Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
M
mobvista-dmp
Project
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
王金锋
mobvista-dmp
Commits
e4336377
Commit
e4336377
authored
Oct 13, 2021
by
fan.jiang
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
fix bug rtdmp_normal
parent
68629eef
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
17 additions
and
8 deletions
+17
-8
RtdmpNormal.scala
src/main/scala/mobvista/dmp/datasource/dm/RtdmpNormal.scala
+17
-8
No files found.
src/main/scala/mobvista/dmp/datasource/dm/RtdmpNormal.scala
View file @
e4336377
...
@@ -121,15 +121,24 @@ class RtdmpNormal extends CommonSparkJob with Serializable {
...
@@ -121,15 +121,24 @@ class RtdmpNormal extends CommonSparkJob with Serializable {
val
package_name
:
String
=
array
(
index
).
_4
val
package_name
:
String
=
array
(
index
).
_4
val
country_code
:
String
=
array
(
index
).
_5
val
country_code
:
String
=
array
(
index
).
_5
println
(
inputPath
)
println
(
inputPath
)
inputDataRdd
=
inputDataRdd
.
union
(
spark
.
sparkContext
.
textFile
(
inputPath
).
map
(
row
=>
{
val
pathUri
=
new
URI
(
inputPath
)
if
(
row
.
length
==
32
)
{
//过滤后面这种不存在的s3路径 s3://mob-emr-test/dataplatform/rtdmp_request/2021/07/10/dsp_req/com.taobao.idlefish_bes/*/,
DmpDailyDataInformation
(
row
,
device_type_md5
,
platform
,
package_name
,
country_code
)
if
(
FileSystem
.
get
(
new
URI
(
s
"${pathUri.getScheme}://${pathUri.getHost}"
),
sc
.
hadoopConfiguration
)
}
.
exists
(
new
Path
(
pathUri
.
toString
.
replace
(
"*"
,
""
)))){
else
{
inputDataRdd
=
inputDataRdd
.
union
(
spark
.
sparkContext
.
textFile
(
inputPath
).
map
(
row
=>
{
DmpDailyDataInformation
(
row
,
device_type_not_md5
,
platform
,
package_name
,
country_code
)
if
(
row
.
length
==
32
)
{
}
DmpDailyDataInformation
(
row
,
device_type_md5
,
platform
,
package_name
,
country_code
)
}
}
else
{
DmpDailyDataInformation
(
row
,
device_type_not_md5
,
platform
,
package_name
,
country_code
)
}
}
))
))
}
else
{
println
(
inputPath
+
" not existed!"
)
inputDataRdd
=
inputDataRdd
.
union
(
spark
.
sparkContext
.
emptyRDD
[
DmpDailyDataInformation
])
}
}
}
val
df
:
DataFrame
=
inputDataRdd
.
toDF
().
persist
(
StorageLevel
.
MEMORY_AND_DISK_SER
)
val
df
:
DataFrame
=
inputDataRdd
.
toDF
().
persist
(
StorageLevel
.
MEMORY_AND_DISK_SER
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment