Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
M
mobvista-dmp
Project
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
王金锋
mobvista-dmp
Commits
e4336377
Commit
e4336377
authored
Oct 13, 2021
by
fan.jiang
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
fix bug rtdmp_normal
parent
68629eef
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
9 additions
and
0 deletions
+9
-0
RtdmpNormal.scala
src/main/scala/mobvista/dmp/datasource/dm/RtdmpNormal.scala
+9
-0
No files found.
src/main/scala/mobvista/dmp/datasource/dm/RtdmpNormal.scala
View file @
e4336377
...
@@ -121,6 +121,10 @@ class RtdmpNormal extends CommonSparkJob with Serializable {
...
@@ -121,6 +121,10 @@ class RtdmpNormal extends CommonSparkJob with Serializable {
val
package_name
:
String
=
array
(
index
).
_4
val
package_name
:
String
=
array
(
index
).
_4
val
country_code
:
String
=
array
(
index
).
_5
val
country_code
:
String
=
array
(
index
).
_5
println
(
inputPath
)
println
(
inputPath
)
val
pathUri
=
new
URI
(
inputPath
)
//过滤后面这种不存在的s3路径 s3://mob-emr-test/dataplatform/rtdmp_request/2021/07/10/dsp_req/com.taobao.idlefish_bes/*/,
if
(
FileSystem
.
get
(
new
URI
(
s
"${pathUri.getScheme}://${pathUri.getHost}"
),
sc
.
hadoopConfiguration
)
.
exists
(
new
Path
(
pathUri
.
toString
.
replace
(
"*"
,
""
)))){
inputDataRdd
=
inputDataRdd
.
union
(
spark
.
sparkContext
.
textFile
(
inputPath
).
map
(
row
=>
{
inputDataRdd
=
inputDataRdd
.
union
(
spark
.
sparkContext
.
textFile
(
inputPath
).
map
(
row
=>
{
if
(
row
.
length
==
32
)
{
if
(
row
.
length
==
32
)
{
DmpDailyDataInformation
(
row
,
device_type_md5
,
platform
,
package_name
,
country_code
)
DmpDailyDataInformation
(
row
,
device_type_md5
,
platform
,
package_name
,
country_code
)
...
@@ -130,6 +134,11 @@ class RtdmpNormal extends CommonSparkJob with Serializable {
...
@@ -130,6 +134,11 @@ class RtdmpNormal extends CommonSparkJob with Serializable {
}
}
}
}
))
))
}
else
{
println
(
inputPath
+
" not existed!"
)
inputDataRdd
=
inputDataRdd
.
union
(
spark
.
sparkContext
.
emptyRDD
[
DmpDailyDataInformation
])
}
}
}
val
df
:
DataFrame
=
inputDataRdd
.
toDF
().
persist
(
StorageLevel
.
MEMORY_AND_DISK_SER
)
val
df
:
DataFrame
=
inputDataRdd
.
toDF
().
persist
(
StorageLevel
.
MEMORY_AND_DISK_SER
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment