Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
M
mobvista-dmp
Project
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
王金锋
mobvista-dmp
Commits
9bc44af8
Commit
9bc44af8
authored
Sep 22, 2021
by
fan.jiang
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
fix bug cn_good_channel
parent
ca633a12
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
13 additions
and
4 deletions
+13
-4
CnGoodChannel.scala
...main/scala/mobvista/dmp/datasource/dm/CnGoodChannel.scala
+13
-4
No files found.
src/main/scala/mobvista/dmp/datasource/dm/CnGoodChannel.scala
View file @
9bc44af8
...
...
@@ -36,6 +36,16 @@ class CnGoodChannel extends CommonSparkJob with Serializable {
options
}
def
buildRes
(
row
:
String
)
:
Array
[
Row
]={
var
data
=
new
ArrayBuffer
[
Row
]()
val
length
=
row
.
split
(
"\t"
,-
1
).
length
if
(
length
==
4
)
{
val
package_name
=
row
.
split
(
"\t"
,
-
1
)(
3
)
data
+=
Row
(
row
.
split
(
"\t"
,
-
1
)(
0
),
row
.
split
(
"\t"
,
-
1
)(
1
),
package_name
.
substring
(
2
,
package_name
.
lastIndexOf
(
"\""
)))
}
data
.
toArray
}
override
protected
def
run
(
args
:
Array
[
String
])
:
Int
=
{
val
commandLine
=
commParser
.
parse
(
options
,
args
)
if
(!
checkMustOption
(
commandLine
))
{
...
...
@@ -81,10 +91,9 @@ class CnGoodChannel extends CommonSparkJob with Serializable {
FileSystem
.
get
(
new
URI
(
s
"s3://mob-emr-test"
),
spark
.
sparkContext
.
hadoopConfiguration
).
delete
(
new
Path
(
output5
),
true
)
try
{
val
old_data
:
RDD
[
Row
]
=
sc
.
textFile
(
old_data_path
).
map
(
row
=>
{
val
package_name
=
row
.
split
(
"\t"
,
-
1
)(
3
)
Row
(
row
.
split
(
"\t"
,
-
1
)(
0
),
row
.
split
(
"\t"
,
-
1
)(
1
),
package_name
.
substring
(
2
,
package_name
.
lastIndexOf
(
"\""
)))
})
// 存在这种不符合的oiad类型设备,特殊字符^M,导致\t分隔数据异常,所以添加过滤buildRes函数中的条件 fddeee78^Mb6ff-f118-9ddf-ef9ed3ffcac0
val
old_data
:
RDD
[
Row
]
=
sc
.
textFile
(
old_data_path
).
flatMap
(
buildRes
(
_
))
val
schema
:
StructType
=
StructType
(
Array
(
StructField
(
"device_id"
,
StringType
),
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment