Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
M
mobvista-dmp
Project
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
王金锋
mobvista-dmp
Commits
df772ee5
Commit
df772ee5
authored
Aug 30, 2021
by
WangJinfeng
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
set spark.hadoop.mapreduce.input.fileinputformat.input.dir.recursive=true
parent
3397f569
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
2 additions
and
1 deletion
+2
-1
rtdmp_pre.sh
azkaban/rtdmp/rtdmp_pre.sh
+1
-0
RTDmpMainPre.scala
...in/scala/mobvista/dmp/datasource/rtdmp/RTDmpMainPre.scala
+1
-1
No files found.
azkaban/rtdmp/rtdmp_pre.sh
View file @
df772ee5
...
...
@@ -17,6 +17,7 @@ spark-submit --class mobvista.dmp.datasource.rtdmp.RTDmpMainPre \
--conf
spark.kryoserializer.buffer.max
=
256m
\
--conf
spark.sql.adaptive.enabled
=
true
\
--conf
spark.sql.adaptive.advisoryPartitionSizeInBytes
=
134217728
\
--conf
spark.hadoop.mapreduce.input.fileinputformat.input.dir.recursive
=
true
\
--master
yarn
--deploy-mode
cluster
--executor-memory
12g
--driver-memory
8g
--executor-cores
5
--num-executors
20
\
../
${
JAR
}
-time
"
${
date_time
}
"
-data_utime
"
${
date_time
}
"
-output
${
OUTPUT
}
-coalesce
100
...
...
src/main/scala/mobvista/dmp/datasource/rtdmp/RTDmpMainPre.scala
View file @
df772ee5
...
...
@@ -99,7 +99,7 @@ class RTDmpMainPre extends CommonSparkJob with Serializable {
val
pathUri
=
new
URI
(
list
.
get
(
0
).
_1
)
val
newAudience
=
if
(
FileSystem
.
get
(
new
URI
(
s
"${pathUri.getScheme}://${pathUri.getHost}"
),
sc
.
hadoopConfiguration
)
.
exists
(
new
Path
(
pathUri
.
toString
.
replace
(
"*"
,
""
))))
{
val
rdd
=
sc
.
newAPIHadoopFile
(
list
.
get
(
0
).
_1
,
fc
,
kc
,
vc
,
sc
.
hadoopConfiguration
)
val
rdd
=
sc
.
newAPIHadoopFile
(
list
.
get
(
0
).
_1
.
replace
(
"*"
,
""
)
,
fc
,
kc
,
vc
,
sc
.
hadoopConfiguration
)
val
linesWithFileNames
=
rdd
.
asInstanceOf
[
NewHadoopRDD
[
LongWritable
,
Text
]]
.
mapPartitionsWithInputSplit
((
inputSplit
,
iterator
)
=>
{
val
file
=
inputSplit
.
asInstanceOf
[
FileSplit
]
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment