Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
M
mobvista-dmp
Project
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
王金锋
mobvista-dmp
Commits
7a8e01b8
Commit
7a8e01b8
authored
Aug 12, 2021
by
WangJinfeng
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
fix rtdmp, adn_sdk_daily
parent
55b0991b
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
14 additions
and
14 deletions
+14
-14
adn_sdk_daily.sh
azkaban/adn_sdk/adn_sdk_daily.sh
+3
-3
rtdmp.sh
azkaban/rtdmp/rtdmp.sh
+6
-4
AdnSdkDaily.scala
...n/scala/mobvista/dmp/datasource/adn_sdk/AdnSdkDaily.scala
+5
-7
No files found.
azkaban/adn_sdk/adn_sdk_daily.sh
View file @
7a8e01b8
...
...
@@ -26,8 +26,8 @@ spark-submit --class mobvista.dmp.datasource.adn_sdk.AdnSdkDaily \
--conf
spark.app.loadTime
=
${
LOG_TIME
}
\
--conf
spark.app.input_path
=
${
INPUT_PATH
}
\
--conf
spark.app.output_path
=
${
OUTPUT_PATH
}
\
--conf
spark.sql.shuffle.partitions
=
5
000
\
--conf
spark.default.parallelism
=
5
000
\
--conf
spark.sql.shuffle.partitions
=
2
000
\
--conf
spark.default.parallelism
=
2
000
\
--conf
spark.shuffle.memoryFraction
=
0.4
\
--conf
spark.storage.memoryFraction
=
0.4
\
--conf
spark.driver.maxResultSize
=
8g
\
...
...
@@ -35,7 +35,7 @@ spark-submit --class mobvista.dmp.datasource.adn_sdk.AdnSdkDaily \
--conf
spark.app.coalesce
=
60000
\
--files
${
HIVE_SITE_PATH
}
\
--jars
${
JARS
}
\
--master
yarn
--deploy-mode
cluster
--name
adn_sdk_daily
--executor-memory
18g
--driver-memory
6g
--executor-cores
5
--num-executors
8
0
\
--master
yarn
--deploy-mode
cluster
--name
adn_sdk_daily
--executor-memory
8g
--driver-memory
6g
--executor-cores
4
--num-executors
20
0
\
../
${
JAR
}
if
[[
$?
-ne
0
]]
;
then
...
...
azkaban/rtdmp/rtdmp.sh
View file @
7a8e01b8
...
...
@@ -18,18 +18,20 @@ OLD_MERGE_INPUT="s3://mob-emr-test/dataplatform/DataWareHouse/data/dwh/audience_
check_await
${
OLD_MERGE_INPUT
}
/_SUCCESS
sleep
120
OUTPUT
=
"s3://mob-emr-test/dataplatform/DataWareHouse/data/dwh/audience_merge/
${
date_path
}
"
spark-submit
--class
mobvista.dmp.datasource.rtdmp.RTDmpMain
\
--name
"RTDmpMain.
${
date_time
}
"
\
--conf
spark.sql.shuffle.partitions
=
1
000
\
--conf
spark.default.parallelism
=
1
000
\
--conf
spark.sql.shuffle.partitions
=
2
000
\
--conf
spark.default.parallelism
=
2
000
\
--conf
spark.kryoserializer.buffer.max
=
512m
\
--conf
spark.kryoserializer.buffer
=
64m
\
--master
yarn
--deploy-mode
cluster
\
--executor-memory
18g
--driver-memory
6g
--executor-cores
5
--num-executors
4
0
\
--executor-memory
8g
--driver-memory
6g
--executor-cores
4
--num-executors
10
0
\
.././DMP.jar
\
-datetime
${
date_time
}
-old_datetime
${
old_date_time
}
-input
${
INPUT
}
-output
${
OUTPUT
}
-coalesce
2
00
-datetime
${
date_time
}
-old_datetime
${
old_date_time
}
-input
${
INPUT
}
-output
${
OUTPUT
}
-coalesce
4
00
if
[[
$?
-ne
0
]]
;
then
exit
255
...
...
src/main/scala/mobvista/dmp/datasource/adn_sdk/AdnSdkDaily.scala
View file @
7a8e01b8
package
mobvista.dmp.datasource.adn_sdk
import
java.net.URI
import
java.text.SimpleDateFormat
import
java.util.
{
Date
,
Properties
}
import
com.alibaba.fastjson.
{
JSON
,
JSONArray
,
JSONObject
}
import
mobvista.dmp.common.MobvistaConstant
import
mobvista.dmp.datasource.apptag.Constant
import
mobvista.dmp.datasource.datatory.ConstantV2
import
org.apache.commons.lang3.StringUtils
import
org.apache.hadoop.fs.
{
FileSystem
,
Path
}
import
org.apache.spark.broadcast.Broadcast
import
org.apache.spark.rdd.RDD
import
org.apache.spark.sql.types._
import
org.apache.spark.sql.
{
SparkSession
,
_
}
import
org.apache.spark.sql.
_
import
org.apache.spark.storage.StorageLevel
import
java.net.URI
import
java.text.SimpleDateFormat
import
java.util.Date
import
scala.collection.mutable
import
scala.collection.mutable.ArrayBuffer
import
scala.util.control.Breaks._
...
...
@@ -94,7 +92,7 @@ object AdnSdkDaily extends Serializable {
}
linesArr
.
toIterator
}
)
)
.
repartition
(
2000
)
filter_rdd
.
persist
(
StorageLevel
.
MEMORY_AND_DISK_SER
)
/*{
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment