You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by li...@itri.org.tw on 2019/02/01 02:53:47 UTC

Task skew or Data skew problem for Spark Standalone (2.3.1)

Dear all,

I have a problem about task skew and data skew for real-time data (via kafka) under the spark streaming.

When one of executors is crashed, task skew and data skew happened in my project as shown figure.

For example, in ubuntu8, because there are 3 crashed executors (here, I am not sure this), the 50,000 data is placed into the executor: ubuntu8:34168.
[cid:image002.jpg@01D4BA1C.67F4E3E0]
Figure 1 executors crash

It is normal (no crashed executors) for most streaming window:
[cid:image004.jpg@01D4BA1C.67F4E3E0]
Figure 2 normal performance

[cid:image005.jpg@01D4BA1C.67F4E3E0]
Figure 3 poor performance



The experiment design in my project is described in the following.
Real-time data speed (via kafka): 100,000/1sec
Read one topic: kafkasink2
Kafka Broker: 2.10-0.10.1.1
  Broker node at ubuntu7
    One topic: kafkasink2 (number of partitions: 8)

The running environment is in my PC:
OS: Ubuntn 14.04.4 LTS
The version of related tools:
java version: "1.8.0_151"
Spark version: 2.3.1 Standalone mode
  Execution condition:
  Master/Driver node: ubuntu7
  Worker nodes: ubuntu8 (4 Executors); ubuntu9 (4 Executors)
Number of executors: 8

Driver setting (spark-defaults.conf):
spark.cores.max=8

spark.executor.instances=8
spark.executor.cores=1
spark.executor.memory=2048m

spark.default.parallelism=8

spark.driver.cores=4
spark.driver.memory=2048m

spark.executor.extraJavaOptions=-XX:+UseConcMarkSweepGC
spark.executor.extraJavaOptions=-Xss100M

spark.shuffle.consolidateFiles=true
spark.streaming.unpersist=true
spark.streaming.stopGracefullyOnShutdown=true

spark.blacklist.enabled=true



If anyone provides any direction to help us to overcome this problem, we would appreciate it.
Thanks.

Rick



--
本信件可能包含工研院機密資訊,非指定之收件者,請勿使用或揭露本信件內容,並請銷毀此信件。 This email may contain confidential information. Please do not use or disclose it in any way and delete it if you are not the intended recipient.