You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by shishu <sh...@zamplus.com> on 2014/09/18 10:46:39 UTC

Spark run slow after unexpected repartition

Hi dear all~

My spark application sometimes runs much slower than it use to be, so I
wonder why would this happen.

I find out that after a repartition stage of stage 17, all tasks go to one
executor. But in my code, I only use repartition at the very beginning. 

In my application, before stage 17, every stage run sucessfully within 1
minute, but after stage 17, it cost more than 10 minutes for every stage.
Normally my application runs succcessfully and will finish within 9 minites.

My spark version is 0.9.1, and my program is writen by scala.

 

I take some screen-shots, you can see it in the archive.

 

Great thanks if you can help~

 

Shi Shu

Re: Spark run slow after unexpected repartition

Posted by matthes <md...@sensenetworks.com>.

I have the same problem! I start the same job 3 or 4 times again, it depends
how big the data and the cluster are. The runtime goes down in the following
jobs. And at the end I get the Fetch failure error and at this point I must
restart the spark shell and everything works well again. And I don't use the
caching option!

By the way, I have the same behavior with different jobs!




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-run-slow-after-unexpected-repartition-tp14542p15416.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: Spark run slow after unexpected repartition

Posted by Tan Tim <un...@gmail.com>.

I also encountered the similar problem: after some stages, all the taskes
are assigned to one machine, and the stage execution get slower and slower.

*[the spark conf setting]*
val conf = new SparkConf().setMaster(sparkMaster).setAppName("ModelTraining"
).setSparkHome(sparkHome).setJars(List(jarFile))
conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
conf.set("spark.kryo.registrator", "LRRegistrator")
conf.set("spark.storage.memoryFraction", "0.7")
conf.set("spark.executor.memory", "8g")
conf.set("spark.cores.max", "150")
conf.set("spark.speculation", "true")
conf.set("spark.storage.blockManagerHeartBeatMs", "300000")

val sc = new SparkContext(conf)
val lines = sc.textFile("hdfs://xxx:52310"+inputPath , 3)
val trainset = lines.map(parseWeightedPoint).repartition(50
).persist(StorageLevel.MEMORY_ONLY)

*[the warn log from the spark]*
14/09/19 10:26:23 WARN TaskSetManager: Loss was due to fetch failure from
BlockManagerId(45, TS-BH109, 48384, 0)
14/09/19 10:27:18 WARN TaskSetManager: Lost TID 726 (task 14.0:9)
14/09/19 10:29:03 WARN SparkDeploySchedulerBackend: Ignored task status
update (737 state FAILED) from unknown executor
Actor[akka.tcp://sparkExecutor@TS-BH96:33178/user/Executor#-913985102] with
ID 39
14/09/19 10:29:03 WARN TaskSetManager: Loss was due to fetch failure from
BlockManagerId(30, TS-BH136, 28518, 0)
14/09/19 11:01:22 WARN BlockManagerMasterActor: Removing BlockManager
BlockManagerId(47, TS-BH136, 31644, 0) with no recent heart beats: 47765ms
exceeds 45000ms

Any suggestions?

On Thu, Sep 18, 2014 at 4:46 PM, shishu <sh...@zamplus.com> wrote:

>  Hi dear all~
>
> My spark application sometimes runs much slower than it use to be, so I
> wonder why would this happen.
>
> I find out that after a repartition stage of stage 17, all tasks go to one
> executor. But in my code, I only use repartition at the very beginning.
>
> In my application, before stage 17, every stage run sucessfully within 1
> minute, but after stage 17, it cost more than 10 minutes for every stage.
> Normally my application runs succcessfully and will finish within 9 minites.
>
> My spark version is 0.9.1, and my program is writen by scala.
>
>
>
> I take some screen-shots, you can see it in the archive.
>
>
>
> Great thanks if you can help~
>
>
>
> Shi Shu
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>