You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kevin (Sangwoo) Kim (JIRA)" <ji...@apache.org> on 2014/05/29 04:02:01 UTC

[jira] [Comment Edited] (SPARK-1112) When spark.akka.frameSize > 10, task results bigger than 10MiB block execution

    [ https://issues.apache.org/jira/browse/SPARK-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011968#comment-14011968 ] 

Kevin (Sangwoo) Kim edited comment on SPARK-1112 at 5/29/14 2:01 AM:
---------------------------------------------------------------------

Hi all, 

I'm very new to Spark and doing some tests, I've experienced similar issue.
(tested with Spark Shell, 0.9.1, r3.8xlarge instance on EC2 - 32 core / 244GiB MEM)

I was trying to broadcast 700MB of data and Spark hangs when I run collect() method for the data. 

Here's the strange things :
1) when I tried 
{code}val userInfo = sc.textFile("file:///spark/logs/user_sign_up2.csv").map{line => val split = line.split(","); (split(1), split)}{code}
it runs well.
2) when I tried 
{code}val userInfo = sc.textFile("file:///spark/logs/user_sign_up2.csv").map{line => val split = line.split(","); (split(1), split(5))} {code}
Spark hangs.
3) when I slightly control the data size using sample() method or cutting the data file, it runs well. 

Our team investigated logs from master and worker then we found worker finished all tasks but master couldn't retrieve the result from a task the result size larger than 10MB

We tried to apply the workaround setting spark.akka.frameSize to 9, it works like a charm.

I guess it might hard to reproduce the issue, please contact me if there's need of testing or getting logs. 

Thanks!


was (Author: swkimme):
Hi all, 

I'm very new to Spark and doing some tests, I've experienced similar issue.
(tested with Spark Shell, 0.9.1, r3.8xlarge instance on EC2 - 32 core / 244GiB MEM)

I was trying to broadcast 700MB of data and Spark hangs when I run collect() method for the data. 

Here's the strange things :
1) when I tried 
"""val userInfo = sc.textFile("file:///spark/logs/user_sign_up2.csv").map{line => val split = line.split(","); (split(1), split)}"""
it runs well.
2) when I tried 
"""val userInfo = sc.textFile("file:///spark/logs/user_sign_up2.csv").map{line => val split = line.split(","); (split(1), split(5))} """
Spark hangs.
3) when I slightly control the data size using sample() method or cutting the data file, it runs well. 

Our team investigated logs from master and worker then we found worker finished all tasks but master couldn't retrieve the result from a task the result size larger than 10MB

We tried to apply the workaround setting spark.akka.frameSize to 9, it works like a charm.

I guess it might hard to reproduce the issue, please contact me if there's need of testing or getting logs. 

Thanks!

> When spark.akka.frameSize > 10, task results bigger than 10MiB block execution
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-1112
>                 URL: https://issues.apache.org/jira/browse/SPARK-1112
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 0.9.0
>            Reporter: Guillaume Pitel
>            Priority: Blocker
>             Fix For: 0.9.2
>
>
> When I set the spark.akka.frameSize to something over 10, the messages sent from the executors to the driver completely block the execution if the message is bigger than 10MiB and smaller than the frameSize (if it's above the frameSize, it's ok)
> Workaround is to set the spark.akka.frameSize to 10. In this case, since 0.8.1, the blockManager deal with  the data to be sent. It seems slower than akka direct message though.
> The configuration seems to be correctly read (see actorSystemConfig.txt), so I don't see where the 10MiB could come from 



--
This message was sent by Atlassian JIRA
(v6.2#6252)