You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Ziv Huang (JIRA)" <ji...@apache.org> on 2014/09/25 17:08:34 UTC

[jira] [Issue Comment Deleted] (SPARK-3687) Spark hang while processing more than 100 sequence files

     [ https://issues.apache.org/jira/browse/SPARK-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ziv Huang updated SPARK-3687:
-----------------------------
    Comment: was deleted

(was: Just a few mins ago I ran a job twice, processing 203 sequence files.
Both times I saw the job hanging with different behavior than before: 
1. the web UI of spark master shows that the job is finished with state "failed" after 3.x mins
2. the job stage web UI still hangs, and execution duration time is still accumulating.
Hope this information helps debugging :))

> Spark hang while processing more than 100 sequence files
> --------------------------------------------------------
>
>                 Key: SPARK-3687
>                 URL: https://issues.apache.org/jira/browse/SPARK-3687
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.0.2, 1.1.0
>            Reporter: Ziv Huang
>
> In my application, I read more than 100 sequence files to a JavaPairRDD, perform flatmap to get another JavaRDD, and then use takeOrdered to get the result.
> It is quite often (but not always) that the spark hangs while the executing some of 120th-150th tasks.
> In 1.0.2, the job can hang for several hours, maybe forever (I can't wait for its completion).
> When the spark job hangs,  I can't kill the job from web UI.
> In 1.1.0, the job hangs for couple mins (3.x mins actually),
> and then web UI of spark master shows that the job is finished with state "FAILED".
> In addition, the job stage web UI still hangs, and execution duration time is still accumulating.
> For both 1.0.2 and 1.1.0, the job hangs with no error messages in anywhere.
> The current workaround is to use coalesce to reduce the number of partitions to be processed.
> I never get a job hanged if the number of partitions to be processed is no greater than 100.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org