You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Ran Haim (JIRA)" <ji...@apache.org> on 2018/09/26 08:12:00 UTC

[jira] [Updated] (SPARK-25527) Job stuck waiting for last stage to start

     [ https://issues.apache.org/jira/browse/SPARK-25527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ran Haim updated SPARK-25527:
-----------------------------
    Attachment: threaddumpjob.txt

> Job stuck waiting for last stage to start
> -----------------------------------------
>
>                 Key: SPARK-25527
>                 URL: https://issues.apache.org/jira/browse/SPARK-25527
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.1.0
>            Reporter: Ran Haim
>            Priority: Major
>         Attachments: threaddumpjob.txt
>
>
> Sometimes it can somehow happen that a job is stuck waiting for the last stage to start.
> There are no Tasks waiting for completion, and the job just hangs.
> There are available Executors for the job to run.
> I do not know how to reproduce this, all I know is that it happens randomly after couple days of hard load.
> Another thing that might help is that it seems to happen when some tasks fail because one or more executors killed (due to memory issues or something).
> Those tasks eventually do get finished by other executors because of retries, but the next stage hangs.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org