You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "KaiXinXIaoLei (JIRA)" <ji...@apache.org> on 2015/07/16 14:58:05 UTC

[jira] [Comment Edited] (SPARK-9097) Tasks are not completed but the number of executor is zero

    [ https://issues.apache.org/jira/browse/SPARK-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14629673#comment-14629673 ] 

KaiXinXIaoLei edited comment on SPARK-9097 at 7/16/15 12:57 PM:
----------------------------------------------------------------

I run a big  job. During running tasks, five tasks failed. Then  executors are killed. But there are many tasks to run. The log info:

2015-07-08 15:03:30,583 | WARN  | [sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1568.0 in stage 167.0 (TID 25557, linux-174): ExecutorLostFailure (executor 52 lost) 
2015-07-08 15:03:30,584 | WARN  | [sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1549.0 in stage 167.0 (TID 25538, linux-174): ExecutorLostFailure (executor 52 lost) 
2015-07-08 15:03:30,584 | WARN  | [sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1552.0 in stage 167.0 (TID 25541, linux-174): ExecutorLostFailure (executor 52 lost) 
2015-07-08 15:03:30,584 | WARN  | [sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1569.0 in stage 167.0 (TID 25558, linux-174): ExecutorLostFailure (executor 52 lost) 
2015-07-08 15:03:30,584 | WARN  | [sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1548.0 in stage 167.0 (TID 25537, linux-174): ExecutorLostFailure (executor 52 lost) 
2015-07-08 15:03:30,584 | INFO  | [dag-scheduler-event-loop] | Executor lost: 52 (epoch 29)
2015-07-08 15:03:30,584 | INFO  | [kill-executor-thread] | Requesting to kill executor(s) 52
2015-07-08 15:03:30,585 | INFO  | [sparkDriver-akka.actor.default-dispatcher-30] | Trying to remove executor 52 from BlockManagerMaster.
2015-07-08 15:03:30,585 | INFO  | [sparkDriver-akka.actor.default-dispatcher-30] | Removing block manager BlockManagerId(52, 9.91.8.174, 23424)
2015-07-08 15:03:30,585 | INFO  | [dag-scheduler-event-loop] | Removed 52 successfully in removeExecutor
2015-07-08 15:03:30,585 | INFO  | [dag-scheduler-event-loop] | Host added was in lost list earlier: hostname

Then I can't find executors to  add, and not find failed task to re-submit in log. Thanks.


was (Author: kaixinxiaolei):
I run a big  job. During running tasks, five tasks failed. Then  executors are killed. But there are many tasks to run. The log info:

2015-07-08 15:03:30,583 | WARN  | [sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1568.0 in stage 167.0 (TID 25557, linux-174): ExecutorLostFailure (executor 52 lost) 
2015-07-08 15:03:30,584 | WARN  | [sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1549.0 in stage 167.0 (TID 25538, linux-174): ExecutorLostFailure (executor 52 lost) 
2015-07-08 15:03:30,584 | WARN  | [sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1552.0 in stage 167.0 (TID 25541, linux-174): ExecutorLostFailure (executor 52 lost) 
2015-07-08 15:03:30,584 | WARN  | [sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1569.0 in stage 167.0 (TID 25558, linux-174): ExecutorLostFailure (executor 52 lost) 
2015-07-08 15:03:30,584 | WARN  | [sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1548.0 in stage 167.0 (TID 25537, linux-174): ExecutorLostFailure (executor 52 lost) 
2015-07-08 15:03:30,584 | INFO  | [dag-scheduler-event-loop] | Executor lost: 52 (epoch 29)
2015-07-08 15:03:30,584 | INFO  | [kill-executor-thread] | Requesting to kill executor(s) 52
2015-07-08 15:03:30,585 | INFO  | [sparkDriver-akka.actor.default-dispatcher-30] | Trying to remove executor 52 from BlockManagerMaster.
2015-07-08 15:03:30,585 | INFO  | [sparkDriver-akka.actor.default-dispatcher-30] | Removing block manager BlockManagerId(52, 9.91.8.174, 23424)
2015-07-08 15:03:30,585 | INFO  | [dag-scheduler-event-loop] | Removed 52 successfully in removeExecutor
2015-07-08 15:03:30,585 | INFO  | [dag-scheduler-event-loop] | Host added was in lost list earlier: hostname

Then I can't find executors to  add, and not find failed task to re-submit. Thanks.

> Tasks are not completed but the number of executor is zero
> ----------------------------------------------------------
>
>                 Key: SPARK-9097
>                 URL: https://issues.apache.org/jira/browse/SPARK-9097
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.4.0
>            Reporter: KaiXinXIaoLei
>         Attachments: number of executor is zero.png, tasks are not completed.png
>
>
> I set the value of "spark.dynamicAllocation.enabled" is true. I submit tasks to run. Tasks are not completed, but the number of executor is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org