You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:23:21 UTC

[jira] [Updated] (SPARK-14890) DAGScheduler should not accept the result of a previous task attempt, since its stage attempt has been completed.

     [ https://issues.apache.org/jira/browse/SPARK-14890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-14890:
---------------------------------
    Labels: bulk-closed  (was: )

> DAGScheduler should not accept the result of a previous task attempt, since its stage attempt has been completed.
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-14890
>                 URL: https://issues.apache.org/jira/browse/SPARK-14890
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.6.1
>         Environment: spark1.6.1 hadoop-2.6.0-cdh5.4.2
>            Reporter: yinqiang
>            Priority: Major
>              Labels: bulk-closed
>
> ......
> 16/04/14 17:07:28 INFO TaskSetManager: Starting task 109.0 in stage 46.0 (TID 18023, cnsz033569.app.paic.com.cn, partition 109,RACK_LOCAL, 2316 bytes)
> ......
> 16/04/14 17:08:32 WARN TaskSetManager: Lost task 109.0 in stage 46.0 (TID 18023, cnsz033569.app.paic.com.cn): ExecutorLostFailure (executor 23 exited caused by one of the running tasks) Reason: Container marked as failed: container_146045
> 9369308_5903_01_000035 on host: cnsz033569.app.paic.com.cn. Exit status: 143. Diagnostics: Container killed on request. Exit code is 143
> ......
> 16/04/14 17:08:37 INFO TaskSetManager: Starting task 109.1 in stage 46.0 (TID 20237, cnsz033561.app.paic.com.cn, partition 109,RACK_LOCAL, 2316 bytes)
> ......
> 16/04/14 17:08:54 WARN TaskSetManager: Lost task 109.1 in stage 46.0 (TID 20237, cnsz033561.app.paic.com.cn): ExecutorLostFailure (executor 6 exited caused by one of the running tasks) Reason: Container marked as failed: container_1460459
> 369308_5903_01_000007 on host: cnsz033561.app.paic.com.cn. Exit status: 143. Diagnostics: Container killed on request. Exit code is 143
> ......
> 16/04/14 17:09:38 INFO TaskSetManager: Starting task 109.2 in stage 46.0 (TID 21034, cnsz033580.app.paic.com.cn, partition 109,RACK_LOCAL, 2316 bytes)
> ......
> 16/04/14 17:10:41 INFO YarnScheduler: Removed TaskSet 46.0, whose tasks have all completed, from pool
> ......
> 16/04/14 17:10:41 INFO DAGScheduler: Ignoring possibly bogus ShuffleMapTask(46, 109) completion from executor 23
> ......
> 16/04/14 17:10:46 INFO TaskSetManager: Ignoring task-finished event for 109.1 in stage 46.0 because task 109 has already completed successfully
> 16/04/14 17:10:46 INFO DAGScheduler: Ignoring possibly bogus ShuffleMapTask(46, 109) completion from executor 6
> ......
> 16/04/14 17:10:47 INFO TaskSetManager: Ignoring task-finished event for 109.2 in stage 46.0 because task 109 has already completed successfully
> ......
> 16/04/14 17:10:47 ERROR DAGSchedulerEventProcessLoop: DAGSchedulerEventProcessLoop failed; shutting down SparkContext
> java.lang.IllegalStateException: more than one active taskSet for stage 46: 46.2,46.1
>         at org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:173)
>         at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1052)
>         at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921)
>         at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1214)
>         at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1637)
>         at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
>         at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
>         at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org