You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "SuYan (JIRA)" <ji...@apache.org> on 2017/07/04 08:52:00 UTC

[jira] [Created] (SPARK-21301) Should abort active taskSets or kill all running Tasks when that stage success.

SuYan created SPARK-21301:
-----------------------------

             Summary: Should abort active taskSets or kill all running Tasks when that stage success.
                 Key: SPARK-21301
                 URL: https://issues.apache.org/jira/browse/SPARK-21301
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.0.1
            Reporter: SuYan
            Priority: Minor


I not sure this problem suit for the master branch....and also I may missing some logic when I looked into that problem.

stage 1 -> stage 2-> stage 3

stage 1.0 done OK
stage 2.0 , fetch failed, taskset 2.0 was zombie but still has running tasks.
stage 1.1, done OK
stage 2.1, running, taskset 2.1 active
after some time...
for stage 2,  pendingPartition become empty and outputLocs become full, 
 but taskset 2.0 and taskset 2.1 was not marked as removed. so 2.1 still active, and 2.0 zombie.

So stage 2 Done OK.

Stage 3, Fetch failed.
Resubmit stage 2, taskset 2.2, active conflict with taskset 2.1.

I had a impression that some jira is about to not resubmit the running tasks in previous taskset....? if that jira resolved, this will fixed too....

and or just do sth to make stage2 related running tasks aborted, to make those taskset finished correctly.

[~irashid][~jinxing6042@126.com]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org