You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "liupengcheng (JIRA)" <ji...@apache.org> on 2019/01/16 07:10:00 UTC

[jira] [Created] (SPARK-26634) OutputCommitCoordinator may allow task of FetchFailureStage commit again

liupengcheng created SPARK-26634:
------------------------------------

             Summary: OutputCommitCoordinator may allow task of FetchFailureStage commit again
                 Key: SPARK-26634
                 URL: https://issues.apache.org/jira/browse/SPARK-26634
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.4.0, 2.1.0
            Reporter: liupengcheng


In our production spark cluster, we encoutered a case that the task of retry stage due to FetchFailure is denied to commit. However, the task is the first attempt of this retry stage.

After carefully investigating, it was found that the call of canCommit of OutputCommitCoordinator would allow the task of FetchFailure stage(with the same parition number as new task of retry stage) commit. which result in the TaskCommitDenied for all the task of retry stage. This is a correctness bug.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org