You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Andrew Or (JIRA)" <ji...@apache.org> on 2016/04/08 00:10:25 UTC

[jira] [Created] (SPARK-14468) Always enable OutputCommitCoordinator

Andrew Or created SPARK-14468:
---------------------------------

             Summary: Always enable OutputCommitCoordinator
                 Key: SPARK-14468
                 URL: https://issues.apache.org/jira/browse/SPARK-14468
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
            Reporter: Andrew Or
            Assignee: Andrew Or


The OutputCommitCoordinator was originally introduced in SPARK-4879 because speculation causes the output of some partitions to be deleted. However, as we can see in SPARK-10063, speculation is not the only case where this can happen.

More specifically, when we retry a stage we're not guaranteed to kill the tasks that are still running (we don't even interrupt their threads), so we may end up with multiple concurrent task attempts for the same task. This leads to problems like SPARK-8029, but this fix alone is necessary but not sufficient.

In general, when we run into situations like these, we need the OutputCommitCoordinator because we don't control what the underlying file system does. Enabling this doesn't induce heavy performance costs so there's little reason why we shouldn't always enable it to ensure correctness.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org