You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2020/10/07 18:37:00 UTC

[jira] [Commented] (SPARK-33019) Use spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 by default

    [ https://issues.apache.org/jira/browse/SPARK-33019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209766#comment-17209766 ] 

Steve Loughran commented on SPARK-33019:
----------------------------------------

Related to this, I'm proposing we add a method which will let the MR engine and spark driver work out if a committer can be recovered from -and choose how to react if it says "no" - fail or warn + commit another attempt

That way if you want full due diligence you can still use v2 committer, (or EMR committer), but get the ability to make failures during the commit phase something which triggers a failure. Most of the time, it won't.


> Use spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 by default
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-33019
>                 URL: https://issues.apache.org/jira/browse/SPARK-33019
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Dongjoon Hyun
>            Assignee: Dongjoon Hyun
>            Priority: Blocker
>              Labels: correctness
>             Fix For: 3.0.2, 3.1.0
>
>
> By default, Spark should use a safe file output committer algorithm to avoid MAPREDUCE-7282.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org