You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Robert Kruszewski (JIRA)" <ji...@apache.org> on 2017/07/27 03:17:00 UTC

[jira] [Closed] (SPARK-21400) Spark shouldn't ignore user defined output committer in append mode

     [ https://issues.apache.org/jira/browse/SPARK-21400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Kruszewski closed SPARK-21400.
-------------------------------------
       Resolution: Fixed
    Fix Version/s: 2.3.0

> Spark shouldn't ignore user defined output committer in append mode
> -------------------------------------------------------------------
>
>                 Key: SPARK-21400
>                 URL: https://issues.apache.org/jira/browse/SPARK-21400
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Robert Kruszewski
>             Fix For: 2.3.0
>
>
> In https://issues.apache.org/jira/browse/SPARK-8578 we decided to override user defined output committers in append mode. The reasoning was that there's some output committers that can lead to correctness issues. Since then we have removed DirectParquetOutputCommitter (the biggest known offender) from codebase and rely on default implementations.
> I believe that we shouldn't be restricting this anymore and users should understand that if they're overwriting this configuration they have tested their committer for correctness. This unblocks using more sophisticated and performant output committers without need to overwrite file format implementations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org