You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2017/07/13 11:49:00 UTC

[jira] [Assigned] (SPARK-21400) Spark shouldn't ignore user defined output committer in append mode

     [ https://issues.apache.org/jira/browse/SPARK-21400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-21400:
------------------------------------

    Assignee: Apache Spark

> Spark shouldn't ignore user defined output committer in append mode
> -------------------------------------------------------------------
>
>                 Key: SPARK-21400
>                 URL: https://issues.apache.org/jira/browse/SPARK-21400
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Robert Kruszewski
>            Assignee: Apache Spark
>
> In https://issues.apache.org/jira/browse/SPARK-8578 we decided to override user defined output committers in append mode. The reasoning was that there's some output committers that can lead to correctness issues. Since then we have removed DirectParquetOutputCommitter (the biggest known offender) from codebase and rely on default implementations.
> I believe that we shouldn't be restricting this anymore and users should understand that if they're overwriting this configuration they have tested their committer for correctness. This unblocks using more sophisticated and performant output committers without need to overwrite file format implementations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org