You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "rameshkrishnan muthusamy (Jira)" <ji...@apache.org> on 2020/03/30 09:31:00 UTC

[jira] [Commented] (SPARK-27093) Honor ParseMode in AvroFileFormat

    [ https://issues.apache.org/jira/browse/SPARK-27093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17070832#comment-17070832 ] 

rameshkrishnan muthusamy commented on SPARK-27093:
--------------------------------------------------

Is there someone working on this. I see that the PR has been closed. 

> Honor ParseMode in AvroFileFormat
> ---------------------------------
>
>                 Key: SPARK-27093
>                 URL: https://issues.apache.org/jira/browse/SPARK-27093
>             Project: Spark
>          Issue Type: Improvement
>          Components: Input/Output
>    Affects Versions: 3.1.0
>            Reporter: Tim Cerexhe
>            Priority: Major
>
> The Avro reader is missing the ability to handle malformed or truncated files like the JSON reader. Currently it throws exceptions when it encounters any bad or truncated record in an Avro file, causing the entire Spark job to fail from a single dodgy file. 
> Ideally the AvroFileFormat would accept a Permissive or DropMalformed ParseMode like Spark's JSON format. This would enable the the Avro reader to drop bad records and continue processing the good records rather than abort the entire job. 
> Obviously the default could remain as FailFastMode, which is the current effective behavior, so this wouldn’t break any existing users.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org