You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "tonydoen (Jira)" <ji...@apache.org> on 2022/03/23 19:29:00 UTC
[jira] [Created] (SPARK-38639) Support ignoreCorruptRecord flag parallel to ignoreCorruptFiles
tonydoen created SPARK-38639:
--------------------------------
Summary: Support ignoreCorruptRecord flag parallel to ignoreCorruptFiles
Key: SPARK-38639
URL: https://issues.apache.org/jira/browse/SPARK-38639
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 3.2.1, 3.1.2
Reporter: tonydoen
Fix For: 3.2.1
There's an existing flag "spark.sql.files.ignoreCorruptFiles" and "spark.sql.files.ignoreMissingFiles" that will quietly ignore attempted reads from files that have been corrupted, but it still allows the query to fail on sequence files.
Being able to ignore corrupt record is useful in the scenarios that users want to query successfully in dirty data(mixed schema in one table).
We would like to add a "spark.sql.hive.ignoreCorruptRecord" to fill out the functionality.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org