You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jungtaek Lim (Jira)" <ji...@apache.org> on 2023/10/26 05:44:00 UTC
[jira] [Created] (SPARK-45671) Implement an option similar to corrupt record column in State Data Source Reader
Jungtaek Lim created SPARK-45671:
------------------------------------
Summary: Implement an option similar to corrupt record column in State Data Source Reader
Key: SPARK-45671
URL: https://issues.apache.org/jira/browse/SPARK-45671
Project: Spark
Issue Type: Sub-task
Components: Structured Streaming
Affects Versions: 4.0.0
Reporter: Jungtaek Lim
Querying against the state would be most likely failing if the underlying state file is corrupted. There may be another case that the binary data (raw) state store read from state file does not fit with state schema and ends up with exception/fatal error in runtime.
(We can't catch the case where the data is loaded with incorrect schema if it does not throw an exception. We cannot add the schema for every data.)
To handle above cases without failure, we want to provide state rows for valid rows, with also providing binary data for corrupted rows (like we do for CSV/JSON) if users specify an option.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org