You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Denis Bolshakov (JIRA)" <ji...@apache.org> on 2018/08/16 06:29:00 UTC

[jira] [Commented] (SPARK-23194) from_json in FAILFAST mode doesn't fail fast, instead it just returns nulls

    [ https://issues.apache.org/jira/browse/SPARK-23194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16582025#comment-16582025 ] 

Denis Bolshakov commented on SPARK-23194:
-----------------------------------------

[~cloud_fan], [~hyukjin.kwon], do you have any updates on this?

 

Javadoc says:
{code:java}
@param options options to control how the json is parsed. accepts the same options and the
*                json data source.
{code}

In fact it's not exactly true.
It' does not support `columnNameOfCorruptRecord` and `mode` options.
`mode` option is not supported because it's overridden in the source code, so user's value is just ignored.
 `columnNameOfCorruptRecord` is not supported because there is no way to set PERMISSIVE mode.

See:
http://apache-spark-user-list.1001560.n3.nabble.com/from-json-function-td33209.html
and
https://github.com/apache/spark/blob/e2ab7deae76d3b6f41b9ad4d0ece14ea28db40ce/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala#L568

It would be very nice to fix this or at least provide clear documentation for options in from_json function.

Kind regards,
Denis

> from_json in FAILFAST mode doesn't fail fast, instead it just returns nulls
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-23194
>                 URL: https://issues.apache.org/jira/browse/SPARK-23194
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Burak Yavuz
>            Priority: Major
>
> from_json accepts Json parsing options such as being PERMISSIVE to parsing errors or failing fast. It seems from the code that even though the default option is to fail fast, we catch that exception and return nulls.
>  
> In order to not change behavior, we should remove that try-catch block and change the default to permissive, but allow failfast mode to indeed fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org