You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2018/12/07 10:12:00 UTC

[jira] [Assigned] (SPARK-26303) Return partial results for bad JSON records

     [ https://issues.apache.org/jira/browse/SPARK-26303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-26303:
------------------------------------

    Assignee:     (was: Apache Spark)

> Return partial results for bad JSON records
> -------------------------------------------
>
>                 Key: SPARK-26303
>                 URL: https://issues.apache.org/jira/browse/SPARK-26303
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Maxim Gekk
>            Priority: Minor
>
> Currently, JSON datasource and JSON functions return row with all null for a malformed JSON string in the PERMISSIVE mode when specified schema has the struct type. All nulls are returned even some of fields were parsed and converted to desired types successfully. The ticket aims to solve the problem by returning already parsed fields. The corrupted column specified via JSON option `columnNameOfCorruptRecord` or SQL config should contain whole original JSON string. 
> For example, if the input has one JSON string:
> {code:json}
> {"a":0.1,"b":{},"c":"def"}
> {code}
> and specified schema is:
> {code:sql}
> a DOUBLE, b ARRAY<INT>, c STRING, _corrupt_record STRIN
> {code}
> expected output of `from_json` in the PERMISSIVE mode:
> {code}
> +---+----+---+--------------------------+
> |a  |b   |c  |_corrupt_record           |
> +---+----+---+--------------------------+
> |0.1|null|def|{"a":0.1,"b":{},"c":"def"}|
> +---+----+---+--------------------------+
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org