You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jingwei Lu (JIRA)" <ji...@apache.org> on 2016/03/08 21:16:40 UTC

[jira] [Updated] (SPARK-13752) JSON array type parsing error

     [ https://issues.apache.org/jira/browse/SPARK-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jingwei Lu updated SPARK-13752:
-------------------------------
    Attachment: sparkissue.scala

This is a repro case. 

> JSON array type parsing error
> -----------------------------
>
>                 Key: SPARK-13752
>                 URL: https://issues.apache.org/jira/browse/SPARK-13752
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.5.2
>            Reporter: Jingwei Lu
>         Attachments: sparkissue.scala
>
>
> Due to SPARK-3308, sql json parser will not able to handle invalid payload field in air_events. This is how payload schema is defined. 
>                         }, {
>                             "name" : "payload",
>                             "type" : {
>                                 "type" : "array",
>                                 "elementType" : {
>                                     "type" : "struct",
>                                     "fields" : [ {
>                                         "name" : "type",
>                                         "type" : "string",
>                                         "nullable" : true,
>                                       }, {
>                                         "name" : "name",
>                                         "type" : "string",
>                                         "nullable" : true,
>                                       }, {
>                                         "name" : "duration",
>                                         "type" : "string",
>                                         "nullable" : true,
>                                       } ]
>                                 },
>                             "containsNull" : false
>                           },
>                           "nullable" : true,
>                         } ]
> For some of invalid payload, for example:
> "payload":[[],[],[],[],[]], or "payload":[[[js, ...], []] will pass the schema validation and generate rows. However, the rows are not compatible with spark sql when it try to access it in the filter. Spark will generate internal CastClassException. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org