You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/09/06 08:59:21 UTC

[jira] [Commented] (DRILL-4824) JSON with complex nested data produces incorrect output with missing fields

    [ https://issues.apache.org/jira/browse/DRILL-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15466876#comment-15466876 ] 

ASF GitHub Bot commented on DRILL-4824:
---------------------------------------

GitHub user KulykRoman opened a pull request:

    https://github.com/apache/drill/pull/580

    DRILL-4824: JSON with complex nested data produces incorrect output w…

    …ith missing fields

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/KulykRoman/drill DRILL-4824

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/580.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #580
    
----
commit 43919c0f8e447a8d76d8726fe72cd3c10edf78b4
Author: Roman Kulyk <ro...@gmail.com>
Date:   2016-08-03T12:08:40Z

    DRILL-4824: JSON with complex nested data produces incorrect output with missing fields
    
    - Added changes to skip empty Lists or Maps.
    - Changed TestJsonReader.testSplitAndTransferFailure() and TestJsonReader.testFieldSelectionBug() according to new output logic.

----


> JSON with complex nested data produces incorrect output with missing fields
> ---------------------------------------------------------------------------
>
>                 Key: DRILL-4824
>                 URL: https://issues.apache.org/jira/browse/DRILL-4824
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - JSON
>    Affects Versions: 1.0.0
>            Reporter: Roman
>            Assignee: Roman
>             Fix For: 1.9.0
>
>
> There is incorrect output in case of JSON file with complex nested data.
> _JSON:_
> {code:none|title=example.json|borderStyle=solid}
> {
>         "Field1" : {
>         }
> }
> {
>         "Field1" : {
>                 "InnerField1": {"key1":"value1"},
>                 "InnerField2": {"key2":"value2"}
>         }
> }
> {
>         "Field1" : {
>                 "InnerField3" : ["value3", "value4"],
>                 "InnerField4" : ["value5", "value6"]
>         }
> }
> {code}
> _Query:_
> {code:sql}
> select Field1 from dfs.`/tmp/example.json`
> {code}
> _Incorrect result:_
> {code:none}
> +---------------------------+
> |          Field1           |
> +---------------------------+
> {"InnerField1":{},"InnerField2":{},"InnerField3":[],"InnerField4":[]}
> {"InnerField1":{"key1":"value1"},"InnerField2" {"key2":"value2"},"InnerField3":[],"InnerField4":[]}
> {"InnerField1":{},"InnerField2":{},"InnerField3":["value3","value4"],"InnerField4":["value5","value6"]}
> +--------------------------+
> {code}
> Theres is no need to output missing fields. In case of deeply nested structure we will get unreadable result for user.
> _Correct result:_
> {code:none}
> +--------------------------+
> |         Field1           |
> +--------------------------+
> |{}                                                                     
> {"InnerField1":{"key1":"value1"},"InnerField2":{"key2":"value2"}}
> {"InnerField3":["value3","value4"],"InnerField4":["value5","value6"]}
> +--------------------------+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)