You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Jacques Nadeau (JIRA)" <ji...@apache.org> on 2014/07/26 04:28:39 UTC

[jira] [Updated] (DRILL-1195) Querying nested array elements in Json returns only null values

     [ https://issues.apache.org/jira/browse/DRILL-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jacques Nadeau updated DRILL-1195:
----------------------------------

    Assignee: Aditya Kishore

> Querying nested array elements in Json returns only null values
> ---------------------------------------------------------------
>
>                 Key: DRILL-1195
>                 URL: https://issues.apache.org/jira/browse/DRILL-1195
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - JSON
>            Reporter: Amit Katti
>            Assignee: Aditya Kishore
>             Fix For: 0.4.0
>
>
> If a Json obejct has an array which is present at the 1st(uppermost) level it can be queried in Drill successfully as follows
> {code}
> {"rownum":1, "arrayval": [ "a1", "a2", "a3" ]}
> {"rownum":2, "arrayval": [ "b1", "b2", "b3" ]}
> {"rownum":3, "arrayval": [ "c1", "c2", "c3" ]}
> {"rownum":4, "arrayval": [ "d1", "d2", "d3" ]}
> {"rownum":5, "arrayval": [ "e1", "e2", "e3" ]}
> select tbl.arrayval[0] from `nested_working.json` tbl;
> +------------+
> |   EXPR$0   |
> +------------+
> | a1         |
> | b1         |
> | c1         |
> | d1         |
> | e1         |
> +------------+
> 5 rows selected (0.157 seconds)
> {code}
> However if the array is present in the 2nd level or below (inner level), directly querying the array's elements throws only null values. 
> {code}
> {"rownum":1, "a":{"arrayval": [ "a1", "a2", "a3" ]}}
> {"rownum":2, "a":{"arrayval": [ "b1", "b2", "b3" ]}}
> {"rownum":3, "a":{"arrayval": [ "c1", "c2", "c3" ]}}
> {"rownum":4, "a":{"arrayval": [ "d1", "d2", "d3" ]}}
> {"rownum":5, "a":{"arrayval": [ "e1", "e2", "e3" ]}}
> select tbl.a.arrayval[0] from `nested_failing.json` tbl;
> +------------+
> |   EXPR$0   |
> +------------+
> | null       |
> | null       |
> | null       |
> | null       |
> | null       |
> +------------+
> 5 rows selected (0.144 seconds)
> {code}
> Select * as well as selecting the whole array works fine though.
> {code}
> select tbl.a.arrayval from `/user/root/nested_failing.json` tbl;
> +------------+
> |   EXPR$0   |
> +------------+
> | ["a1","a2","a3"] |
> | ["b1","b2","b3"] |
> | ["c1","c2","c3"] |
> | ["d1","d2","d3"] |
> | ["e1","e2","e3"] |
> +------------+
> 5 rows selected (0.151 seconds)
> {code}
> On the other hand if the array's elements are json objects again, we are able to successfully query them. The issue seems to be centric only to single element objects.
> {code}
> {"rownum":1, "a":{"arrayval": [ {"val1":"a1"}, {"val2":"a2"}, {"val3":"a3"} ]}}
> {"rownum":2, "a":{"arrayval": [ {"val1":"b1"}, {"val2":"b2"}, {"val3":"b3"} ]}}
> {"rownum":3, "a":{"arrayval": [ {"val1":"c1"}, {"val2":"c2"}, {"val3":"c3"} ]}}
> {"rownum":4, "a":{"arrayval": [ {"val1":"d1"}, {"val2":"d2"}, {"val3":"d3"} ]}}
> {"rownum":5, "a":{"arrayval": [ {"val1":"e1"}, {"val2":"e2"}, {"val3":"e3"} ]}}
> select tbl.a.arrayval[0] from `nested_working2.json` tbl;
> +------------+
> |   EXPR$0   |
> +------------+
> | {"val1":"a1"} |
> | {"val1":"b1"} |
> | {"val1":"c1"} |
> | {"val1":"d1"} |
> | {"val1":"e1"} |
> +------------+
> 5 rows selected (0.186 seconds)
> select tbl.a.arrayval[0].val1 from `nested_working2.json` tbl;
> +------------+
> |   EXPR$0   |
> +------------+
> | a1         |
> | b1         |
> | c1         |
> | d1         |
> | e1         |
> +------------+
> 5 rows selected (0.157 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)