You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Amit Katti (JIRA)" <ji...@apache.org> on 2014/07/25 07:58:38 UTC
[jira] [Commented] (DRILL-1151) JSON nested data returns empty for
all rows except the first when using selections
[ https://issues.apache.org/jira/browse/DRILL-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074113#comment-14074113 ]
Amit Katti commented on DRILL-1151:
-----------------------------------
Managed to narrow down the cause of the Issue:
=========================================
While querying a nested Json which has an array of Json objects, if the query has a LIMIT keyword, all rows after the 1st one return an empty array
{code}
{"rownum":1, "arrayval": [ {"val1":"a1"}, {"val2":"a2"}, {"val3":"a3"} ]}
{"rownum":2, "arrayval": [ {"val1":"b1"}, {"val2":"b2"}, {"val3":"b3"} ]}
{"rownum":3, "arrayval": [ {"val1":"c1"}, {"val2":"c2"}, {"val3":"c3"} ]}
{"rownum":4, "arrayval": [ {"val1":"d1"}, {"val2":"d2"}, {"val3":"d3"} ]}
{"rownum":5, "arrayval": [ {"val1":"e1"}, {"val2":"e2"}, {"val3":"e3"} ]}
select * from `failing.json`;
+------------+------------+
| rownum | arrayval |
+------------+------------+
| 1 | [{"val1":"a1"},{"val2":"a2"},{"val3":"a3"}] |
| 2 | [{"val1":"b1"},{"val2":"b2"},{"val3":"b3"}] |
| 3 | [{"val1":"c1"},{"val2":"c2"},{"val3":"c3"}] |
| 4 | [{"val1":"d1"},{"val2":"d2"},{"val3":"d3"}] |
| 5 | [{"val1":"e1"},{"val2":"e2"},{"val3":"e3"}] |
+------------+------------+
5 rows selected (0.205 seconds)
select * from `failing.json` limit 3;
+------------+------------+
| rownum | arrayval |
+------------+------------+
| 1 | [{"val1":"a1"},{"val2":"a2"},{"val3":"a3"}] |
| 2 | [] |
| 3 | [] |
+------------+------------+
3 rows selected (0.137 seconds)
{code}
However if the array's element are simple values (primitive) instead of Json objects, the query with LIMIT works perfectly fine.
{code}
{"rownum":1, "arrayval": [ "a1", "a2", "a3"]}
{"rownum":2, "arrayval": [ "b1", "b2", "b3"]}
{"rownum":3, "arrayval": [ "c1", "c2", "c3"]}
{"rownum":4, "arrayval": [ "d1", "d2", "d3"]}
{"rownum":5, "arrayval": [ "e1", "e2", "e3"]}
select * from `working.json`;
+------------+------------+
| rownum | arrayval |
+------------+------------+
| 1 | ["a1","a2","a3"] |
| 2 | ["b1","b2","b3"] |
| 3 | ["c1","c2","c3"] |
| 4 | ["d1","d2","d3"] |
| 5 | ["e1","e2","e3"] |
+------------+------------+
5 rows selected (0.145 seconds)
select * from `working.json` limit 3;
+------------+------------+
| rownum | arrayval |
+------------+------------+
| 1 | ["a1","a2","a3"] |
| 2 | ["b1","b2","b3"] |
| 3 | ["c1","c2","c3"] |
+------------+------------+
3 rows selected (0.141 seconds)
{code}
> JSON nested data returns empty for all rows except the first when using selections
> ----------------------------------------------------------------------------------
>
> Key: DRILL-1151
> URL: https://issues.apache.org/jira/browse/DRILL-1151
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Norris Lee
> Assignee: Jinfeng Ni
> Priority: Critical
>
> Without the limit clause, everything gets returned:
> {code}
> 0: jdbc:drill:zk=localhost:5181> SELECT * FROM `dfs`.`optdrill`.`./raw-files/donuts/moredonuts.json`;
> +------------+------------+------------+------------+------------+------------+------------+------------+
> | id | type | name | ppu | sales | batters | topping | filling |
> +------------+------------+------------+------------+------------+------------+------------+------------+
> | 0001 | donut | Cake | 0.55 | 35 | {"batter":[{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"},{"id":"100 |
> | 0002 | donut | Raised | 0.69 | 145 | {"batter":[{"id":"1001","type":"Regular"}]} | [{"id":"5001","type":"None"},{"id":"500 |
> | 0003 | donut | Old Fashioned | 0.55 | 300 | {"batter":[{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"}]} | [{" |
> | 0004 | donut | Filled | 0.69 | 14 | {"batter":[{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"},{"id":"100 |
> | 0005 | donut | Apple Fritter | 1.0 | 700 | {"batter":[{"id":"1001","type":"Regular"}]} | [{"id":"5002","type":"Glazed"}] | [] |
> +------------+------------+------------+------------+------------+------------+------------+------------+
> 5 rows selected (0.785 seconds)
> {code}
> Using the limit clause, the batters column returns [] for rows 2-5:
> {code}
> dbc:drill:zk=localhost:5181> SELECT * FROM `dfs`.`optdrill`.`./raw-files/donuts/moredonuts.json` LIMIT 100;
> +------------+------------+------------+------------+------------+------------+------------+------------+
> | id | type | name | ppu | sales | batters | topping | filling |
> +------------+------------+------------+------------+------------+------------+------------+------------+
> | 0001 | donut | Cake | 0.55 | 35 | {"batter":[{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"},{"id":"100 |
> | 0002 | donut | Raised | 0.69 | 145 | {"batter":[]} | [] | [] |
> | 0003 | donut | Old Fashioned | 0.55 | 300 | {"batter":[]} | [] | [] |
> | 0004 | donut | Filled | 0.69 | 14 | {"batter":[]} | [] | [] |
> | 0005 | donut | Apple Fritter | 1.0 | 700 | {"batter":[]} | [] | [] |
> +------------+------------+------------+------------+------------+------------+------------+------------+
> 5 rows selected (0.543 seconds)
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)