You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Vitalii Diravka (JIRA)" <ji...@apache.org> on 2017/10/18 15:26:02 UTC

[jira] [Commented] (DRILL-5822) Select * on directory containing multiple json files (one or more empty) with same schema doesn't preserve column order

    [ https://issues.apache.org/jira/browse/DRILL-5822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16209525#comment-16209525 ] 

Vitalii Diravka commented on DRILL-5822:
----------------------------------------

[~prasadns14] Not sure how to reproduce the issue. Please add data sources.
{code}
vitalii@vitalii-pc:/tmp/json$ ls -l
total 8
-rw-rw-r-- 1 vitalii vitalii   0 жов 18 15:20 scan_json_test_1.json
-rw-rw-r-- 1 vitalii vitalii   0 жов 18 15:21 scan_json_test_2_1 .json
-rw-rw-r-- 1 vitalii vitalii 258 жов 12  2016 scan_json_test_2.json
-rw-rw-r-- 1 vitalii vitalii 221 жов 12  2016 scan_json_test_3.json
{code}
{code}
apache drill 1.12.0-SNAPSHOT 
"drill baby drill"
0: jdbc:drill:zk=local> select * from dfs.`/tmp/json`;
+--------+-------+-------+--------+--------+-------+--------+
|  test  |   b   |   c   |  bool  |  str1  |   d   |  str2  |
+--------+-------+-------+--------+--------+-------+--------+
| 123    | 1     | 2.15  | true   | test1  | null  | null   |
| 1234   | 3     | null  | false  | test2  | 4     | null   |
| 12345  | null  | 5.16  | true   | null   | 6     | test3  |
| 123    | null  | null  | null   | null   | null  | null   |
| 1234   | null  | null  | null   | null   | null  | null   |
+--------+-------+-------+--------+--------+-------+--------+
5 rows selected (1.641 seconds)
0: jdbc:drill:zk=local> alter session set `planner.slice_target`=1;
+-------+--------------------------------+
|  ok   |            summary             |
+-------+--------------------------------+
| true  | planner.slice_target updated.  |
+-------+--------------------------------+
1 row selected (0.127 seconds)
0: jdbc:drill:zk=local> select * from dfs.`/tmp/json`;
+--------+-------+-------+--------+--------+-------+--------+
|  test  |   b   |   c   |  bool  |  str1  |   d   |  str2  |
+--------+-------+-------+--------+--------+-------+--------+
| 123    | 1     | 2.15  | true   | test1  | null  | null   |
| 1234   | 3     | null  | false  | test2  | 4     | null   |
| 12345  | null  | 5.16  | true   | null   | 6     | test3  |
| 123    | null  | null  | null   | null   | null  | null   |
| 1234   | null  | null  | null   | null   | null  | null   |
+--------+-------+-------+--------+--------+-------+--------+
{code}
But it is possible that this issue will gone once DRILL-5845 is fixed.

> Select * on directory containing multiple json files (one or more empty) with same schema doesn't preserve column order
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-5822
>                 URL: https://issues.apache.org/jira/browse/DRILL-5822
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - JSON
>    Affects Versions: 1.11.0
>            Reporter: Prasad Nagaraj Subramanya
>            Assignee: Vitalii Diravka
>             Fix For: 1.12.0
>
>
> Repro steps
> 1) Have multiple json files in a directory having the same schema
> 2) Also have one or more empty files 
> Scenarios
> 1) Only one minor fragment{code}select * from dfs.`/json_dir`;{code}
> {code}Result:
> +----------+------------+------------------------------------------+-----------------+-----------+----------------------------+---------+--------------+----------------+------------------------+
> | row_key  | p_partkey  |                  p_name                  |     p_mfgr      |  p_brand  |           p_type           | p_size  | p_container  | p_retailprice  |       p_comment        |
> +----------+------------+------------------------------------------+-----------------+-----------+----------------------------+---------+--------------+----------------+------------------------+
> | 1        | 1          | goldenrod lace spring peru powder        | Manufacturer#1  | Brand#13  | PROMO BURNISHED COPPER     | 7       | JUMBO PKG    | 901.0          | ly. slyly ironi        |
> | 2        | 2          | blush rosy metallic lemon navajo         | Manufacturer#1  | Brand#13  | LARGE BRUSHED BRASS        | 1       | LG CASE      | 902.0          | lar accounts amo       |
> {code}
>  2) One minor fragment per file
> {code}alter session set `planner.slice_target`=1;
> select * from dfs.`/json_dir`;{code}
> Result:
> {code}
> +-----------+------------------------+--------------+-----------------+------------------------------------------+------------+----------------+---------+----------------------------+----------+
> |  p_brand  |       p_comment        | p_container  |     p_mfgr      |                  p_name                  | p_partkey  | p_retailprice  | p_size  |           p_type           | row_key  |
> +-----------+------------------------+--------------+-----------------+------------------------------------------+------------+----------------+---------+----------------------------+----------+
> | Brand#13  | ly. slyly ironi        | JUMBO PKG    | Manufacturer#1  | goldenrod lace spring peru powder        | 1          | 901.0          | 7       | PROMO BURNISHED COPPER     | 1        |
> | Brand#13  | lar accounts amo       | LG CASE      | Manufacturer#1  | blush rosy metallic lemon navajo         | 2          | 902.0          | 1       | LARGE BRUSHED BRASS        | 2        |
> {code}
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)