You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Abhishek Girish (JIRA)" <ji...@apache.org> on 2015/05/07 21:40:00 UTC

[jira] [Commented] (DRILL-2221) CTAS (JSON) creates unreadable files when writing empty arrays

    [ https://issues.apache.org/jira/browse/DRILL-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14533264#comment-14533264 ] 

Abhishek Girish commented on DRILL-2221:
----------------------------------------

Verified on Git.Commit.ID d12bee0 (May 7 build)

{code:sql}
> select * from dfs.tmp.`a.json` ;
+------------+------------+------------+
|     a      |     b      |     c      |
+------------+------------+------------+
| A1         | ["B1","B2"] | ["C1","C2"] |
| A2         | ["B3"]     | []         |
+------------+------------+------------+
2 rows selected (1.397 seconds)

> alter session set `store.format` = 'json';
+------------+------------+
|     ok     |  summary   |
+------------+------------+
| true       | store.format updated. |
+------------+------------+
1 row selected (0.163 seconds)

> create table file_json1 as select * from dfs.tmp.`a.json` ;
+------------+---------------------------+
|  Fragment  | Number of records written |
+------------+---------------------------+
| 0_0        | 2                         |
+------------+---------------------------+
1 row selected (0.645 seconds)

> select * from file_json1;
+------------+------------+------------+
|     a      |     b      |     c      |
+------------+------------+------------+
| A1         | ["B1","B2"] | ["C1","C2"] |
| A2         | ["B3"]     | []         |
+------------+------------+------------+
2 rows selected (0.16 seconds)
{code}

The issue is now resolved.

> CTAS (JSON) creates unreadable files when writing empty arrays
> --------------------------------------------------------------
>
>                 Key: DRILL-2221
>                 URL: https://issues.apache.org/jira/browse/DRILL-2221
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - JSON
>            Reporter: Abhishek Girish
>            Assignee: Jason Altekruse
>            Priority: Critical
>             Fix For: 1.0.0
>
>         Attachments: DRILL-2221.1.patch.txt, drillbit_empty_array.log
>
>
> Source JSON file:
> {
>   "a": "A1",
>   "b": [ "B1", "B2" ],
>   "c": [ "C1", "C2" ]
> }
> {
>   "a": "A2",
>   "b": [ "B3" ],
>   "c": []
> }
> > select * from `file.json` ;
> +------------+------------+------------+
> |     a      |     b      |     c      |
> +------------+------------+------------+
> | A1         | ["B1","B2"] | ["C1","C2"] |
> | A2         | ["B3"]     | []         |
> +------------+------------+------------+
> 2 rows selected (0.039 seconds)
> >alter session set `store.format` = 'json';
> > create table file_json as select * from `file.json` ;
> [Success]
> Created file contents:
> {
>   "a" : "A1",
>   "b" : [ "B1", "B2" ],
>   "c" : [ "C1", "C2" ]
> } {
>   "a" : "A2",
>   "b" : [ "B3" ],
>   "c"
> }
> This file cannot be read by Drill as the object "c" is not in key:value format:
> > select * from file_json ;
> Query failed: Query stopped., Unexpected character ('}' (code 125)): was expecting a colon to separate field name and value
>  at [Source: org.apache.drill.exec.vector.complex.fn.JsonReader@67b4040a; line: 8, column: 2] [ 64bb2b5d-5d83-4ae3-a7dc-7a7e44b28bf5 on abhi8.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. (state=,code=0)
> Log attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)