You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Daniel Barclay (Drill) (JIRA)" <ji...@apache.org> on 2015/04/30 21:16:08 UTC

[jira] [Comment Edited] (DRILL-1816) Scan Error with JSON on large no of records with Complex Types

    [ https://issues.apache.org/jira/browse/DRILL-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522053#comment-14522053 ] 

Daniel Barclay (Drill) edited comment on DRILL-1816 at 4/30/15 7:16 PM:
------------------------------------------------------------------------

Investigation note:

Hmm.  Something about this is intermittent or state-dependent:

A first try at reproducing this consistently returned wrong numbers of rows (around 15,000 rather than 100,000).  (In one session with SQLLine and Drill in embedded mode, multiple queries returned less than 100,000 records (per SQLLine's record count at the end), and the number of records varied and was around 15,000.)

However, subsequent tries returned the correct number of rows.  (In several sessions, and for several queries, the number of records was correct.)


was (Author: dsbos):
Investigation note:

Hmm.  Something about this is intermittent or state-dependent:

A first try at reproducing this consistently returned wrong numbers of rows (around 15,000 rather than 100,000).

However, subsequent tries returned the correct number of rows.

> Scan Error with JSON on large no of records with Complex Types
> --------------------------------------------------------------
>
>                 Key: DRILL-1816
>                 URL: https://issues.apache.org/jira/browse/DRILL-1816
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Data Types, Storage - JSON
>            Reporter: Rahul Challapalli
>            Assignee: Daniel Barclay (Drill)
>             Fix For: 1.0.0
>
>         Attachments: complex.log
>
>
> git.commit.id.abbrev=4a4f54a
> Memory Settings
> {code}
> DRILL_MAX_DIRECT_MEMORY="32G"
> DRILL_MAX_HEAP="4G"
> {code}
> Dataset :
> {code}
> {
>   "data" : {
>     "col1" : {
>       "one" : [1,2,3,4],
>       "two" : [{"a":"b"},{"c":"d"}]
>     }
>   }
> }
> {code}
> The below query works fine for the above record. However if we copy the same record 100,000 times, it fails with IOOB exception
> {code}
> select data from `json_kvgenflatten/kvgen-complex-large.json`;
> {code}
> Attached the logs. Let me know if you need anything more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)