You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Ted Dunning <te...@gmail.com> on 2015/05/31 00:16:24 UTC

known issue? Problem reading JSON

This seems wrong.  I can count the records in a JSON table, but select *
doesn't work.

Is this a known issue?



ted:apache-drill-1.0.0$ bin/drill-embedded
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
MaxPermSize=512M; support was removed in 8.0
May 31, 2015 12:14:52 AM org.glassfish.jersey.server.ApplicationHandler
initialize
INFO: Initiating Jersey application, version Jersey: 2.8 2014-04-29
01:25:26...
apache drill 1.0.0
"got drill?"
0: jdbc:drill:zk=local> *select count(*) from
cp.`sales_fact_1997_collapsed.json` ;*
+---------+
| EXPR$0  |
+---------+
| 86837   |
+---------+
1 row selected (1.316 seconds)
0: jdbc:drill:zk=local> *select * from cp.`sales_fact_1997_collapsed.json`
limit 3;*
Error: DATA_READ ERROR: Error parsing JSON - You tried to write a BigInt
type when you are using a ValueWriter of type NullableFloat8WriterImpl.

File  /sales_fact_1997_collapsed.json
Record  3
Fragment 0:0

[Error Id: 8a9ac2c1-9764-42fd-bdeb-ec0b5e408438 on 192.168.1.38:31010]
(state=,code=0)
0: jdbc:drill:zk=local> *ALTER SYSTEM SET
`store.json.read_numbers_as_double` = true;*
+-------+---------------------------------------------+
|  ok   |                   summary                   |
+-------+---------------------------------------------+
| true  | store.json.read_numbers_as_double updated.  |
+-------+---------------------------------------------+
1 row selected (0.086 seconds)
0: jdbc:drill:zk=local> *select * from cp.`sales_fact_1997_collapsed.json`
limit 3;*
Error: DATA_READ ERROR: Error parsing JSON - You tried to write a VarChar
type when you are using a ValueWriter of type NullableFloat8WriterImpl.

File  /sales_fact_1997_collapsed.json
Record  47
Fragment 0:0

Re: known issue? Problem reading JSON

Posted by Jinfeng Ni <jn...@apache.org>.
The first error implies at least one column of your JSON data have mixed
integer values and float values. The second error seems implies the column
has mixed string and float values.

You may try with this option :

ALTER session set `store.json.all_text_mode` = true;



On Sat, May 30, 2015 at 3:16 PM, Ted Dunning <te...@gmail.com> wrote:

> This seems wrong.  I can count the records in a JSON table, but select *
> doesn't work.
>
> Is this a known issue?
>
>
>
> ted:apache-drill-1.0.0$ bin/drill-embedded
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> MaxPermSize=512M; support was removed in 8.0
> May 31, 2015 12:14:52 AM org.glassfish.jersey.server.ApplicationHandler
> initialize
> INFO: Initiating Jersey application, version Jersey: 2.8 2014-04-29
> 01:25:26...
> apache drill 1.0.0
> "got drill?"
> 0: jdbc:drill:zk=local> *select count(*) from
> cp.`sales_fact_1997_collapsed.json` ;*
> +---------+
> | EXPR$0  |
> +---------+
> | 86837   |
> +---------+
> 1 row selected (1.316 seconds)
> 0: jdbc:drill:zk=local> *select * from cp.`sales_fact_1997_collapsed.json`
> limit 3;*
> Error: DATA_READ ERROR: Error parsing JSON - You tried to write a BigInt
> type when you are using a ValueWriter of type NullableFloat8WriterImpl.
>
> File  /sales_fact_1997_collapsed.json
> Record  3
> Fragment 0:0
>
> [Error Id: 8a9ac2c1-9764-42fd-bdeb-ec0b5e408438 on 192.168.1.38:31010]
> (state=,code=0)
> 0: jdbc:drill:zk=local> *ALTER SYSTEM SET
> `store.json.read_numbers_as_double` = true;*
> +-------+---------------------------------------------+
> |  ok   |                   summary                   |
> +-------+---------------------------------------------+
> | true  | store.json.read_numbers_as_double updated.  |
> +-------+---------------------------------------------+
> 1 row selected (0.086 seconds)
> 0: jdbc:drill:zk=local> *select * from cp.`sales_fact_1997_collapsed.json`
> limit 3;*
> Error: DATA_READ ERROR: Error parsing JSON - You tried to write a VarChar
> type when you are using a ValueWriter of type NullableFloat8WriterImpl.
>
> File  /sales_fact_1997_collapsed.json
> Record  47
> Fragment 0:0
>

Re: known issue? Problem reading JSON

Posted by Hanifi Gunes <hg...@maprtech.com>.
* The former query(select) does read and vectorize every single
field/column, thus field type matters whereas the latter(count) does not
really read at field level but simply counts individual JSON records
thereby very efficient in time (~90x in a single very wide record) and
memory.

On Mon, Jun 1, 2015 at 12:38 PM, Hanifi Gunes <hg...@maprtech.com> wrote:

> The fact that count does not fail but select fails is known and will be
> there at least until we support heterogenous types. Also we handle these
> queries differently at JSON processor. The former query does read and
> vectorize every single field/column, thus field type matters whereas the
> latter does not really read at field level but simply counts individual
> JSON records thereby very efficient in time (~90x in a single very wide
> record) and memory. That's the reason why your count(*) query succeeds
> while select(*) fails.
>
> I agree that error messages need a touch. Filed DRILL-3231 to track this.
>
>
> On Sat, May 30, 2015 at 10:51 PM, Ted Dunning <te...@gmail.com>
> wrote:
>
>> OK.
>>
>> But this *is* in a data file that we distribute as part of Drill.
>>
>> Perhaps a better error message is warranted?
>>
>> Also, this seems to be a serious limitation that appears only to be
>> fixable
>> using a sledge-hammer.
>>
>>
>>
>> On Sun, May 31, 2015 at 3:31 AM, Jacques Nadeau <ja...@apache.org>
>> wrote:
>>
>> > The second error is stating that you have a column that is a string in
>> one
>> > row and a double in another.
>> >
>> > On Sat, May 30, 2015 at 3:16 PM, Ted Dunning <te...@gmail.com>
>> > wrote:
>> >
>> > > This seems wrong.  I can count the records in a JSON table, but
>> select *
>> > > doesn't work.
>> > >
>> > > Is this a known issue?
>> > >
>> > >
>> > >
>> > > ted:apache-drill-1.0.0$ bin/drill-embedded
>> > > Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>> > > MaxPermSize=512M; support was removed in 8.0
>> > > May 31, 2015 12:14:52 AM
>> org.glassfish.jersey.server.ApplicationHandler
>> > > initialize
>> > > INFO: Initiating Jersey application, version Jersey: 2.8 2014-04-29
>> > > 01:25:26...
>> > > apache drill 1.0.0
>> > > "got drill?"
>> > > 0: jdbc:drill:zk=local> *select count(*) from
>> > > cp.`sales_fact_1997_collapsed.json` ;*
>> > > +---------+
>> > > | EXPR$0  |
>> > > +---------+
>> > > | 86837   |
>> > > +---------+
>> > > 1 row selected (1.316 seconds)
>> > > 0: jdbc:drill:zk=local> *select * from
>> > cp.`sales_fact_1997_collapsed.json`
>> > > limit 3;*
>> > > Error: DATA_READ ERROR: Error parsing JSON - You tried to write a
>> BigInt
>> > > type when you are using a ValueWriter of type
>> NullableFloat8WriterImpl.
>> > >
>> > > File  /sales_fact_1997_collapsed.json
>> > > Record  3
>> > > Fragment 0:0
>> > >
>> > > [Error Id: 8a9ac2c1-9764-42fd-bdeb-ec0b5e408438 on 192.168.1.38:31010
>> ]
>> > > (state=,code=0)
>> > > 0: jdbc:drill:zk=local> *ALTER SYSTEM SET
>> > > `store.json.read_numbers_as_double` = true;*
>> > > +-------+---------------------------------------------+
>> > > |  ok   |                   summary                   |
>> > > +-------+---------------------------------------------+
>> > > | true  | store.json.read_numbers_as_double updated.  |
>> > > +-------+---------------------------------------------+
>> > > 1 row selected (0.086 seconds)
>> > > 0: jdbc:drill:zk=local> *select * from
>> > cp.`sales_fact_1997_collapsed.json`
>> > > limit 3;*
>> > > Error: DATA_READ ERROR: Error parsing JSON - You tried to write a
>> VarChar
>> > > type when you are using a ValueWriter of type
>> NullableFloat8WriterImpl.
>> > >
>> > > File  /sales_fact_1997_collapsed.json
>> > > Record  47
>> > > Fragment 0:0
>> > >
>> >
>>
>
>

Re: known issue? Problem reading JSON

Posted by Hanifi Gunes <hg...@maprtech.com>.
The fact that count does not fail but select fails is known and will be
there at least until we support heterogenous types. Also we handle these
queries differently at JSON processor. The former query does read and
vectorize every single field/column, thus field type matters whereas the
latter does not really read at field level but simply counts individual
JSON records thereby very efficient in time (~90x in a single very wide
record) and memory. That's the reason why your count(*) query succeeds
while select(*) fails.

I agree that error messages need a touch. Filed DRILL-3231 to track this.


On Sat, May 30, 2015 at 10:51 PM, Ted Dunning <te...@gmail.com> wrote:

> OK.
>
> But this *is* in a data file that we distribute as part of Drill.
>
> Perhaps a better error message is warranted?
>
> Also, this seems to be a serious limitation that appears only to be fixable
> using a sledge-hammer.
>
>
>
> On Sun, May 31, 2015 at 3:31 AM, Jacques Nadeau <ja...@apache.org>
> wrote:
>
> > The second error is stating that you have a column that is a string in
> one
> > row and a double in another.
> >
> > On Sat, May 30, 2015 at 3:16 PM, Ted Dunning <te...@gmail.com>
> > wrote:
> >
> > > This seems wrong.  I can count the records in a JSON table, but select
> *
> > > doesn't work.
> > >
> > > Is this a known issue?
> > >
> > >
> > >
> > > ted:apache-drill-1.0.0$ bin/drill-embedded
> > > Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> > > MaxPermSize=512M; support was removed in 8.0
> > > May 31, 2015 12:14:52 AM org.glassfish.jersey.server.ApplicationHandler
> > > initialize
> > > INFO: Initiating Jersey application, version Jersey: 2.8 2014-04-29
> > > 01:25:26...
> > > apache drill 1.0.0
> > > "got drill?"
> > > 0: jdbc:drill:zk=local> *select count(*) from
> > > cp.`sales_fact_1997_collapsed.json` ;*
> > > +---------+
> > > | EXPR$0  |
> > > +---------+
> > > | 86837   |
> > > +---------+
> > > 1 row selected (1.316 seconds)
> > > 0: jdbc:drill:zk=local> *select * from
> > cp.`sales_fact_1997_collapsed.json`
> > > limit 3;*
> > > Error: DATA_READ ERROR: Error parsing JSON - You tried to write a
> BigInt
> > > type when you are using a ValueWriter of type NullableFloat8WriterImpl.
> > >
> > > File  /sales_fact_1997_collapsed.json
> > > Record  3
> > > Fragment 0:0
> > >
> > > [Error Id: 8a9ac2c1-9764-42fd-bdeb-ec0b5e408438 on 192.168.1.38:31010]
> > > (state=,code=0)
> > > 0: jdbc:drill:zk=local> *ALTER SYSTEM SET
> > > `store.json.read_numbers_as_double` = true;*
> > > +-------+---------------------------------------------+
> > > |  ok   |                   summary                   |
> > > +-------+---------------------------------------------+
> > > | true  | store.json.read_numbers_as_double updated.  |
> > > +-------+---------------------------------------------+
> > > 1 row selected (0.086 seconds)
> > > 0: jdbc:drill:zk=local> *select * from
> > cp.`sales_fact_1997_collapsed.json`
> > > limit 3;*
> > > Error: DATA_READ ERROR: Error parsing JSON - You tried to write a
> VarChar
> > > type when you are using a ValueWriter of type NullableFloat8WriterImpl.
> > >
> > > File  /sales_fact_1997_collapsed.json
> > > Record  47
> > > Fragment 0:0
> > >
> >
>

Re: known issue? Problem reading JSON

Posted by Ted Dunning <te...@gmail.com>.
OK.

But this *is* in a data file that we distribute as part of Drill.

Perhaps a better error message is warranted?

Also, this seems to be a serious limitation that appears only to be fixable
using a sledge-hammer.



On Sun, May 31, 2015 at 3:31 AM, Jacques Nadeau <ja...@apache.org> wrote:

> The second error is stating that you have a column that is a string in one
> row and a double in another.
>
> On Sat, May 30, 2015 at 3:16 PM, Ted Dunning <te...@gmail.com>
> wrote:
>
> > This seems wrong.  I can count the records in a JSON table, but select *
> > doesn't work.
> >
> > Is this a known issue?
> >
> >
> >
> > ted:apache-drill-1.0.0$ bin/drill-embedded
> > Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> > MaxPermSize=512M; support was removed in 8.0
> > May 31, 2015 12:14:52 AM org.glassfish.jersey.server.ApplicationHandler
> > initialize
> > INFO: Initiating Jersey application, version Jersey: 2.8 2014-04-29
> > 01:25:26...
> > apache drill 1.0.0
> > "got drill?"
> > 0: jdbc:drill:zk=local> *select count(*) from
> > cp.`sales_fact_1997_collapsed.json` ;*
> > +---------+
> > | EXPR$0  |
> > +---------+
> > | 86837   |
> > +---------+
> > 1 row selected (1.316 seconds)
> > 0: jdbc:drill:zk=local> *select * from
> cp.`sales_fact_1997_collapsed.json`
> > limit 3;*
> > Error: DATA_READ ERROR: Error parsing JSON - You tried to write a BigInt
> > type when you are using a ValueWriter of type NullableFloat8WriterImpl.
> >
> > File  /sales_fact_1997_collapsed.json
> > Record  3
> > Fragment 0:0
> >
> > [Error Id: 8a9ac2c1-9764-42fd-bdeb-ec0b5e408438 on 192.168.1.38:31010]
> > (state=,code=0)
> > 0: jdbc:drill:zk=local> *ALTER SYSTEM SET
> > `store.json.read_numbers_as_double` = true;*
> > +-------+---------------------------------------------+
> > |  ok   |                   summary                   |
> > +-------+---------------------------------------------+
> > | true  | store.json.read_numbers_as_double updated.  |
> > +-------+---------------------------------------------+
> > 1 row selected (0.086 seconds)
> > 0: jdbc:drill:zk=local> *select * from
> cp.`sales_fact_1997_collapsed.json`
> > limit 3;*
> > Error: DATA_READ ERROR: Error parsing JSON - You tried to write a VarChar
> > type when you are using a ValueWriter of type NullableFloat8WriterImpl.
> >
> > File  /sales_fact_1997_collapsed.json
> > Record  47
> > Fragment 0:0
> >
>

Re: known issue? Problem reading JSON

Posted by Jacques Nadeau <ja...@apache.org>.
The second error is stating that you have a column that is a string in one
row and a double in another.

On Sat, May 30, 2015 at 3:16 PM, Ted Dunning <te...@gmail.com> wrote:

> This seems wrong.  I can count the records in a JSON table, but select *
> doesn't work.
>
> Is this a known issue?
>
>
>
> ted:apache-drill-1.0.0$ bin/drill-embedded
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> MaxPermSize=512M; support was removed in 8.0
> May 31, 2015 12:14:52 AM org.glassfish.jersey.server.ApplicationHandler
> initialize
> INFO: Initiating Jersey application, version Jersey: 2.8 2014-04-29
> 01:25:26...
> apache drill 1.0.0
> "got drill?"
> 0: jdbc:drill:zk=local> *select count(*) from
> cp.`sales_fact_1997_collapsed.json` ;*
> +---------+
> | EXPR$0  |
> +---------+
> | 86837   |
> +---------+
> 1 row selected (1.316 seconds)
> 0: jdbc:drill:zk=local> *select * from cp.`sales_fact_1997_collapsed.json`
> limit 3;*
> Error: DATA_READ ERROR: Error parsing JSON - You tried to write a BigInt
> type when you are using a ValueWriter of type NullableFloat8WriterImpl.
>
> File  /sales_fact_1997_collapsed.json
> Record  3
> Fragment 0:0
>
> [Error Id: 8a9ac2c1-9764-42fd-bdeb-ec0b5e408438 on 192.168.1.38:31010]
> (state=,code=0)
> 0: jdbc:drill:zk=local> *ALTER SYSTEM SET
> `store.json.read_numbers_as_double` = true;*
> +-------+---------------------------------------------+
> |  ok   |                   summary                   |
> +-------+---------------------------------------------+
> | true  | store.json.read_numbers_as_double updated.  |
> +-------+---------------------------------------------+
> 1 row selected (0.086 seconds)
> 0: jdbc:drill:zk=local> *select * from cp.`sales_fact_1997_collapsed.json`
> limit 3;*
> Error: DATA_READ ERROR: Error parsing JSON - You tried to write a VarChar
> type when you are using a ValueWriter of type NullableFloat8WriterImpl.
>
> File  /sales_fact_1997_collapsed.json
> Record  47
> Fragment 0:0
>