You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Dan Blondowski <da...@dhigroupinc.com> on 2016/10/12 14:08:26 UTC

Error parsing JSON

Hello.

I have a json file with 2 records (see attached).

When I run a query against it I get this error:


Error: DATA_READ ERROR: Error parsing JSON - You tried to write a BigInt type when you are using a ValueWriter of type NullableVarCharWriterImpl.


File  /test.json

Record  2

Fragment 0:0


[Error Id: 8b5166cc-28b6-488c-893f-f0265d483e13 on ip-10-3-48-183:31010] (state=,code=0)

If I break it up into individual files, I can run the query against both records just fine. Any clues?

Thanks!

Daniel Blondowski

Big Data Architecture/Engineering
DHI Group, Inc.

dan.blondowski@dice.com
515-313-2137
---------------------------------------------------------------------------------------
 This email has been scanned for email related threats and delivered safely by Mimecast.
 For more information please visit http://www.mimecast.com
---------------------------------------------------------------------------------------

Re: Error parsing JSON

Posted by Khurram Faraaz <kf...@maprtech.com>.
Can you please try to set this config parameter from sqlline and then run
your query.

alter session set `store.json.all_text_mode` = true;



On Wed, Oct 12, 2016 at 8:38 PM, Dan Blondowski <
dan.blondowski@dhigroupinc.com> wrote:

> Sorry!
> Select * from s3.`test.json`;
>
> Daniel Blondowski
>
> Big Data Architecture/Engineering
> DHI Group, Inc.
>
> dan.blondowski@dice.com
> 515-313-2137
>
>
>
>
>
> On 10/12/16, 9:52 AM, "Khurram Faraaz" <kf...@maprtech.com> wrote:
>
> >Can you please share your SQL query ?
> >
> >On Wed, Oct 12, 2016 at 7:38 PM, Dan Blondowski <
> >dan.blondowski@dhigroupinc.com> wrote:
> >
> >> Hello.
> >>
> >> I have a json file with 2 records (see attached).
> >>
> >> When I run a query against it I get this error:
> >>
> >> *Error: DATA_READ ERROR: Error parsing JSON - You tried to write a
> >>BigInt
> >> type when you are using a ValueWriter of type
> >>NullableVarCharWriterImpl.*
> >>
> >>
> >> *File  /test.json*
> >>
> >> *Record  2*
> >>
> >> *Fragment 0:0*
> >>
> >>
> >> *[Error Id: 8b5166cc-28b6-488c-893f-f0265d483e13 on
> >>ip-10-3-48-183:31010]
> >> (state=,code=0)*
> >>
> >> If I break it up into individual files, I can run the query against both
> >> records just fine. Any clues?
> >>
> >> Thanks!
> >>
> >> Daniel Blondowski
> >>
> >> Big Data Architecture/Engineering
> >> DHI Group, Inc.
> >>
> >> dan.blondowski@dice.com
> >> 515-313-2137
> >>
> >>
> >> ------------------------------
> >> This email has been scanned for email related threats by Mimecast
> >> <http://www.mimecast.com>
> >> ------------------------------
> >>
> ------------------------------------------------------------
> ---------------------------
>  This email has been scanned for email related threats and delivered
> safely by Mimecast.
>  For more information please visit http://www.mimecast.com
> ------------------------------------------------------------
> ---------------------------
>

Re: Error parsing JSON

Posted by Dan Blondowski <da...@dhigroupinc.com>.
Sorry!  
Select * from s3.`test.json`;

Daniel Blondowski

Big Data Architecture/Engineering
DHI Group, Inc.

dan.blondowski@dice.com
515-313-2137





On 10/12/16, 9:52 AM, "Khurram Faraaz" <kf...@maprtech.com> wrote:

>Can you please share your SQL query ?
>
>On Wed, Oct 12, 2016 at 7:38 PM, Dan Blondowski <
>dan.blondowski@dhigroupinc.com> wrote:
>
>> Hello.
>>
>> I have a json file with 2 records (see attached).
>>
>> When I run a query against it I get this error:
>>
>> *Error: DATA_READ ERROR: Error parsing JSON - You tried to write a
>>BigInt
>> type when you are using a ValueWriter of type
>>NullableVarCharWriterImpl.*
>>
>>
>> *File  /test.json*
>>
>> *Record  2*
>>
>> *Fragment 0:0*
>>
>>
>> *[Error Id: 8b5166cc-28b6-488c-893f-f0265d483e13 on
>>ip-10-3-48-183:31010]
>> (state=,code=0)*
>>
>> If I break it up into individual files, I can run the query against both
>> records just fine. Any clues?
>>
>> Thanks!
>>
>> Daniel Blondowski
>>
>> Big Data Architecture/Engineering
>> DHI Group, Inc.
>>
>> dan.blondowski@dice.com
>> 515-313-2137
>>
>>
>> ------------------------------
>> This email has been scanned for email related threats by Mimecast
>> <http://www.mimecast.com>
>> ------------------------------
>>
---------------------------------------------------------------------------------------
 This email has been scanned for email related threats and delivered safely by Mimecast.
 For more information please visit http://www.mimecast.com
---------------------------------------------------------------------------------------

Re: Error parsing JSON

Posted by Khurram Faraaz <kf...@maprtech.com>.
Can you please share your SQL query ?

On Wed, Oct 12, 2016 at 7:38 PM, Dan Blondowski <
dan.blondowski@dhigroupinc.com> wrote:

> Hello.
>
> I have a json file with 2 records (see attached).
>
> When I run a query against it I get this error:
>
> *Error: DATA_READ ERROR: Error parsing JSON - You tried to write a BigInt
> type when you are using a ValueWriter of type NullableVarCharWriterImpl.*
>
>
> *File  /test.json*
>
> *Record  2*
>
> *Fragment 0:0*
>
>
> *[Error Id: 8b5166cc-28b6-488c-893f-f0265d483e13 on ip-10-3-48-183:31010]
> (state=,code=0)*
>
> If I break it up into individual files, I can run the query against both
> records just fine. Any clues?
>
> Thanks!
>
> Daniel Blondowski
>
> Big Data Architecture/Engineering
> DHI Group, Inc.
>
> dan.blondowski@dice.com
> 515-313-2137
>
>
> ------------------------------
> This email has been scanned for email related threats by Mimecast
> <http://www.mimecast.com>
> ------------------------------
>

Re: Error parsing JSON

Posted by Abhishek Girish <ag...@mapr.com>.
Unfortunately, the only information shared is the file name and the record
having the issue. In your example, you can see that the issue was with
record 2 in file /test.json. You will need to use trial and error method to
isolate the field in my opinion.

On Wed, Oct 12, 2016 at 9:01 AM, Dan Blondowski <dan.blondowski@dhigroupinc.
com> wrote:

> Hello.  You were correct.
> I was able to set `store.json.all_text_mode` = true; and run the query.
>
> I also found the data that needed changing.  There were actually 3 fields.
>
> Is there a way for Drill to display which field is bad, so I don¹t have to
> do the trial & error?
>
>
>
>
> On 10/12/16, 10:12 AM, "Abhishek Girish" <ab...@gmail.com>
> wrote:
>
> >Hey Dan,
> >
> >This usually happens when there is a schema change across records. It
> >could
> >be intentional (records do need different types for some fields) or bad
> >formatting (1 vs "1").
> >
> >Can you try setting this session option and retry your query? Let us know
> >if it helps. You could attempt to use explicit casts in the query to get
> >types you want.
> >
> >          set `store.json.all_text_mode` = true;
> >
> >Regards,
> >Abhishek
> >
> >On Wednesday, October 12, 2016, Dan Blondowski <
> >dan.blondowski@dhigroupinc.com> wrote:
> >
> >> Hello.
> >>
> >> I have a json file with 2 records (see attached).
> >>
> >> When I run a query against it I get this error:
> >>
> >> *Error: DATA_READ ERROR: Error parsing JSON - You tried to write a
> >>BigInt
> >> type when you are using a ValueWriter of type
> >>NullableVarCharWriterImpl.*
> >>
> >>
> >> *File  /test.json*
> >>
> >> *Record  2*
> >>
> >> *Fragment 0:0*
> >>
> >>
> >> *[Error Id: 8b5166cc-28b6-488c-893f-f0265d483e13 on
> >>ip-10-3-48-183:31010]
> >> (state=,code=0)*
> >>
> >> If I break it up into individual files, I can run the query against both
> >> records just fine. Any clues?
> >>
> >> Thanks!
> >>
> >> Daniel Blondowski
> >>
> >> Big Data Architecture/Engineering
> >> DHI Group, Inc.
> >>
> >> dan.blondowski@dice.com
> >> <javascript:_e(%7B%7D,'cvml','dan.blondowski@dice.com');>
> >> 515-313-2137
> >>
> >>
> >> ------------------------------
> >> This email has been scanned for email related threats by Mimecast
> >> <http://www.mimecast.com>
> >> ------------------------------
> >>
> ------------------------------------------------------------
> ---------------------------
>  This email has been scanned for email related threats and delivered
> safely by Mimecast.
>  For more information please visit http://www.mimecast.com
> ------------------------------------------------------------
> ---------------------------
>

Re: Error parsing JSON

Posted by Andries Engelbrecht <ae...@maprtech.com>.
Look into the JSON data model
http://drill.apache.org/docs/json-data-model/ <http://drill.apache.org/docs/json-data-model/>

You can set the union type to true for the session and the use the typeof function to determine data types of the fields. Perhaps do a group by typeof.
exec.enable_union_type

Do note that setting to all text mode need to be turned off before working on this.


--Andries


> On Oct 12, 2016, at 11:17 AM, Dan Blondowski <da...@dhigroupinc.com> wrote:
> 
> Yes, I understand that part.
> I wasn’t clear enough.  By trial and error, I meant - me trying to find
> which field was bad.   I’m going to ask the guys that provide the source
> file, to make sure data types for each field are consistent across records.
> 
> 
> 
> 
> 
> On 10/12/16, 12:53 PM, "Khurram Faraaz" <kf...@maprtech.com> wrote:
> 
>> Since your JSON file has different types of data (line int, boolean,
>> strings etc) we need to tell Drill to consider all data in that JSON file
>> as text. This is by design, and this is not trial and error.
>> Such occurrence of different types of data in a JSON file is called
>> SchemaChange and to avoid that we set `store.json.all_text_mode` = true
>> 
>> On Wed, Oct 12, 2016 at 9:31 PM, Dan Blondowski <
>> dan.blondowski@dhigroupinc.com> wrote:
>> 
>>> Hello.  You were correct.
>>> I was able to set `store.json.all_text_mode` = true; and run the query.
>>> 
>>> I also found the data that needed changing.  There were actually 3
>>> fields.
>>> 
>>> Is there a way for Drill to display which field is bad, so I don¹t have
>>> to
>>> do the trial & error?
>>> 
>>> 
>>> 
>>> 
>>> On 10/12/16, 10:12 AM, "Abhishek Girish" <ab...@gmail.com>
>>> wrote:
>>> 
>>>> Hey Dan,
>>>> 
>>>> This usually happens when there is a schema change across records. It
>>>> could
>>>> be intentional (records do need different types for some fields) or bad
>>>> formatting (1 vs "1").
>>>> 
>>>> Can you try setting this session option and retry your query? Let us
>>> know
>>>> if it helps. You could attempt to use explicit casts in the query to
>>> get
>>>> types you want.
>>>> 
>>>>         set `store.json.all_text_mode` = true;
>>>> 
>>>> Regards,
>>>> Abhishek
>>>> 
>>>> On Wednesday, October 12, 2016, Dan Blondowski <
>>>> dan.blondowski@dhigroupinc.com> wrote:
>>>> 
>>>>> Hello.
>>>>> 
>>>>> I have a json file with 2 records (see attached).
>>>>> 
>>>>> When I run a query against it I get this error:
>>>>> 
>>>>> *Error: DATA_READ ERROR: Error parsing JSON - You tried to write a
>>>>> BigInt
>>>>> type when you are using a ValueWriter of type
>>>>> NullableVarCharWriterImpl.*
>>>>> 
>>>>> 
>>>>> *File  /test.json*
>>>>> 
>>>>> *Record  2*
>>>>> 
>>>>> *Fragment 0:0*
>>>>> 
>>>>> 
>>>>> *[Error Id: 8b5166cc-28b6-488c-893f-f0265d483e13 on
>>>>> ip-10-3-48-183:31010]
>>>>> (state=,code=0)*
>>>>> 
>>>>> If I break it up into individual files, I can run the query against
>>> both
>>>>> records just fine. Any clues?
>>>>> 
>>>>> Thanks!
>>>>> 
>>>>> Daniel Blondowski
>>>>> 
>>>>> Big Data Architecture/Engineering
>>>>> DHI Group, Inc.
>>>>> 
>>>>> dan.blondowski@dice.com
>>>>> <javascript:_e(%7B%7D,'cvml','dan.blondowski@dice.com');>
>>>>> 515-313-2137
>>>>> 
>>>>> 
>>>>> ------------------------------
>>>>> This email has been scanned for email related threats by Mimecast
>>>>> <http://www.mimecast.com>
>>>>> ------------------------------
>>>>> 
>>> ------------------------------------------------------------
>>> ---------------------------
>>> This email has been scanned for email related threats and delivered
>>> safely by Mimecast.
>>> For more information please visit http://www.mimecast.com
>>> ------------------------------------------------------------
>>> ---------------------------
>>> 
> ---------------------------------------------------------------------------------------
> This email has been scanned for email related threats and delivered safely by Mimecast.
> For more information please visit http://www.mimecast.com
> ---------------------------------------------------------------------------------------


Re: Error parsing JSON

Posted by Dan Blondowski <da...@dhigroupinc.com>.
Yes, I understand that part.
I wasn’t clear enough.  By trial and error, I meant - me trying to find
which field was bad.   I’m going to ask the guys that provide the source
file, to make sure data types for each field are consistent across records.





On 10/12/16, 12:53 PM, "Khurram Faraaz" <kf...@maprtech.com> wrote:

>Since your JSON file has different types of data (line int, boolean,
>strings etc) we need to tell Drill to consider all data in that JSON file
>as text. This is by design, and this is not trial and error.
>Such occurrence of different types of data in a JSON file is called
>SchemaChange and to avoid that we set `store.json.all_text_mode` = true
>
>On Wed, Oct 12, 2016 at 9:31 PM, Dan Blondowski <
>dan.blondowski@dhigroupinc.com> wrote:
>
>> Hello.  You were correct.
>> I was able to set `store.json.all_text_mode` = true; and run the query.
>>
>> I also found the data that needed changing.  There were actually 3
>>fields.
>>
>> Is there a way for Drill to display which field is bad, so I don¹t have
>>to
>> do the trial & error?
>>
>>
>>
>>
>> On 10/12/16, 10:12 AM, "Abhishek Girish" <ab...@gmail.com>
>> wrote:
>>
>> >Hey Dan,
>> >
>> >This usually happens when there is a schema change across records. It
>> >could
>> >be intentional (records do need different types for some fields) or bad
>> >formatting (1 vs "1").
>> >
>> >Can you try setting this session option and retry your query? Let us
>>know
>> >if it helps. You could attempt to use explicit casts in the query to
>>get
>> >types you want.
>> >
>> >          set `store.json.all_text_mode` = true;
>> >
>> >Regards,
>> >Abhishek
>> >
>> >On Wednesday, October 12, 2016, Dan Blondowski <
>> >dan.blondowski@dhigroupinc.com> wrote:
>> >
>> >> Hello.
>> >>
>> >> I have a json file with 2 records (see attached).
>> >>
>> >> When I run a query against it I get this error:
>> >>
>> >> *Error: DATA_READ ERROR: Error parsing JSON - You tried to write a
>> >>BigInt
>> >> type when you are using a ValueWriter of type
>> >>NullableVarCharWriterImpl.*
>> >>
>> >>
>> >> *File  /test.json*
>> >>
>> >> *Record  2*
>> >>
>> >> *Fragment 0:0*
>> >>
>> >>
>> >> *[Error Id: 8b5166cc-28b6-488c-893f-f0265d483e13 on
>> >>ip-10-3-48-183:31010]
>> >> (state=,code=0)*
>> >>
>> >> If I break it up into individual files, I can run the query against
>>both
>> >> records just fine. Any clues?
>> >>
>> >> Thanks!
>> >>
>> >> Daniel Blondowski
>> >>
>> >> Big Data Architecture/Engineering
>> >> DHI Group, Inc.
>> >>
>> >> dan.blondowski@dice.com
>> >> <javascript:_e(%7B%7D,'cvml','dan.blondowski@dice.com');>
>> >> 515-313-2137
>> >>
>> >>
>> >> ------------------------------
>> >> This email has been scanned for email related threats by Mimecast
>> >> <http://www.mimecast.com>
>> >> ------------------------------
>> >>
>> ------------------------------------------------------------
>> ---------------------------
>>  This email has been scanned for email related threats and delivered
>> safely by Mimecast.
>>  For more information please visit http://www.mimecast.com
>> ------------------------------------------------------------
>> ---------------------------
>>
---------------------------------------------------------------------------------------
 This email has been scanned for email related threats and delivered safely by Mimecast.
 For more information please visit http://www.mimecast.com
---------------------------------------------------------------------------------------

Re: Error parsing JSON

Posted by Khurram Faraaz <kf...@maprtech.com>.
Since your JSON file has different types of data (line int, boolean,
strings etc) we need to tell Drill to consider all data in that JSON file
as text. This is by design, and this is not trial and error.
Such occurrence of different types of data in a JSON file is called
SchemaChange and to avoid that we set `store.json.all_text_mode` = true

On Wed, Oct 12, 2016 at 9:31 PM, Dan Blondowski <
dan.blondowski@dhigroupinc.com> wrote:

> Hello.  You were correct.
> I was able to set `store.json.all_text_mode` = true; and run the query.
>
> I also found the data that needed changing.  There were actually 3 fields.
>
> Is there a way for Drill to display which field is bad, so I don¹t have to
> do the trial & error?
>
>
>
>
> On 10/12/16, 10:12 AM, "Abhishek Girish" <ab...@gmail.com>
> wrote:
>
> >Hey Dan,
> >
> >This usually happens when there is a schema change across records. It
> >could
> >be intentional (records do need different types for some fields) or bad
> >formatting (1 vs "1").
> >
> >Can you try setting this session option and retry your query? Let us know
> >if it helps. You could attempt to use explicit casts in the query to get
> >types you want.
> >
> >          set `store.json.all_text_mode` = true;
> >
> >Regards,
> >Abhishek
> >
> >On Wednesday, October 12, 2016, Dan Blondowski <
> >dan.blondowski@dhigroupinc.com> wrote:
> >
> >> Hello.
> >>
> >> I have a json file with 2 records (see attached).
> >>
> >> When I run a query against it I get this error:
> >>
> >> *Error: DATA_READ ERROR: Error parsing JSON - You tried to write a
> >>BigInt
> >> type when you are using a ValueWriter of type
> >>NullableVarCharWriterImpl.*
> >>
> >>
> >> *File  /test.json*
> >>
> >> *Record  2*
> >>
> >> *Fragment 0:0*
> >>
> >>
> >> *[Error Id: 8b5166cc-28b6-488c-893f-f0265d483e13 on
> >>ip-10-3-48-183:31010]
> >> (state=,code=0)*
> >>
> >> If I break it up into individual files, I can run the query against both
> >> records just fine. Any clues?
> >>
> >> Thanks!
> >>
> >> Daniel Blondowski
> >>
> >> Big Data Architecture/Engineering
> >> DHI Group, Inc.
> >>
> >> dan.blondowski@dice.com
> >> <javascript:_e(%7B%7D,'cvml','dan.blondowski@dice.com');>
> >> 515-313-2137
> >>
> >>
> >> ------------------------------
> >> This email has been scanned for email related threats by Mimecast
> >> <http://www.mimecast.com>
> >> ------------------------------
> >>
> ------------------------------------------------------------
> ---------------------------
>  This email has been scanned for email related threats and delivered
> safely by Mimecast.
>  For more information please visit http://www.mimecast.com
> ------------------------------------------------------------
> ---------------------------
>

Re: Error parsing JSON

Posted by Dan Blondowski <da...@dhigroupinc.com>.
Hello.  You were correct.
I was able to set `store.json.all_text_mode` = true; and run the query.

I also found the data that needed changing.  There were actually 3 fields.

Is there a way for Drill to display which field is bad, so I don¹t have to
do the trial & error?




On 10/12/16, 10:12 AM, "Abhishek Girish" <ab...@gmail.com> wrote:

>Hey Dan,
>
>This usually happens when there is a schema change across records. It
>could
>be intentional (records do need different types for some fields) or bad
>formatting (1 vs "1").
>
>Can you try setting this session option and retry your query? Let us know
>if it helps. You could attempt to use explicit casts in the query to get
>types you want.
>
>          set `store.json.all_text_mode` = true;
>
>Regards,
>Abhishek
>
>On Wednesday, October 12, 2016, Dan Blondowski <
>dan.blondowski@dhigroupinc.com> wrote:
>
>> Hello.
>>
>> I have a json file with 2 records (see attached).
>>
>> When I run a query against it I get this error:
>>
>> *Error: DATA_READ ERROR: Error parsing JSON - You tried to write a
>>BigInt
>> type when you are using a ValueWriter of type
>>NullableVarCharWriterImpl.*
>>
>>
>> *File  /test.json*
>>
>> *Record  2*
>>
>> *Fragment 0:0*
>>
>>
>> *[Error Id: 8b5166cc-28b6-488c-893f-f0265d483e13 on
>>ip-10-3-48-183:31010]
>> (state=,code=0)*
>>
>> If I break it up into individual files, I can run the query against both
>> records just fine. Any clues?
>>
>> Thanks!
>>
>> Daniel Blondowski
>>
>> Big Data Architecture/Engineering
>> DHI Group, Inc.
>>
>> dan.blondowski@dice.com
>> <javascript:_e(%7B%7D,'cvml','dan.blondowski@dice.com');>
>> 515-313-2137
>>
>>
>> ------------------------------
>> This email has been scanned for email related threats by Mimecast
>> <http://www.mimecast.com>
>> ------------------------------
>>
---------------------------------------------------------------------------------------
 This email has been scanned for email related threats and delivered safely by Mimecast.
 For more information please visit http://www.mimecast.com
---------------------------------------------------------------------------------------

Re: Error parsing JSON

Posted by Abhishek Girish <ab...@gmail.com>.
Hey Dan,

This usually happens when there is a schema change across records. It could
be intentional (records do need different types for some fields) or bad
formatting (1 vs "1").

Can you try setting this session option and retry your query? Let us know
if it helps. You could attempt to use explicit casts in the query to get
types you want.

          set `store.json.all_text_mode` = true;

Regards,
Abhishek

On Wednesday, October 12, 2016, Dan Blondowski <
dan.blondowski@dhigroupinc.com> wrote:

> Hello.
>
> I have a json file with 2 records (see attached).
>
> When I run a query against it I get this error:
>
> *Error: DATA_READ ERROR: Error parsing JSON - You tried to write a BigInt
> type when you are using a ValueWriter of type NullableVarCharWriterImpl.*
>
>
> *File  /test.json*
>
> *Record  2*
>
> *Fragment 0:0*
>
>
> *[Error Id: 8b5166cc-28b6-488c-893f-f0265d483e13 on ip-10-3-48-183:31010]
> (state=,code=0)*
>
> If I break it up into individual files, I can run the query against both
> records just fine. Any clues?
>
> Thanks!
>
> Daniel Blondowski
>
> Big Data Architecture/Engineering
> DHI Group, Inc.
>
> dan.blondowski@dice.com
> <javascript:_e(%7B%7D,'cvml','dan.blondowski@dice.com');>
> 515-313-2137
>
>
> ------------------------------
> This email has been scanned for email related threats by Mimecast
> <http://www.mimecast.com>
> ------------------------------
>