You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by John Omernik <jo...@omernik.com> on 2017/09/22 18:32:13 UTC

Error Messages and Finding the Issue in JSON

Hello all, I am struggling in both defining what the issue is with my JSON
file, and how to zero in on it so I can define it.  I figured I'd post my
thought process now, as verbose and scary as that is, to help folks see
some issues with error messages and (hopefully while helping me solve my
problem) determine ways we can make the error messages/troubleshooting
better.

First the description. I have a json file with one object per line. It's a
nasty structured/nested thing, without many human readable fields which
makes this process harder.  The original file (let's call it orig.json) has
66240 lines in it. This is Drill 1.10.0

When I run the following query as is:

select * from `/path/to/orig.json` limit 10I get:

Error Returned - Code: 500
Error Text:
SYSTEM ERROR: IllegalStateException: You tried to start when you are
using a ValueWriter of type NullableBigIntWriterImpl.

Fragment 0:0

[Error Id: de1d4c54-8a6d-4765-9f5f-2b75fdfd4b49 on zeta8.brewingintel.com:20005]


 I look at this error and already I am lost (as a user who hasn't
lived in this user group for the past X years).  It's just not helpful
for me to troubleshoot the issue. I don't know where to look in the
offending files, if this was a directory of files, I wouldn't know
which file had the issue.

Even if I knew the file (as I do) I am lost in what I am actually looking for.


*Point 1: Where are we looking for the problem? Can that be in a the
Error message? Filename, and line number at a min, char loc of error
if possible. Field names if it knows it!*

*Point 2: If a dev based error message is used (that's what I am
calling this error message) Can there be some human readable
explaination too? Like "A value you are reading may have changed types
from X to Y, and that's causing this issue.*

*Point 3: If there are options to try to fix, can those be included as well?*


Ok, so going on to my own troubleshooting, I decided to guess and
check the problem to a line number


> head -5000 orig.json > test.json # Run query, still errors

> head -4000 orig.json > test.json # Run query, still errors

> head -3000 orig.json > test.json # Run query, no error!

> head -3500 orig.json > test.json # Error

...

...

> head -3425 orig.json > test.json # Error

> head -3424 orig.json > test.json # No error!

Ok, so I think the error is in line 3425 of the file, so now I try:

> tail -1 test.json > test1.json

And run the query. No error. That makes sense, one record there...  so then


> tail -10 test.json > test1.json # No error Huh?

> tail -500 test.json > test1.json # no error!!!

> tail -1000 test.json > test1.json # no error!!! (This is getting frustrating)

> tail -3348 test.json > test1.json # No error

> tail -3349 test.json > test1.json # Error


So I guess I am stumped, why does something change and why does error
be somewhat based on where I drop the need, yet inconsistent? And
through this, how can we help improve this process for all users?


Thanks!


John