You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Stefán Baxter <st...@activitystream.com> on 2015/09/02 03:19:27 UTC

data type differences and 2 incompatible parquet files

Hi,

I'm battling minor inconsistencies in 2 Parquet file generated from the
same(ish) json structure. (product of 2 separate CTAS processes but the
json was compatible before conversion)

I can not create query that reads from them both and this is the error I
get:

[Error Id: 4ee4c131-31fc-4252-a664-5a2e855349fb on localhost:31010]
  (java.lang.IllegalStateException) Failure while reading vector.  Expected
vector class of org.apache.drill.exec.vector.NullableVarCharVector but was
holding vector class org.apache.drill.exec.vector.NullableIntVector.

Turning on verbose logging produces a stacktrace that gives me no usable
information regarding tracking down the field or the value.

I'm assuming, because that has happened to me too many times before using
Drill, that this is a null value that is interpreted as numeric value that
then clashes with a string value.

Is there anyone here that can assist me in working around this?

(there are no data-type-changes in these files and the only difference may
be fields that are missing in one and present in the other)

Regards,
 -Stefán