You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by "David F. Severski" <da...@severski.net> on 2020/01/07 19:43:22 UTC

Parquet INT64 Nullable Type Support

Hello, fellow drill-ers!

Reposting from the Drill Slack community, under the apache/drill:1.17
docker container, I am having problems querying a parquet file. This file
(possibly generated via pyspark) has an INT64 type field that whenever
included generates an immediate error in the complex parquet reader.

The logs mention: "Unsupported nullable converted type INT_64 for primitive
type INT64" and there are some JIRA references to nullable support being
added not too long ago for INT16. Attempts to CAST() this field as INT and
various permutations on CONVERT_FROM() are unsuccessful.

Any thoughts on how to proceed? I don't have easy access to a sanitized
sample for sharing at the moment. I'm already having to do explicit casting
for an INT32 type field in the same file and hoping there's a similar trick
to use for this INT64 field to keep me moving.

David

Re: Parquet INT64 Nullable Type Support

Posted by Arina Yelchiyeva <ar...@gmail.com>.
Hi David,

Looks like this is a bug. Could you please file a Jira?
It would be nice if you could provide file example so the fix can be checked.
But looking in the code we just need to handle int64 here:
https://github.com/apache/drill/blob/9993fa3547b029db5fe33a2210fa6f07e8ac1990/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/ColumnReaderFactory.java#L303

Not sure if there any other workarounds except of the fix.

Kind regards,
Arina

> On Jan 7, 2020, at 9:43 PM, David F. Severski <da...@severski.net> wrote:
> 
> Hello, fellow drill-ers!
> 
> Reposting from the Drill Slack community, under the apache/drill:1.17
> docker container, I am having problems querying a parquet file. This file
> (possibly generated via pyspark) has an INT64 type field that whenever
> included generates an immediate error in the complex parquet reader.
> 
> The logs mention: "Unsupported nullable converted type INT_64 for primitive
> type INT64" and there are some JIRA references to nullable support being
> added not too long ago for INT16. Attempts to CAST() this field as INT and
> various permutations on CONVERT_FROM() are unsuccessful.
> 
> Any thoughts on how to proceed? I don't have easy access to a sanitized
> sample for sharing at the moment. I'm already having to do explicit casting
> for an INT32 type field in the same file and hoping there's a similar trick
> to use for this INT64 field to keep me moving.
> 
> David