You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Max Burke (Jira)" <ji...@apache.org> on 2021/01/20 00:03:00 UTC
[jira] [Created] (ARROW-11324) [Rust] Querying datetime data in
DataFusion with an embedded timezone always fails
Max Burke created ARROW-11324:
---------------------------------
Summary: [Rust] Querying datetime data in DataFusion with an embedded timezone always fails
Key: ARROW-11324
URL: https://issues.apache.org/jira/browse/ARROW-11324
Project: Apache Arrow
Issue Type: Bug
Components: Rust - DataFusion
Reporter: Max Burke
We have a number (~ hundreds of thousands) of Parquet files that have embedded Arrow schemas in them that have time-valued columns with the type DateTime(TimeUnit::Nanosecond, Some("UTC")).
One of the changes in the Arrow 2 -> 3 working window was to make the Parquet loader prefer the Arrow schema compared to the one generated from the columns.
But because DataFusion has the timezone field of the DateTime variant hardcoded as None, we can't load any of our data after this upgrade; we get errors like:
{{SELECT * FROM parquet_table WHERE ("timestamp" >= to_timestamp('2010-03-24T13:00:00.000000Z') AND "timestamp" <= to_timestamp('2010-03-25T00:00:00.000000Z')) ORDER BY timestamp ASC NULLS LAST;}}
{{Plan("\'Timestamp(Nanosecond, Some(\"UTC\")) >= Timestamp(Nanosecond, None)\' can\'t be evaluated because there isn\'t a common type to coerce the types to")}}
Any ideas/thoughts?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)