You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Paul Rogers (Jira)" <ji...@apache.org> on 2022/01/01 06:39:00 UTC
[jira] [Created] (DRILL-8099) Parquet record writer does not convert Dril local timestamp to UTC
Paul Rogers created DRILL-8099:
----------------------------------
Summary: Parquet record writer does not convert Dril local timestamp to UTC
Key: DRILL-8099
URL: https://issues.apache.org/jira/browse/DRILL-8099
Project: Apache Drill
Issue Type: Bug
Affects Versions: 1.19.0
Reporter: Paul Rogers
Assignee: Paul Rogers
Drill follows the old SQL engine convention to store the `TIMESTAMP` type in the local time zone. This is, of course, highly awkward in today's age when UTC is used as the standard timestamp in most products. However, it is how Drill works. (It would be great to add a `UTC_TIMESTAMP` type, but that is another topic.)
Each reader or writer that works with files that hold UTC timestamps must convert to (reader) or from (writer) Drill's local-time timestamp. Otherwise, Drill works correctly only when the server time zone is set to UTC.
Now, perhaps we can convince must shops to run their Drill server in UTC, or at least set the JVM timezone to UTC. However, this still leads developers in a lurch: if the development machine timezone is not UTC, then some tests fail. In particular:
{{TestNestedDateTimeTimestamp.testNestedDateTimeCTASParquet}}
The reason that the above test fails is that the generated Parquet writer code assumes (incorrectly) that the Drill timestamp is in UTC and so no conversion is needed to write that data into Parquet. In particular, in {{ParquetOutputRecordWriter.getNewTimeStampConverter()}}:
{noformat}
reader.read(holder);
consumer.addLong(holder.value);
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)