You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Paul Rogers (Jira)" <ji...@apache.org> on 2022/01/01 23:03:00 UTC

[jira] [Created] (DRILL-8100) JSON record writer does not convert Dril local timestamp to UTC

Paul Rogers created DRILL-8100:
----------------------------------

             Summary: JSON record writer does not convert Dril local timestamp to UTC
                 Key: DRILL-8100
                 URL: https://issues.apache.org/jira/browse/DRILL-8100
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.19.0
            Reporter: Paul Rogers
            Assignee: Paul Rogers


Drill follows the old SQL engine convention to store the `TIMESTAMP` type in the local time zone. This is, of course, highly awkward in today's age when UTC is used as the standard timestamp in most products. However, it is how Drill works. (It would be great to add a `UTC_TIMESTAMP` type, but that is another topic.)

Each reader or writer that works with files that hold UTC timestamps must convert to (reader) or from (writer) Drill's local-time timestamp. Otherwise, Drill works correctly only when the server time zone is set to UTC.

The JSON writer does not do the proper conversion, causing tests to fail when run in a time zone other than UTC.

{noformat}
  @Override
  public void writeTimestamp(FieldReader reader) throws IOException {
    if (reader.isSet()) {
      writeTimestamp(reader.readLocalDateTime());
    } else {
      writeTimeNull();
    }
  }
{noformat}

Basically, it takes a {{LocalDateTime}}, and formats it as a UTC timezone (using the "Z" suffix.) This is only valid if the machine is in the UTC time zone, which is why the test for this class attempts to force the local time zone to UTC, something that must users will not do.

A consequence of this bug is that "round trip" CTAS will change dates by the UTC offset of the machine running the CTAS. In the Pacific time zone, each "round trip" subtracts 8 hours from the time. After three round trips, the "UTC" date in the Parquet file or JSON will be a day earlier than the original data. One might argue that this "feature" is not always helpful.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)