You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Eugene Koifman (JIRA)" <ji...@apache.org> on 2014/02/04 21:50:10 UTC

[jira] [Created] (HIVE-6370) LazySimpleSerDe doesn't handle Date and Timestamp properly

Eugene Koifman created HIVE-6370:
------------------------------------

             Summary: LazySimpleSerDe doesn't handle Date and Timestamp properly
                 Key: HIVE-6370
                 URL: https://issues.apache.org/jira/browse/HIVE-6370
             Project: Hive
          Issue Type: Bug
          Components: Serializers/Deserializers
    Affects Versions: 0.12.0
            Reporter: Eugene Koifman


LazySimpleSerde#serialize() calls LazyUtils.writePrimitiveUTF8() to handle primitive types.
When writing out java.sql.Date, this in turn, calls LazyDate.writeUTF8() which calls DateWritable.toString(), which is effectively Date.toString().
Date.toString() makes an implicit adjustment for the local timezone in it's output.  Thus if Date.getTime() is on a day boundary (midnight UTC), toString() on it will write out the previous day.  Date.valueOf() which is used by this SerDe to read data makes a similar adjustment for current timezone.

This is wrong, it should write out Date.getTime() (possibly normalizing to day boundary).  This will make read/write independent of current timezone.

I think java.sql.Timestamp has similar issue.  When this is fixed, work in HIVE-5814 should be adjusted to work with getTime() rather than use deprecated day/month/year API it uses now.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)