You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Owen O'Malley (Jira)" <ji...@apache.org> on 2019/10/22 23:29:00 UTC

[jira] [Commented] (ORC-546) The timestamps are getting duplicated millis after ORC-306.

    [ https://issues.apache.org/jira/browse/ORC-546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957423#comment-16957423 ] 

Owen O'Malley commented on ORC-546:
-----------------------------------

Ok, we decided to revert this. From the dev list:

bq.    TimestampColumnVector is awkwardly defined with both the millis since 1970 and nanos with in a second. Hive's use is such that it doesn't matter if the millis have the last three digits set or not. Spark however does care.
bq. 
bq. In ORC-306, we inadvertently changed the behavior of the ORC reader to set the lower 3 digits of the millis. Previously it always had zeros.
bq. 
bq. In SPARK-24322, they added compensating code such that they now depend on the last three digits being non-zero.
bq. 
bq. In ORC-546, we changed the semantics to the previous behavior (pre ORC-306) to always have zeros. ORC-546 was released in 1.6.0 and was scheduled for 1.5.7.
bq. 
bq. Since ORC-546 requires additional changes in Spark and given that ORC 1.6.0 isn't widely used yet, I'd like to roll back ORC-546.

Therefore, we will revert this.

> The timestamps are getting duplicated millis after ORC-306.
> -----------------------------------------------------------
>
>                 Key: ORC-546
>                 URL: https://issues.apache.org/jira/browse/ORC-546
>             Project: ORC
>          Issue Type: Bug
>          Components: Java
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>            Priority: Major
>             Fix For: 1.4.5, 1.5.7, 1.6.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The Hive's TimestampColumnVector has a bad design with millis from 1970 and nanos within the second. This is consistent with java.sql.Timestamp, but it causes the millis to overlap with the nanos.
> ORC-306 changed the behavior from:
> millis:  xxxx000, nanos: 123456789
> to:
> millis:  xxxx123, nanos: 123456789
> that means that addition of the millis & nanos doubles the contribution of the millis.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)