You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/10/29 00:52:52 UTC

[GitHub] [iceberg] rdblue opened a new issue #1680: Spec: document partition transforms for timestamps before 1970

rdblue opened a new issue #1680:
URL: https://github.com/apache/iceberg/issues/1680


   The Java implementation uses `ChronoUnit.between` to calculate years, months, days, and hours transforms for timestamp and date types. For dates and timestamps before 1970, the values are off by 1 because units are rounded down.
   
   For example, 1969-12-31 23:59:59.000000 is -1,000,000 in microseconds. When that value is converted to hours by dividing by 3,600,000,000 microseconds per hour, the value is 0 instead of -1. But 1 second after new year 1970 results in the same value, 0. This affects all timestamp transforms and the year/month date transforms (days with date values is correct because the interval is a whole number of days).
   
   This was discussed in the community and the consensus was to document the current behavior of the `year`, `month`, `day`, and `hour` transforms in the spec so that any existing data can be read correctly. Then, new transforms without this issue should be introduced and used by default for v2 tables.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue closed issue #1680: Spec: document partition transforms for timestamps before 1970

Posted by GitBox <gi...@apache.org>.
rdblue closed issue #1680:
URL: https://github.com/apache/iceberg/issues/1680


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on issue #1680: Spec: document partition transforms for timestamps before 1970

Posted by GitBox <gi...@apache.org>.
rdblue commented on issue #1680:
URL: https://github.com/apache/iceberg/issues/1680#issuecomment-752223855


   I started working on this and realized that it is simpler to account for the bad values produced by version 0.10.0 and older.
   
   Testing showed that the date transform didn't have the bug, so documenting the behavior would be more complicated in the spec. In addition, adding new transforms to the spec would be a much larger addition than adding example test cases and noting old versions that were affected because it would require twice as many supported date/time transforms.
   
   Instead, I chose to adjust when projecting data predicates to partition predicates. The advantage is that the update to the spec only needs to be a few test values and a note about the behavior of old versions. Getting this in by v2 also means that the adjustment isn't needed for tables that are created as v2 tables because v1 writers with the problem cannot read or write v2 tables.
   
   #1981 has more extensive tests for date/time transforms and adjusts predicates while projecting to account for the transform bug. Once that is in, I'll update the spec with test values, a description of the problem, and affected versions.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue closed issue #1680: Spec: document partition transforms for timestamps before 1970

Posted by GitBox <gi...@apache.org>.
rdblue closed issue #1680:
URL: https://github.com/apache/iceberg/issues/1680


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org