You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Belton Zhong (Jira)" <ji...@apache.org> on 2021/05/13 15:04:00 UTC

[jira] [Comment Edited] (AVRO-2904) timestamp-millis doesn't truncate when in a union with null

    [ https://issues.apache.org/jira/browse/AVRO-2904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17343918#comment-17343918 ] 

Belton Zhong edited comment on AVRO-2904 at 5/13/21, 3:03 PM:
--------------------------------------------------------------

Hi,

Would it be possible to add a null check on the value paramater in the setter prior to the call to truncatedTo()? 


was (Author: beltonzhong):
Hi,

Would it be possible to add a null check to the setter prior to the call to truncatedTo()? 

> timestamp-millis doesn't truncate when in a union with null
> -----------------------------------------------------------
>
>                 Key: AVRO-2904
>                 URL: https://issues.apache.org/jira/browse/AVRO-2904
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java, logical types
>    Affects Versions: 1.10.0
>            Reporter: Dan Lipofsky
>            Priority: Major
>
> In Avro 1.10.0 the setter for logical-types {{timestamp-millis}} and {{time-millis}} will truncate the timestamp to millis in the simplest condition but not when it is in a union with null.
> It looks like something similar was addressed in AVRO-2360 but that was supposedly fixed in 1.9.0 and this is still broken in 1.10.0.
> Note: it would appear that in Java8 the {{Instant.now()}} is already truncated to millis so it usually isn't a problem, but in Java11 you get microsecond precision (at least on macOS 10.14.6 with recent hardware). However you can see the bug even in Java8 if you force the microseconds the way I did in my test.
> I uploaded by code to [https://github.com/dlipofsky/AVRO-2904]
> Here are my schemas
> {noformat}
> {
>   "type" : "record",
>   "name" : "TimestampTest1",
>   "fields" : [ {
>     "name" : "time",
>     "type" : [ "null", { "type" : "long", "logicalType" : "timestamp-millis" } ]
>   } ]
> }
> {noformat}
> {noformat}
> {
>   "type" : "record",
>   "name" : "TimestampTest2",
>   "fields" : [ {
>     "name" : "time",
>     "type" : { "type" : "long", "logicalType" : "timestamp-millis" }
>   } ]
> }
> {noformat}
> Here is my test, in which {{test2}} passes but {{test1}} fails.
> {noformat}
> public class Avro2904Test {
>     // If you just used Instant.now() test would randomly pass 1 times out of a 1000.
>     // If you use Instant.now().truncatedTo(ChronoUnit.MILLIS) it would always pass.
>     // By making sure micros is non-zero we ensure it always fails and demonstrates AVRO-2904.
>     private Instant now = Instant.now().truncatedTo(ChronoUnit.MILLIS).plus(999, ChronoUnit.MICROS);
>     @Test
>     public void test1() throws IOException {
>         final TimestampTest1 t = TimestampTest1.newBuilder().setTime(now).build();
>         byte[] bytes = serialize(t);
>         TimestampTest1 other = deserialize(t.getSchema(), bytes);
>         // fails because of AVRO-2904
>         assertEquals(t, other);
>     }
>     @Test
>     public void test2() throws IOException {
>         final TimestampTest2 t = TimestampTest2.newBuilder().setTime(Instant.now()).build();
>         byte[] bytes = serialize(t);
>         TimestampTest2 other = deserialize(t.getSchema(), bytes);
>         // if logicalType timestamp-millis in not in union with null it works fine
>         assertEquals(t, other);
>     }
>     private byte[] serialize(SpecificRecord record) throws IOException {
>         try (final ByteArrayOutputStream out = new ByteArrayOutputStream()) {
>             SpecificDatumWriter<SpecificRecord> writer = new SpecificDatumWriter<>(
>                 record.getSchema());
>             Encoder encoder = EncoderFactory.get().binaryEncoder(out, null);
>             writer.write(record, encoder);
>             encoder.flush();
>             final byte[] bytes = out.toByteArray();
>             return bytes;
>         }
>     }
>     private <T extends SpecificRecord> T deserialize(Schema schema, byte[] bytes)
>             throws IOException {
>         Decoder decoder = DecoderFactory.get().binaryDecoder(bytes, null);
>         SpecificDatumReader<T> reader = new SpecificDatumReader<>(schema);
>         return reader.read(null, decoder);
>     }
> }
> {noformat}
> {{test1}} fails with
> {noformat}
> java.lang.AssertionError: 
> Expected :{"time": 2020-07-30T23:14:08.294999Z}
> Actual   :{"time": 2020-07-30T23:14:08.294Z}
> {noformat}
> If I change my tests to use {{Instant.now().truncatedTo(ChronoUnit.MILLIS)}} they'll both pass, but it seems like that should not be necessary, particularly given the code below.
> Looking at the generated Java DTO, I see a few differences, but the most important would seem to be that the working DTO has
> {noformat}
> public void setTime(java.time.Instant value) {
>   this.time = value.truncatedTo(java.time.temporal.ChronoUnit.MILLIS);
> }
> {noformat}
> while the broken one has
> {noformat}
> public void setTime(java.time.Instant value) {
>   this.time = value;
> }
> {noformat}
> The {{Builder}} is similar (only the non-union version has {{truncatedTo}} in the setter).
> The working one also has this code, which is missing from the broken one:
> {noformat}
> private static final org.apache.avro.Conversion<?>[] conversions =
>     new org.apache.avro.Conversion<?>[] {
>     new org.apache.avro.data.TimeConversions.TimestampMillisConversion(),
>     null
> };
> @Override
> public org.apache.avro.Conversion<?> getConversion(int field) {
>   return conversions[field];
> }
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)