You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@avro.apache.org by "Christophe Le Saec (Jira)" <ji...@apache.org> on 2022/09/23 05:52:00 UTC

[jira] [Assigned] (AVRO-3611) org.apache.avro.util.RandomData generates invalid test data

     [ https://issues.apache.org/jira/browse/AVRO-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christophe Le Saec reassigned AVRO-3611:
----------------------------------------

    Assignee: Christophe Le Saec

> org.apache.avro.util.RandomData generates invalid test data
> -----------------------------------------------------------
>
>                 Key: AVRO-3611
>                 URL: https://issues.apache.org/jira/browse/AVRO-3611
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.11.1
>            Reporter: Simon Klakegg
>            Assignee: Christophe Le Saec
>            Priority: Minor
>              Labels: features, pull-request-available
>             Fix For: 1.11.2
>
>         Attachments: image-2022-08-18-19-05-37-323.png
>
>   Original Estimate: 48h
>          Time Spent: 0.5h
>  Remaining Estimate: 47.5h
>
> When RandomData.java generates data it does not check for Logical Types, which are described here: [Specification | Apache Avro|https://avro.apache.org/docs/1.11.1/specification/_print/]
> *For instance the following generate method would return this for INT fields:*
> {code:java}
>     case INT:      return random.nextInt(); {code}
>  
> {*}However, an int field could be of logical type date:{*}!image-2022-08-18-19-05-37-323.png|width=1052,height=266!
>  
> Which in many cases could create an int that is out of range for logicalType Date, and thus break when creating records in for instance kafka.
> My suggestion is to generated data that is valid for logicalTypes, here is an example I made for int and long:
> {code:java}
> case INT:
>     switch (logicalTypeName) {
>       case "date":
>         // Random number of days between Unix Epoch start day (0) and end day (24855)
>         int maxDaysInEpoch = (int) Duration.ofSeconds(Integer.MAX_VALUE).toDays();
>         return ThreadLocalRandom.current().nextInt(0, maxDaysInEpoch);
>       case "time-millis":
>         // Random number of milliseconds between midnight 00:00:00.000 (0) and 23:59:59:999 (86399999)
>         int maxMillisecondsInDay = (int) Duration.ofDays(1).toMillis() - 1;
>         return random.nextInt(0, maxMillisecondsInDay);
>       default: return random.nextInt();
>     }
> case LONG:
>   switch (logicalTypeName) {
>     case "time-micros":
>       // Random number of microseconds between midnight 00:00:00.000000 (0) and 23:59:59:999999 (86399999999)
>       long maxMicrosecondsInDay = (Duration.ofDays(1).toNanos() - 1) / 1000;
>       return random.nextLong(0, maxMicrosecondsInDay);
>     case "timestamp-millis":
>       // Random milliseconds between Unix Epoch (0) start and end (2147483647000)
>       long maxMillisecondsInEpoch = TimeUnit.SECONDS.toMillis(Integer.MAX_VALUE);
>       return ThreadLocalRandom.current().nextLong(0, maxMillisecondsInEpoch);
>     case "timestamp-micros":
>       // Random microseconds between Unix Epoch (0) start and end (2147483647000000)
>       long maxMicrosecondsInEpoch = TimeUnit.SECONDS.toMicros(Integer.MAX_VALUE);
>       return ThreadLocalRandom.current().nextLong(0, maxMicrosecondsInEpoch);
>     case "local-timestamp-millis":
>       // Random number of milliseconds between Unix Epoch start (0) and 100 years from now (now() + 100)
>       ZonedDateTime hundredYearsFromNow = ZonedDateTime.now().plusYears(100);
>       long hundredYearsEpochMillis = ChronoUnit.MILLIS.between(Instant.EPOCH, hundredYearsFromNow);
>       return random.nextLong(0, hundredYearsEpochMillis);
>     case "local-timestamp-micros":
>       // Random number of microseconds between Unix Epoch start (0) and 100 years from now (now() + 100)
>       long hundredYearsEpochMicros = ChronoUnit.MICROS.between(Instant.EPOCH, hundredYearsFromNow);
>       return random.nextLong(0, hundredYearsEpochMicros);
>     default: return random.nextLong();
>   } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)