You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Alexey Kudinkin (Jira)" <ji...@apache.org> on 2022/02/21 21:39:00 UTC

[jira] [Assigned] (HUDI-3469) Refactor HoodieTestDataGenerator to enable reproducible builds

     [ https://issues.apache.org/jira/browse/HUDI-3469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexey Kudinkin reassigned HUDI-3469:
-------------------------------------

    Assignee: Alexey Kudinkin

> Refactor HoodieTestDataGenerator to enable reproducible builds
> --------------------------------------------------------------
>
>                 Key: HUDI-3469
>                 URL: https://issues.apache.org/jira/browse/HUDI-3469
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Alexey Kudinkin
>            Assignee: Alexey Kudinkin
>            Priority: Major
>
> Currently, `HoodieTestDataGenerator` relies on static state which make its state shared across all of the tests making data generation dependent on the order of execution.
>  
> Instead we should properly abstract `HoodieTestDataGenerator` to hold all of the state w/in individual instances so that individual tests can
>  # Create they own isolated instance (which won't be affected by other Tests)
>  # Accept "seed" value for its PRNG so that it always produces the same random sequence (for a given seed)
>  # All of the operations w/in it only rely on such internal PRNG and don't rely on any external sources (such as `UUID.randomUUID()`, `System.currentTimeMillis()`, etc)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)