You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@hudi.apache.org by leesf <le...@gmail.com> on 2021/02/28 15:34:00 UTC

[ANNOUNCE] Hudi Community Update(2021-01-31 ~ 2021-02-28)

Dear community,

Nice to share Hudi community updates for 2021-01-31 ~ 2021-02-28 with
updates on features, bug fixes and tests.

=======================================
Features

[Core] Improve minKey/maxKey computation in HoodieHFileWriter [1]
[Flink] Introduce FlinkHoodieSimpleIndex to hudi-flink-client [2]
[Flink Integration] InstantGenerateOperator support multiple parallelism [3]
[Flink Integration] Introduce FlinkHoodieBloomIndex to hudi-flink-client [4]
[CLI] Adding commit_show_records_info to display record sizes for commit [5]
[Flink Integration] Make Flink write pipeline write task scalable [6]
[Spark Integration] Translate the api partitionBy in spark datasource to
hoodie.datasource.write.partitionpath.field [7]
[Flink Integration] Write as minor batches during one checkpoint interval
for the new writer [8]
[Spark Integration] Support Spark Structured Streaming read from Hudi table
[9]
[Flink Integration] Gets the parallelism from context when init
StreamWriteOperatorCoordinator [10]
[Core] Schedule compaction based on time elapsed [11]
[Metaclient] Adding builder for HoodieTableMetaClient initialization [12]
[Core] Remove inline inflight rollback in hoodie writer [13]
[Flink Integration] Reduce the coupling of hadoop [14]
[Flink Integration] The state based index should bootstrap from existing
base files [15]
[Java Client] Support copyOnWriteTable in java client [16]
[Flink Integration] Avoid to rename for bucket update when there is only
one flush action during a checkpoint [17]
[Flink Integration] Some improvements to BucketAssignFunction [18]
[DeltaStreamer] Make deltaStreamer transition from dfsSouce to kafkasouce
[19]
[Hive Integration] Make whether the failure of connect hive affects hudi
ingest process configurable [20]
[Metadata Table] Added a configuration to allow specific directories to be
filtered out during Metadata Table bootstrap [21]


[1] https://issues.apache.org/jira/browse/HUDI-1519
[2] https://issues.apache.org/jira/browse/HUDI-1335
[3] https://issues.apache.org/jira/browse/HUDI-1511
[4] https://issues.apache.org/jira/browse/HUDI-1332
[5] https://issues.apache.org/jira/browse/HUDI-1571
[6] https://issues.apache.org/jira/browse/HUDI-1557
[7] https://issues.apache.org/jira/browse/HUDI-1526
[8] https://issues.apache.org/jira/browse/HUDI-1598
[9] https://issues.apache.org/jira/browse/HUDI-1109
[10] https://issues.apache.org/jira/browse/HUDI-1621
[11] https://issues.apache.org/jira/browse/HUDI-1381
[12] https://issues.apache.org/jira/browse/HUDI-1315
[13] https://issues.apache.org/jira/browse/HUDI-1486
[14] https://issues.apache.org/jira/browse/HUDI-1586
[15] https://issues.apache.org/jira/browse/HUDI-1624
[16] https://issues.apache.org/jira/browse/HUDI-1477
[17] https://issues.apache.org/jira/browse/HUDI-1637
[18] https://issues.apache.org/jira/browse/HUDI-1638
[19] https://issues.apache.org/jira/browse/HUDI-1367
[20] https://issues.apache.org/jira/browse/HUDI-1269
[21] https://issues.apache.org/jira/browse/HUDI-1611

=======================================
Bugs

[Core] Honor ordering field for MOR Spark datasource reader [1]
[Core] Call mkdir(partition) only if not exists [2]
[Core] Try to init class trying different signatures instead of checking
its name [3]
[Core] IHoodieTableMetaClient.getMarkerFolderPath works incorrectly on
windows client with hdfs server for wrong file seperator [4]
[Core] Fix Rollback Metadata AVRO backwards incompatiblity [5]
[Core] fix DefaultHoodieRecordPayload serialization failure [6]
[Hive Integration] Throw an exception when syncHoodieTable() fails, with
RuntimeException [7]
[Core] Fix bug in HoodieCombineRealtimeRecordReader with reading empty
iterators [8]
[HBase Index] Fix Hbase index to make rollback synchronous (via config) [9]


[1] https://issues.apache.org/jira/browse/HUDI-1550
[2] https://issues.apache.org/jira/browse/HUDI-1523
[3] https://issues.apache.org/jira/browse/HUDI-1538
[4] https://issues.apache.org/jira/browse/HUDI-1420
[5] https://issues.apache.org/jira/browse/HUDI-1589
[6] https://issues.apache.org/jira/browse/HUDI-1603
[7] https://issues.apache.org/jira/browse/HUDI-1582
[8] https://issues.apache.org/jira/browse/HUDI-1539
[9] https://issues.apache.org/jira/browse/HUDI-1347


=======================================
Tests

[Tests] CI intermittent failure: TestJsonStringToHoodieRecordMapFunction [1]
[Tests] Add test cases for INSERT_OVERWRITE Operation [2]
[Tests] Fix write test flakiness in StreamWriteITCase [3]
[CI] Add azure pipelines configs [4]


[1] https://issues.apache.org/jira/browse/HUDI-1547
[2] https://issues.apache.org/jira/browse/HUDI-1545
[3] https://issues.apache.org/jira/browse/HUDI-1612
[4] https://issues.apache.org/jira/browse/HUDI-1620

Best,
Leesf