You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@hudi.apache.org by leesf <le...@gmail.com> on 2021/05/23 15:27:00 UTC

[ANNOUNCE] Hudi Community Update(2021-05-09 ~ 2021-05-23)

Dear community,

Nice to share Hudi community bi-weekly updates for 2021-05-09 ~ 2021-05-22
with updates on features, bug fixes and tests.


=======================================
Features

[Flink Integration] Avoid to generates corrupted files for flink sink [1]
[Core] Support reading older snapshots [2]
[Flink Integration] Global index for flink writer [3]
[Flink Integration] Reuse the partition path and file group id for flink
write data buffer [4]


[1] https://issues.apache.org/jira/browse/HUDI-1886
[2] https://issues.apache.org/jira/browse/HUDI-1789
[3] https://issues.apache.org/jira/browse/HUDI-1902
[4] https://issues.apache.org/jira/browse/HUDI-1911


=======================================
Bugs

[Core] Reduces log level for too verbose messages from info to debug level
[1]
[Flink Integration] FlinkCreateHandle and FlinkAppendHandle canWrite should
always return true[2]
[Flink Integration] Validate required fields for Flink HoodieTable [3]
[Flink Integration] Close the file handles gracefully for flink write
function to avoid corrupted files [4]
[Spark Integration] Fix hive beeline/spark-sql query specified field on mor
table occur NPE [5]
[Flink Integration] Always close the file handle for a flink mini-batch
write [6]
[Flink Integration] Support skip bootstrapIndex's init in abstract fs view
init [7]
[Flink Integration] Clean the corrupted files generated by
FlinkMergeAndReplaceHandle [8]
[Hive Integratoin] Honoring skipROSuffix in spark ds [9]
[Core] Using streams instead of loops for input/output [10]
[Flink Integration] Fix the file id for write data buffer before flushing
[11]
[Flink Integration] Fix hive conf for Flink writer hive meta sync [12]
[Hive Integration] hive on spark/mr,Incremental query of the mor table, the
partition field is incorrect [13]
[Flink Integration] Remove the metadata sync logic in
HoodieFlinkWriteClient#preWrite because it is not thread safe [14]
[Core]  Fix NPE when the nested partition path field has null value [15]
[Flink Integration] Fix incorrect keyBy field cause serious data skew, to
avoid multiple subtasks write to a partition at the same time [16]
[Core] Fix insert-overwrite API archival [17]


[1] https://issues.apache.org/jira/browse/HUDI-1707
[2] https://issues.apache.org/jira/browse/HUDI-1890
[3] https://issues.apache.org/jira/browse/HUDI-1818
[4] https://issues.apache.org/jira/browse/HUDI-1895
[5] https://issues.apache.org/jira/browse/HUDI-1722
[6] https://issues.apache.org/jira/browse/HUDI-1900
[7] https://issues.apache.org/jira/browse/HUDI-1446
[8] https://issues.apache.org/jira/browse/HUDI-1876
[9] https://issues.apache.org/jira/browse/HUDI-1806
[10] https://issues.apache.org/jira/browse/HUDI-1913
[11] https://issues.apache.org/jira/browse/HUDI-1915
[12] https://issues.apache.org/jira/browse/HUDI-1871
[13] https://issues.apache.org/jira/browse/HUDI-1719
[14] https://issues.apache.org/jira/browse/HUDI-1917
[15] https://issues.apache.org/jira/browse/HUDI-1888
[16] https://issues.apache.org/jira/browse/HUDI-1918
[17] https://issues.apache.org/jira/browse/HUDI-1740

======================================
Tests

[Tests] Adding test suite long running automate scripts for docker [1]
[Tests] Remove hardcoded parquet in tests [2]
[Tests] add spark datasource unit test for schema validate add column [3]

[1] https://issues.apache.org/jira/browse/HUDI-1851
[2] https://issues.apache.org/jira/browse/HUDI-1055
[3] https://issues.apache.org/jira/browse/HUDI-1768


Best,
Leesf