You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hudi.apache.org by leesf <le...@gmail.com> on 2021/11/21 15:16:00 UTC

[ANNOUNCE] Hudi Community Update(2021-11-07 ~ 2021-11-21)

Dear community,

Nice to share Hudi community bi-weekly updates for 2021-11-07 ~ 2021-11-21
with updates on features, bug fixes and tests.


=======================================
Features

[Spark SQL] Add ORC support in Bootstrap Op [1]
[Core] Support records staying in same fileId after clustering [2]
[Flink Integration] Support scheduling online compaction plan when there
are no commit data [3]
[Core] Add support for DynamoDb based lock provider [4]
[Core] InLineFS support for S3FS logs [5]
[Spark] Add external config file support [6]
[Spark SQL] Virtual keys support for metadata table [7]
[Core] Added S3 object filter to support multiple S3EventsHoodieIncrSources
single S3 meta table [8]
[Core] Add mechanism to safely update,delete and recover table properties
[9]


[1] https://issues.apache.org/jira/browse/HUDI-1827
[2] https://issues.apache.org/jira/browse/HUDI-1877
[3] https://issues.apache.org/jira/browse/HUDI-2685
[4] https://issues.apache.org/jira/browse/HUDI-2314
[5] https://issues.apache.org/jira/browse/HUDI-2716
[6] https://issues.apache.org/jira/browse/HUDI-2362
[7] https://issues.apache.org/jira/browse/HUDI-2593
[8] https://issues.apache.org/jira/browse/HUDI-2472
[9] https://issues.apache.org/jira/browse/HUDI-2795

=======================================
Bugs

[Flink] Set up keygen class explicit for write config for flink table
upgrade [1]
[Core] bugfix: NPE when select count start from a realtime table with Tez
[2]
[Flink] Add more options when initializing table [3]
[Flink] Remove the table source options validation [4]
[Flink] Fixing metadata table updates such that only regular writes from
data table can trigger table services in metadata table [5]
[Core] The BitCaskDiskMap iterator may cause memory leak [6]
[Core] Bootstrap metadata table only if upgrade / downgrade is not
required. [7]
[Deltastreamer] Make deltastreamer checkpoint state merging more explicit
[8]
[Core] Estimate available memory size for spillable map accurately [9]
[Hive Integration] redo the logical of mor_incremental_view for hive [10]
[Core] Change default values for certin clustering configs [11]
[Core] Move EventTimeAvroPayload into hudi-common module [12]
[Core] Improved the metadata table bootstrap for very large tables [13]
[Core] Resolve inconsistent key generation for timestamp types by
GenericRecord and Row [14]
[Flink Integration] Remove the bucketAssignFunction useless context [15]
[Flink Integration] Do not bootstrap for flink insert overwrite [16]
[Core] Part1 Setting default parallelism to 200 for some of write configs
 [17]
[Core] ExternalSpillableMap payload size re-estimation throws
ArithmeticException [18]
[Core] Fixing instantiating metadata table config in HoodieFileIndex [19]
[Flink Integration] Fix flink parquet writer decimal type conversion [20]
[Spark SQL] refactor spark-sql to make consistent with DataFrame api [21]
[Core] Fix parsing of metadadata table compaction timestamp when metrics
are enabled [22]
[Core] Parallelize deleting archived hoodie commits [23]
[Core] Fixing a bug with rollback of partially failed commit which has new
partitions [24]
[Flink Integration] Fix StreamerUtil#medianInstantTime for very near
instant time [25]
[Core] Ensure list based rollback strategy is used for restore [26]
[Core] Part3 Enabling marker based rollback as default rollback strategy
[27]
[Core] Setting default metadata enable as false for Java [28]
[Flink Integration] Flink batch upsert for non partitioned table does not
work [29]
[Flink Integration] Fix the changelog mode of HoodieTableSource [30]
[Core] Avoid deleting all inflight commits heartbeats while rolling back
failed writes [31]
[Core] Allows duplicate files for metadata commit [32]
[Flink Integration] Fix flink query operation fields [33]
[Core] Make clustering work regardless of whether there are base file [34]
[Core] Metadata table support for Restore action to first commit [35]
[Core] Add configuration inference logic for few options [36]
[Flink Integration] Add option to skip compaction instants for streaming
read [37]
[Flink Integration] Make flink parquet reader compatible with decimal
BINARY encoding [38]
[Hive Integration] Update Hive sync timestamp when change detected [39]


[1] https://issues.apache.org/jira/browse/HUDI-2702
[2] https://issues.apache.org/jira/browse/HUDI-313
[3] https://issues.apache.org/jira/browse/HUDI-2709
[4] https://issues.apache.org/jira/browse/HUDI-2698
[5] https://issues.apache.org/jira/browse/HUDI-2595
[6] https://issues.apache.org/jira/browse/HUDI-2715
[7] https://issues.apache.org/jira/browse/HUDI-2591
[8] https://issues.apache.org/jira/browse/HUDI-2579
[9] https://issues.apache.org/jira/browse/HUDI-2297
[10] https://issues.apache.org/jira/browse/HUDI-2086
[11] https://issues.apache.org/jira/browse/HUDI-2442
[12] https://issues.apache.org/jira/browse/HUDI-2730
[13] https://issues.apache.org/jira/browse/HUDI-2634
[14] https://issues.apache.org/jira/browse/HUDI-2495
[15] https://issues.apache.org/jira/browse/HUDI-2738
[16] https://issues.apache.org/jira/browse/HUDI-2746
[17] https://issues.apache.org/jira/browse/HUDI-2151
[18] https://issues.apache.org/jira/browse/HUDI-2718
[19] https://issues.apache.org/jira/browse/HUDI-2741
[20] https://issues.apache.org/jira/browse/HUDI-2756
[21] https://issues.apache.org/jira/browse/HUDI-2706
[22] https://issues.apache.org/jira/browse/HUDI-2744
[23] https://issues.apache.org/jira/browse/HUDI-2683
[24] https://issues.apache.org/jira/browse/HUDI-2712
[25] https://issues.apache.org/jira/browse/HUDI-2769
[26] https://issues.apache.org/jira/browse/HUDI-2753
[27] https://issues.apache.org/jira/browse/HUDI-2151
[28] https://issues.apache.org/jira/browse/HUDI-2734
[29] https://issues.apache.org/jira/browse/HUDI-2789
[30] https://issues.apache.org/jira/browse/HUDI-2790
[31] https://issues.apache.org/jira/browse/HUDI-2641
[32] https://issues.apache.org/jira/browse/HUDI-2791
[33] https://issues.apache.org/jira/browse/HUDI-2798
[34] https://issues.apache.org/jira/browse/HUDI-2731
[35] https://issues.apache.org/jira/browse/HUDI-2796
[36] https://issues.apache.org/jira/browse/HUDI-2242
[37] https://issues.apache.org/jira/browse/HUDI-2804
[38] https://issues.apache.org/jira/browse/HUDI-2392
[39] https://issues.apache.org/jira/browse/HUDI-1932


======================================
Tests

[Tests] Enabling metadata table in TestHoodieIndex and
TestMergeOnReadRollbackActionExecutor [1]
[Tests]Enabling metadata table for TestHoodieMergeOnReadTable and
TestHoodieCompactor [2]



[1] https://issues.apache.org/jira/browse/HUDI-2472
[2] https://issues.apache.org/jira/browse/HUDI-2472




Best,
Leesf