You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Ethan Guo (Jira)" <ji...@apache.org> on 2022/03/15 20:37:00 UTC

[jira] [Created] (HUDI-3637) Check file listing from FS vs metadata table when compaction in pending and inflight

Ethan Guo created HUDI-3637:
-------------------------------

             Summary: Check file listing from FS vs metadata table when compaction in pending and inflight
                 Key: HUDI-3637
                 URL: https://issues.apache.org/jira/browse/HUDI-3637
             Project: Apache Hudi
          Issue Type: Task
            Reporter: Ethan Guo


HoodieMetadataTableValidator validation of the latest base files and file slices fails due to the following.  The validation failure may be due to the inflight compaction.  Need to investigate whether this affects the file listing for write operations.  The behavior is that after some instants, the validation can pass, so the MT correct is guaranteed, but the file listing view may have a bug.
{code:java}
file slices from metadata: [FileSlice {fileGroupId=HoodieFileGroupId{partitionPath='2022/1/28', fileId='769bf7ac-d6d0-452c-bf54-bbe7e8381766-0'}, baseCommitTime=20220314001058266, baseFile='HoodieBaseFile{fullPath=file:/Users/ethan/Work/scripts/mt_rollout_testing/deploy_c_multi_writer/c2_mor_010nomt_011mt/test_table/2022/1/28/769bf7ac-d6d0-452c-bf54-bbe7e8381766-0_2-47-485_20220314001058266.parquet, fileLen=106839698, BootstrapBaseFile=null}', logFiles='[]'}]
file slices from file system and base files: [FileSlice {fileGroupId=HoodieFileGroupId{partitionPath='2022/1/28', fileId='769bf7ac-d6d0-452c-bf54-bbe7e8381766-0'}, baseCommitTime=20220314001058266, baseFile='HoodieBaseFile{fullPath=file:/Users/ethan/Work/scripts/mt_rollout_testing/deploy_c_multi_writer/c2_mor_010nomt_011mt/test_table/2022/1/28/769bf7ac-d6d0-452c-bf54-bbe7e8381766-0_2-47-485_20220314001058266.parquet, fileLen=106839698, BootstrapBaseFile=null}', logFiles='[HoodieLogFile{pathStr='file:/Users/ethan/Work/scripts/mt_rollout_testing/deploy_c_multi_writer/c2_mor_010nomt_011mt/test_table/2022/1/28/.769bf7ac-d6d0-452c-bf54-bbe7e8381766-0_20220314001058266.log.1_2-111-954', fileLen=51607682}]'}]
22/03/14 00:33:03 ERROR HoodieMetadataTableValidator: Metadata table validation failed for 2022/1/28 due to HoodieValidationException {code}
Compaction:
{code:java}
Partition Path │ FileId                                 │ Base-Instant      │ Data File Path                                                            │ Total Delta Files │ getMetrics                                                                                                                  ║
╠══
 2022/1/28      │ 769bf7ac-d6d0-452c-bf54-bbe7e8381766-0 │ 20220314001058266 │ 769bf7ac-d6d0-452c-bf54-bbe7e8381766-0_2-47-485_20220314001058266.parquet │ 1                 │ {TOTAL_LOG_FILES=1.0, TOTAL_IO_READ_MB=151.0, TOTAL_LOG_FILES_SIZE=5.1607682E7, TOTAL_IO_WRITE_MB=101.0, TOTAL_IO_MB=252.0} ║ {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)