You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "wqwl611 (via GitHub)" <gi...@apache.org> on 2023/02/15 15:41:35 UTC

[GitHub] [hudi] wqwl611 opened a new issue, #7969: [SUPPORT] data loss in new base file.

wqwl611 opened a new issue, #7969:
URL: https://github.com/apache/hudi/issues/7969

   **Describe the problem you faced**
   I find some data loss in the  new base file: [00000000-9e95-4471-bba0-5604a282aa34-0_0-12-4_20230208003459996.parquet].
   I doubt that compaction plan may miss some delta log. 
   How can I check the archive compaction plan?
   <img width="1273" alt="image" src="https://user-images.githubusercontent.com/67826098/219074761-6150bcf1-89f5-4333-8eea-960105c07f94.png">
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1.
   2.
   3.
   4.
   
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version :
   
   * Spark version : 3.2.0
   
   * Hive version :
   
   * Hadoop version :
   
   * Storage (HDFS/S3/GCS..) : hdfs
   
   * Running on Docker? (yes/no) :no
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   ```Add the stacktrace of the error.```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #7969: [SUPPORT] data loss in new base file.

Posted by "yihua (via GitHub)" <gi...@apache.org>.
yihua commented on issue #7969:
URL: https://github.com/apache/hudi/issues/7969#issuecomment-1435410982

   @wqwl611 if you cannot find `.00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703.log.6_0-5545-224979` in any commit metadata, it's likely that the log block(s) in this log file is written by a failed commit, so it is not included in the compaction plan, which is expected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] wqwl611 commented on issue #7969: [SUPPORT] data loss in new base file.

Posted by "wqwl611 (via GitHub)" <gi...@apache.org>.
wqwl611 commented on issue #7969:
URL: https://github.com/apache/hudi/issues/7969#issuecomment-1432426502

   @yihua I can found all other delta file in arvhive file except the log: .00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703.log.6_0-5545-224979
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #7969: [SUPPORT] data loss in new base file.

Posted by "yihua (via GitHub)" <gi...@apache.org>.
yihua commented on issue #7969:
URL: https://github.com/apache/hudi/issues/7969#issuecomment-1432517979

   Hi @wqwl611 could you check the Hudi timeline and see which commits wrote the log files with the prefix `.00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703`?  and how's the sequence of the commits in the Hudi timeline?  That'll help us understand the sequence of events and see if there's any bug in MOR snapshot read or compaction.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #7969: [SUPPORT] data loss in new base file.

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #7969:
URL: https://github.com/apache/hudi/issues/7969#issuecomment-1436223752

   Hi, @wqwl611 , guess you encountered the bug introduced in release 0.11.x: https://github.com/apache/hudi/pull/6179


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] wqwl611 commented on issue #7969: [SUPPORT] data loss in new base file.

Posted by "wqwl611 (via GitHub)" <gi...@apache.org>.
wqwl611 commented on issue #7969:
URL: https://github.com/apache/hudi/issues/7969#issuecomment-1432566979

   
   
   > Hi @wqwl611 could you check the Hudi timeline and see which commits wrote the log files with the prefix `.00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703`? and how's the sequence of the commits in the Hudi timeline? That'll help us understand the sequence of events and see if there's any bug in MOR snapshot read or compaction.
   
   @yihua I check the active timelineļ¼Œand can't not find which commit, it should be archived,  but I can't find this delta log in the archived file.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] wqwl611 commented on issue #7969: [SUPPORT] data loss in new base file.

Posted by "wqwl611 (via GitHub)" <gi...@apache.org>.
wqwl611 commented on issue #7969:
URL: https://github.com/apache/hudi/issues/7969#issuecomment-1432521810

   @yihua I am in Chinaļ¼Œplease WeChat
   <img width="294" alt="image" src="https://user-images.githubusercontent.com/67826098/219274163-77ecbe2a-ce70-4683-9f39-1773f3558439.png">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #7969: [SUPPORT] data loss in new base file.

Posted by "yihua (via GitHub)" <gi...@apache.org>.
yihua commented on issue #7969:
URL: https://github.com/apache/hudi/issues/7969#issuecomment-1432519140

   @wqwl611 do you use Slack or Wechat?  Feel free to DM me for faster iteration.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] lokeshj1703 commented on issue #7969: [SUPPORT] data loss in new base file.

Posted by "lokeshj1703 (via GitHub)" <gi...@apache.org>.
lokeshj1703 commented on issue #7969:
URL: https://github.com/apache/hudi/issues/7969#issuecomment-1564265015

   @wqwl611 Can we use the hudi-cli command `show archived commit stats` to understand how many log files were actually written in the delta commits? We can then compare it with the number of log files compacted to understand if a log file was missed during compaction.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #7969: [SUPPORT] data loss in new base file.

Posted by "yihua (via GitHub)" <gi...@apache.org>.
yihua commented on issue #7969:
URL: https://github.com/apache/hudi/issues/7969#issuecomment-1431873843

   Hi @wqwl611 thanks for raising this.  Could you clarify what kind of data loss do you observe (missing records, updates not applied, missing columns, etc.)?  Also, are there any failures or other table services running before the compaction happened, based on the Hudi timeline?
   
   To inspect the compaction plan, you may use [Hudi CLI compaction commands](https://hudi.apache.org/docs/cli#compactions) or directly check the requested compaction instant under `.hoodie/`(`avrocat <instant_time>.compaction.requested`).  If the compaction commit is archived, you may only look at the archived file for now to understand the compaction plan.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] wqwl611 commented on issue #7969: [SUPPORT] data loss in new base file.

Posted by "wqwl611 (via GitHub)" <gi...@apache.org>.
wqwl611 commented on issue #7969:
URL: https://github.com/apache/hudi/issues/7969#issuecomment-1434122121

   the following is relate compaction meta in archive file, and it don't contain the delta log[.00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703.log.6_0-5545-224979], and I don't find the same log in other meta eithger.
   @yihua 
   
   {"hoodieCommitMetadata": null, "hoodieCleanMetadata": null, "hoodieCompactionMetadata": null, "hoodieRollbackMetadata": null, "hoodieSavePointMetadata": null, "commitTime": "20230208003459996", "actionType": "compaction", "version": null, "hoodieCompactionPlan": {"operations": [{"baseInstantTime": "20230208001523703", "deltaFilePaths": [".00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703.log.1_0-26568-763035", ".00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703.log.1_0-26748-768915", ".00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703.log.2_0-26568-763035", ".00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703.log.3_0-26748-768915", ".00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703.log.4_0-9952-310687", ".00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703.log.5_1-0-1"], "dataFilePath": "00000000-9e95-4471-bba0-5604a282aa34-0_1-8918-275143_20230208001523703.parquet", "fileId": "00000000-9e95-4471-bba0-5604a282aa34-0", "partitionPath": "_db=c
 fdmscoredb/_tbl=debt_detail_snap_202302/_id_mod=12", "metrics": {"TOTAL_LOG_FILES": 6.0, "TOTAL_IO_READ_MB": 5930.0, "TOTAL_LOG_FILES_SIZE": 6.149091603E9, "TOTAL_IO_WRITE_MB": 66.0, "TOTAL_IO_MB": 5996.0}, "bootstrapFilePath": null}, {"baseInstantTime": "20230208000815127", "deltaFilePaths": [".00000000-9e25-4c39-b05c-657a82bbcf63-0_20230208000815127.log.1_0-8899-274704", ".00000000-9e25-4c39-b05c-657a82bbcf63-0_20230208000815127.log.1_0-8918-275142"], "dataFilePath": "00000000-9e25-4c39-b05c-657a82bbcf63-0_0-3-3_20230208000815127.parquet", "fileId": "00000000-9e25-4c39-b05c-657a82bbcf63-0", "partitionPath": "_db=cfdmscoredb/_tbl=debt_detail_snap_202302/_id_mod=11", "metrics": {"TOTAL_LOG_FILES": 2.0, "TOTAL_IO_READ_MB": 2371.0, "TOTAL_LOG_FILES_SIZE": 2.045651049E9, "TOTAL_IO_WRITE_MB": 420.0, "TOTAL_IO_MB": 2791.0}, "bootstrapFilePath": null}], "extraMetadata": {}, "version": 2}, "hoodieCleanerPlan": null, "actionState": "REQUESTED", "hoodieReplaceCommitMetadata": null, "hoodieRe
 questedReplaceMetadata": null, "HoodieInflightReplaceMetadata": null, "hoodieIndexCommitMetadata": null}
   {"hoodieCommitMetadata": null, "hoodieCleanMetadata": null, "hoodieCompactionMetadata": null, "hoodieRollbackMetadata": null, "hoodieSavePointMetadata": null, "commitTime": "20230208003459996", "actionType": "compaction", "version": null, "hoodieCompactionPlan": {"operations": [{"baseInstantTime": "20230208001523703", "deltaFilePaths": [".00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703.log.1_0-26568-763035", ".00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703.log.1_0-26748-768915", ".00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703.log.2_0-26568-763035", ".00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703.log.3_0-26748-768915", ".00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703.log.4_0-9952-310687", ".00000000-9e95-4471-bba0-5604a282aa34-0_20230208001523703.log.5_1-0-1"], "dataFilePath": "00000000-9e95-4471-bba0-5604a282aa34-0_1-8918-275143_20230208001523703.parquet", "fileId": "00000000-9e95-4471-bba0-5604a282aa34-0", "partitionPath": "_db=c
 fdmscoredb/_tbl=debt_detail_snap_202302/_id_mod=12", "metrics": {"TOTAL_LOG_FILES": 6.0, "TOTAL_IO_READ_MB": 5930.0, "TOTAL_LOG_FILES_SIZE": 6.149091603E9, "TOTAL_IO_WRITE_MB": 66.0, "TOTAL_IO_MB": 5996.0}, "bootstrapFilePath": null}, {"baseInstantTime": "20230208000815127", "deltaFilePaths": [".00000000-9e25-4c39-b05c-657a82bbcf63-0_20230208000815127.log.1_0-8899-274704", ".00000000-9e25-4c39-b05c-657a82bbcf63-0_20230208000815127.log.1_0-8918-275142"], "dataFilePath": "00000000-9e25-4c39-b05c-657a82bbcf63-0_0-3-3_20230208000815127.parquet", "fileId": "00000000-9e25-4c39-b05c-657a82bbcf63-0", "partitionPath": "_db=cfdmscoredb/_tbl=debt_detail_snap_202302/_id_mod=11", "metrics": {"TOTAL_LOG_FILES": 2.0, "TOTAL_IO_READ_MB": 2371.0, "TOTAL_LOG_FILES_SIZE": 2.045651049E9, "TOTAL_IO_WRITE_MB": 420.0, "TOTAL_IO_MB": 2791.0}, "bootstrapFilePath": null}], "extraMetadata": {}, "version": 2}, "hoodieCleanerPlan": null, "actionState": "INFLIGHT", "hoodieReplaceCommitMetadata": null, "hoodieReq
 uestedReplaceMetadata": null, "HoodieInflightReplaceMetadata": null, "hoodieIndexCommitMetadata": null}


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] wqwl611 commented on issue #7969: [SUPPORT] data loss in new base file.

Posted by "wqwl611 (via GitHub)" <gi...@apache.org>.
wqwl611 commented on issue #7969:
URL: https://github.com/apache/hudi/issues/7969#issuecomment-1432386537

   @yihua Thanks for reply. the loss is missing records. I find no service failures. I use hudi0.11.1.
   I can't read the loss data by snapshot read, but I can read the loss data by timetravel read with specified instant time.
   Is there any tool to help inspect the archived file?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org