You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/01/10 03:12:58 UTC

[GitHub] [hudi] yihua opened a new pull request #4544: [HUDI-2735] Allow empty commits in Kafka Connect Sink for Hudi

yihua opened a new pull request #4544:
URL: https://github.com/apache/hudi/pull/4544


   ## What is the purpose of the pull request
   
   This PR makes Kafka Connect Sink for Hudi to write empty commits when there are no new messages from the Kafka topic.  This avoids constant rollbacks if the Kafka topic has no new message.  Regardless of whether there are new messages or not, the write commit logic, including archival, is always executed, resolving the problem of no archival of rollbacks when there is no new message as well.
   
   ## Brief change log
   
     - Removes the check of the size of write status list from all participants in `ConnectTransactionCoordinator`.
     - Adds a new test for empty status list.
   
   ## Verify this pull request
   
   This change added tests and can be verified as follows:
   
   - Run Kafka Connect Sink for Hudi using Quick Start Guide
   - Publish some messages to the Kafka topic: `bash setupKafka.sh -n 100 -b 6`
   - Wait for some time so the Sink ingests all messages and writes empty commits
   - Publish more messages to the topic: `bash setupKafka.sh -n 100 -b 6 -o 600 -t`
   - Verify the table timeline using hudi-cli:
   ```
   hudi:hudi-test-topic->commits show
   ╔═══════════════════╤═════════════════════╤═══════════════════╤═════════════════════╤══════════════════════════╤═══════════════════════╤══════════════════════════════╤══════════════╗
   ║ CommitTime        │ Total Bytes Written │ Total Files Added │ Total Files Updated │ Total Partitions Written │ Total Records Written │ Total Update Records Written │ Total Errors ║
   ╠═══════════════════╪═════════════════════╪═══════════════════╪═════════════════════╪══════════════════════════╪═══════════════════════╪══════════════════════════════╪══════════════╣
   ║ 20220109184255282 │ 76.1 KB             │ 0                 │ 20                  │ 5                        │ 300                   │ 300                          │ 0            ║
   ╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
   ║ 20220109184129070 │ 75.7 KB             │ 0                 │ 20                  │ 5                        │ 300                   │ 300                          │ 0            ║
   ╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
   ║ 20220109183955630 │ 0.0 B               │ 0                 │ 0                   │ 0                        │ 0                     │ 0                            │ 0            ║
   ╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
   ║ 20220109183755160 │ 0.0 B               │ 0                 │ 0                   │ 0                        │ 0                     │ 0                            │ 0            ║
   ╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
   ║ 20220109183554995 │ 0.0 B               │ 0                 │ 0                   │ 0                        │ 0                     │ 0                            │ 0            ║
   ╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
   ║ 20220109183354904 │ 0.0 B               │ 0                 │ 0                   │ 0                        │ 0                     │ 0                            │ 0            ║
   ╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
   ║ 20220109183225656 │ 75.7 KB             │ 0                 │ 20                  │ 5                        │ 300                   │ 300                          │ 0            ║
   ╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
   ║ 20220109183055068 │ 71.8 KB             │ 0                 │ 16                  │ 5                        │ 300                   │ 300                          │ 0            ║
   ╚═══════════════════╧═════════════════════╧═══════════════════╧═════════════════════╧══════════════════════════╧═══════════════════════╧══════════════════════════════╧══════════════╝
   ```
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4544: [HUDI-2735] Allow empty commits in Kafka Connect Sink for Hudi

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4544:
URL: https://github.com/apache/hudi/pull/4544#issuecomment-1008543620


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8ca9f2823977584fb07efc737ccc175a6e33f115",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5043",
       "triggerID" : "8ca9f2823977584fb07efc737ccc175a6e33f115",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8ca9f2823977584fb07efc737ccc175a6e33f115 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5043) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan merged pull request #4544: [HUDI-2735] Allow empty commits in Kafka Connect Sink for Hudi

Posted by GitBox <gi...@apache.org>.
nsivabalan merged pull request #4544:
URL: https://github.com/apache/hudi/pull/4544


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4544: [HUDI-2735] Allow empty commits in Kafka Connect Sink for Hudi

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4544:
URL: https://github.com/apache/hudi/pull/4544#issuecomment-1008510590


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8ca9f2823977584fb07efc737ccc175a6e33f115",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8ca9f2823977584fb07efc737ccc175a6e33f115",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8ca9f2823977584fb07efc737ccc175a6e33f115 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4544: [HUDI-2735] Allow empty commits in Kafka Connect Sink for Hudi

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4544:
URL: https://github.com/apache/hudi/pull/4544#issuecomment-1008510590


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8ca9f2823977584fb07efc737ccc175a6e33f115",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8ca9f2823977584fb07efc737ccc175a6e33f115",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8ca9f2823977584fb07efc737ccc175a6e33f115 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4544: [HUDI-2735] Allow empty commits in Kafka Connect Sink for Hudi

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4544:
URL: https://github.com/apache/hudi/pull/4544#issuecomment-1008512050


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8ca9f2823977584fb07efc737ccc175a6e33f115",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5043",
       "triggerID" : "8ca9f2823977584fb07efc737ccc175a6e33f115",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8ca9f2823977584fb07efc737ccc175a6e33f115 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5043) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4544: [HUDI-2735] Allow empty commits in Kafka Connect Sink for Hudi

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4544:
URL: https://github.com/apache/hudi/pull/4544#issuecomment-1008512050


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8ca9f2823977584fb07efc737ccc175a6e33f115",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5043",
       "triggerID" : "8ca9f2823977584fb07efc737ccc175a6e33f115",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8ca9f2823977584fb07efc737ccc175a6e33f115 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5043) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org