You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/09/14 04:46:48 UTC

[GitHub] [hudi] rmahindra123 opened a new pull request #3656: [HUDI-2428] Fix protocol and other issues after stress testing Hudi Kafka Connect

rmahindra123 opened a new pull request #3656:
URL: https://github.com/apache/hudi/pull/3656


   ## Verify this pull request
   
   After stress testing with Kafka (MSK), confluent schema registry and scripts to generate the kafka records, this PR contains all fixes:
   
   1.When hudi writer fails, the participant sends an empty List, instead throw retryable exception
   2. In coordinator, shutdown scheduler to ensure the old coordinator is properly cleaned up during re-assignement
   3. Connect calls 2 APIs: put() and preCommit(). We run the complete state machine for both the APIs, that may reset the kafka offset if its not in sync with the coordinator. Resetting the offset when preCommit is called is causing issues in the following cases: Run the connect sink with data in the kafka and wait for a hudi commit. Then kill the worker and restart it. After a START_COMMIT, then start another worker and after sometime. We see the following problem: the tasks in the first worker are killed and all tasks are assigned to the second worker. The issue was that when preCommit is called, we reset the offset (by calling context.offset) and that causes the task to crash. => The fix is to not process the state machine when preCommit is called. The connect platform calls PUT even when the kafka consumer is paused. Hence, we can only rely on PUT API to execute the state machine, and avoid running the state machine on preCommit calls, instead jut return the latest kafka offsets
  in preCommit API call.
   4. Fix logging (log4j was getting imported twice).
   5. Fix the script to generate the kafka records, reusing the docker demo json payloads.
   6. Fix the README accordingly.
   7. Fix the toString conversion of ControlEvent.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #3656: [HUDI-2428] Fix protocol and other issues after stress testing Hudi Kafka Connect

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #3656:
URL: https://github.com/apache/hudi/pull/3656#discussion_r708249977



##########
File path: hudi-kafka-connect/src/main/java/org/apache/hudi/connect/transaction/ControlEvent.java
##########
@@ -108,7 +109,9 @@ public int getVersion() {
   @Override
   public String toString() {
     return String.format("%s %s %s %s %s %s", version, msgType.name(), commitTime,
-        Arrays.toString(senderPartition), coordinatorInfo.toString(), participantInfo.toString());
+        Arrays.toString(senderPartition),
+        (coordinatorInfo == null) ? "" : coordinatorInfo.toString(),

Review comment:
       if you simply do `cordinatorInfo` instead of `coordinatorInfo.toString()` you can avoid the NPE

##########
File path: hudi-kafka-connect/src/main/java/org/apache/hudi/connect/transaction/ControlEvent.java
##########
@@ -163,6 +166,13 @@ public CoordinatorInfo(Map<Integer, Long> globalKafkaCommitOffsets) {
     public Map<Integer, Long> getGlobalKafkaCommitOffsets() {
       return (globalKafkaCommitOffsets == null) ? new HashMap<>() : globalKafkaCommitOffsets;
     }
+
+    @Override
+    public String toString() {
+      return String.format("%s", globalKafkaCommitOffsets.keySet().stream()

Review comment:
       does nt this already return a string? why format it? for `null` handling?

##########
File path: hudi-kafka-connect/src/main/java/org/apache/hudi/connect/transaction/TransactionParticipant.java
##########
@@ -35,7 +37,7 @@
 
   void buffer(SinkRecord record);
 
-  void processRecords();
+  void processRecords() throws IOException;

Review comment:
       e.g `HoodieIOException`

##########
File path: hudi-kafka-connect/src/main/java/org/apache/hudi/connect/writers/BufferedConnectWriter.java
##########
@@ -94,7 +94,7 @@ public void writeHudiRecord(HoodieRecord<HoodieAvroPayload> record) {
   }
 
   @Override
-  public List<WriteStatus> flushHudiRecords() {
+  public List<WriteStatus> flushHudiRecords() throws IOException {

Review comment:
       rename: just flushRecords

##########
File path: hudi-kafka-connect/src/main/java/org/apache/hudi/connect/transaction/TransactionParticipant.java
##########
@@ -35,7 +37,7 @@
 
   void buffer(SinkRecord record);
 
-  void processRecords();
+  void processRecords() throws IOException;

Review comment:
       its better if all interfaces throw a unchecked hudi exception,




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar merged pull request #3656: [HUDI-2428] Fix protocol and other issues after stress testing Hudi Kafka Connect

Posted by GitBox <gi...@apache.org>.
vinothchandar merged pull request #3656:
URL: https://github.com/apache/hudi/pull/3656


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3656: [HUDI-2428] Fix protocol and other issues after stress testing Hudi Kafka Connect

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3656:
URL: https://github.com/apache/hudi/pull/3656#issuecomment-918799472


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "78819b8c42c702d56e3c9efa07426504f2beef81",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "78819b8c42c702d56e3c9efa07426504f2beef81",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 78819b8c42c702d56e3c9efa07426504f2beef81 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3656: [HUDI-2428] Fix protocol and other issues after stress testing Hudi Kafka Connect

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3656:
URL: https://github.com/apache/hudi/pull/3656#issuecomment-918799472


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "78819b8c42c702d56e3c9efa07426504f2beef81",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2200",
       "triggerID" : "78819b8c42c702d56e3c9efa07426504f2beef81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67ff0ed3c61a9239f0b648a13f2477c2b33c0fe4",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2207",
       "triggerID" : "67ff0ed3c61a9239f0b648a13f2477c2b33c0fe4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 67ff0ed3c61a9239f0b648a13f2477c2b33c0fe4 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2207) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3656: [HUDI-2428] Fix protocol and other issues after stress testing Hudi Kafka Connect

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3656:
URL: https://github.com/apache/hudi/pull/3656#issuecomment-918799472


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "78819b8c42c702d56e3c9efa07426504f2beef81",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2200",
       "triggerID" : "78819b8c42c702d56e3c9efa07426504f2beef81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67ff0ed3c61a9239f0b648a13f2477c2b33c0fe4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "67ff0ed3c61a9239f0b648a13f2477c2b33c0fe4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 78819b8c42c702d56e3c9efa07426504f2beef81 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2200) 
   * 67ff0ed3c61a9239f0b648a13f2477c2b33c0fe4 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3656: [HUDI-2428] Fix protocol and other issues after stress testing Hudi Kafka Connect

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3656:
URL: https://github.com/apache/hudi/pull/3656#issuecomment-918799472


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "78819b8c42c702d56e3c9efa07426504f2beef81",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2200",
       "triggerID" : "78819b8c42c702d56e3c9efa07426504f2beef81",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 78819b8c42c702d56e3c9efa07426504f2beef81 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2200) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #3656: [HUDI-2428] Fix protocol and other issues after stress testing Hudi Kafka Connect

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #3656:
URL: https://github.com/apache/hudi/pull/3656#issuecomment-918808681


   Seems like the build is failing. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3656: [HUDI-2428] Fix protocol and other issues after stress testing Hudi Kafka Connect

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3656:
URL: https://github.com/apache/hudi/pull/3656#issuecomment-918799472


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "78819b8c42c702d56e3c9efa07426504f2beef81",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2200",
       "triggerID" : "78819b8c42c702d56e3c9efa07426504f2beef81",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 78819b8c42c702d56e3c9efa07426504f2beef81 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2200) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3656: [HUDI-2428] Fix protocol and other issues after stress testing Hudi Kafka Connect

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3656:
URL: https://github.com/apache/hudi/pull/3656#issuecomment-918799472


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "78819b8c42c702d56e3c9efa07426504f2beef81",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2200",
       "triggerID" : "78819b8c42c702d56e3c9efa07426504f2beef81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67ff0ed3c61a9239f0b648a13f2477c2b33c0fe4",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2207",
       "triggerID" : "67ff0ed3c61a9239f0b648a13f2477c2b33c0fe4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 78819b8c42c702d56e3c9efa07426504f2beef81 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2200) 
   * 67ff0ed3c61a9239f0b648a13f2477c2b33c0fe4 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2207) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org