You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/09/28 10:53:40 UTC

[GitHub] [hudi] YannByron opened a new pull request, #6818: Improve CDC Write

YannByron opened a new pull request, #6818:
URL: https://github.com/apache/hudi/pull/6818

   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance impact._
   
   **Risk level: none | low | medium | high**
   
   _Choose one. If medium or high, explain what verification was done to mitigate the risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
     ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make
     changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1276247999

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920",
       "triggerID" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e0ccacd8d030984ed30f19b17b0dafb02d8685ee Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920) 
   * 2d4330e9960569a8aee6e5094244746c3f68883c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1260890678

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa5571c3b239b8edd501999a914b6c91105dc200 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839) 
   * d430448672665ebb89b20687a0ee6d629bd0483f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6818:
URL: https://github.com/apache/hudi/pull/6818#discussion_r990585049


##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieCDCLogRecordIterator.java:
##########
@@ -27,50 +27,94 @@
 import org.apache.avro.generic.IndexedRecord;
 
 import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
 
 import java.io.IOException;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.concurrent.atomic.AtomicInteger;
 
 public class HoodieCDCLogRecordIterator implements ClosableIterator<IndexedRecord> {
 
-  private final HoodieLogFile cdcLogFile;
+  private final FileSystem fs;
 
-  private final HoodieLogFormat.Reader reader;
+  private final Schema cdcSchema;
+
+  private final Iterator<HoodieLogFile> cdcLogFileIter;
+
+  private HoodieLogFormat.Reader reader;
+
+  /**
+   * Due to the hasNext of {@link HoodieLogFormat.Reader} is not idempotent,
+   * Here guarantee idempotent by `hasNextCall` and `nextCall`.
+   */
+  private final AtomicInteger hasNextCall = new AtomicInteger(0);
+  private final AtomicInteger nextCall = new AtomicInteger(0);
 
   private ClosableIterator<IndexedRecord> itr;
 
-  public HoodieCDCLogRecordIterator(
-      FileSystem fs,
-      Path cdcLogPath,
-      Schema cdcSchema) throws IOException {
-    this.cdcLogFile = new HoodieLogFile(fs.getFileStatus(cdcLogPath));
-    this.reader = new HoodieLogFileReader(fs, cdcLogFile, cdcSchema,
-        HoodieLogFileReader.DEFAULT_BUFFER_SIZE, false);
+  public HoodieCDCLogRecordIterator(FileSystem fs, HoodieLogFile[] cdcLogFiles, Schema cdcSchema) {
+    this.fs = fs;
+    this.cdcSchema = cdcSchema;
+    this.cdcLogFileIter = Arrays.stream(cdcLogFiles).iterator();
   }

Review Comment:
   Do we have some sort sequence for these files ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1276264799

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920",
       "triggerID" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12160",
       "triggerID" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e0ccacd8d030984ed30f19b17b0dafb02d8685ee Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920) 
   * 2d4330e9960569a8aee6e5094244746c3f68883c UNKNOWN
   * 3099df08c92beca28b203a5a3c37e044bf2b6da2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12160) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 merged pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
danny0405 merged PR #6818:
URL: https://github.com/apache/hudi/pull/6818


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
danny0405 commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1272218777

   > this pr will support cdc data block's flushing and cdc log file's rollover. this features need to upgrade the write stat about cdc, that is the key point need to be discuss.
   > 
   > there maybe are solutions:
   > 
   > 1. like this pr: both `cdcPaths` and `cdcWriteBytes` are the `list` data type.
   > 2. use a map, like:
   > 
   > ```
   > cdcWriteStats: {
   >   "cdclogfile1": cdclogFile1Size,
   >   "cdclogfile1": cdclogFile1Size
   > }
   > ```
   > 
   > cc @xushiyan @alexeykudinkin @danny0405 WDYH?
   
   What is the file size used for ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1261297879

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f14363a4be66f8a05ddbbe14600176da151d04ff Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
danny0405 commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1278370891

   There are test failures, not sure if it is related with this change:
   
   ![image](https://user-images.githubusercontent.com/7644508/195744371-ba4299e6-2b80-4d07-a6c9-0639adf6fe20.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1278563771

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920",
       "triggerID" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12160",
       "triggerID" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12161",
       "triggerID" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "390c6c486d979bb7fa703ba9ecd40be01e660d49",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12179",
       "triggerID" : "390c6c486d979bb7fa703ba9ecd40be01e660d49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "390c6c486d979bb7fa703ba9ecd40be01e660d49",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12197",
       "triggerID" : "1278386797",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 2d4330e9960569a8aee6e5094244746c3f68883c UNKNOWN
   * 390c6c486d979bb7fa703ba9ecd40be01e660d49 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12179) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12197) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1261005415

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d430448672665ebb89b20687a0ee6d629bd0483f Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841) 
   * f14363a4be66f8a05ddbbe14600176da151d04ff Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] YannByron commented on pull request #6818: Improve CDC Write

Posted by GitBox <gi...@apache.org>.
YannByron commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1260890524

   this pr will support cdc data block's flushing and cdc log file's rollover. this features need to upgrade the write stat about cdc, that is the key point need to be discuss.
   
   there maybe are solutions:
   1. like this pr: both `cdcPaths` and `cdcWriteBytes` are the `list` data type.
   2. use a map, like:
   ```
   cdcWriteStats: {
     "cdclogfile1": cdclogFile1Size,
     "cdclogfile1": cdclogFile1Size
   }
   ```
   
   cc @xushiyan @alexeykudinkin @danny0405  WDYH?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1276540740

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920",
       "triggerID" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12160",
       "triggerID" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12161",
       "triggerID" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2d4330e9960569a8aee6e5094244746c3f68883c UNKNOWN
   * 98b0bd9e0f36e58f70c6a901e5a5907987f9656d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12161) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1260997200

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa5571c3b239b8edd501999a914b6c91105dc200 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839) 
   * d430448672665ebb89b20687a0ee6d629bd0483f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841) 
   * f14363a4be66f8a05ddbbe14600176da151d04ff UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1277109143

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920",
       "triggerID" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12160",
       "triggerID" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12161",
       "triggerID" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "390c6c486d979bb7fa703ba9ecd40be01e660d49",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "390c6c486d979bb7fa703ba9ecd40be01e660d49",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2d4330e9960569a8aee6e5094244746c3f68883c UNKNOWN
   * 98b0bd9e0f36e58f70c6a901e5a5907987f9656d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12161) 
   * 390c6c486d979bb7fa703ba9ecd40be01e660d49 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6818:
URL: https://github.com/apache/hudi/pull/6818#discussion_r990583580


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCDCLogger.java:
##########
@@ -73,35 +80,56 @@ public class HoodieCDCLogger implements Closeable {
 
   private final Schema cdcSchema;
 
-  private final String cdcSchemaString;
-
   // the cdc data
   private final Map<String, HoodieAvroPayload> cdcData;
 
+  private final Map<HoodieLogBlock.HeaderMetadataType, String> cdcDataBlockHeader;
+
   // the cdc record transformer
   private final CDCTransformer transformer;
 
+  // Max block size to limit to for a log block
+  private final int maxBlockSize;
+
+  // Average cdc record size. This size is updated at the end of every log block flushed to disk
+  private long averageCDCRecordSize = 0;
+
+  // Number of records that must be written to meet the max block size for a log block
+  private AtomicInteger numOfCDCRecordInMemory = new AtomicInteger();
+

Review Comment:
   `numOfCDCRecordInMemory` -> `numOfCDCRecordsInMemory`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1263105772

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920",
       "triggerID" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f14363a4be66f8a05ddbbe14600176da151d04ff Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843) 
   * e0ccacd8d030984ed30f19b17b0dafb02d8685ee Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1263103404

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f14363a4be66f8a05ddbbe14600176da151d04ff Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843) 
   * e0ccacd8d030984ed30f19b17b0dafb02d8685ee UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6818:
URL: https://github.com/apache/hudi/pull/6818#discussion_r990584901


##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieCDCLogRecordIterator.java:
##########
@@ -27,50 +27,94 @@
 import org.apache.avro.generic.IndexedRecord;
 
 import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
 
 import java.io.IOException;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.concurrent.atomic.AtomicInteger;
 
 public class HoodieCDCLogRecordIterator implements ClosableIterator<IndexedRecord> {
 
-  private final HoodieLogFile cdcLogFile;
+  private final FileSystem fs;
 
-  private final HoodieLogFormat.Reader reader;
+  private final Schema cdcSchema;
+
+  private final Iterator<HoodieLogFile> cdcLogFileIter;
+
+  private HoodieLogFormat.Reader reader;
+
+  /**
+   * Due to the hasNext of {@link HoodieLogFormat.Reader} is not idempotent,
+   * Here guarantee idempotent by `hasNextCall` and `nextCall`.
+   */
+  private final AtomicInteger hasNextCall = new AtomicInteger(0);
+  private final AtomicInteger nextCall = new AtomicInteger(0);

Review Comment:
   We can avoid these two variables by a `currentRecord` reference from the current iterator.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1276376067

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920",
       "triggerID" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12160",
       "triggerID" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12161",
       "triggerID" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2d4330e9960569a8aee6e5094244746c3f68883c UNKNOWN
   * 3099df08c92beca28b203a5a3c37e044bf2b6da2 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12160) 
   * 98b0bd9e0f36e58f70c6a901e5a5907987f9656d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12161) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
danny0405 commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1277017004

   [4948.patch.zip](https://github.com/apache/hudi/files/9770814/4948.patch.zip)
   Thanks for the contribution, i have reviewed and applied a path here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] YannByron commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
YannByron commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1278386797

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1278409832

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920",
       "triggerID" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12160",
       "triggerID" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12161",
       "triggerID" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "390c6c486d979bb7fa703ba9ecd40be01e660d49",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12179",
       "triggerID" : "390c6c486d979bb7fa703ba9ecd40be01e660d49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "390c6c486d979bb7fa703ba9ecd40be01e660d49",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12197",
       "triggerID" : "1278386797",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 2d4330e9960569a8aee6e5094244746c3f68883c UNKNOWN
   * 390c6c486d979bb7fa703ba9ecd40be01e660d49 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12179) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12197) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1277291768

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920",
       "triggerID" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12160",
       "triggerID" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12161",
       "triggerID" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "390c6c486d979bb7fa703ba9ecd40be01e660d49",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12179",
       "triggerID" : "390c6c486d979bb7fa703ba9ecd40be01e660d49",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2d4330e9960569a8aee6e5094244746c3f68883c UNKNOWN
   * 390c6c486d979bb7fa703ba9ecd40be01e660d49 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12179) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1277113506

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920",
       "triggerID" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12160",
       "triggerID" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12161",
       "triggerID" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "390c6c486d979bb7fa703ba9ecd40be01e660d49",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12179",
       "triggerID" : "390c6c486d979bb7fa703ba9ecd40be01e660d49",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2d4330e9960569a8aee6e5094244746c3f68883c UNKNOWN
   * 98b0bd9e0f36e58f70c6a901e5a5907987f9656d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12161) 
   * 390c6c486d979bb7fa703ba9ecd40be01e660d49 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12179) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1276368111

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920",
       "triggerID" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12160",
       "triggerID" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "98b0bd9e0f36e58f70c6a901e5a5907987f9656d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e0ccacd8d030984ed30f19b17b0dafb02d8685ee Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920) 
   * 2d4330e9960569a8aee6e5094244746c3f68883c UNKNOWN
   * 3099df08c92beca28b203a5a3c37e044bf2b6da2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12160) 
   * 98b0bd9e0f36e58f70c6a901e5a5907987f9656d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1276256246

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920",
       "triggerID" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2d4330e9960569a8aee6e5094244746c3f68883c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3099df08c92beca28b203a5a3c37e044bf2b6da2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e0ccacd8d030984ed30f19b17b0dafb02d8685ee Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920) 
   * 2d4330e9960569a8aee6e5094244746c3f68883c UNKNOWN
   * 3099df08c92beca28b203a5a3c37e044bf2b6da2 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] YannByron commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
YannByron commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1278386590

   > There are test failures, not sure if it is related with this change:
   > 
   > ![image](https://user-images.githubusercontent.com/7644508/195744371-ba4299e6-2b80-4d07-a6c9-0639adf6fe20.png)
   
   Let restart CI to judge this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1263280762

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11841",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11843",
       "triggerID" : "f14363a4be66f8a05ddbbe14600176da151d04ff",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920",
       "triggerID" : "e0ccacd8d030984ed30f19b17b0dafb02d8685ee",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e0ccacd8d030984ed30f19b17b0dafb02d8685ee Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11920) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1260796554

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa5571c3b239b8edd501999a914b6c91105dc200 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1260791120

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa5571c3b239b8edd501999a914b6c91105dc200 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6818: Improve CDC Write

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6818:
URL: https://github.com/apache/hudi/pull/6818#issuecomment-1260882092

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839",
       "triggerID" : "fa5571c3b239b8edd501999a914b6c91105dc200",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d430448672665ebb89b20687a0ee6d629bd0483f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa5571c3b239b8edd501999a914b6c91105dc200 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11839) 
   * d430448672665ebb89b20687a0ee6d629bd0483f UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6818:
URL: https://github.com/apache/hudi/pull/6818#discussion_r990584548


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandleWithChangeLog.java:
##########
@@ -89,9 +94,19 @@ protected void writeInsertRecord(HoodieRecord<T> hoodieRecord, Option<IndexedRec
   public List<WriteStatus> close() {
     List<WriteStatus> writeStatuses = super.close();
     // if there are cdc data written, set the CDC-related information.
-    Option<AppendResult> cdcResult =
-        HoodieCDCLogger.writeCDCDataIfNeeded(cdcLogger, recordsWritten, insertRecordsWritten);
-    HoodieCDCLogger.setCDCStatIfNeeded(writeStatuses.get(0).getStat(), cdcResult, partitionPath, fs);
+
+    if (cdcLogger == null || recordsWritten == 0L || (recordsWritten == insertRecordsWritten)) {
+      // the following cases where we do not need to write out the cdc file:

Review Comment:
   The if condition is not suitable for Flink, we may need some change for flink cdc handles.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6818: [HUDI-4948] Improve CDC Write

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6818:
URL: https://github.com/apache/hudi/pull/6818#discussion_r990584508


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieWriteStat.java:
##########
@@ -254,12 +256,12 @@ public String getPath() {
   }
 
   @Nullable
-  public String getCdcPath() {
-    return cdcPath;
+  public List<String> getCdcPaths() {
+    return cdcPaths;
   }
 
-  public void setCdcPath(String cdcPath) {
-    this.cdcPath = cdcPath;
+  public void setCdcPath(List<String> cdcPaths) {
+    this.cdcPaths = cdcPaths;

Review Comment:
   setCdcPath -> setCdcPaths



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org