You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "scxwhite (via GitHub)" <gi...@apache.org> on 2023/02/03 09:52:50 UTC

[GitHub] [hudi] scxwhite opened a new pull request, #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

scxwhite opened a new pull request, #7842:
URL: https://github.com/apache/hudi/pull/7842

   ### Change Logs
   
   In the merge phase, when we call the combineAndGetUpdateValue method, the sorting field used is: house.payload.ordering.field instead of house.datasource.write.recombine.field.
   
   
   ### Impact
   
   Fix the wrong use of order field in PartialUpdateAvroPayload class.
   
   ### Risk level (write none, low medium or high below)
   
   high
   
   ### Documentation Update
   
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7842:
URL: https://github.com/apache/hudi/pull/7842#issuecomment-1419605389

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14898",
       "triggerID" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f400a032c996f52b4153ef8cf5ac89bb6677f100",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14963",
       "triggerID" : "f400a032c996f52b4153ef8cf5ac89bb6677f100",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f400a032c996f52b4153ef8cf5ac89bb6677f100 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14963) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7842:
URL: https://github.com/apache/hudi/pull/7842#issuecomment-1418986543

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14898",
       "triggerID" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f400a032c996f52b4153ef8cf5ac89bb6677f100",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14963",
       "triggerID" : "f400a032c996f52b4153ef8cf5ac89bb6677f100",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7f29e19230ac8dd1ffff0fac6ceefdbcb862851 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14898) 
   * f400a032c996f52b4153ef8cf5ac89bb6677f100 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14963) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7842:
URL: https://github.com/apache/hudi/pull/7842#issuecomment-1420055428

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14898",
       "triggerID" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f400a032c996f52b4153ef8cf5ac89bb6677f100",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14963",
       "triggerID" : "f400a032c996f52b4153ef8cf5ac89bb6677f100",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f400a032c996f52b4153ef8cf5ac89bb6677f100",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14974",
       "triggerID" : "1420044291",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * f400a032c996f52b4153ef8cf5ac89bb6677f100 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14963) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14974) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] scxwhite commented on pull request #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

Posted by "scxwhite (via GitHub)" <gi...@apache.org>.
scxwhite commented on PR #7842:
URL: https://github.com/apache/hudi/pull/7842#issuecomment-1420044291

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7842:
URL: https://github.com/apache/hudi/pull/7842#issuecomment-1418976834

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14898",
       "triggerID" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f400a032c996f52b4153ef8cf5ac89bb6677f100",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f400a032c996f52b4153ef8cf5ac89bb6677f100",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7f29e19230ac8dd1ffff0fac6ceefdbcb862851 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14898) 
   * f400a032c996f52b4153ef8cf5ac89bb6677f100 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] scxwhite commented on a diff in pull request #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

Posted by "scxwhite (via GitHub)" <gi...@apache.org>.
scxwhite commented on code in PR #7842:
URL: https://github.com/apache/hudi/pull/7842#discussion_r1101207004


##########
hudi-common/src/main/java/org/apache/hudi/common/model/PartialUpdateAvroPayload.java:
##########
@@ -196,30 +196,40 @@ protected Option<IndexedRecord> mergeDisorderRecordsWithMetadata(
   /**
    * Returns whether the given record is newer than the record of this payload.
    *
-   * @param orderingVal
-   * @param record The record
-   * @param prop   The payload properties
-   *
+   * @param schema      The schema
+   * @param record      The record
+   * @param prop        The payload properties
    * @return true if the given record is newer
    */
-  private static boolean isRecordNewer(Comparable orderingVal, IndexedRecord record, Properties prop) {
+  private boolean isRecordNewer(Schema schema, IndexedRecord record, Properties prop) throws IOException {
     String orderingField = prop.getProperty(HoodiePayloadProps.PAYLOAD_ORDERING_FIELD_PROP_KEY);
+    String preCombineField = prop.getProperty("hoodie.datasource.write.precombine.field");
+
     if (!StringUtils.isNullOrEmpty(orderingField)) {
       boolean consistentLogicalTimestampEnabled = Boolean.parseBoolean(prop.getProperty(
           KeyGeneratorOptions.KEYGENERATOR_CONSISTENT_LOGICAL_TIMESTAMP_ENABLED.key(),
           KeyGeneratorOptions.KEYGENERATOR_CONSISTENT_LOGICAL_TIMESTAMP_ENABLED.defaultValue()));
 
+      Comparable incomingOrderingVal;
+      if (orderingField.equals(preCombineField)) {
+        incomingOrderingVal = orderingVal;

Review Comment:
   I agree with you. I will modify these codes again.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] scxwhite commented on a diff in pull request #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

Posted by "scxwhite (via GitHub)" <gi...@apache.org>.
scxwhite commented on code in PR #7842:
URL: https://github.com/apache/hudi/pull/7842#discussion_r1098391650


##########
hudi-common/src/main/java/org/apache/hudi/common/model/PartialUpdateAvroPayload.java:
##########
@@ -196,30 +196,40 @@ protected Option<IndexedRecord> mergeDisorderRecordsWithMetadata(
   /**
    * Returns whether the given record is newer than the record of this payload.
    *
-   * @param orderingVal
-   * @param record The record
-   * @param prop   The payload properties
-   *
+   * @param schema      The schema
+   * @param record      The record
+   * @param prop        The payload properties
    * @return true if the given record is newer
    */
-  private static boolean isRecordNewer(Comparable orderingVal, IndexedRecord record, Properties prop) {
+  private boolean isRecordNewer(Schema schema, IndexedRecord record, Properties prop) throws IOException {
     String orderingField = prop.getProperty(HoodiePayloadProps.PAYLOAD_ORDERING_FIELD_PROP_KEY);
+    String preCombineField = prop.getProperty("hoodie.datasource.write.precombine.field");
+
     if (!StringUtils.isNullOrEmpty(orderingField)) {
       boolean consistentLogicalTimestampEnabled = Boolean.parseBoolean(prop.getProperty(
           KeyGeneratorOptions.KEYGENERATOR_CONSISTENT_LOGICAL_TIMESTAMP_ENABLED.key(),
           KeyGeneratorOptions.KEYGENERATOR_CONSISTENT_LOGICAL_TIMESTAMP_ENABLED.defaultValue()));
 
+      Comparable incomingOrderingVal;
+      if (orderingField.equals(preCombineField)) {
+        incomingOrderingVal = orderingVal;

Review Comment:
   @danny0405  Thank you for your review.
   If I understand correctly, the value of orderingVal variable defined in BaseAvroPayload is precombine configuration. You can refer to this part of code.
   https://github.com/apache/hudi/blob/6436ef3ee2cb887cf25643c5c99f1fe34a64ec68/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala#L1095
   
   In the process of using Spark, I found that precombine is configured by hoodie.datasource.write.precombine.field configuration item, and orderfield is configured by hoodie.payload.ordering.field configuration item.There is no restriction that precombine and orderfield must be consistent.
   
   
   https://github.com/apache/hudi/blob/6436ef3ee2cb887cf25643c5c99f1fe34a64ec68/hudi-common/src/main/java/org/apache/hudi/common/model/DefaultHoodieRecordPayload.java#L126
   
   And in the DefaultHoodieRecordPayload class, we can also find that the orderingVal used is the value of the configuration item hoodie.payload.ordering.field.The orderingVal variable in the BaseAvroPayload class is not used.
   
   As for the reason why I compare the fields of the two to see if they are consistent, as you just mentioned, it is to avoid the time of data serialization.
   
   I look forward to your answer. Thank you again.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7842:
URL: https://github.com/apache/hudi/pull/7842#issuecomment-1415814306

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14898",
       "triggerID" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7f29e19230ac8dd1ffff0fac6ceefdbcb862851 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14898) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7842:
URL: https://github.com/apache/hudi/pull/7842#discussion_r1098201307


##########
hudi-common/src/main/java/org/apache/hudi/common/model/PartialUpdateAvroPayload.java:
##########
@@ -196,30 +196,40 @@ protected Option<IndexedRecord> mergeDisorderRecordsWithMetadata(
   /**
    * Returns whether the given record is newer than the record of this payload.
    *
-   * @param orderingVal
-   * @param record The record
-   * @param prop   The payload properties
-   *
+   * @param schema      The schema
+   * @param record      The record
+   * @param prop        The payload properties
    * @return true if the given record is newer
    */
-  private static boolean isRecordNewer(Comparable orderingVal, IndexedRecord record, Properties prop) {
+  private boolean isRecordNewer(Schema schema, IndexedRecord record, Properties prop) throws IOException {
     String orderingField = prop.getProperty(HoodiePayloadProps.PAYLOAD_ORDERING_FIELD_PROP_KEY);
+    String preCombineField = prop.getProperty("hoodie.datasource.write.precombine.field");
+
     if (!StringUtils.isNullOrEmpty(orderingField)) {
       boolean consistentLogicalTimestampEnabled = Boolean.parseBoolean(prop.getProperty(
           KeyGeneratorOptions.KEYGENERATOR_CONSISTENT_LOGICAL_TIMESTAMP_ENABLED.key(),
           KeyGeneratorOptions.KEYGENERATOR_CONSISTENT_LOGICAL_TIMESTAMP_ENABLED.defaultValue()));
 
+      Comparable incomingOrderingVal;
+      if (orderingField.equals(preCombineField)) {
+        incomingOrderingVal = orderingVal;

Review Comment:
   Not sure what are we fixing here, do you mean the record payload ordering field and preCombine field are actually the same field? In Flink, we force the ordering field as the preCombine field, see `SteamerUtil.getPayloadConfig` for details.
   
   And in general, the ordering field is passed as a param for the payload constructor, so how and when in-consistency happens?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] scxwhite commented on a diff in pull request #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

Posted by "scxwhite (via GitHub)" <gi...@apache.org>.
scxwhite commented on code in PR #7842:
URL: https://github.com/apache/hudi/pull/7842#discussion_r1098391650


##########
hudi-common/src/main/java/org/apache/hudi/common/model/PartialUpdateAvroPayload.java:
##########
@@ -196,30 +196,40 @@ protected Option<IndexedRecord> mergeDisorderRecordsWithMetadata(
   /**
    * Returns whether the given record is newer than the record of this payload.
    *
-   * @param orderingVal
-   * @param record The record
-   * @param prop   The payload properties
-   *
+   * @param schema      The schema
+   * @param record      The record
+   * @param prop        The payload properties
    * @return true if the given record is newer
    */
-  private static boolean isRecordNewer(Comparable orderingVal, IndexedRecord record, Properties prop) {
+  private boolean isRecordNewer(Schema schema, IndexedRecord record, Properties prop) throws IOException {
     String orderingField = prop.getProperty(HoodiePayloadProps.PAYLOAD_ORDERING_FIELD_PROP_KEY);
+    String preCombineField = prop.getProperty("hoodie.datasource.write.precombine.field");
+
     if (!StringUtils.isNullOrEmpty(orderingField)) {
       boolean consistentLogicalTimestampEnabled = Boolean.parseBoolean(prop.getProperty(
           KeyGeneratorOptions.KEYGENERATOR_CONSISTENT_LOGICAL_TIMESTAMP_ENABLED.key(),
           KeyGeneratorOptions.KEYGENERATOR_CONSISTENT_LOGICAL_TIMESTAMP_ENABLED.defaultValue()));
 
+      Comparable incomingOrderingVal;
+      if (orderingField.equals(preCombineField)) {
+        incomingOrderingVal = orderingVal;

Review Comment:
   @danny0405  Thank you for your review.
   If I understand correctly, the value of orderingVal variable defined in BaseAvroPayload is precombine configuration. You can refer to this part of code.
   https://github.com/apache/hudi/blob/6436ef3ee2cb887cf25643c5c99f1fe34a64ec68/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala#L1095
   
   In the process of using Spark, I found that precombine is configured by hoodie.datasource.write.precombine.field configuration item, and orderfield is configured by hoodie.datasource.write.precombine.field configuration item.There is no restriction that precombine and orderfield must be consistent.
   
   
   https://github.com/apache/hudi/blob/6436ef3ee2cb887cf25643c5c99f1fe34a64ec68/hudi-common/src/main/java/org/apache/hudi/common/model/DefaultHoodieRecordPayload.java#L126
   
   And in the DefaultHoodieRecordPayload class, we can also find that the orderingVal used is the value of the configuration item hoodie.payload.ordering.field.The orderingVal variable in the BaseAvroPayload class is not used.
   
   As for the reason why I compare the fields of the two to see if they are consistent, as you just mentioned, it is to avoid the time of data serialization.
   
   I look forward to your answer. Thank you again.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7842:
URL: https://github.com/apache/hudi/pull/7842#issuecomment-1415552527

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7f29e19230ac8dd1ffff0fac6ceefdbcb862851 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7842:
URL: https://github.com/apache/hudi/pull/7842#issuecomment-1415721459

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14898",
       "triggerID" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7f29e19230ac8dd1ffff0fac6ceefdbcb862851 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14898) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7842:
URL: https://github.com/apache/hudi/pull/7842#discussion_r1099641567


##########
hudi-common/src/main/java/org/apache/hudi/common/model/PartialUpdateAvroPayload.java:
##########
@@ -196,30 +196,40 @@ protected Option<IndexedRecord> mergeDisorderRecordsWithMetadata(
   /**
    * Returns whether the given record is newer than the record of this payload.
    *
-   * @param orderingVal
-   * @param record The record
-   * @param prop   The payload properties
-   *
+   * @param schema      The schema
+   * @param record      The record
+   * @param prop        The payload properties
    * @return true if the given record is newer
    */
-  private static boolean isRecordNewer(Comparable orderingVal, IndexedRecord record, Properties prop) {
+  private boolean isRecordNewer(Schema schema, IndexedRecord record, Properties prop) throws IOException {
     String orderingField = prop.getProperty(HoodiePayloadProps.PAYLOAD_ORDERING_FIELD_PROP_KEY);
+    String preCombineField = prop.getProperty("hoodie.datasource.write.precombine.field");
+
     if (!StringUtils.isNullOrEmpty(orderingField)) {
       boolean consistentLogicalTimestampEnabled = Boolean.parseBoolean(prop.getProperty(
           KeyGeneratorOptions.KEYGENERATOR_CONSISTENT_LOGICAL_TIMESTAMP_ENABLED.key(),
           KeyGeneratorOptions.KEYGENERATOR_CONSISTENT_LOGICAL_TIMESTAMP_ENABLED.defaultValue()));
 
+      Comparable incomingOrderingVal;
+      if (orderingField.equals(preCombineField)) {
+        incomingOrderingVal = orderingVal;

Review Comment:
   We should fix the orderingVal to be always as `HoodiePayloadProps.PAYLOAD_ORDERING_FIELD_PROP_KEY`, and of course the field can fallback to preCombine field if it is not configured explicitly by user.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7842:
URL: https://github.com/apache/hudi/pull/7842#issuecomment-1420203832

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14898",
       "triggerID" : "f7f29e19230ac8dd1ffff0fac6ceefdbcb862851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f400a032c996f52b4153ef8cf5ac89bb6677f100",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14963",
       "triggerID" : "f400a032c996f52b4153ef8cf5ac89bb6677f100",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f400a032c996f52b4153ef8cf5ac89bb6677f100",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14974",
       "triggerID" : "1420044291",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * f400a032c996f52b4153ef8cf5ac89bb6677f100 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14963) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14974) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7842: [HUDI-5695]Fix the wrong use of order field in PartialUpdateAvroPayload class.

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7842:
URL: https://github.com/apache/hudi/pull/7842#discussion_r1098201307


##########
hudi-common/src/main/java/org/apache/hudi/common/model/PartialUpdateAvroPayload.java:
##########
@@ -196,30 +196,40 @@ protected Option<IndexedRecord> mergeDisorderRecordsWithMetadata(
   /**
    * Returns whether the given record is newer than the record of this payload.
    *
-   * @param orderingVal
-   * @param record The record
-   * @param prop   The payload properties
-   *
+   * @param schema      The schema
+   * @param record      The record
+   * @param prop        The payload properties
    * @return true if the given record is newer
    */
-  private static boolean isRecordNewer(Comparable orderingVal, IndexedRecord record, Properties prop) {
+  private boolean isRecordNewer(Schema schema, IndexedRecord record, Properties prop) throws IOException {
     String orderingField = prop.getProperty(HoodiePayloadProps.PAYLOAD_ORDERING_FIELD_PROP_KEY);
+    String preCombineField = prop.getProperty("hoodie.datasource.write.precombine.field");
+
     if (!StringUtils.isNullOrEmpty(orderingField)) {
       boolean consistentLogicalTimestampEnabled = Boolean.parseBoolean(prop.getProperty(
           KeyGeneratorOptions.KEYGENERATOR_CONSISTENT_LOGICAL_TIMESTAMP_ENABLED.key(),
           KeyGeneratorOptions.KEYGENERATOR_CONSISTENT_LOGICAL_TIMESTAMP_ENABLED.defaultValue()));
 
+      Comparable incomingOrderingVal;
+      if (orderingField.equals(preCombineField)) {
+        incomingOrderingVal = orderingVal;

Review Comment:
   Not sure what are we fixing here, do you mean the record payload ordering field and preCombine field are actually not the same field? In Flink, we force the ordering field as the preCombine field, see `SteamerUtil.getPayloadConfig` for details.
   
   And in general, the ordering field is passed as a param for the payload constructor, so how and when in-consistency happens?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org