You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/03/20 16:00:28 UTC

[GitHub] [hudi] codope opened a new pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

codope opened a new pull request #5077:
URL: https://github.com/apache/hudi/pull/5077


   ## What is the purpose of the pull request
   
   Fixes an issue with comparison of column range metadata. Instead of using string comparators, we get the type of field from the avro schema and convert to native Java type.
   
   ## Brief change log
   
   *(for example:)*
     - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
     - *Added integration tests for end-to-end.*
     - *Added HoodieClientWriteTest to verify the change.*
     - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1074155056


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8a8f3c421836a055b4b5917dc58f240d10e11820 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128) 
   * 4c79f3df666bdf2c3d1e90d930f6f89fb991afcc UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1080414160


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448",
       "triggerID" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192) 
   * 2ef5fd42900ec5ca2ef5d2f6db9e209067518784 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1075399688


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4c79f3df666bdf2c3d1e90d930f6f89fb991afcc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136) 
   * a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073501604


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e268f4ceb0883f1092ab468f466a90baf485d5c3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108) 
   * f5a13c08cc1d2c59a56f85702c455cd1d409fe79 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1074158121


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8a8f3c421836a055b4b5917dc58f240d10e11820 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128) 
   * 4c79f3df666bdf2c3d1e90d930f6f89fb991afcc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073297587


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e268f4ceb0883f1092ab468f466a90baf485d5c3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073757791


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f5a13c08cc1d2c59a56f85702c455cd1d409fe79 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119) 
   * 8a8f3c421836a055b4b5917dc58f240d10e11820 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073282373


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e268f4ceb0883f1092ab468f466a90baf485d5c3 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1080799126


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448",
       "triggerID" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "triggerType" : "PUSH"
     }, {
       "hash" : "91230b04ffc5c566806ddc7efe0e2f6024fb5948",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "91230b04ffc5c566806ddc7efe0e2f6024fb5948",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2ef5fd42900ec5ca2ef5d2f6db9e209067518784 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448) 
   * 91230b04ffc5c566806ddc7efe0e2f6024fb5948 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1080408730


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192) 
   * 2ef5fd42900ec5ca2ef5d2f6db9e209067518784 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1074229102


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4c79f3df666bdf2c3d1e90d930f6f89fb991afcc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#discussion_r831528135



##########
File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
##########
@@ -511,6 +512,132 @@ public static Object getNestedFieldVal(GenericRecord record, String fieldName, b
     }
   }
 
+  /**
+   * Get schema for the given field and record. Field can be nested, denoted by dot notation. e.g: a.b.c
+   *
+   * @param record    - record containing the value of the given field
+   * @param fieldName - name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromRecord(GenericRecord record, String fieldName) {

Review comment:
       Do we really need this one if we already have method obtaining it from schema? We can use that one just passing record's schema, can't we?

##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
##########
@@ -1012,11 +1015,12 @@ public static void accumulateColumnRanges(Schema.Field field, String filePath,
                                             Map<String, HoodieColumnRangeMetadata<Comparable>> columnRangeMap,
                                             Map<String, Map<String, Object>> columnToStats) {
     Map<String, Object> columnStats = columnToStats.get(field.name());
+    field.schema().getType();

Review comment:
       Can delete this one

##########
File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
##########
@@ -511,6 +512,132 @@ public static Object getNestedFieldVal(GenericRecord record, String fieldName, b
     }
   }
 
+  /**
+   * Get schema for the given field and record. Field can be nested, denoted by dot notation. e.g: a.b.c
+   *
+   * @param record    - record containing the value of the given field
+   * @param fieldName - name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromRecord(GenericRecord record, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    GenericRecord valueNode = record;
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Object val = valueNode.get(part);
+
+      if (i == parts.length - 1) {
+        return resolveUnion(valueNode.getSchema().getField(part).schema());
+      } else {
+        if (!(val instanceof GenericRecord)) {
+          throw new HoodieException("Cannot find a record at part value :" + part);
+        }
+        valueNode = (GenericRecord) val;
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+
+  /**
+   * Get schema for the given field and write schema. Field can be nested, denoted by dot notation. e.g: a.b.c
+   * Use this method when record is not available. Otherwise, prefer to use {@link #getNestedFieldSchemaFromRecord(GenericRecord, String)}
+   *
+   * @param writeSchema - write schema of the record
+   * @param fieldName   -  name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromWriteSchema(Schema writeSchema, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Schema schema = writeSchema.getField(part).schema();
+
+      if (i == parts.length - 1) {
+        return resolveUnion(schema);
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+  /**
+   * Given a field schema, convert its value to native Java type.
+   *
+   * @param schema - field schema
+   * @param val    - field value
+   * @return
+   */
+  public static Comparable<?> convertToNativeJavaType(Schema schema, Object val) {
+    if (val == null) {
+      return StringUtils.EMPTY_STRING;
+    }
+    if (schema.getLogicalType() == LogicalTypes.date()) {
+      return java.sql.Date.valueOf((val.toString()));
+    }
+    switch (schema.getType()) {
+      case RECORD:

Review comment:
       There's no sensible way we can compare records -- let's set min/max for composite columns to null

##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
##########
@@ -1042,22 +1046,27 @@ public static void aggregateColumnStats(IndexedRecord record, Schema schema,
 
     schema.getFields().forEach(field -> {
       Map<String, Object> columnStats = columnToStats.getOrDefault(field.name(), new HashMap<>());
-      final String fieldVal = getNestedFieldValAsString((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);
+      final Object fieldVal = getNestedFieldVal((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);
+      final Schema fieldSchema = getNestedFieldSchemaFromRecord((GenericRecord) record, field.name());
       // update stats
-      final int fieldSize = fieldVal == null ? 0 : fieldVal.length();
+      final int fieldSize = fieldVal == null ? 0 : StringUtils.objToString(fieldVal).length();

Review comment:
       For ex, Parquet does provide such metrics but these are also much more sensible at Parquet level: encoding is fixed (Thrift), it also directly translates into size on disk. In our case i don't think there's a sensible high-level use-case for such metric

##########
File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
##########
@@ -511,6 +512,132 @@ public static Object getNestedFieldVal(GenericRecord record, String fieldName, b
     }
   }
 
+  /**
+   * Get schema for the given field and record. Field can be nested, denoted by dot notation. e.g: a.b.c
+   *
+   * @param record    - record containing the value of the given field
+   * @param fieldName - name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromRecord(GenericRecord record, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    GenericRecord valueNode = record;
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Object val = valueNode.get(part);
+
+      if (i == parts.length - 1) {
+        return resolveUnion(valueNode.getSchema().getField(part).schema());
+      } else {
+        if (!(val instanceof GenericRecord)) {
+          throw new HoodieException("Cannot find a record at part value :" + part);
+        }
+        valueNode = (GenericRecord) val;
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+
+  /**
+   * Get schema for the given field and write schema. Field can be nested, denoted by dot notation. e.g: a.b.c
+   * Use this method when record is not available. Otherwise, prefer to use {@link #getNestedFieldSchemaFromRecord(GenericRecord, String)}
+   *
+   * @param writeSchema - write schema of the record
+   * @param fieldName   -  name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromWriteSchema(Schema writeSchema, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Schema schema = writeSchema.getField(part).schema();
+
+      if (i == parts.length - 1) {
+        return resolveUnion(schema);
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+  /**
+   * Given a field schema, convert its value to native Java type.
+   *
+   * @param schema - field schema
+   * @param val    - field value
+   * @return
+   */
+  public static Comparable<?> convertToNativeJavaType(Schema schema, Object val) {
+    if (val == null) {
+      return StringUtils.EMPTY_STRING;
+    }
+    if (schema.getLogicalType() == LogicalTypes.date()) {
+      return java.sql.Date.valueOf((val.toString()));
+    }
+    switch (schema.getType()) {
+      case RECORD:

Review comment:
       Actually better approach would be to align with whatever Parquet is doing for such columns

##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
##########
@@ -1042,22 +1046,27 @@ public static void aggregateColumnStats(IndexedRecord record, Schema schema,
 
     schema.getFields().forEach(field -> {
       Map<String, Object> columnStats = columnToStats.getOrDefault(field.name(), new HashMap<>());
-      final String fieldVal = getNestedFieldValAsString((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);
+      final Object fieldVal = getNestedFieldVal((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);

Review comment:
       Also, since we rely on Parquet for collecting stats for base files we need to make sure that this sequence is _compatible_ with it.
   
   As such, we can't use comparator for Avro records and assume it would be identical to that one of Parquet (Thrift) as it's very likely to be not: since binary representation of this encodings is different we can't really compare composite records like that (arrays, maps are all treated like binary in Parquet, see ref at the bottom).
   
   As such my proposal is to limit min/max stats only for primitive types.
   
   Ref: Take a look at [PrimitiveComparator](https://github.com/apache/parquet-mr/blob/master/parquet-column/src/main/java/org/apache/parquet/schema/PrimitiveComparator.java) implementation

##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
##########
@@ -1042,22 +1046,27 @@ public static void aggregateColumnStats(IndexedRecord record, Schema schema,
 
     schema.getFields().forEach(field -> {
       Map<String, Object> columnStats = columnToStats.getOrDefault(field.name(), new HashMap<>());
-      final String fieldVal = getNestedFieldValAsString((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);
+      final Object fieldVal = getNestedFieldVal((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);
+      final Schema fieldSchema = getNestedFieldSchemaFromRecord((GenericRecord) record, field.name());
       // update stats
-      final int fieldSize = fieldVal == null ? 0 : fieldVal.length();
+      final int fieldSize = fieldVal == null ? 0 : StringUtils.objToString(fieldVal).length();

Review comment:
       This is incorrect value-size calculation: string size != binary size 

##########
File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
##########
@@ -511,6 +512,132 @@ public static Object getNestedFieldVal(GenericRecord record, String fieldName, b
     }
   }
 
+  /**
+   * Get schema for the given field and record. Field can be nested, denoted by dot notation. e.g: a.b.c
+   *
+   * @param record    - record containing the value of the given field
+   * @param fieldName - name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromRecord(GenericRecord record, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    GenericRecord valueNode = record;
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Object val = valueNode.get(part);
+
+      if (i == parts.length - 1) {
+        return resolveUnion(valueNode.getSchema().getField(part).schema());
+      } else {
+        if (!(val instanceof GenericRecord)) {
+          throw new HoodieException("Cannot find a record at part value :" + part);
+        }
+        valueNode = (GenericRecord) val;
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+
+  /**
+   * Get schema for the given field and write schema. Field can be nested, denoted by dot notation. e.g: a.b.c
+   * Use this method when record is not available. Otherwise, prefer to use {@link #getNestedFieldSchemaFromRecord(GenericRecord, String)}
+   *
+   * @param writeSchema - write schema of the record
+   * @param fieldName   -  name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromWriteSchema(Schema writeSchema, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Schema schema = writeSchema.getField(part).schema();
+
+      if (i == parts.length - 1) {
+        return resolveUnion(schema);
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+  /**
+   * Given a field schema, convert its value to native Java type.
+   *
+   * @param schema - field schema
+   * @param val    - field value
+   * @return
+   */
+  public static Comparable<?> convertToNativeJavaType(Schema schema, Object val) {

Review comment:
       I don't think we need this method. Instead we should make sure at the place where we collect min/max stats that they're proper Java objects, and simply treat min/max as `Comparable`
   

##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
##########
@@ -1042,22 +1046,27 @@ public static void aggregateColumnStats(IndexedRecord record, Schema schema,
 
     schema.getFields().forEach(field -> {
       Map<String, Object> columnStats = columnToStats.getOrDefault(field.name(), new HashMap<>());
-      final String fieldVal = getNestedFieldValAsString((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);
+      final Object fieldVal = getNestedFieldVal((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);

Review comment:
       Few notes: 
   
   1. `getNestedFieldVal` is too heavy-weight of an operation, which we don't actually need here (we don't need to parse the dot-path, etc). Instead we can directly access the field value and perform logical type conversion by invoking `convertValueForSpecificDataTypes` directly.
   2. `fieldVal` we're getting should be of type `Comparable<?>`, otherwise we set it to null (since there's no plausible order we can impose on these

##########
File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
##########
@@ -511,6 +512,132 @@ public static Object getNestedFieldVal(GenericRecord record, String fieldName, b
     }
   }
 
+  /**
+   * Get schema for the given field and record. Field can be nested, denoted by dot notation. e.g: a.b.c
+   *
+   * @param record    - record containing the value of the given field
+   * @param fieldName - name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromRecord(GenericRecord record, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    GenericRecord valueNode = record;
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Object val = valueNode.get(part);
+
+      if (i == parts.length - 1) {
+        return resolveUnion(valueNode.getSchema().getField(part).schema());
+      } else {
+        if (!(val instanceof GenericRecord)) {
+          throw new HoodieException("Cannot find a record at part value :" + part);
+        }
+        valueNode = (GenericRecord) val;
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+
+  /**
+   * Get schema for the given field and write schema. Field can be nested, denoted by dot notation. e.g: a.b.c
+   * Use this method when record is not available. Otherwise, prefer to use {@link #getNestedFieldSchemaFromRecord(GenericRecord, String)}
+   *
+   * @param writeSchema - write schema of the record
+   * @param fieldName   -  name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromWriteSchema(Schema writeSchema, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Schema schema = writeSchema.getField(part).schema();
+
+      if (i == parts.length - 1) {
+        return resolveUnion(schema);
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+  /**
+   * Given a field schema, convert its value to native Java type.
+   *
+   * @param schema - field schema
+   * @param val    - field value
+   * @return
+   */
+  public static Comparable<?> convertToNativeJavaType(Schema schema, Object val) {

Review comment:
       Please check other comments for context

##########
File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
##########
@@ -511,6 +512,132 @@ public static Object getNestedFieldVal(GenericRecord record, String fieldName, b
     }
   }
 
+  /**
+   * Get schema for the given field and record. Field can be nested, denoted by dot notation. e.g: a.b.c
+   *
+   * @param record    - record containing the value of the given field
+   * @param fieldName - name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromRecord(GenericRecord record, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    GenericRecord valueNode = record;
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Object val = valueNode.get(part);
+
+      if (i == parts.length - 1) {
+        return resolveUnion(valueNode.getSchema().getField(part).schema());
+      } else {
+        if (!(val instanceof GenericRecord)) {
+          throw new HoodieException("Cannot find a record at part value :" + part);
+        }
+        valueNode = (GenericRecord) val;
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+
+  /**
+   * Get schema for the given field and write schema. Field can be nested, denoted by dot notation. e.g: a.b.c
+   * Use this method when record is not available. Otherwise, prefer to use {@link #getNestedFieldSchemaFromRecord(GenericRecord, String)}
+   *
+   * @param writeSchema - write schema of the record
+   * @param fieldName   -  name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromWriteSchema(Schema writeSchema, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Schema schema = writeSchema.getField(part).schema();
+
+      if (i == parts.length - 1) {
+        return resolveUnion(schema);
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+  /**
+   * Given a field schema, convert its value to native Java type.
+   *
+   * @param schema - field schema
+   * @param val    - field value
+   * @return
+   */
+  public static Comparable<?> convertToNativeJavaType(Schema schema, Object val) {
+    if (val == null) {
+      return StringUtils.EMPTY_STRING;
+    }
+    if (schema.getLogicalType() == LogicalTypes.date()) {
+      return java.sql.Date.valueOf((val.toString()));
+    }
+    switch (schema.getType()) {
+      case RECORD:
+        final List<Schema.Field> schemaFields = schema.getFields();
+        final GenericRecord recordVal = (GenericRecord) val;
+        for (Schema.Field f : schemaFields) {
+          return convertToNativeJavaType(f.schema(), recordVal.get(f.name()));
+        }
+      case ARRAY:
+        Schema elementSchema = schema.getElementType();
+        List<Object> listRes = new ArrayList<>();
+        for (Object v : (List) val) {
+          listRes.add(convertToNativeJavaType(elementSchema, v));
+        }
+        return listRes.toString();
+      case UNION:
+        return convertToNativeJavaType(resolveUnion(schema), val);
+      case STRING:
+        return val.toString();
+      case BYTES:
+        return (ByteBuffer) val;
+      case INT:
+        return (Integer) val;
+      case LONG:
+        return (Long) val;
+      case FLOAT:
+        return (Float) val;
+      case DOUBLE:
+        return (Double) val;
+      case BOOLEAN:
+        return (Boolean) val;
+      case ENUM:
+      case MAP:
+      case FIXED:
+      case NULL:
+        // TODO: implement for above types based on logical types
+        return null;
+      default:
+        throw new IllegalStateException("Unexpected value: " + schema.getType());
+    }
+  }
+
+  /**
+   * Type-aware object comparison. Used to compare two objects for an Avro field.
+   */
+  public static int compare(Object o1, Object o2, Schema schema) {
+    if (Schema.Type.MAP.equals(schema.getType())) {
+      return ((Map) o1).equals(o2) ? 0 : 1;
+    }
+    return GenericData.get().compare(o1, o2, schema);
+  }
+
+  private static Schema resolveUnion(Schema fieldSchema) {

Review comment:
       There's already such utility (in this file actually) `resolveNullableSchema`

##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
##########
@@ -1042,22 +1046,27 @@ public static void aggregateColumnStats(IndexedRecord record, Schema schema,
 
     schema.getFields().forEach(field -> {
       Map<String, Object> columnStats = columnToStats.getOrDefault(field.name(), new HashMap<>());
-      final String fieldVal = getNestedFieldValAsString((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);
+      final Object fieldVal = getNestedFieldVal((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);
+      final Schema fieldSchema = getNestedFieldSchemaFromRecord((GenericRecord) record, field.name());
       // update stats
-      final int fieldSize = fieldVal == null ? 0 : fieldVal.length();
+      final int fieldSize = fieldVal == null ? 0 : StringUtils.objToString(fieldVal).length();

Review comment:
       I actually don't think this metric is useful at all without fixing the encoding we use for the payload. Is the encoding here Avro? If it's Avro, then the question is ...why Avro?
   
   I'd suggest before we try to implement some of these metrics let's start with use-case we have in mind for them and then take it up from there




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on a change in pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#discussion_r832172618



##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
##########
@@ -1042,22 +1046,27 @@ public static void aggregateColumnStats(IndexedRecord record, Schema schema,
 
     schema.getFields().forEach(field -> {
       Map<String, Object> columnStats = columnToStats.getOrDefault(field.name(), new HashMap<>());
-      final String fieldVal = getNestedFieldValAsString((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);
+      final Object fieldVal = getNestedFieldVal((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);

Review comment:
       > Instead we can directly access the field value and perform logical type conversion by invoking convertValueForSpecificDataTypes directly.
   
   > As such my proposal is to limit min/max stats only for primitive types.
   
   Will do.
   
   > `fieldVal` we're getting should be of type `Comparable<?>`, otherwise we set it to null (since there's no plausible order we can impose on these
   
   > I don't think we need this method. Instead we should make sure at the place where we collect min/max stats that they're proper Java objects, and simply treat min/max as Comparable
   
   How do you get a `Comparable<?>` given the record is an avro `GenericRecord`? I can refactor in a way such that we don't have to convert while merging stats, and hide the conversion. Still, we need to infer the java type from avro type and do the conversion. What I am thinking is to add another method, say `convertValueToNativeJavaType` in `HoodieAvroUtils` which will
   ```
   Comparable<?> convertValueToNativeJavaType(Schema fieldSchema, Object val) {
     Object v = convertValueForSpecificDataTypes(fieldSchema, val)
     return convertToNativeJavaType(fieldSchema, v);
   }
   
   and then in HoodieTableMetadataUtil#aggregateColumnStats
   
   Comparable<?> fieldVal = convertValueToNativeJavaType(field.schema(), genericRecord.get(field.name()));
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1075466123


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1080595178


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448",
       "triggerID" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2ef5fd42900ec5ca2ef5d2f6db9e209067518784 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1080802841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448",
       "triggerID" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "triggerType" : "PUSH"
     }, {
       "hash" : "91230b04ffc5c566806ddc7efe0e2f6024fb5948",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7464",
       "triggerID" : "91230b04ffc5c566806ddc7efe0e2f6024fb5948",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2ef5fd42900ec5ca2ef5d2f6db9e209067518784 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448) 
   * 91230b04ffc5c566806ddc7efe0e2f6024fb5948 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7464) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#discussion_r831547314



##########
File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
##########
@@ -511,6 +512,132 @@ public static Object getNestedFieldVal(GenericRecord record, String fieldName, b
     }
   }
 
+  /**
+   * Get schema for the given field and record. Field can be nested, denoted by dot notation. e.g: a.b.c
+   *
+   * @param record    - record containing the value of the given field
+   * @param fieldName - name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromRecord(GenericRecord record, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    GenericRecord valueNode = record;
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Object val = valueNode.get(part);
+
+      if (i == parts.length - 1) {
+        return resolveUnion(valueNode.getSchema().getField(part).schema());
+      } else {
+        if (!(val instanceof GenericRecord)) {
+          throw new HoodieException("Cannot find a record at part value :" + part);
+        }
+        valueNode = (GenericRecord) val;
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+
+  /**
+   * Get schema for the given field and write schema. Field can be nested, denoted by dot notation. e.g: a.b.c
+   * Use this method when record is not available. Otherwise, prefer to use {@link #getNestedFieldSchemaFromRecord(GenericRecord, String)}
+   *
+   * @param writeSchema - write schema of the record
+   * @param fieldName   -  name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromWriteSchema(Schema writeSchema, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Schema schema = writeSchema.getField(part).schema();
+
+      if (i == parts.length - 1) {
+        return resolveUnion(schema);
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+  /**
+   * Given a field schema, convert its value to native Java type.
+   *
+   * @param schema - field schema
+   * @param val    - field value
+   * @return
+   */
+  public static Comparable<?> convertToNativeJavaType(Schema schema, Object val) {

Review comment:
       Please check other comments for context

##########
File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
##########
@@ -511,6 +512,132 @@ public static Object getNestedFieldVal(GenericRecord record, String fieldName, b
     }
   }
 
+  /**
+   * Get schema for the given field and record. Field can be nested, denoted by dot notation. e.g: a.b.c
+   *
+   * @param record    - record containing the value of the given field
+   * @param fieldName - name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromRecord(GenericRecord record, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    GenericRecord valueNode = record;
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Object val = valueNode.get(part);
+
+      if (i == parts.length - 1) {
+        return resolveUnion(valueNode.getSchema().getField(part).schema());
+      } else {
+        if (!(val instanceof GenericRecord)) {
+          throw new HoodieException("Cannot find a record at part value :" + part);
+        }
+        valueNode = (GenericRecord) val;
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+
+  /**
+   * Get schema for the given field and write schema. Field can be nested, denoted by dot notation. e.g: a.b.c
+   * Use this method when record is not available. Otherwise, prefer to use {@link #getNestedFieldSchemaFromRecord(GenericRecord, String)}
+   *
+   * @param writeSchema - write schema of the record
+   * @param fieldName   -  name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromWriteSchema(Schema writeSchema, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Schema schema = writeSchema.getField(part).schema();
+
+      if (i == parts.length - 1) {
+        return resolveUnion(schema);
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+  /**
+   * Given a field schema, convert its value to native Java type.
+   *
+   * @param schema - field schema
+   * @param val    - field value
+   * @return
+   */
+  public static Comparable<?> convertToNativeJavaType(Schema schema, Object val) {

Review comment:
       I don't think we need this method. Instead we should make sure at the place where we collect min/max stats that they're proper Java objects, and simply treat min/max as `Comparable`
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on a change in pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#discussion_r832284664



##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestColumnStatsIndex.scala
##########
@@ -63,6 +69,91 @@ class TestColumnStatsIndex extends HoodieClientTestBase {
     cleanupSparkContexts()
   }
 
+  @Test
+  def testMetadataColumnStatsIndex(): Unit = {
+    setTableName("hoodie_test")
+    initMetaClient()
+    val sourceJSONTablePath = getClass.getClassLoader.getResource("index/zorder/input-table-json").toString
+    val inputDF =
+    // NOTE: Schema here is provided for validation that the input date is in the appropriate format
+      spark.read
+        .schema(sourceTableSchema)
+        .json(sourceJSONTablePath)
+
+    val opts = Map(
+      "hoodie.insert.shuffle.parallelism" -> "4",
+      "hoodie.upsert.shuffle.parallelism" -> "4",
+      HoodieWriteConfig.TBL_NAME.key -> "hoodie_test",
+      RECORDKEY_FIELD.key -> "c1",
+      PRECOMBINE_FIELD.key -> "c1",
+      HoodieMetadataConfig.ENABLE.key -> "true",
+      HoodieMetadataConfig.ENABLE_METADATA_INDEX_COLUMN_STATS.key -> "true",
+      HoodieMetadataConfig.ENABLE_METADATA_INDEX_COLUMN_STATS_FOR_ALL_COLUMNS.key -> "true",
+      HoodieTableConfig.POPULATE_META_FIELDS.key -> "true"
+    )
+
+    inputDF.repartition(4)
+      .write
+      .format("hudi")
+      .options(opts)
+      .option(DataSourceWriteOptions.OPERATION.key, DataSourceWriteOptions.INSERT_OPERATION_OPT_VAL)
+      .option(HoodieStorageConfig.PARQUET_MAX_FILE_SIZE.key, 100 * 1024)
+      .mode(SaveMode.Overwrite)
+      .save(basePath)
+
+    metaClient = HoodieTableMetaClient.reload(metaClient)
+
+    val metadataTablePath = HoodieTableMetadata.getMetadataTableBasePath(basePath)
+
+    val targetColStatsIndexColumns = Seq(
+      HoodieMetadataPayload.COLUMN_STATS_FIELD_FILE_NAME,
+      HoodieMetadataPayload.COLUMN_STATS_FIELD_MIN_VALUE,
+      HoodieMetadataPayload.COLUMN_STATS_FIELD_MAX_VALUE,
+      HoodieMetadataPayload.COLUMN_STATS_FIELD_NULL_COUNT)
+
+    val requiredMetadataIndexColumns =
+      (targetColStatsIndexColumns :+ HoodieMetadataPayload.COLUMN_STATS_FIELD_COLUMN_NAME).map(colName =>
+        s"${HoodieMetadataPayload.SCHEMA_FIELD_ID_COLUMN_STATS}.${colName}")
+
+    // Read Metadata Table's Column Stats Index into Spark's [[DataFrame]]
+    val metadataTableDF = spark.read.format("org.apache.hudi")
+      .load(s"$metadataTablePath/${MetadataPartitionType.COLUMN_STATS.getPartitionPath}")
+
+    val colStatsDF = metadataTableDF.where(col(HoodieMetadataPayload.SCHEMA_FIELD_ID_COLUMN_STATS).isNotNull)
+      .select(requiredMetadataIndexColumns.map(col): _*)
+
+    // assert min/max for some columns
+    val minC1 = colStatsDF.select(HoodieMetadataPayload.COLUMN_STATS_FIELD_MIN_VALUE)

Review comment:
       will do.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1074229102


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4c79f3df666bdf2c3d1e90d930f6f89fb991afcc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#discussion_r832420211



##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestColumnStatsIndex.scala
##########
@@ -63,6 +69,91 @@ class TestColumnStatsIndex extends HoodieClientTestBase {
     cleanupSparkContexts()
   }
 
+  @Test
+  def testMetadataColumnStatsIndex(): Unit = {

Review comment:
       Do we assert Column Stats correctness there?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1075399688


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4c79f3df666bdf2c3d1e90d930f6f89fb991afcc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136) 
   * a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073297587


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e268f4ceb0883f1092ab468f466a90baf485d5c3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073501604


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e268f4ceb0883f1092ab468f466a90baf485d5c3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108) 
   * f5a13c08cc1d2c59a56f85702c455cd1d409fe79 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073754782


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f5a13c08cc1d2c59a56f85702c455cd1d409fe79 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119) 
   * 8a8f3c421836a055b4b5917dc58f240d10e11820 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#discussion_r831514560



##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestColumnStatsIndex.scala
##########
@@ -63,6 +69,91 @@ class TestColumnStatsIndex extends HoodieClientTestBase {
     cleanupSparkContexts()
   }
 
+  @Test
+  def testMetadataColumnStatsIndex(): Unit = {

Review comment:
       In general, i think we should test against all operations that can modify Col Stats index (commit, delta_commit, clean, compact, cluster) to make sure they leave it in consistent state.

##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestColumnStatsIndex.scala
##########
@@ -63,6 +69,91 @@ class TestColumnStatsIndex extends HoodieClientTestBase {
     cleanupSparkContexts()
   }
 
+  @Test
+  def testMetadataColumnStatsIndex(): Unit = {
+    setTableName("hoodie_test")
+    initMetaClient()
+    val sourceJSONTablePath = getClass.getClassLoader.getResource("index/zorder/input-table-json").toString
+    val inputDF =
+    // NOTE: Schema here is provided for validation that the input date is in the appropriate format
+      spark.read
+        .schema(sourceTableSchema)
+        .json(sourceJSONTablePath)
+
+    val opts = Map(
+      "hoodie.insert.shuffle.parallelism" -> "4",
+      "hoodie.upsert.shuffle.parallelism" -> "4",
+      HoodieWriteConfig.TBL_NAME.key -> "hoodie_test",
+      RECORDKEY_FIELD.key -> "c1",
+      PRECOMBINE_FIELD.key -> "c1",
+      HoodieMetadataConfig.ENABLE.key -> "true",
+      HoodieMetadataConfig.ENABLE_METADATA_INDEX_COLUMN_STATS.key -> "true",
+      HoodieMetadataConfig.ENABLE_METADATA_INDEX_COLUMN_STATS_FOR_ALL_COLUMNS.key -> "true",
+      HoodieTableConfig.POPULATE_META_FIELDS.key -> "true"
+    )
+
+    inputDF.repartition(4)
+      .write
+      .format("hudi")
+      .options(opts)
+      .option(DataSourceWriteOptions.OPERATION.key, DataSourceWriteOptions.INSERT_OPERATION_OPT_VAL)
+      .option(HoodieStorageConfig.PARQUET_MAX_FILE_SIZE.key, 100 * 1024)
+      .mode(SaveMode.Overwrite)
+      .save(basePath)
+
+    metaClient = HoodieTableMetaClient.reload(metaClient)
+
+    val metadataTablePath = HoodieTableMetadata.getMetadataTableBasePath(basePath)
+
+    val targetColStatsIndexColumns = Seq(
+      HoodieMetadataPayload.COLUMN_STATS_FIELD_FILE_NAME,
+      HoodieMetadataPayload.COLUMN_STATS_FIELD_MIN_VALUE,
+      HoodieMetadataPayload.COLUMN_STATS_FIELD_MAX_VALUE,
+      HoodieMetadataPayload.COLUMN_STATS_FIELD_NULL_COUNT)
+
+    val requiredMetadataIndexColumns =
+      (targetColStatsIndexColumns :+ HoodieMetadataPayload.COLUMN_STATS_FIELD_COLUMN_NAME).map(colName =>
+        s"${HoodieMetadataPayload.SCHEMA_FIELD_ID_COLUMN_STATS}.${colName}")
+
+    // Read Metadata Table's Column Stats Index into Spark's [[DataFrame]]
+    val metadataTableDF = spark.read.format("org.apache.hudi")
+      .load(s"$metadataTablePath/${MetadataPartitionType.COLUMN_STATS.getPartitionPath}")
+
+    val colStatsDF = metadataTableDF.where(col(HoodieMetadataPayload.SCHEMA_FIELD_ID_COLUMN_STATS).isNotNull)
+      .select(requiredMetadataIndexColumns.map(col): _*)
+
+    // assert min/max for some columns
+    val minC1 = colStatsDF.select(HoodieMetadataPayload.COLUMN_STATS_FIELD_MIN_VALUE)

Review comment:
       Instead of selectively asserting min/max for columns, let's use the final table fixture and match it as a whole (like is being done in `testZIndexTableComposition`)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073811521


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8a8f3c421836a055b4b5917dc58f240d10e11820 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073811521


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8a8f3c421836a055b4b5917dc58f240d10e11820 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073502423


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e268f4ceb0883f1092ab468f466a90baf485d5c3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108) 
   * f5a13c08cc1d2c59a56f85702c455cd1d409fe79 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073502423


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e268f4ceb0883f1092ab468f466a90baf485d5c3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108) 
   * f5a13c08cc1d2c59a56f85702c455cd1d409fe79 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1075396244


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4c79f3df666bdf2c3d1e90d930f6f89fb991afcc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136) 
   * a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1080414160


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448",
       "triggerID" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192) 
   * 2ef5fd42900ec5ca2ef5d2f6db9e209067518784 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073281768


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e268f4ceb0883f1092ab468f466a90baf485d5c3 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073754782


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f5a13c08cc1d2c59a56f85702c455cd1d409fe79 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119) 
   * 8a8f3c421836a055b4b5917dc58f240d10e11820 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073533101


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f5a13c08cc1d2c59a56f85702c455cd1d409fe79 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073281768


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e268f4ceb0883f1092ab468f466a90baf485d5c3 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073533101


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f5a13c08cc1d2c59a56f85702c455cd1d409fe79 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1080799126


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448",
       "triggerID" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "triggerType" : "PUSH"
     }, {
       "hash" : "91230b04ffc5c566806ddc7efe0e2f6024fb5948",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "91230b04ffc5c566806ddc7efe0e2f6024fb5948",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2ef5fd42900ec5ca2ef5d2f6db9e209067518784 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448) 
   * 91230b04ffc5c566806ddc7efe0e2f6024fb5948 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1080989038


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448",
       "triggerID" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "triggerType" : "PUSH"
     }, {
       "hash" : "91230b04ffc5c566806ddc7efe0e2f6024fb5948",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7464",
       "triggerID" : "91230b04ffc5c566806ddc7efe0e2f6024fb5948",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 91230b04ffc5c566806ddc7efe0e2f6024fb5948 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7464) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1074155056


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8a8f3c421836a055b4b5917dc58f240d10e11820 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128) 
   * 4c79f3df666bdf2c3d1e90d930f6f89fb991afcc UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073282373


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e268f4ceb0883f1092ab468f466a90baf485d5c3 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1073757791


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f5a13c08cc1d2c59a56f85702c455cd1d409fe79 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119) 
   * 8a8f3c421836a055b4b5917dc58f240d10e11820 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1080595178


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448",
       "triggerID" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2ef5fd42900ec5ca2ef5d2f6db9e209067518784 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1080802841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448",
       "triggerID" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "triggerType" : "PUSH"
     }, {
       "hash" : "91230b04ffc5c566806ddc7efe0e2f6024fb5948",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7464",
       "triggerID" : "91230b04ffc5c566806ddc7efe0e2f6024fb5948",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2ef5fd42900ec5ca2ef5d2f6db9e209067518784 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7448) 
   * 91230b04ffc5c566806ddc7efe0e2f6024fb5948 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7464) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#discussion_r836991954



##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestColumnStatsIndex.scala
##########
@@ -63,6 +69,103 @@ class TestColumnStatsIndex extends HoodieClientTestBase {
     cleanupSparkContexts()
   }
 
+  @Test
+  def testMetadataColumnStatsIndex(): Unit = {
+    setTableName("hoodie_test")
+    initMetaClient()
+    val sourceJSONTablePath = getClass.getClassLoader.getResource("index/zorder/input-table-json").toString

Review comment:
       Let's remove the `z-order` dangling references 

##########
File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
##########
@@ -501,6 +502,109 @@ public static Object getNestedFieldVal(GenericRecord record, String fieldName, b
     }
   }
 
+  /**
+   * Get schema for the given field and record. Field can be nested, denoted by dot notation. e.g: a.b.c
+   *
+   * @param record    - record containing the value of the given field
+   * @param fieldName - name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromRecord(GenericRecord record, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    GenericRecord valueNode = record;
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Object val = valueNode.get(part);
+
+      if (i == parts.length - 1) {
+        return resolveNullableSchema(valueNode.getSchema().getField(part).schema());
+      } else {
+        if (!(val instanceof GenericRecord)) {
+          throw new HoodieException("Cannot find a record at part value :" + part);
+        }
+        valueNode = (GenericRecord) val;
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+
+  /**
+   * Get schema for the given field and write schema. Field can be nested, denoted by dot notation. e.g: a.b.c
+   * Use this method when record is not available. Otherwise, prefer to use {@link #getNestedFieldSchemaFromRecord(GenericRecord, String)}
+   *
+   * @param writeSchema - write schema of the record
+   * @param fieldName   -  name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromWriteSchema(Schema writeSchema, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Schema schema = writeSchema.getField(part).schema();
+
+      if (i == parts.length - 1) {
+        return resolveNullableSchema(schema);
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+  /**
+   * Given a field schema, convert its value to native Java type.
+   *
+   * @param schema - field schema
+   * @param val    - field value
+   * @return
+   */
+  public static Comparable<?> convertToNativeJavaType(Schema schema, Object val) {
+    if (val == null) {
+      return StringUtils.EMPTY_STRING;

Review comment:
       Should return null

##########
File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
##########
@@ -501,6 +502,109 @@ public static Object getNestedFieldVal(GenericRecord record, String fieldName, b
     }
   }
 
+  /**
+   * Get schema for the given field and record. Field can be nested, denoted by dot notation. e.g: a.b.c
+   *
+   * @param record    - record containing the value of the given field
+   * @param fieldName - name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromRecord(GenericRecord record, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    GenericRecord valueNode = record;
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Object val = valueNode.get(part);
+
+      if (i == parts.length - 1) {
+        return resolveNullableSchema(valueNode.getSchema().getField(part).schema());
+      } else {
+        if (!(val instanceof GenericRecord)) {
+          throw new HoodieException("Cannot find a record at part value :" + part);
+        }
+        valueNode = (GenericRecord) val;
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+
+  /**
+   * Get schema for the given field and write schema. Field can be nested, denoted by dot notation. e.g: a.b.c
+   * Use this method when record is not available. Otherwise, prefer to use {@link #getNestedFieldSchemaFromRecord(GenericRecord, String)}
+   *
+   * @param writeSchema - write schema of the record
+   * @param fieldName   -  name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromWriteSchema(Schema writeSchema, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Schema schema = writeSchema.getField(part).schema();
+
+      if (i == parts.length - 1) {
+        return resolveNullableSchema(schema);
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+  /**
+   * Given a field schema, convert its value to native Java type.
+   *
+   * @param schema - field schema
+   * @param val    - field value
+   * @return
+   */
+  public static Comparable<?> convertToNativeJavaType(Schema schema, Object val) {
+    if (val == null) {
+      return StringUtils.EMPTY_STRING;
+    }
+    if (schema.getLogicalType() == LogicalTypes.date()) {
+      return java.sql.Date.valueOf((val.toString()));
+    }
+    switch (schema.getType()) {
+      case UNION:

Review comment:
       We need to do union-unfolding externally to this method (otherwise every branch will have to do it)

##########
File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
##########
@@ -511,6 +512,132 @@ public static Object getNestedFieldVal(GenericRecord record, String fieldName, b
     }
   }
 
+  /**
+   * Get schema for the given field and record. Field can be nested, denoted by dot notation. e.g: a.b.c
+   *
+   * @param record    - record containing the value of the given field
+   * @param fieldName - name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromRecord(GenericRecord record, String fieldName) {

Review comment:
       This doesn't seem to be used anymore

##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestColumnStatsIndex.scala
##########
@@ -63,6 +69,103 @@ class TestColumnStatsIndex extends HoodieClientTestBase {
     cleanupSparkContexts()
   }
 
+  @Test
+  def testMetadataColumnStatsIndex(): Unit = {
+    setTableName("hoodie_test")
+    initMetaClient()
+    val sourceJSONTablePath = getClass.getClassLoader.getResource("index/zorder/input-table-json").toString

Review comment:
       Actually, NVM. Will do in my cleanup PR

##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
##########
@@ -1061,23 +1064,29 @@ public static void aggregateColumnStats(IndexedRecord record, Schema schema,
     }
 
     schema.getFields().forEach(field -> {
-      Map<String, Object> columnStats = columnToStats.getOrDefault(field.name(), new HashMap<>());
-      final String fieldVal = getNestedFieldValAsString((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);
+      Map<String, Object> columnStats = columnToStats.get(field.name());
+      GenericRecord genericRecord = (GenericRecord) record;
+      final Object fieldVal = convertValueForSpecificDataTypes(field.schema(), genericRecord.get(field.name()), consistentLogicalTimestampEnabled);
+      final Schema fieldSchema = getNestedFieldSchemaFromWriteSchema(genericRecord.getSchema(), field.name());
       // update stats
-      final int fieldSize = fieldVal == null ? 0 : fieldVal.length();
-      columnStats.put(TOTAL_SIZE, Long.parseLong(columnStats.getOrDefault(TOTAL_SIZE, 0).toString()) + fieldSize);
-      columnStats.put(TOTAL_UNCOMPRESSED_SIZE, Long.parseLong(columnStats.getOrDefault(TOTAL_UNCOMPRESSED_SIZE, 0).toString()) + fieldSize);
+      // NOTE: Unlike Parquet, Avro does not give the field size.
+      columnStats.put(TOTAL_SIZE, Long.parseLong(columnStats.getOrDefault(TOTAL_SIZE, 0).toString()));
+      columnStats.put(TOTAL_UNCOMPRESSED_SIZE, Long.parseLong(columnStats.getOrDefault(TOTAL_UNCOMPRESSED_SIZE, 0).toString()));
 
-      if (!isNullOrEmpty(fieldVal)) {
+      if (fieldVal != null) {
         // set the min value of the field
         if (!columnStats.containsKey(MIN)) {
           columnStats.put(MIN, fieldVal);
         }
-        if (fieldVal.compareTo(String.valueOf(columnStats.get(MIN))) < 0) {
+        if (compare(fieldVal, columnStats.get(MIN), fieldSchema) < 0) {
           columnStats.put(MIN, fieldVal);
         }
         // set the max value of the field
-        if (fieldVal.compareTo(String.valueOf(columnStats.getOrDefault(MAX, ""))) > 0) {
+        if (!columnStats.containsKey(MAX)) {
+          columnStats.put(MAX, fieldVal);
+        }
+        // set the max value of the field
+        if (compare(fieldVal, columnStats.get(MAX), fieldSchema) > 0) {

Review comment:
       This could also be `else if`

##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestColumnStatsIndex.scala
##########
@@ -63,6 +69,91 @@ class TestColumnStatsIndex extends HoodieClientTestBase {
     cleanupSparkContexts()
   }
 
+  @Test
+  def testMetadataColumnStatsIndex(): Unit = {

Review comment:
       Somehow can't resolve my own comment. This is taken care of

##########
File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
##########
@@ -501,6 +502,109 @@ public static Object getNestedFieldVal(GenericRecord record, String fieldName, b
     }
   }
 
+  /**
+   * Get schema for the given field and record. Field can be nested, denoted by dot notation. e.g: a.b.c
+   *
+   * @param record    - record containing the value of the given field
+   * @param fieldName - name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromRecord(GenericRecord record, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    GenericRecord valueNode = record;
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Object val = valueNode.get(part);
+
+      if (i == parts.length - 1) {
+        return resolveNullableSchema(valueNode.getSchema().getField(part).schema());
+      } else {
+        if (!(val instanceof GenericRecord)) {
+          throw new HoodieException("Cannot find a record at part value :" + part);
+        }
+        valueNode = (GenericRecord) val;
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+
+  /**
+   * Get schema for the given field and write schema. Field can be nested, denoted by dot notation. e.g: a.b.c
+   * Use this method when record is not available. Otherwise, prefer to use {@link #getNestedFieldSchemaFromRecord(GenericRecord, String)}
+   *
+   * @param writeSchema - write schema of the record
+   * @param fieldName   -  name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromWriteSchema(Schema writeSchema, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Schema schema = writeSchema.getField(part).schema();
+
+      if (i == parts.length - 1) {
+        return resolveNullableSchema(schema);
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+  /**
+   * Given a field schema, convert its value to native Java type.
+   *
+   * @param schema - field schema
+   * @param val    - field value
+   * @return
+   */
+  public static Comparable<?> convertToNativeJavaType(Schema schema, Object val) {

Review comment:
       Should accept `Comparable`

##########
File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
##########
@@ -501,6 +502,109 @@ public static Object getNestedFieldVal(GenericRecord record, String fieldName, b
     }
   }
 
+  /**
+   * Get schema for the given field and record. Field can be nested, denoted by dot notation. e.g: a.b.c
+   *
+   * @param record    - record containing the value of the given field
+   * @param fieldName - name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromRecord(GenericRecord record, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    GenericRecord valueNode = record;
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Object val = valueNode.get(part);
+
+      if (i == parts.length - 1) {
+        return resolveNullableSchema(valueNode.getSchema().getField(part).schema());
+      } else {
+        if (!(val instanceof GenericRecord)) {
+          throw new HoodieException("Cannot find a record at part value :" + part);
+        }
+        valueNode = (GenericRecord) val;
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+
+  /**
+   * Get schema for the given field and write schema. Field can be nested, denoted by dot notation. e.g: a.b.c
+   * Use this method when record is not available. Otherwise, prefer to use {@link #getNestedFieldSchemaFromRecord(GenericRecord, String)}
+   *
+   * @param writeSchema - write schema of the record
+   * @param fieldName   -  name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromWriteSchema(Schema writeSchema, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Schema schema = writeSchema.getField(part).schema();
+
+      if (i == parts.length - 1) {
+        return resolveNullableSchema(schema);
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+  /**
+   * Given a field schema, convert its value to native Java type.
+   *
+   * @param schema - field schema
+   * @param val    - field value
+   * @return
+   */
+  public static Comparable<?> convertToNativeJavaType(Schema schema, Object val) {
+    if (val == null) {
+      return StringUtils.EMPTY_STRING;
+    }
+    if (schema.getLogicalType() == LogicalTypes.date()) {
+      return java.sql.Date.valueOf((val.toString()));
+    }
+    switch (schema.getType()) {
+      case UNION:
+        return convertToNativeJavaType(resolveNullableSchema(schema), val);
+      case STRING:
+        return val.toString();
+      case BYTES:
+        return (ByteBuffer) val;
+      case INT:
+        return (Integer) val;
+      case LONG:
+        return (Long) val;
+      case FLOAT:
+        return (Float) val;
+      case DOUBLE:
+        return (Double) val;
+      case BOOLEAN:
+        return (Boolean) val;
+      case ENUM:
+      case MAP:
+      case FIXED:
+      case NULL:
+      case RECORD:
+      case ARRAY:
+        return null;
+      default:
+        throw new IllegalStateException("Unexpected type: " + schema.getType());
+    }
+  }
+
+  /**
+   * Type-aware object comparison. Used to compare two objects for an Avro field.
+   */
+  public static int compare(Object o1, Object o2, Schema schema) {
+    if (Schema.Type.MAP.equals(schema.getType())) {

Review comment:
       We don't need this conditional

##########
File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
##########
@@ -501,6 +502,109 @@ public static Object getNestedFieldVal(GenericRecord record, String fieldName, b
     }
   }
 
+  /**
+   * Get schema for the given field and record. Field can be nested, denoted by dot notation. e.g: a.b.c
+   *
+   * @param record    - record containing the value of the given field
+   * @param fieldName - name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromRecord(GenericRecord record, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    GenericRecord valueNode = record;
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Object val = valueNode.get(part);
+
+      if (i == parts.length - 1) {
+        return resolveNullableSchema(valueNode.getSchema().getField(part).schema());
+      } else {
+        if (!(val instanceof GenericRecord)) {
+          throw new HoodieException("Cannot find a record at part value :" + part);
+        }
+        valueNode = (GenericRecord) val;
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+
+  /**
+   * Get schema for the given field and write schema. Field can be nested, denoted by dot notation. e.g: a.b.c
+   * Use this method when record is not available. Otherwise, prefer to use {@link #getNestedFieldSchemaFromRecord(GenericRecord, String)}
+   *
+   * @param writeSchema - write schema of the record
+   * @param fieldName   -  name of the field
+   * @return
+   */
+  public static Schema getNestedFieldSchemaFromWriteSchema(Schema writeSchema, String fieldName) {
+    String[] parts = fieldName.split("\\.");
+    int i = 0;
+    for (; i < parts.length; i++) {
+      String part = parts[i];
+      Schema schema = writeSchema.getField(part).schema();
+
+      if (i == parts.length - 1) {
+        return resolveNullableSchema(schema);
+      }
+    }
+    throw new HoodieException("Failed to get schema. Not a valid field name: " + fieldName);
+  }
+
+  /**
+   * Given a field schema, convert its value to native Java type.
+   *
+   * @param schema - field schema
+   * @param val    - field value
+   * @return
+   */
+  public static Comparable<?> convertToNativeJavaType(Schema schema, Object val) {
+    if (val == null) {
+      return StringUtils.EMPTY_STRING;
+    }
+    if (schema.getLogicalType() == LogicalTypes.date()) {
+      return java.sql.Date.valueOf((val.toString()));
+    }
+    switch (schema.getType()) {
+      case UNION:
+        return convertToNativeJavaType(resolveNullableSchema(schema), val);
+      case STRING:
+        return val.toString();
+      case BYTES:
+        return (ByteBuffer) val;
+      case INT:
+        return (Integer) val;
+      case LONG:
+        return (Long) val;
+      case FLOAT:
+        return (Float) val;
+      case DOUBLE:
+        return (Double) val;
+      case BOOLEAN:
+        return (Boolean) val;
+      case ENUM:
+      case MAP:
+      case FIXED:
+      case NULL:
+      case RECORD:
+      case ARRAY:
+        return null;
+      default:
+        throw new IllegalStateException("Unexpected type: " + schema.getType());
+    }
+  }
+
+  /**
+   * Type-aware object comparison. Used to compare two objects for an Avro field.
+   */
+  public static int compare(Object o1, Object o2, Schema schema) {

Review comment:
       Should also accept `Comparable`

##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
##########
@@ -1061,23 +1064,29 @@ public static void aggregateColumnStats(IndexedRecord record, Schema schema,
     }
 
     schema.getFields().forEach(field -> {
-      Map<String, Object> columnStats = columnToStats.getOrDefault(field.name(), new HashMap<>());
-      final String fieldVal = getNestedFieldValAsString((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);
+      Map<String, Object> columnStats = columnToStats.get(field.name());
+      GenericRecord genericRecord = (GenericRecord) record;
+      final Object fieldVal = convertValueForSpecificDataTypes(field.schema(), genericRecord.get(field.name()), consistentLogicalTimestampEnabled);
+      final Schema fieldSchema = getNestedFieldSchemaFromWriteSchema(genericRecord.getSchema(), field.name());
       // update stats
-      final int fieldSize = fieldVal == null ? 0 : fieldVal.length();
-      columnStats.put(TOTAL_SIZE, Long.parseLong(columnStats.getOrDefault(TOTAL_SIZE, 0).toString()) + fieldSize);
-      columnStats.put(TOTAL_UNCOMPRESSED_SIZE, Long.parseLong(columnStats.getOrDefault(TOTAL_UNCOMPRESSED_SIZE, 0).toString()) + fieldSize);
+      // NOTE: Unlike Parquet, Avro does not give the field size.
+      columnStats.put(TOTAL_SIZE, Long.parseLong(columnStats.getOrDefault(TOTAL_SIZE, 0).toString()));
+      columnStats.put(TOTAL_UNCOMPRESSED_SIZE, Long.parseLong(columnStats.getOrDefault(TOTAL_UNCOMPRESSED_SIZE, 0).toString()));
 
-      if (!isNullOrEmpty(fieldVal)) {
+      if (fieldVal != null) {
         // set the min value of the field
         if (!columnStats.containsKey(MIN)) {
           columnStats.put(MIN, fieldVal);
         }
-        if (fieldVal.compareTo(String.valueOf(columnStats.get(MIN))) < 0) {
+        if (compare(fieldVal, columnStats.get(MIN), fieldSchema) < 0) {

Review comment:
       Let's make this `else if`

##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
##########
@@ -1032,11 +1035,11 @@ public static void accumulateColumnRanges(Schema.Field field, String filePath,
                                             Map<String, HoodieColumnRangeMetadata<Comparable>> columnRangeMap,

Review comment:
       Let's move `accumulateColumnRanges` closer to where it's actually used (into `HoodieAppendHandle`)

##########
File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java
##########
@@ -2026,11 +2027,57 @@ private void validateMetadata(SparkRDDWriteClient testClient) throws IOException
       assertTrue(latestSlices.size()
           <= (numFileVersions * metadataEnabledPartitionTypes.get(partition).getFileGroupCount()), "Should limit file slice to "
           + numFileVersions + " per file group, but was " + latestSlices.size());
+      List<HoodieLogFile> logFiles = latestSlices.get(0).getLogFiles().collect(Collectors.toList());
+      try {
+        if (MetadataPartitionType.FILES.getPartitionPath().equals(partition)) {
+          verifyMetadataRawRecords(table, logFiles, false);
+        }
+        if (MetadataPartitionType.COLUMN_STATS.getPartitionPath().equals(partition)) {
+          verifyMetadataColumnStatsRecords(logFiles);
+        }
+      } catch (IOException e) {
+        LOG.error("Metadata record validation failed", e);
+        fail("Metadata record validation failed");
+      }
     });
 
     LOG.info("Validation time=" + timer.endTimer());
   }
 
+  private void verifyMetadataColumnStatsRecords(List<HoodieLogFile> logFiles) throws IOException {
+    for (HoodieLogFile logFile : logFiles) {
+      FileStatus[] fsStatus = fs.listStatus(logFile.getPath());
+      MessageType writerSchemaMsg = TableSchemaResolver.readSchemaFromLogFile(fs, logFile.getPath());
+      if (writerSchemaMsg == null) {
+        // not a data block
+        continue;
+      }
+
+      Schema writerSchema = new AvroSchemaConverter().convert(writerSchemaMsg);
+      HoodieLogFormat.Reader logFileReader = HoodieLogFormat.newReader(fs, new HoodieLogFile(fsStatus[0].getPath()), writerSchema);
+
+      while (logFileReader.hasNext()) {
+        HoodieLogBlock logBlock = logFileReader.next();
+        if (logBlock instanceof HoodieDataBlock) {
+          try (ClosableIterator<IndexedRecord> recordItr = ((HoodieDataBlock) logBlock).getRecordItr()) {
+            recordItr.forEachRemaining(indexRecord -> {
+              final GenericRecord record = (GenericRecord) indexRecord;
+              final GenericRecord colStatsRecord = (GenericRecord) record.get(HoodieMetadataPayload.SCHEMA_FIELD_ID_COLUMN_STATS);
+              assertNotNull(colStatsRecord);
+              assertNotNull(colStatsRecord.get(HoodieMetadataPayload.COLUMN_STATS_FIELD_COLUMN_NAME));
+              assertNotNull(colStatsRecord.get(HoodieMetadataPayload.COLUMN_STATS_FIELD_NULL_COUNT));
+              /**
+               * TODO: some types of field may have null min/max as these statistics are only supported for primitive types

Review comment:
       Comment is rather misleading: min/max stats could be null, but they are supported for composite typs as well (composite payload would be converted to byte string and compared as such)

##########
File path: hudi-spark-datasource/hudi-spark/src/test/resources/index/zorder/update-column-stats-index-table.json
##########
@@ -0,0 +1,26 @@
+{"minValue":"0","maxValue":"959","nullCount":0,"columnName":"c1"}
+{"minValue":"64.768","maxValue":"979.272","nullCount":0,"columnName":"c3"}
+{"minValue":"20220328202039002","maxValue":"20220328202039002","nullCount":0,"columnName":"_hoodie_commit_time"}
+{"minValue":"20220328202039002_0_41","maxValue":"20220328202039002_0_80","nullCount":0,"columnName":"_hoodie_commit_seqno"}
+{"minValue":"20220328202022669_0_1","maxValue":"20220328202022669_0_9","nullCount":0,"columnName":"_hoodie_commit_seqno"}
+{"minValue":"2020-01-01","maxValue":"2020-11-21","nullCount":0,"columnName":"c6"}
+{"minValue":"2020-01-01","maxValue":"2020-11-22","nullCount":0,"columnName":"c6"}
+{"minValue":" 111sdc","maxValue":" 8sdc","nullCount":0,"columnName":"c2"}
+{"minValue":"1637307284159000","maxValue":"1637307284201000","nullCount":0,"columnName":"c4"}
+{"minValue":"10deb9bc-f7b0-4c5c-8cd4-eccb92788c8c-0_0-70-115_20220328202039002.parquet","maxValue":"10deb9bc-f7b0-4c5c-8cd4-eccb92788c8c-0_0-70-115_20220328202039002.parquet","nullCount":0,"columnName":"_hoodie_file_name"}
+{"minValue":"19.000","maxValue":"994.355","nullCount":0,"columnName":"c3"}
+{"minValue":" 0sdc","maxValue":" 959sdc","nullCount":0,"columnName":"c2"}
+{"minValue":"8","maxValue":"770","nullCount":0,"columnName":"c1"}
+{"minValue":"b8dc14a9-c067-4b9c-9d29-eaa09bd35c4b-0_0-23-37_20220328202022669.parquet","maxValue":"b8dc14a9-c067-4b9c-9d29-eaa09bd35c4b-0_0-23-37_20220328202022669.parquet","nullCount":0,"columnName":"_hoodie_file_name"}
+{"minValue":"20220328202022669","maxValue":"20220328202022669","nullCount":0,"columnName":"_hoodie_commit_time"}
+{"minValue":"1637383255339000","maxValue":"1637383255550000","nullCount":0,"columnName":"c4"}
+{"minValue":"1","maxValue":"97","nullCount":0,"columnName":"c5"}
+{"minValue":"0","maxValue":"959","nullCount":0,"columnName":"_hoodie_record_key"}
+{"minValue":"9","maxValue":"9","nullCount":0,"columnName":"c8"}
+{"minValue":"9","maxValue":"9","nullCount":0,"columnName":"c8"}
+{"minValue":"","maxValue":"","nullCount":0,"columnName":"_hoodie_partition_path"}
+{"minValue":"111","maxValue":"8","nullCount":0,"columnName":"_hoodie_record_key"}
+{"minValue":"2","maxValue":"78","nullCount":0,"columnName":"c5"}
+{"minValue":"","maxValue":"","nullCount":0,"columnName":"_hoodie_partition_path"}
+{"minValue":"java.nio.HeapByteBuffer[pos=0 lim=1 cap=1]","maxValue":"java.nio.HeapByteBuffer[pos=0 lim=1 cap=1]","nullCount":0,"columnName":"c7"}

Review comment:
       How does `"minValue":"java.nio.HeapByteBuffer[pos=0 lim=1 cap=1]"` pass the assertion test?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
codope commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1084919253


   Closing it in favor of #5181 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope closed pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
codope closed pull request #5077:
URL: https://github.com/apache/hudi/pull/5077


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on a change in pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#discussion_r832286053



##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestColumnStatsIndex.scala
##########
@@ -63,6 +69,91 @@ class TestColumnStatsIndex extends HoodieClientTestBase {
     cleanupSparkContexts()
   }
 
+  @Test
+  def testMetadataColumnStatsIndex(): Unit = {

Review comment:
       There is already a test for metadata index with table operations in `TestHoodieBackedMetadata#testTableOperationsWithMetadataIndex`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1074158121


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8a8f3c421836a055b4b5917dc58f240d10e11820 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128) 
   * 4c79f3df666bdf2c3d1e90d930f6f89fb991afcc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on a change in pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#discussion_r832179099



##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
##########
@@ -1042,22 +1046,27 @@ public static void aggregateColumnStats(IndexedRecord record, Schema schema,
 
     schema.getFields().forEach(field -> {
       Map<String, Object> columnStats = columnToStats.getOrDefault(field.name(), new HashMap<>());
-      final String fieldVal = getNestedFieldValAsString((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);
+      final Object fieldVal = getNestedFieldVal((GenericRecord) record, field.name(), true, consistentLogicalTimestampEnabled);
+      final Schema fieldSchema = getNestedFieldSchemaFromRecord((GenericRecord) record, field.name());
       // update stats
-      final int fieldSize = fieldVal == null ? 0 : fieldVal.length();
+      final int fieldSize = fieldVal == null ? 0 : StringUtils.objToString(fieldVal).length();

Review comment:
       This method is used for meging colstats when handing updates for deltacommits. In this case, the records are encoded in avro by design.
   I agree with your point about size. While there is no immediate use-case for size, it might be helpful in future for e.g. skewness. Since we have the right size for base files, do you think we can set it to 0 here for now?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1080408730


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2ef5fd42900ec5ca2ef5d2f6db9e209067518784",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192) 
   * 2ef5fd42900ec5ca2ef5d2f6db9e209067518784 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5077: [HUDI-3664] Handle type conversion for comparison of column range metadata

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#issuecomment-1075466123


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7108",
       "triggerID" : "e268f4ceb0883f1092ab468f466a90baf485d5c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7119",
       "triggerID" : "f5a13c08cc1d2c59a56f85702c455cd1d409fe79",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7128",
       "triggerID" : "8a8f3c421836a055b4b5917dc58f240d10e11820",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7136",
       "triggerID" : "4c79f3df666bdf2c3d1e90d930f6f89fb991afcc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192",
       "triggerID" : "a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a9b90e1d953448644d5a84bb664b2a8e4bfc1d3b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7192) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org