You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/11/17 07:12:41 UTC

[GitHub] [hudi] xiarixiaoyao opened a new pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add more…

xiarixiaoyao opened a new pull request #4013:
URL: https://github.com/apache/hudi/pull/4013


   … docs for z-order.
   
   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   1. optimize the logical of parquet statistics collections。
   2. add same doc for z-order。
   3. add more test for UT。
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
     - *Added integration tests for end-to-end.*
     - *Added HoodieClientWriteTest to verify the change.*
     - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-971298376


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9a11df297537d5dc7e68deb032f9c8ad70b0e049 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r751816052



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -30,16 +28,21 @@
   private final String columnName;
   private final T minValue;
   private final T maxValue;
-  private final long numNulls;
-  private final PrimitiveStringifier stringifier;
+  private long numNulls;
+  // For Decimal Type/Date Type, minValue/maxValue cannot represent it's original value.
+  // eg: when parquet collects column information, the decimal type is collected as int/binary type.
+  // so we cannot use minValue and maxValue directly, use minValueAsString/maxValueAsString instead.
+  private final String minValueAsString;
+  private final String maxValueAsString;
 
-  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, final long numNulls, final PrimitiveStringifier stringifier) {
+  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, long numNulls, final String minValueAsString, final String maxValueAsString) {
     this.filePath = filePath;
     this.columnName = columnName;
     this.minValue = minValue;
     this.maxValue = maxValue;
-    this.numNulls = numNulls;
-    this.stringifier = stringifier;
+    this.numNulls = numNulls == -1 ? 0 : numNulls;

Review comment:
       When parquet cannot collect statistics for a certain field. For example, parquet does not collect statistics information for timestamp type.   how about throw exception directly?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r751938029



##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/DataSkippingUtils.scala
##########
@@ -179,7 +179,7 @@ object DataSkippingUtils {
   def getIndexFiles(conf: Configuration, indexPath: String): Seq[FileStatus] = {
     val basePath = new Path(indexPath)
     basePath.getFileSystem(conf)
-      .listStatus(basePath).filterNot(f => f.getPath.getName.endsWith(".parquet"))
+      .listStatus(basePath).filter(f => f.getPath.getName.endsWith(".parquet"))
   }

Review comment:
       very sorry for that。
   My local code is filter, and it is mistakenly written as filternot when submitting




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975382004


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7dcf6c5957115a3fb4bf84decd21d6b2792be857 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560) 
   * 1f9629007d7d91e44e9291b0cecf3722401effc9 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-976121409


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3608",
       "triggerID" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3614",
       "triggerID" : "976120008",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3608) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3614) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975470208


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582) 
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 2eee617252ffbf31c6b91923338c861fee9b40d2 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975477611


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975630508


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 6da299122f9661fa711afc3e4f47b1be26a08027 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] leesf commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r753737063



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -30,16 +28,21 @@
   private final String columnName;
   private final T minValue;
   private final T maxValue;
-  private final long numNulls;
-  private final PrimitiveStringifier stringifier;
+  private long numNulls;
+  // For Decimal Type/Date Type, minValue/maxValue cannot represent it's original value.
+  // eg: when parquet collects column information, the decimal type is collected as int/binary type.
+  // so we cannot use minValue and maxValue directly, use minValueAsString/maxValueAsString instead.
+  private final String minValueAsString;
+  private final String maxValueAsString;
 
-  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, final long numNulls, final PrimitiveStringifier stringifier) {
+  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, long numNulls, final String minValueAsString, final String maxValueAsString) {
     this.filePath = filePath;
     this.columnName = columnName;
     this.minValue = minValue;
     this.maxValue = maxValue;
-    this.numNulls = numNulls;
-    this.stringifier = stringifier;
+    this.numNulls = numNulls == -1 ? 0 : numNulls;

Review comment:
       If we can not support index timestamp yet, we would throw an exception explicitly

##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/TestOptimizeTable.scala
##########
@@ -88,19 +88,28 @@ class TestOptimizeTable extends HoodieClientTestBase {
       .save(basePath)
 
     assertEquals(1000, spark.read.format("hudi").load(basePath).count())
+    // use unsorted col as filter.
     assertEquals(1000,
-      spark.read.option(DataSourceReadOptions.ENABLE_DATA_SKIPPING.key(), "true").format("hudi").load(basePath).count())
+      spark.read.option(DataSourceReadOptions.ENABLE_DATA_SKIPPING.key(), "true")
+        .format("hudi").load(basePath).where("end_lat >= 0").count())
+    // use sorted col as filter.
+    assertEquals(1000,
+      spark.read.option(DataSourceReadOptions.ENABLE_DATA_SKIPPING.key(), "true")
+        .format("hudi").load(basePath).where("begin_lon >= 0").count())
   }
 
   @Test
   def testCollectMinMaxStatistics(): Unit = {
     val testPath = new Path(System.getProperty("java.io.tmpdir"), "minMax")
     val statisticPath = new Path(System.getProperty("java.io.tmpdir"), "stat")
     val fs = testPath.getFileSystem(spark.sparkContext.hadoopConfiguration)
+    val complexDataFrame = createComplexDataFrame(spark)
+    complexDataFrame.repartition(3).write.mode("overwrite").save(testPath.toString)
+    val df = spark.read.load(testPath.toString)
     try {
-      val complexDataFrame = createComplexDataFrame(spark)
-      complexDataFrame.repartition(3).write.mode("overwrite").save(testPath.toString)
-      val df = spark.read.load(testPath.toString)
+      // test z-order sort for all primitive type, should not throw error.

Review comment:
       throw exception

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -30,16 +28,21 @@
   private final String columnName;
   private final T minValue;
   private final T maxValue;
-  private final long numNulls;
-  private final PrimitiveStringifier stringifier;
+  private long numNulls;
+  // For Decimal Type/Date Type, minValue/maxValue cannot represent it's original value.
+  // eg: when parquet collects column information, the decimal type is collected as int/binary type.
+  // so we cannot use minValue and maxValue directly, use minValueAsString/maxValueAsString instead.
+  private final String minValueAsString;
+  private final String maxValueAsString;
 
-  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, final long numNulls, final PrimitiveStringifier stringifier) {
+  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, long numNulls, final String minValueAsString, final String maxValueAsString) {
     this.filePath = filePath;
     this.columnName = columnName;
     this.minValue = minValue;
     this.maxValue = maxValue;
-    this.numNulls = numNulls;
-    this.stringifier = stringifier;
+    this.numNulls = numNulls == -1 ? 0 : numNulls;

Review comment:
       If we can not support index timestamp yet, we would throw exception explicitly




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975058276


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e95361f7e109251511059817b7bc12591cd1671a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551) 
   * 7dcf6c5957115a3fb4bf84decd21d6b2792be857 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975058276


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e95361f7e109251511059817b7bc12591cd1671a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551) 
   * 7dcf6c5957115a3fb4bf84decd21d6b2792be857 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin edited a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
alexeykudinkin edited a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-974745454


   @xiarixiaoyao thanks for addressing the issues! 
   
   After our testing we've also tried to squash some bugs in https://github.com/apache/hudi/pull/4026 and https://github.com/apache/hudi/pull/4060. Unfortunately i didn't see this PR before and hence re-addressed some of the same (as well as some other) issues that you're touching here.
   
   I would very much appreciate your feedback on the aforementioned PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] leesf commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r753737063



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -30,16 +28,21 @@
   private final String columnName;
   private final T minValue;
   private final T maxValue;
-  private final long numNulls;
-  private final PrimitiveStringifier stringifier;
+  private long numNulls;
+  // For Decimal Type/Date Type, minValue/maxValue cannot represent it's original value.
+  // eg: when parquet collects column information, the decimal type is collected as int/binary type.
+  // so we cannot use minValue and maxValue directly, use minValueAsString/maxValueAsString instead.
+  private final String minValueAsString;
+  private final String maxValueAsString;
 
-  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, final long numNulls, final PrimitiveStringifier stringifier) {
+  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, long numNulls, final String minValueAsString, final String maxValueAsString) {
     this.filePath = filePath;
     this.columnName = columnName;
     this.minValue = minValue;
     this.maxValue = maxValue;
-    this.numNulls = numNulls;
-    this.stringifier = stringifier;
+    this.numNulls = numNulls == -1 ? 0 : numNulls;

Review comment:
       If we can not support index timestamp yet, we would throw exception explicitly




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r753758254



##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/TestOptimizeTable.scala
##########
@@ -88,19 +88,28 @@ class TestOptimizeTable extends HoodieClientTestBase {
       .save(basePath)
 
     assertEquals(1000, spark.read.format("hudi").load(basePath).count())
+    // use unsorted col as filter.
     assertEquals(1000,
-      spark.read.option(DataSourceReadOptions.ENABLE_DATA_SKIPPING.key(), "true").format("hudi").load(basePath).count())
+      spark.read.option(DataSourceReadOptions.ENABLE_DATA_SKIPPING.key(), "true")
+        .format("hudi").load(basePath).where("end_lat >= 0").count())
+    // use sorted col as filter.
+    assertEquals(1000,
+      spark.read.option(DataSourceReadOptions.ENABLE_DATA_SKIPPING.key(), "true")
+        .format("hudi").load(basePath).where("begin_lon >= 0").count())
   }
 
   @Test
   def testCollectMinMaxStatistics(): Unit = {
     val testPath = new Path(System.getProperty("java.io.tmpdir"), "minMax")
     val statisticPath = new Path(System.getProperty("java.io.tmpdir"), "stat")
     val fs = testPath.getFileSystem(spark.sparkContext.hadoopConfiguration)
+    val complexDataFrame = createComplexDataFrame(spark)
+    complexDataFrame.repartition(3).write.mode("overwrite").save(testPath.toString)
+    val df = spark.read.load(testPath.toString)
     try {
-      val complexDataFrame = createComplexDataFrame(spark)
-      complexDataFrame.repartition(3).write.mode("overwrite").save(testPath.toString)
-      val df = spark.read.load(testPath.toString)
+      // test z-order sort for all primitive type, should not throw error.

Review comment:
       fixed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-974773010


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e95361f7e109251511059817b7bc12591cd1671a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-974767804






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975444071


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1f9629007d7d91e44e9291b0cecf3722401effc9 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578) 
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582) 
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975444071


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1f9629007d7d91e44e9291b0cecf3722401effc9 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578) 
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582) 
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-976097406


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 6da299122f9661fa711afc3e4f47b1be26a08027 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590) 
   * a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975479551


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585) 
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 2eee617252ffbf31c6b91923338c861fee9b40d2 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r753738339



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -58,8 +68,12 @@ public T getMaxValue() {
     return this.maxValue;
   }
 
-  public PrimitiveStringifier getStringifier() {
-    return stringifier;
+  public String getMaxValueAsString() {

Review comment:
       I believe we should handle all type conversions at the stage of extracting this statistics from Parquet -- to make sure that `HoodieColumnRangeMetadata` users are not exposed to the need of conversion, and get Java native types out of the box




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-976116824


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3608",
       "triggerID" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3608) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975518713


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585) 
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 2eee617252ffbf31c6b91923338c861fee9b40d2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586) 
   * 6da299122f9661fa711afc3e4f47b1be26a08027 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-971293363


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9a11df297537d5dc7e68deb032f9c8ad70b0e049 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] leesf commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r751261387



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -30,16 +28,21 @@
   private final String columnName;
   private final T minValue;
   private final T maxValue;
-  private final long numNulls;
-  private final PrimitiveStringifier stringifier;
+  private long numNulls;
+  // For Decimal Type/Date Type, minValue/maxValue cannot represent it's original value.
+  // eg: when parquet collects column information, the decimal type is collected as int/binary type.
+  // so we cannot use minValue and maxValue directly, use minValueAsString/maxValueAsString instead.
+  private final String minValueAsString;
+  private final String maxValueAsString;
 
-  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, final long numNulls, final PrimitiveStringifier stringifier) {
+  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, long numNulls, final String minValueAsString, final String maxValueAsString) {
     this.filePath = filePath;
     this.columnName = columnName;
     this.minValue = minValue;
     this.maxValue = maxValue;
-    this.numNulls = numNulls;
-    this.stringifier = stringifier;
+    this.numNulls = numNulls == -1 ? 0 : numNulls;

Review comment:
       here in which case would `numNulls == -1` ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-974767804


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b9383b77280419d54fa09206c768ca17a3683fb4 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463) 
   * e95361f7e109251511059817b7bc12591cd1671a UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-974767554


   @leesf  addressed all comments. thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975060958


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e95361f7e109251511059817b7bc12591cd1671a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551) 
   * 7dcf6c5957115a3fb4bf84decd21d6b2792be857 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975161713


   @vinothchandar @leesf @alexeykudinkin  could we merge this  patch to master? 
   this patch can solve most of the problems in #4026 and #4060


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] leesf commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r754042709



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java
##########
@@ -142,6 +142,16 @@
       .sinceVersion("0.9.0")
       .withDocumentation("When rewriting data, preserves existing hoodie_commit_time");
 
+  /**
+   * Using space-filling curves to optimize the layout of table to boost query performance.
+   * The table data which sorted by space-filling curve has better aggregation; combine with min-max filtering, it can achieve good performance improvement.

Review comment:
       nit: split into two lines.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-971293363


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9a11df297537d5dc7e68deb032f9c8ad70b0e049 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-976100635


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3608",
       "triggerID" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 6da299122f9661fa711afc3e4f47b1be26a08027 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590) 
   * a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3608) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-974773010


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e95361f7e109251511059817b7bc12591cd1671a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] leesf commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r752297785



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -30,16 +28,21 @@
   private final String columnName;
   private final T minValue;
   private final T maxValue;
-  private final long numNulls;
-  private final PrimitiveStringifier stringifier;
+  private long numNulls;
+  // For Decimal Type/Date Type, minValue/maxValue cannot represent it's original value.
+  // eg: when parquet collects column information, the decimal type is collected as int/binary type.
+  // so we cannot use minValue and maxValue directly, use minValueAsString/maxValueAsString instead.
+  private final String minValueAsString;
+  private final String maxValueAsString;
 
-  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, final long numNulls, final PrimitiveStringifier stringifier) {
+  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, long numNulls, final String minValueAsString, final String maxValueAsString) {
     this.filePath = filePath;
     this.columnName = columnName;
     this.minValue = minValue;
     this.maxValue = maxValue;
-    this.numNulls = numNulls;
-    this.stringifier = stringifier;
+    this.numNulls = numNulls == -1 ? 0 : numNulls;

Review comment:
       so in this case, the numNulls will be -1 ? and what's the impact if we set numNulls to 0, will the query performance be affected?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-972637957






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975436537


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1f9629007d7d91e44e9291b0cecf3722401effc9 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578) 
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-972482783


   cc @alexeykudinkin can you review this once as well


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-974766598






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r753757069



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -30,16 +28,21 @@
   private final String columnName;
   private final T minValue;
   private final T maxValue;
-  private final long numNulls;
-  private final PrimitiveStringifier stringifier;
+  private long numNulls;
+  // For Decimal Type/Date Type, minValue/maxValue cannot represent it's original value.
+  // eg: when parquet collects column information, the decimal type is collected as int/binary type.
+  // so we cannot use minValue and maxValue directly, use minValueAsString/maxValueAsString instead.
+  private final String minValueAsString;
+  private final String maxValueAsString;
 
-  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, final long numNulls, final PrimitiveStringifier stringifier) {
+  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, long numNulls, final String minValueAsString, final String maxValueAsString) {
     this.filePath = filePath;
     this.columnName = columnName;
     this.minValue = minValue;
     this.maxValue = maxValue;
-    this.numNulls = numNulls;
-    this.stringifier = stringifier;
+    this.numNulls = numNulls == -1 ? 0 : numNulls;

Review comment:
       yes, we have already throw exceptions in  L264 in ZCurveOptimizeHelper.getMinMaxValue  

##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/TestOptimizeTable.scala
##########
@@ -88,19 +88,28 @@ class TestOptimizeTable extends HoodieClientTestBase {
       .save(basePath)
 
     assertEquals(1000, spark.read.format("hudi").load(basePath).count())
+    // use unsorted col as filter.
     assertEquals(1000,
-      spark.read.option(DataSourceReadOptions.ENABLE_DATA_SKIPPING.key(), "true").format("hudi").load(basePath).count())
+      spark.read.option(DataSourceReadOptions.ENABLE_DATA_SKIPPING.key(), "true")
+        .format("hudi").load(basePath).where("end_lat >= 0").count())
+    // use sorted col as filter.
+    assertEquals(1000,
+      spark.read.option(DataSourceReadOptions.ENABLE_DATA_SKIPPING.key(), "true")
+        .format("hudi").load(basePath).where("begin_lon >= 0").count())
   }
 
   @Test
   def testCollectMinMaxStatistics(): Unit = {
     val testPath = new Path(System.getProperty("java.io.tmpdir"), "minMax")
     val statisticPath = new Path(System.getProperty("java.io.tmpdir"), "stat")
     val fs = testPath.getFileSystem(spark.sparkContext.hadoopConfiguration)
+    val complexDataFrame = createComplexDataFrame(spark)
+    complexDataFrame.repartition(3).write.mode("overwrite").save(testPath.toString)
+    val df = spark.read.load(testPath.toString)
     try {
-      val complexDataFrame = createComplexDataFrame(spark)
-      complexDataFrame.repartition(3).write.mode("overwrite").save(testPath.toString)
-      val df = spark.read.load(testPath.toString)
+      // test z-order sort for all primitive type, should not throw error.

Review comment:
       fixed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975518713


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585) 
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 2eee617252ffbf31c6b91923338c861fee9b40d2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586) 
   * 6da299122f9661fa711afc3e4f47b1be26a08027 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-971298376


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9a11df297537d5dc7e68deb032f9c8ad70b0e049 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-974745454


   @xiarixiaoyao thanks for addressing the issues! 
   
   After our testing we've also tried to squash some bugs in https://github.com/apache/hudi/pull/4026 and https://github.com/apache/hudi/pull/4060. Unfortunately i didn't see this PR before and hence re-addressed some of the same issues that you're touching here.
   
   I would very much appreciate your feedback on the aforementioned PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975434605


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7dcf6c5957115a3fb4bf84decd21d6b2792be857 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560) 
   * 1f9629007d7d91e44e9291b0cecf3722401effc9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578) 
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975521123


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 2eee617252ffbf31c6b91923338c861fee9b40d2 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586) 
   * 6da299122f9661fa711afc3e4f47b1be26a08027 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975540983


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 2eee617252ffbf31c6b91923338c861fee9b40d2 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586) 
   * 6da299122f9661fa711afc3e4f47b1be26a08027 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-976114333


   @vinothchandar 
   Yes, there are a lot of repetitive work at present.
   
   I'm very sorry for this. This series of problems are caused by the inconsistency between my local code and the community submitted code,


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-976121409


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3608",
       "triggerID" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3614",
       "triggerID" : "976120008",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3608) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3614) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-972589905


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9a11df297537d5dc7e68deb032f9c8ad70b0e049 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423) 
   * 23ef0dc3cf648cef640263c06432fd7fc4708327 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460) 
   * b9383b77280419d54fa09206c768ca17a3683fb4 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-971332508


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9a11df297537d5dc7e68deb032f9c8ad70b0e049 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-971303295


   @leesf  @xushiyan  could you help me review this pr, thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-972575950


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9a11df297537d5dc7e68deb032f9c8ad70b0e049 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423) 
   * 23ef0dc3cf648cef640263c06432fd7fc4708327 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-972591270


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 23ef0dc3cf648cef640263c06432fd7fc4708327 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460) 
   * b9383b77280419d54fa09206c768ca17a3683fb4 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-971332508


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9a11df297537d5dc7e68deb032f9c8ad70b0e049 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975630508


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 6da299122f9661fa711afc3e4f47b1be26a08027 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975160541


   @leesf
   we can   build indexes for Dataskipping Manually。
   step1: we can use ZCurveOptimizeHelper.getMinMaxValue to get min-max statistics info for current table
   ste2: use ZCurveOptimizeHelper.saveStatisticsInfo to save statistics info.
   see line105  and line 111 in TestOptimizeTable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975479551


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585) 
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 2eee617252ffbf31c6b91923338c861fee9b40d2 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-974768072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b9383b77280419d54fa09206c768ca17a3683fb4 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463) 
   * e95361f7e109251511059817b7bc12591cd1671a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] leesf commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r753737137



##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/TestOptimizeTable.scala
##########
@@ -88,19 +88,28 @@ class TestOptimizeTable extends HoodieClientTestBase {
       .save(basePath)
 
     assertEquals(1000, spark.read.format("hudi").load(basePath).count())
+    // use unsorted col as filter.
     assertEquals(1000,
-      spark.read.option(DataSourceReadOptions.ENABLE_DATA_SKIPPING.key(), "true").format("hudi").load(basePath).count())
+      spark.read.option(DataSourceReadOptions.ENABLE_DATA_SKIPPING.key(), "true")
+        .format("hudi").load(basePath).where("end_lat >= 0").count())
+    // use sorted col as filter.
+    assertEquals(1000,
+      spark.read.option(DataSourceReadOptions.ENABLE_DATA_SKIPPING.key(), "true")
+        .format("hudi").load(basePath).where("begin_lon >= 0").count())
   }
 
   @Test
   def testCollectMinMaxStatistics(): Unit = {
     val testPath = new Path(System.getProperty("java.io.tmpdir"), "minMax")
     val statisticPath = new Path(System.getProperty("java.io.tmpdir"), "stat")
     val fs = testPath.getFileSystem(spark.sparkContext.hadoopConfiguration)
+    val complexDataFrame = createComplexDataFrame(spark)
+    complexDataFrame.repartition(3).write.mode("overwrite").save(testPath.toString)
+    val df = spark.read.load(testPath.toString)
     try {
-      val complexDataFrame = createComplexDataFrame(spark)
-      complexDataFrame.repartition(3).write.mode("overwrite").save(testPath.toString)
-      val df = spark.read.load(testPath.toString)
+      // test z-order sort for all primitive type, should not throw error.

Review comment:
       throw exception




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-972637957


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b9383b77280419d54fa09206c768ca17a3683fb4 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975384373


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7dcf6c5957115a3fb4bf84decd21d6b2792be857 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560) 
   * 1f9629007d7d91e44e9291b0cecf3722401effc9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975485911


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585) 
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 2eee617252ffbf31c6b91923338c861fee9b40d2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975540983


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 2eee617252ffbf31c6b91923338c861fee9b40d2 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586) 
   * 6da299122f9661fa711afc3e4f47b1be26a08027 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975436537


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1f9629007d7d91e44e9291b0cecf3722401effc9 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578) 
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975470208


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582) 
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 2eee617252ffbf31c6b91923338c861fee9b40d2 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975382023


   @leesf  addressed all commts, add UT for multi-thread parquet footer read 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975485911


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585) 
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 2eee617252ffbf31c6b91923338c861fee9b40d2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-976120008


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r754578303



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -30,16 +28,28 @@
   private final String columnName;
   private final T minValue;
   private final T maxValue;
-  private final long numNulls;
-  private final PrimitiveStringifier stringifier;
+  private long numNulls;
+  // For Decimal Type/Date Type, minValue/maxValue cannot represent it's original value.
+  // eg: when parquet collects column information, the decimal type is collected as int/binary type.
+  // so we cannot use minValue and maxValue directly, use minValueAsString/maxValueAsString instead.
+  private final String minValueAsString;

Review comment:
       Please take a look at my comment below -- i don't think this is necessary we can handle all type conversions at the time of reading of the Footer and don't need to propagate it further

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/ParquetUtils.java
##########
@@ -283,17 +286,38 @@ public Boolean apply(String recordKey) {
 
   /**
    * Parse min/max statistics stored in parquet footers for all columns.
+   * ParquetRead.readFooter is not a thread safe method.
+   *
+   * @param conf hadoop conf.
+   * @param parquetFilePath file to be read.
+   * @param cols cols which need to collect statistics.
+   * @return a HoodieColumnRangeMetadata instance.
    */
-  public Collection<HoodieColumnRangeMetadata<Comparable>> readRangeFromParquetMetadata(Configuration conf, Path parquetFilePath, List<String> cols) {
+  public Collection<HoodieColumnRangeMetadata<Comparable>> readRangeFromParquetMetadata(
+      Configuration conf,
+      Path parquetFilePath,
+      List<String> cols) {
     ParquetMetadata metadata = readMetadata(conf, parquetFilePath);
     // collect stats from all parquet blocks
     Map<String, List<HoodieColumnRangeMetadata<Comparable>>> columnToStatsListMap = metadata.getBlocks().stream().flatMap(blockMetaData -> {
-      return blockMetaData.getColumns().stream().filter(f -> cols.contains(f.getPath().toDotString())).map(columnChunkMetaData ->
-          new HoodieColumnRangeMetadata<>(parquetFilePath.getName(), columnChunkMetaData.getPath().toDotString(),
-              columnChunkMetaData.getStatistics().genericGetMin(),
-              columnChunkMetaData.getStatistics().genericGetMax(),
-              columnChunkMetaData.getStatistics().getNumNulls(),
-              columnChunkMetaData.getPrimitiveType().stringifier()));
+      return blockMetaData.getColumns().stream().filter(f -> cols.contains(f.getPath().toDotString())).map(columnChunkMetaData -> {
+        String minAsString;
+        String maxAsString;
+        if (columnChunkMetaData.getPrimitiveType().getOriginalType() == OriginalType.DATE) {
+          synchronized (lock) {

Review comment:
       This has been addressed by #4060




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-976100635


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3608",
       "triggerID" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 6da299122f9661fa711afc3e4f47b1be26a08027 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590) 
   * a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3608) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-972591270


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 23ef0dc3cf648cef640263c06432fd7fc4708327 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460) 
   * b9383b77280419d54fa09206c768ca17a3683fb4 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-972577078


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9a11df297537d5dc7e68deb032f9c8ad70b0e049 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423) 
   * 23ef0dc3cf648cef640263c06432fd7fc4708327 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] leesf commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r751261009



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -30,16 +28,21 @@
   private final String columnName;
   private final T minValue;
   private final T maxValue;
-  private final long numNulls;
-  private final PrimitiveStringifier stringifier;
+  private long numNulls;
+  // For Decimal Type/Date Type, minValue/maxValue cannot represent it's original value.
+  // eg: when parquet collects column information, the decimal type is collected as int/binary type.
+  // so we cannot use minValue and maxValue directly, use minValueAsString/maxValueAsString instead.
+  private final String minValueAsString;
+  private final String maxValueAsString;
 
-  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, final long numNulls, final PrimitiveStringifier stringifier) {
+  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, long numNulls, final String minValueAsString, final String maxValueAsString) {

Review comment:
       split into two lines.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-972589905


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9a11df297537d5dc7e68deb032f9c8ad70b0e049 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423) 
   * 23ef0dc3cf648cef640263c06432fd7fc4708327 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460) 
   * b9383b77280419d54fa09206c768ca17a3683fb4 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-972606180


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 23ef0dc3cf648cef640263c06432fd7fc4708327 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460) 
   * b9383b77280419d54fa09206c768ca17a3683fb4 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-972606180


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 23ef0dc3cf648cef640263c06432fd7fc4708327 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460) 
   * b9383b77280419d54fa09206c768ca17a3683fb4 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-972637957


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b9383b77280419d54fa09206c768ca17a3683fb4 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-974768072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b9383b77280419d54fa09206c768ca17a3683fb4 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463) 
   * e95361f7e109251511059817b7bc12591cd1671a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] leesf commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r753737063



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -30,16 +28,21 @@
   private final String columnName;
   private final T minValue;
   private final T maxValue;
-  private final long numNulls;
-  private final PrimitiveStringifier stringifier;
+  private long numNulls;
+  // For Decimal Type/Date Type, minValue/maxValue cannot represent it's original value.
+  // eg: when parquet collects column information, the decimal type is collected as int/binary type.
+  // so we cannot use minValue and maxValue directly, use minValueAsString/maxValueAsString instead.
+  private final String minValueAsString;
+  private final String maxValueAsString;
 
-  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, final long numNulls, final PrimitiveStringifier stringifier) {
+  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, long numNulls, final String minValueAsString, final String maxValueAsString) {
     this.filePath = filePath;
     this.columnName = columnName;
     this.minValue = minValue;
     this.maxValue = maxValue;
-    this.numNulls = numNulls;
-    this.stringifier = stringifier;
+    this.numNulls = numNulls == -1 ? 0 : numNulls;

Review comment:
       If we can not support index timestamp yet, we would throw an exception explicitly




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-976169728


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3608",
       "triggerID" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3614",
       "triggerID" : "976120008",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3608) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3614) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975384373


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7dcf6c5957115a3fb4bf84decd21d6b2792be857 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560) 
   * 1f9629007d7d91e44e9291b0cecf3722401effc9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975446166


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582) 
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975521123


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 2eee617252ffbf31c6b91923338c861fee9b40d2 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586) 
   * 6da299122f9661fa711afc3e4f47b1be26a08027 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin edited a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
alexeykudinkin edited a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-974745454


   @xiarixiaoyao thanks for addressing the issues! 
   
   After our testing we've also tried to squash some bugs in https://github.com/apache/hudi/pull/4026 and https://github.com/apache/hudi/pull/4060. Unfortunately i didn't see this PR before and hence re-addressed some of the same (as well as some other) issues that you're touching here.
   
   I would very much appreciate your feedback on the aforementioned PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r754048363



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java
##########
@@ -142,6 +142,16 @@
       .sinceVersion("0.9.0")
       .withDocumentation("When rewriting data, preserves existing hoodie_commit_time");
 
+  /**
+   * Using space-filling curves to optimize the layout of table to boost query performance.
+   * The table data which sorted by space-filling curve has better aggregation; combine with min-max filtering, it can achieve good performance improvement.

Review comment:
       ok
   

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/ParquetUtils.java
##########
@@ -283,17 +283,37 @@ public Boolean apply(String recordKey) {
 
   /**
    * Parse min/max statistics stored in parquet footers for all columns.
+   * ParquetRead.readFooter is not a thread safe method.
+   *
+   * @param conf hadoop conf.
+   * @param parquetFilePath file to be read.
+   * @param cols cols which need to collect statistics.
+   * @param useLock if use lock when read parquet footer.
+   * @return a HoodieColumnRangeMetadata instance.
    */
-  public Collection<HoodieColumnRangeMetadata<Comparable>> readRangeFromParquetMetadata(Configuration conf, Path parquetFilePath, List<String> cols) {
-    ParquetMetadata metadata = readMetadata(conf, parquetFilePath);
+  public Collection<HoodieColumnRangeMetadata<Comparable>> readRangeFromParquetMetadata(
+      Configuration conf,
+      Path parquetFilePath,
+      List<String> cols,
+      boolean useLock) {

Review comment:
       We get the statistics of the specified column by reading footer.
   
   However, if the specified column is of datetye type, parquet uses simpledateformat to format data(this is not thread safe) when returning the min-max value. In the case of multithreading, this will lead to the return of the wrong min max value.
   
   Let me add a test case




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-976116824


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3608",
       "triggerID" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3608) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975109801


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7dcf6c5957115a3fb4bf84decd21d6b2792be857 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975446166


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582) 
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r754768668



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java
##########
@@ -142,6 +142,17 @@
       .sinceVersion("0.9.0")
       .withDocumentation("When rewriting data, preserves existing hoodie_commit_time");
 
+  /**
+   * Using space-filling curves to optimize the layout of table to boost query performance.
+   * The table data which sorted by space-filling curve has better aggregation;
+   * combine with min-max filtering, it can achieve good performance improvement.
+   *
+   * Notice:

Review comment:
       wondering how much of this should go into the config docs itself?

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -30,16 +28,28 @@
   private final String columnName;
   private final T minValue;
   private final T maxValue;
-  private final long numNulls;
-  private final PrimitiveStringifier stringifier;
+  private long numNulls;
+  // For Decimal Type/Date Type, minValue/maxValue cannot represent it's original value.
+  // eg: when parquet collects column information, the decimal type is collected as int/binary type.
+  // so we cannot use minValue and maxValue directly, use minValueAsString/maxValueAsString instead.
+  private final String minValueAsString;

Review comment:
       sounds good to me as well. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-972575950


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9a11df297537d5dc7e68deb032f9c8ad70b0e049 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423) 
   * 23ef0dc3cf648cef640263c06432fd7fc4708327 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-974767804


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b9383b77280419d54fa09206c768ca17a3683fb4 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463) 
   * e95361f7e109251511059817b7bc12591cd1671a UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-974766598


   @alexeykudinkin 
   Thank you very much for your testing/bug fixing and code optimization. Due to the existence of rfc-27, data skipping was not considered too much in the initial design; Thank you for your work to make this feature better.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975060958


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e95361f7e109251511059817b7bc12591cd1671a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551) 
   * 7dcf6c5957115a3fb4bf84decd21d6b2792be857 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar merged pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
vinothchandar merged pull request #4013:
URL: https://github.com/apache/hudi/pull/4013


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-974745454


   @xiarixiaoyao thanks for addressing the issues! 
   
   After our testing we've also tried to squash some bugs in https://github.com/apache/hudi/pull/4026 and https://github.com/apache/hudi/pull/4060. Unfortunately i didn't see this PR before and hence re-addressed some of the same issues that you're touching here.
   
   I would very much appreciate your feedback on the aforementioned PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-972577078


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9a11df297537d5dc7e68deb032f9c8ad70b0e049 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423) 
   * 23ef0dc3cf648cef640263c06432fd7fc4708327 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r752798967



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -30,16 +28,21 @@
   private final String columnName;
   private final T minValue;
   private final T maxValue;
-  private final long numNulls;
-  private final PrimitiveStringifier stringifier;
+  private long numNulls;
+  // For Decimal Type/Date Type, minValue/maxValue cannot represent it's original value.
+  // eg: when parquet collects column information, the decimal type is collected as int/binary type.
+  // so we cannot use minValue and maxValue directly, use minValueAsString/maxValueAsString instead.
+  private final String minValueAsString;
+  private final String maxValueAsString;
 
-  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, final long numNulls, final PrimitiveStringifier stringifier) {
+  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, long numNulls, final String minValueAsString, final String maxValueAsString) {
     this.filePath = filePath;
     this.columnName = columnName;
     this.minValue = minValue;
     this.maxValue = maxValue;
-    this.numNulls = numNulls;
-    this.stringifier = stringifier;
+    this.numNulls = numNulls == -1 ? 0 : numNulls;

Review comment:
       If this value is - 1, it must be set to 0, This will result in incorrect query results. In fact, when a timestamp type is encountered in subsequent logic, an exception will be thrown directly to tell the user that indexing for timestamp is not supported.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-972589911


   @leesf  @vinothchandar @alexeykudinkin  
   address all comments and update the codes and more test case.  could you help me review this pr again , thanks.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r753757069



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -30,16 +28,21 @@
   private final String columnName;
   private final T minValue;
   private final T maxValue;
-  private final long numNulls;
-  private final PrimitiveStringifier stringifier;
+  private long numNulls;
+  // For Decimal Type/Date Type, minValue/maxValue cannot represent it's original value.
+  // eg: when parquet collects column information, the decimal type is collected as int/binary type.
+  // so we cannot use minValue and maxValue directly, use minValueAsString/maxValueAsString instead.
+  private final String minValueAsString;
+  private final String maxValueAsString;
 
-  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, final long numNulls, final PrimitiveStringifier stringifier) {
+  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, long numNulls, final String minValueAsString, final String maxValueAsString) {
     this.filePath = filePath;
     this.columnName = columnName;
     this.minValue = minValue;
     this.maxValue = maxValue;
-    this.numNulls = numNulls;
-    this.stringifier = stringifier;
+    this.numNulls = numNulls == -1 ? 0 : numNulls;

Review comment:
       yes, we have already throw exceptions in  L264 in ZCurveOptimizeHelper.getMinMaxValue  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r754752336



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -30,16 +28,28 @@
   private final String columnName;
   private final T minValue;
   private final T maxValue;
-  private final long numNulls;
-  private final PrimitiveStringifier stringifier;
+  private long numNulls;
+  // For Decimal Type/Date Type, minValue/maxValue cannot represent it's original value.
+  // eg: when parquet collects column information, the decimal type is collected as int/binary type.
+  // so we cannot use minValue and maxValue directly, use minValueAsString/maxValueAsString instead.
+  private final String minValueAsString;

Review comment:
       yes, I agree with you。But let's put this into #4060 as an optimization. 
   We can merge the PR first and then merge the #4060 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975109801


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7dcf6c5957115a3fb4bf84decd21d6b2792be857 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] leesf commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r754042366



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/ParquetUtils.java
##########
@@ -283,17 +283,37 @@ public Boolean apply(String recordKey) {
 
   /**
    * Parse min/max statistics stored in parquet footers for all columns.
+   * ParquetRead.readFooter is not a thread safe method.
+   *
+   * @param conf hadoop conf.
+   * @param parquetFilePath file to be read.
+   * @param cols cols which need to collect statistics.
+   * @param useLock if use lock when read parquet footer.
+   * @return a HoodieColumnRangeMetadata instance.
    */
-  public Collection<HoodieColumnRangeMetadata<Comparable>> readRangeFromParquetMetadata(Configuration conf, Path parquetFilePath, List<String> cols) {
-    ParquetMetadata metadata = readMetadata(conf, parquetFilePath);
+  public Collection<HoodieColumnRangeMetadata<Comparable>> readRangeFromParquetMetadata(
+      Configuration conf,
+      Path parquetFilePath,
+      List<String> cols,
+      boolean useLock) {

Review comment:
       I am so curious about the change, would you please point me to the docs that read footer is not thread-safe and affect the result?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975434605


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7dcf6c5957115a3fb4bf84decd21d6b2792be857 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560) 
   * 1f9629007d7d91e44e9291b0cecf3722401effc9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578) 
   * 5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r754274986



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/ParquetUtils.java
##########
@@ -283,17 +283,37 @@ public Boolean apply(String recordKey) {
 
   /**
    * Parse min/max statistics stored in parquet footers for all columns.
+   * ParquetRead.readFooter is not a thread safe method.
+   *
+   * @param conf hadoop conf.
+   * @param parquetFilePath file to be read.
+   * @param cols cols which need to collect statistics.
+   * @param useLock if use lock when read parquet footer.
+   * @return a HoodieColumnRangeMetadata instance.
    */
-  public Collection<HoodieColumnRangeMetadata<Comparable>> readRangeFromParquetMetadata(Configuration conf, Path parquetFilePath, List<String> cols) {
-    ParquetMetadata metadata = readMetadata(conf, parquetFilePath);
+  public Collection<HoodieColumnRangeMetadata<Comparable>> readRangeFromParquetMetadata(
+      Configuration conf,
+      Path parquetFilePath,
+      List<String> cols,
+      boolean useLock) {

Review comment:
       Reduce the scope of locking by using a private static lock.   only when we counter DateType, we locked




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] leesf commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r754308992



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java
##########
@@ -241,7 +241,7 @@ private int doScheduleAndCluster(JavaSparkContext jsc) throws Exception {
         HoodieTimeline inflightHoodieTimeline = table.getActiveTimeline().filterPendingReplaceTimeline().filterInflights();
         if (!inflightHoodieTimeline.empty()) {
           HoodieInstant inflightClusteringInstant = inflightHoodieTimeline.lastInstant().get();
-          Date clusteringStartTime = HoodieActiveTimeline.COMMIT_FORMATTER.parse(inflightClusteringInstant.getTimestamp());
+          Date clusteringStartTime = HoodieActiveTimeline.parseInstantTime(inflightClusteringInstant.getTimestamp());

Review comment:
       please revert this change and rebase against master branch




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-975382004


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7dcf6c5957115a3fb4bf84decd21d6b2792be857 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560) 
   * 1f9629007d7d91e44e9291b0cecf3722401effc9 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-976097406


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3423",
       "triggerID" : "9a11df297537d5dc7e68deb032f9c8ad70b0e049",
       "triggerType" : "PUSH"
     }, {
       "hash" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3460",
       "triggerID" : "23ef0dc3cf648cef640263c06432fd7fc4708327",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3463",
       "triggerID" : "b9383b77280419d54fa09206c768ca17a3683fb4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e95361f7e109251511059817b7bc12591cd1671a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3551",
       "triggerID" : "e95361f7e109251511059817b7bc12591cd1671a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3560",
       "triggerID" : "7dcf6c5957115a3fb4bf84decd21d6b2792be857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3578",
       "triggerID" : "1f9629007d7d91e44e9291b0cecf3722401effc9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3582",
       "triggerID" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8f0177d9fb3debe94fad21dfa6ac8cd78a863128",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5bdd5b4bb67ccea7f920e8b88e3683acadc59ce3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3585",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eee617252ffbf31c6b91923338c861fee9b40d2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3586",
       "triggerID" : "975477611",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590",
       "triggerID" : "6da299122f9661fa711afc3e4f47b1be26a08027",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8f0177d9fb3debe94fad21dfa6ac8cd78a863128 UNKNOWN
   * 6da299122f9661fa711afc3e4f47b1be26a08027 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3590) 
   * a8f0efb20b3cae9b1780ae712aabc6f903c2aeb6 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#issuecomment-976109726


   @alexeykudinkin @xiarixiaoyao This is a little bit of a miss from my side. Should have stepped in to co-ordinate better. I will review all 3 PRs and make forward progress. 
   
   In the future, we can use our umbrella task and then split up work off these, so we avoid these overlaps?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org