You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/11/19 01:32:56 UTC

[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

xiarixiaoyao commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r752798967



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java
##########
@@ -30,16 +28,21 @@
   private final String columnName;
   private final T minValue;
   private final T maxValue;
-  private final long numNulls;
-  private final PrimitiveStringifier stringifier;
+  private long numNulls;
+  // For Decimal Type/Date Type, minValue/maxValue cannot represent it's original value.
+  // eg: when parquet collects column information, the decimal type is collected as int/binary type.
+  // so we cannot use minValue and maxValue directly, use minValueAsString/maxValueAsString instead.
+  private final String minValueAsString;
+  private final String maxValueAsString;
 
-  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, final long numNulls, final PrimitiveStringifier stringifier) {
+  public HoodieColumnRangeMetadata(final String filePath, final String columnName, final T minValue, final T maxValue, long numNulls, final String minValueAsString, final String maxValueAsString) {
     this.filePath = filePath;
     this.columnName = columnName;
     this.minValue = minValue;
     this.maxValue = maxValue;
-    this.numNulls = numNulls;
-    this.stringifier = stringifier;
+    this.numNulls = numNulls == -1 ? 0 : numNulls;

Review comment:
       If this value is - 1, it must be set to 0, This will result in incorrect query results. In fact, when a timestamp type is encountered in subsequent logic, an exception will be thrown directly to tell the user that indexing for timestamp is not supported.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org