You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/10/17 00:51:55 UTC

[GitHub] [pinot] richardstartin commented on a change in pull request #7581: add `BOOLEAN_ARRAY` and `TIMESTAMP_ARRAY` types

richardstartin commented on a change in pull request #7581:
URL: https://github.com/apache/pinot/pull/7581#discussion_r729801948



##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/DataSchema.java
##########
@@ -250,8 +251,17 @@ public int hashCode() {
     LONG_ARRAY,
     FLOAT_ARRAY,
     DOUBLE_ARRAY,
-    STRING_ARRAY;
-
+    STRING_ARRAY,
+    BOOLEAN_ARRAY /* Stored as INT_ARRAY */,
+    TIMESTAMP_ARRAY /* Stored as LONG_ARRAY */;
+
+    private static final EnumSet<ColumnDataType> NUMERIC_TYPES = EnumSet.of(INT, LONG, FLOAT, DOUBLE);
+    private static final EnumSet<ColumnDataType> INTEGRAL_TYPES = EnumSet.of(INT, LONG);
+    private static final EnumSet<ColumnDataType> ARRAY_TYPES = EnumSet.of(INT_ARRAY, LONG_ARRAY, FLOAT_ARRAY,
+        DOUBLE_ARRAY, STRING_ARRAY, BOOLEAN_ARRAY, TIMESTAMP_ARRAY);
+    private static final EnumSet<ColumnDataType> NUMERIC_ARRAY_TYPES = EnumSet.of(INT_ARRAY, LONG_ARRAY, FLOAT_ARRAY,
+        DOUBLE_ARRAY);
+    private static final EnumSet<ColumnDataType> INTEGRAL_ARRAY_TYPES = EnumSet.of(INT_ARRAY, LONG_ARRAY);

Review comment:
       I took the opportunity to clean this up a little bit since I needed to add to what didn't look like very scalable code readability wise. I have no reason to believe this code is performance-critical but lookups into a constant `EnumSet` will perform well.

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/DataSchema.java
##########
@@ -414,74 +391,98 @@ public Serializable format(Object value) {
      */
     public Serializable convertAndFormat(Object value) {
       switch (this) {
+        case TIMESTAMP:

Review comment:
       I removed a lot of duplication here so it's easier to follow which types need special formatting and which don't.

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/DataSchema.java
##########
@@ -414,74 +391,98 @@ public Serializable format(Object value) {
      */
     public Serializable convertAndFormat(Object value) {
       switch (this) {
+        case TIMESTAMP:

Review comment:
       I removed a lot of duplication here so it's easier to follow which types need special formatting and which don't.

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/DataSchema.java
##########
@@ -434,54 +411,104 @@ public Serializable convertAndFormat(Object value) {
         case INT_ARRAY:
           return (int[]) value;
         case LONG_ARRAY:
-          if (value instanceof long[]) {
-            return (long[]) value;
-          } else {
-            int[] intValues = (int[]) value;
-            int length = intValues.length;
-            long[] longValues = new long[length];
-            for (int i = 0; i < length; i++) {
-              longValues[i] = intValues[i];
-            }
-            return longValues;
-          }
+          return toLongArray(value);
         case FLOAT_ARRAY:
           return (float[]) value;
         case DOUBLE_ARRAY:
-          if (value instanceof double[]) {
-            return (double[]) value;
-          } else if (value instanceof int[]) {
-            int[] intValues = (int[]) value;
-            int length = intValues.length;
-            double[] doubleValues = new double[length];
-            for (int i = 0; i < length; i++) {
-              doubleValues[i] = intValues[i];
-            }
-            return doubleValues;
-          } else if (value instanceof long[]) {
-            long[] longValues = (long[]) value;
-            int length = longValues.length;
-            double[] doubleValues = new double[length];
-            for (int i = 0; i < length; i++) {
-              doubleValues[i] = longValues[i];
-            }
-            return doubleValues;
-          } else {
-            float[] floatValues = (float[]) value;
-            int length = floatValues.length;
-            double[] doubleValues = new double[length];
-            for (int i = 0; i < length; i++) {
-              doubleValues[i] = floatValues[i];
-            }
-            return doubleValues;
-          }
+          return toDoubleArray(value);
         case STRING_ARRAY:
           return (String[]) value;
+        case BOOLEAN_ARRAY:
+          return toBooleanArray(value);
+        case TIMESTAMP_ARRAY:
+          return formatTimestampArray(value);
         default:
           throw new IllegalStateException(String.format("Cannot convert and format: '%s' to type: %s", value, this));
       }
     }
 
+    private static double[] toDoubleArray(Object value) {
+      if (value instanceof double[]) {
+        return (double[]) value;
+      } else if (value instanceof int[]) {
+        int[] intValues = (int[]) value;
+        int length = intValues.length;
+        double[] doubleValues = new double[length];
+        for (int i = 0; i < length; i++) {
+          doubleValues[i] = intValues[i];
+        }
+        return doubleValues;
+      } else if (value instanceof long[]) {
+        long[] longValues = (long[]) value;
+        int length = longValues.length;
+        double[] doubleValues = new double[length];
+        for (int i = 0; i < length; i++) {
+          doubleValues[i] = longValues[i];
+        }
+        return doubleValues;
+      } else {
+        float[] floatValues = (float[]) value;
+        int length = floatValues.length;
+        double[] doubleValues = new double[length];
+        for (int i = 0; i < length; i++) {
+          doubleValues[i] = floatValues[i];
+        }
+        return doubleValues;
+      }
+    }
+
+    private static long[] toLongArray(Object value) {
+      if (value instanceof long[]) {
+        return (long[]) value;
+      } else {
+        int[] intValues = (int[]) value;
+        int length = intValues.length;
+        long[] longValues = new long[length];
+        for (int i = 0; i < length; i++) {
+          longValues[i] = intValues[i];
+        }
+        return longValues;
+      }
+    }
+
+    private static boolean[] toBooleanArray(Object value) {
+      if (value instanceof boolean[]) {
+        return (boolean[]) value;
+      } else {
+        int[] ints = (int[]) value;
+        boolean[] booleans = new boolean[ints.length];
+        for (int i = 0; i < ints.length; i++) {
+          booleans[i] = ints[i] == 1;

Review comment:
       no, see the SV equivalent:
   
   ```java
       case BOOLEAN:
              return (Integer) value == 1;
   ```

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/DataSchema.java
##########
@@ -434,54 +411,104 @@ public Serializable convertAndFormat(Object value) {
         case INT_ARRAY:
           return (int[]) value;
         case LONG_ARRAY:
-          if (value instanceof long[]) {
-            return (long[]) value;
-          } else {
-            int[] intValues = (int[]) value;
-            int length = intValues.length;
-            long[] longValues = new long[length];
-            for (int i = 0; i < length; i++) {
-              longValues[i] = intValues[i];
-            }
-            return longValues;
-          }
+          return toLongArray(value);
         case FLOAT_ARRAY:
           return (float[]) value;
         case DOUBLE_ARRAY:
-          if (value instanceof double[]) {
-            return (double[]) value;
-          } else if (value instanceof int[]) {
-            int[] intValues = (int[]) value;
-            int length = intValues.length;
-            double[] doubleValues = new double[length];
-            for (int i = 0; i < length; i++) {
-              doubleValues[i] = intValues[i];
-            }
-            return doubleValues;
-          } else if (value instanceof long[]) {
-            long[] longValues = (long[]) value;
-            int length = longValues.length;
-            double[] doubleValues = new double[length];
-            for (int i = 0; i < length; i++) {
-              doubleValues[i] = longValues[i];
-            }
-            return doubleValues;
-          } else {
-            float[] floatValues = (float[]) value;
-            int length = floatValues.length;
-            double[] doubleValues = new double[length];
-            for (int i = 0; i < length; i++) {
-              doubleValues[i] = floatValues[i];
-            }
-            return doubleValues;
-          }
+          return toDoubleArray(value);
         case STRING_ARRAY:
           return (String[]) value;
+        case BOOLEAN_ARRAY:
+          return toBooleanArray(value);
+        case TIMESTAMP_ARRAY:
+          return formatTimestampArray(value);
         default:
           throw new IllegalStateException(String.format("Cannot convert and format: '%s' to type: %s", value, this));
       }
     }
 
+    private static double[] toDoubleArray(Object value) {
+      if (value instanceof double[]) {
+        return (double[]) value;
+      } else if (value instanceof int[]) {
+        int[] intValues = (int[]) value;
+        int length = intValues.length;
+        double[] doubleValues = new double[length];
+        for (int i = 0; i < length; i++) {
+          doubleValues[i] = intValues[i];
+        }
+        return doubleValues;
+      } else if (value instanceof long[]) {
+        long[] longValues = (long[]) value;
+        int length = longValues.length;
+        double[] doubleValues = new double[length];
+        for (int i = 0; i < length; i++) {
+          doubleValues[i] = longValues[i];
+        }
+        return doubleValues;
+      } else {
+        float[] floatValues = (float[]) value;
+        int length = floatValues.length;
+        double[] doubleValues = new double[length];
+        for (int i = 0; i < length; i++) {
+          doubleValues[i] = floatValues[i];
+        }
+        return doubleValues;
+      }
+    }
+
+    private static long[] toLongArray(Object value) {
+      if (value instanceof long[]) {
+        return (long[]) value;
+      } else {
+        int[] intValues = (int[]) value;
+        int length = intValues.length;
+        long[] longValues = new long[length];
+        for (int i = 0; i < length; i++) {
+          longValues[i] = intValues[i];
+        }
+        return longValues;
+      }
+    }
+
+    private static boolean[] toBooleanArray(Object value) {
+      if (value instanceof boolean[]) {
+        return (boolean[]) value;
+      } else {
+        int[] ints = (int[]) value;
+        boolean[] booleans = new boolean[ints.length];
+        for (int i = 0; i < ints.length; i++) {
+          booleans[i] = ints[i] == 1;

Review comment:
       Yes, this is for mapping to truth values from the way booleans are physically stored in pinot, and isn't a generic utility.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org