You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/03/11 02:45:44 UTC

[GitHub] [incubator-pinot] Jackie-Jiang opened a new pull request #6668: SumPrecision: support all data types and star-tree

Jackie-Jiang opened a new pull request #6668:
URL: https://github.com/apache/incubator-pinot/pull/6668


   ## Description
   Currently `SumPrecision` aggregation function only supports summing up the serialized BigDecimal bytes.
   This PR enhances the `SumPrecision` with:
   - All data type support (INT, LONG, FLOAT, DOUBLE, STRING) to perform exact sum
   - Star-tree support (e.g. `SumPrecision__col`)
   
   Other changes:
   - Move the `BigDecimal` related functions into `BigDecimalUtils`
   - Allow creating var-length dictionary for BYTES column


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6668: SumPrecision: support all data types and star-tree

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #6668:
URL: https://github.com/apache/incubator-pinot/pull/6668#discussion_r592617634



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/aggregator/SumPrecisionValueAggregator.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.data.aggregator;
+
+import java.math.BigDecimal;
+import org.apache.pinot.common.function.AggregationFunctionType;
+import org.apache.pinot.spi.data.FieldSpec.DataType;
+import org.apache.pinot.spi.utils.BigDecimalUtils;
+
+
+public class SumPrecisionValueAggregator implements ValueAggregator<Object, BigDecimal> {
+  public static final DataType AGGREGATED_VALUE_TYPE = DataType.BYTES;
+
+  private int _maxByteSize;

Review comment:
       It doesn't really matter as we won't generate star-tree if there is no record. Also, without explicitly putting a value, member variable will be initialized to 0




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang merged pull request #6668: SumPrecision: support all data types and star-tree

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang merged pull request #6668:
URL: https://github.com/apache/incubator-pinot/pull/6668


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] icefury71 commented on a change in pull request #6668: SumPrecision: support all data types and star-tree

Posted by GitBox <gi...@apache.org>.
icefury71 commented on a change in pull request #6668:
URL: https://github.com/apache/incubator-pinot/pull/6668#discussion_r592071217



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/aggregator/SumPrecisionValueAggregator.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.data.aggregator;
+
+import java.math.BigDecimal;
+import org.apache.pinot.common.function.AggregationFunctionType;
+import org.apache.pinot.spi.data.FieldSpec.DataType;
+import org.apache.pinot.spi.utils.BigDecimalUtils;
+
+
+public class SumPrecisionValueAggregator implements ValueAggregator<Object, BigDecimal> {
+  public static final DataType AGGREGATED_VALUE_TYPE = DataType.BYTES;
+
+  private int _maxByteSize;
+
+  @Override
+  public AggregationFunctionType getAggregationType() {
+    return AggregationFunctionType.SUMPRECISION;
+  }
+
+  @Override
+  public DataType getAggregatedValueType() {
+    return AGGREGATED_VALUE_TYPE;
+  }
+
+  @Override
+  public BigDecimal getInitialAggregatedValue(Object rawValue) {
+    BigDecimal initialValue = toBigDecimal(rawValue);
+    _maxByteSize = Math.max(_maxByteSize, BigDecimalUtils.byteSize(initialValue));
+    return initialValue;
+  }
+
+  @Override
+  public BigDecimal applyRawValue(BigDecimal value, Object rawValue) {
+    value = value.add(toBigDecimal(rawValue));
+    _maxByteSize = Math.max(_maxByteSize, BigDecimalUtils.byteSize(value));
+    return value;
+  }
+
+  private static BigDecimal toBigDecimal(Object rawValue) {
+    if (rawValue instanceof byte[]) {
+      return BigDecimalUtils.deserialize((byte[]) rawValue);
+    }
+    if (rawValue instanceof Integer || rawValue instanceof Long) {
+      return BigDecimal.valueOf(((Number) rawValue).longValue());
+    }
+    return new BigDecimal(rawValue.toString());

Review comment:
       Do we need to handle any edge cases and throw ?

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/aggregator/SumPrecisionValueAggregator.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.data.aggregator;
+
+import java.math.BigDecimal;
+import org.apache.pinot.common.function.AggregationFunctionType;
+import org.apache.pinot.spi.data.FieldSpec.DataType;
+import org.apache.pinot.spi.utils.BigDecimalUtils;
+
+
+public class SumPrecisionValueAggregator implements ValueAggregator<Object, BigDecimal> {
+  public static final DataType AGGREGATED_VALUE_TYPE = DataType.BYTES;
+
+  private int _maxByteSize;

Review comment:
       Should this be initialized to 0 ?

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/SumPrecisionAggregationFunction.java
##########
@@ -77,40 +85,142 @@ public GroupByResultHolder createGroupByResultHolder(int initialCapacity, int ma
   @Override
   public void aggregate(int length, AggregationResultHolder aggregationResultHolder,
       Map<ExpressionContext, BlockValSet> blockValSetMap) {
-    byte[][] valueArray = blockValSetMap.get(_expression).getBytesValuesSV();
-    BigDecimal sumValue = getDefaultResult(aggregationResultHolder);
-    for (int i = 0; i < length; i++) {
-      BigDecimal value = DataTypeConversionFunctions.bytesToBigDecimalObject(valueArray[i]);
-      sumValue = sumValue.add(value);
+    BigDecimal sum = getDefaultResult(aggregationResultHolder);
+    BlockValSet blockValSet = blockValSetMap.get(_expression);
+    switch (blockValSet.getValueType()) {
+      case INT:
+        int[] intValues = blockValSet.getIntValuesSV();
+        for (int i = 0; i < length; i++) {
+          sum = sum.add(BigDecimal.valueOf(intValues[i]));
+        }
+        break;
+      case LONG:
+        long[] longValues = blockValSet.getLongValuesSV();
+        for (int i = 0; i < length; i++) {
+          sum = sum.add(BigDecimal.valueOf(longValues[i]));
+        }
+        break;
+      case FLOAT:
+      case DOUBLE:

Review comment:
       Float and double are treated as String ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang removed a comment on pull request #6668: SumPrecision: support all data types and star-tree

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang removed a comment on pull request #6668:
URL: https://github.com/apache/incubator-pinot/pull/6668#issuecomment-797145272


   Currently `SumPrecision` aggregation function only supports summing up the serialized BigDecimal bytes.
   
   This PR enhances the `SumPrecision` with:
   - All data type support (INT, LONG, FLOAT, DOUBLE, STRING) to perform exact sum
   - Star-tree support (e.g. `SumPrecision__col`)
   
   Other changes:
   - Move the `BigDecimal` related functions into `BigDecimalUtils`
   - Allow creating var-length dictionary for BYTES column


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on pull request #6668: SumPrecision: support all data types and star-tree

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on pull request #6668:
URL: https://github.com/apache/incubator-pinot/pull/6668#issuecomment-797145272


   Currently `SumPrecision` aggregation function only supports summing up the serialized BigDecimal bytes.
   
   This PR enhances the `SumPrecision` with:
   - All data type support (INT, LONG, FLOAT, DOUBLE, STRING) to perform exact sum
   - Star-tree support (e.g. `SumPrecision__col`)
   
   Other changes:
   - Move the `BigDecimal` related functions into `BigDecimalUtils`
   - Allow creating var-length dictionary for BYTES column


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6668: SumPrecision: support all data types and star-tree

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #6668:
URL: https://github.com/apache/incubator-pinot/pull/6668#discussion_r592619923



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/SumPrecisionAggregationFunction.java
##########
@@ -77,40 +85,142 @@ public GroupByResultHolder createGroupByResultHolder(int initialCapacity, int ma
   @Override
   public void aggregate(int length, AggregationResultHolder aggregationResultHolder,
       Map<ExpressionContext, BlockValSet> blockValSetMap) {
-    byte[][] valueArray = blockValSetMap.get(_expression).getBytesValuesSV();
-    BigDecimal sumValue = getDefaultResult(aggregationResultHolder);
-    for (int i = 0; i < length; i++) {
-      BigDecimal value = DataTypeConversionFunctions.bytesToBigDecimalObject(valueArray[i]);
-      sumValue = sumValue.add(value);
+    BigDecimal sum = getDefaultResult(aggregationResultHolder);
+    BlockValSet blockValSet = blockValSetMap.get(_expression);
+    switch (blockValSet.getValueType()) {
+      case INT:
+        int[] intValues = blockValSet.getIntValuesSV();
+        for (int i = 0; i < length; i++) {
+          sum = sum.add(BigDecimal.valueOf(intValues[i]));
+        }
+        break;
+      case LONG:
+        long[] longValues = blockValSet.getLongValuesSV();
+        for (int i = 0; i < length; i++) {
+          sum = sum.add(BigDecimal.valueOf(longValues[i]));
+        }
+        break;
+      case FLOAT:
+      case DOUBLE:

Review comment:
       We read them as string values. In order to construct `BigDecimal` from floating point values, we need to first convert the number into string format 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6668: SumPrecision: support all data types and star-tree

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #6668:
URL: https://github.com/apache/incubator-pinot/pull/6668#discussion_r592618660



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/aggregator/SumPrecisionValueAggregator.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.data.aggregator;
+
+import java.math.BigDecimal;
+import org.apache.pinot.common.function.AggregationFunctionType;
+import org.apache.pinot.spi.data.FieldSpec.DataType;
+import org.apache.pinot.spi.utils.BigDecimalUtils;
+
+
+public class SumPrecisionValueAggregator implements ValueAggregator<Object, BigDecimal> {
+  public static final DataType AGGREGATED_VALUE_TYPE = DataType.BYTES;
+
+  private int _maxByteSize;
+
+  @Override
+  public AggregationFunctionType getAggregationType() {
+    return AggregationFunctionType.SUMPRECISION;
+  }
+
+  @Override
+  public DataType getAggregatedValueType() {
+    return AGGREGATED_VALUE_TYPE;
+  }
+
+  @Override
+  public BigDecimal getInitialAggregatedValue(Object rawValue) {
+    BigDecimal initialValue = toBigDecimal(rawValue);
+    _maxByteSize = Math.max(_maxByteSize, BigDecimalUtils.byteSize(initialValue));
+    return initialValue;
+  }
+
+  @Override
+  public BigDecimal applyRawValue(BigDecimal value, Object rawValue) {
+    value = value.add(toBigDecimal(rawValue));
+    _maxByteSize = Math.max(_maxByteSize, BigDecimalUtils.byteSize(value));
+    return value;
+  }
+
+  private static BigDecimal toBigDecimal(Object rawValue) {
+    if (rawValue instanceof byte[]) {
+      return BigDecimalUtils.deserialize((byte[]) rawValue);
+    }
+    if (rawValue instanceof Integer || rawValue instanceof Long) {
+      return BigDecimal.valueOf(((Number) rawValue).longValue());
+    }
+    return new BigDecimal(rawValue.toString());

Review comment:
       If the value can be parsed, we parse the value. Otherwise, the constructor should throw the exception




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org