You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/03/05 09:22:38 UTC

[GitHub] [druid] FrankChen021 opened a new pull request #10949: Add support to first/last aggregators for numeric types during ingestion

FrankChen021 opened a new pull request #10949:
URL: https://github.com/apache/druid/pull/10949


   
   Fixes #10702
   
   ### Description
   
   This PR fixes #10702 by adding support to doubleFirst/floatFirst/longFirst and doubleLast/floatLast/longLast during ingestion phase. And also reverts #10794 to bring back the UI.
   
   The implementation is inspired by current stringFirst/stringLast implementation, so the code looks like similar. But this PR does not refactor current stringFirst/stringLast implementation to share the code with double/float/long. That might be done in the future.
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `AbstractSerializablePairSerde` is provided to share serialization code for type of long/double/float
    * `GenericFirstAggregateCombiner` is provided to share first aggregator code for type of long/double/float
    * `GenericLastAggregateCombiner` is provided to share last aggregator code for type of long/double/float
   
   <hr>
   
   <!-- Check the items by putting "x" in the brackets for the done things. Not all of these items apply to every PR. Remove the items which are not done or not relevant to the PR. None of the items from the checklist below are strictly necessary, but it would be very helpful if you at least self-review the PR. -->
   
   This PR has:
   - [X] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [X] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [X] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [X] added integration tests.
   - [X] been tested in a test Druid cluster.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r614511037



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializableLongObjectPairSerde.java
##########
@@ -0,0 +1,134 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+import java.util.Comparator;
+
+/**
+ * The class serializes/deserializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators
+ */
+public abstract class AbstractSerializableLongObjectPairSerde<T extends SerializablePair<Long, ?>>
+    extends ComplexMetricSerde
+{
+  private final Class<T> pairClassObject;
+
+  public AbstractSerializableLongObjectPairSerde(Class<T> pairClassObject)
+  {
+    this.pairClassObject = pairClassObject;
+  }
+
+  @Override
+  public ComplexMetricExtractor getExtractor()
+  {
+    return new ComplexMetricExtractor()
+    {
+      @Override
+      public Class<T> extractedClass()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public Object extractValue(InputRow inputRow, String metricName)
+      {
+        return inputRow.getRaw(metricName);
+      }
+    };
+  }
+
+  @Override
+  public void deserializeColumn(ByteBuffer buffer, ColumnBuilder columnBuilder)
+  {
+    final GenericIndexed column = GenericIndexed.read(buffer, getObjectStrategy(), columnBuilder.getFileMapper());
+    columnBuilder.setComplexColumnSupplier(new ComplexColumnPartSupplier(getTypeName(), column));
+  }
+
+  @Override
+  public ObjectStrategy<T> getObjectStrategy()

Review comment:
       Very good suggestion. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r616283750



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/SerializablePairLongDoubleSerde.java
##########
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.common.config.NullHandling;
+import org.apache.druid.segment.data.ObjectStrategy;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+import java.util.Comparator;
+
+/**
+ * The class serializes a Long-Double pair (SerializablePair<Long, Double>).
+ * The serialization structure is: Long:Double
+ * <p>
+ * The class is used on first/last Double aggregators to store the time and the first/last Double.
+ * (Long:timestamp, Float:value)
+ */
+public class SerializablePairLongDoubleSerde extends AbstractSerializableLongObjectPairSerde<SerializablePairLongDouble>
+{
+  public static final String TYPE_NAME = "serializablePairLongDouble";
+
+  /**
+   * Since SerializablePairLongDouble is subclass of SerializablePair<Long,Double>,
+   * it's safe to declare the generic type of comparator as SerializablePair<Long,Double>.
+   */
+  private static final Comparator<SerializablePair<Long, Double>> VALUE_COMPARATOR = SerializablePair.createNullHandlingComparator(
+      Double::compare,
+      true
+  );
+
+  public SerializablePairLongDoubleSerde()
+  {
+    super(SerializablePairLongDouble.class);
+  }
+
+  @Override
+  public String getTypeName()
+  {
+    return TYPE_NAME;
+  }
+
+  @Override
+  public ObjectStrategy getObjectStrategy()
+  {
+    return new ObjectStrategy<SerializablePairLongDouble>()

Review comment:
       done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r618967367



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/first/NumericFirstAggregator.java
##########
@@ -49,16 +55,44 @@ public NumericFirstAggregator(BaseLongColumnValueSelector timeSelector, TSelecto
   /**
    * Store the current primitive typed 'first' value
    */
-  abstract void setCurrentValue();
+  abstract void setCurrentValue(ColumnValueSelector valueSelector);
+
+  abstract void setCurrentValue(Number number);
 
   @Override
   public void aggregate()
   {
+    if (needsFoldCheck) {
+
+      // Need to read this first (before time), just in case it's a SerializablePairLongString (we don't know; it's
+      // detected at query time).
+      final Object object = valueSelector.getObject();
+
+      if (object instanceof SerializablePair) {
+
+        // cast to Pair<Long, Number> to support reindex from type such as doubleFirst into longFirst
+        final SerializablePair<Long, Number> pair = (SerializablePair<Long, Number>) object;
+        if (pair.lhs < firstTime) {
+          firstTime = pair.lhs;
+
+          // rhs might be NULL under SQL-compatibility mode

Review comment:
       For this question, the short answer is yes. The query time processing of code is here
   
   https://github.com/apache/druid/blob/9745d9e1c3f9e7664e289a82acffc1130db4a11e/processing/src/main/java/org/apache/druid/query/aggregation/last/FloatLastAggregatorFactory.java#L228
   
   I think we should return the default value 0 for this case. What do you think @clintropolis ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-811916219


   @FrankChen021 thanks for bringing this back to the top of my radar. I will look through this over the next week or so.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-817061164


   @FrankChen021 To help with reviewing this PR, could you update the PR description to include some notes how you chose to implement the solution for this. For example, it looks like the `AbstractSerializablePairSerde` was based off the `SerializablePairLongStringSerde`.
   
   Also, it would help if you included a description of the different scenarios you tested, and known unsupported conditions - like your comment about first last aggregators not working in SQL queries.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
clintropolis commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-811490841


   >Hi @clintropolis @suneet-s , Could you review this PR at any time you're convenient ? Since this PR is a little large, I think the SQL problem could be separated in another PR.
   
   Sorry, I will try to get to this soon! I think I have a similar problem to solve with complex types in a different thing I'm working on, so will be thinking about how we can deal with differences between intermediary and finalized types a bit better as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
clintropolis commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-797172311


   >Hi @clintropolis , Thanks for your suggestion. I'll try to solve it.
   
   Depending on how big of a change this is, it might be worth splitting out a separate PR to go in before this one. I'll try to think about this a bit as well.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r611059697



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializablePairSerde.java
##########
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import com.google.common.primitives.Longs;
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * The class serializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators
+ */
+public abstract class AbstractSerializablePairSerde<T extends SerializablePair<Long, ?>> extends ComplexMetricSerde
+{
+  private final Class<T> pairClassObject;
+
+  public AbstractSerializablePairSerde(Class<T> pairClassObject)
+  {
+    this.pairClassObject = pairClassObject;
+  }
+
+  @Override
+  public ComplexMetricExtractor getExtractor()
+  {
+    return new ComplexMetricExtractor()
+    {
+      @Override
+      public Class<T> extractedClass()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public Object extractValue(InputRow inputRow, String metricName)
+      {
+        return inputRow.getRaw(metricName);
+      }
+    };
+  }
+
+  @Override
+  public void deserializeColumn(ByteBuffer buffer, ColumnBuilder columnBuilder)
+  {
+    final GenericIndexed column = GenericIndexed.read(buffer, getObjectStrategy(), columnBuilder.getFileMapper());
+    columnBuilder.setComplexColumnSupplier(new ComplexColumnPartSupplier(getTypeName(), column));
+  }
+
+  @Override
+  public ObjectStrategy<T> getObjectStrategy()
+  {
+    return new ObjectStrategy<T>()
+    {
+      @Override
+      public int compare(@Nullable T o1, @Nullable T o2)
+      {
+        return Longs.compare(o1.lhs, o2.lhs);

Review comment:
       I will check if UT is able to check this error first because IT cases share some data sources with other cases.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r611058506



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializablePairSerde.java
##########
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import com.google.common.primitives.Longs;
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * The class serializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators
+ */
+public abstract class AbstractSerializablePairSerde<T extends SerializablePair<Long, ?>> extends ComplexMetricSerde
+{
+  private final Class<T> pairClassObject;
+
+  public AbstractSerializablePairSerde(Class<T> pairClassObject)
+  {
+    this.pairClassObject = pairClassObject;
+  }
+
+  @Override
+  public ComplexMetricExtractor getExtractor()
+  {
+    return new ComplexMetricExtractor()
+    {
+      @Override
+      public Class<T> extractedClass()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public Object extractValue(InputRow inputRow, String metricName)
+      {
+        return inputRow.getRaw(metricName);
+      }
+    };
+  }
+
+  @Override
+  public void deserializeColumn(ByteBuffer buffer, ColumnBuilder columnBuilder)
+  {
+    final GenericIndexed column = GenericIndexed.read(buffer, getObjectStrategy(), columnBuilder.getFileMapper());
+    columnBuilder.setComplexColumnSupplier(new ComplexColumnPartSupplier(getTypeName(), column));
+  }
+
+  @Override
+  public ObjectStrategy<T> getObjectStrategy()
+  {
+    return new ObjectStrategy<T>()
+    {
+      @Override
+      public int compare(@Nullable T o1, @Nullable T o2)
+      {
+        return Longs.compare(o1.lhs, o2.lhs);
+      }
+
+      @Override
+      public Class<T> getClazz()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public T fromByteBuffer(ByteBuffer buffer, int numBytes)
+      {
+        return toPairObject(buffer);
+      }
+
+      @Override
+      public byte[] toBytes(T val)
+      {
+        return pairToBytes(val);
+      }
+    };
+  }
+
+  @Override
+  public GenericColumnSerializer<T> getSerializer(SegmentWriteOutMedium segmentWriteOutMedium, String column)
+  {
+    return LargeColumnSupportedComplexColumnSerializer.create(segmentWriteOutMedium, column, this.getObjectStrategy());
+  }
+
+  protected abstract T toPairObject(ByteBuffer buffer);
+
+  protected abstract byte[] pairToBytes(T val);

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-791299986


   One tricky problem left is that first/last aggregators is not supported by SQL query on reindexed long/float/double columns while these aggregators work well in a native query.
   
   The type of reindexed double/float/long/string first/last columns are marked as COMPLEX in schema, and the underlying type is lost when the type is converted into `RelDataType`. 
   
   https://github.com/apache/druid/blob/16acd6686a337fc9a12e9aeda443fca973bdc625/sql/src/main/java/org/apache/druid/sql/calcite/table/RowSignatures.java#L135
   
   Since the underlying data type of the column is lost during SQL planning,  current `EarliestLatestReturnTypeInference` also is not able to infer correct return type, and is unable to create correct type of aggregator for double/float/long.
   
   One way I can come up with is to define some macros such as double_latest for different data types at the SQL layer. 
   @gianm @clintropolis Do you have any other suggestions ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-811588087


   
   > Sorry, I will try to get to this soon! I think I have a similar problem to solve with complex types in a different thing I'm working on, so will be thinking about how we can deal with differences between intermediary and finalized types a bit better as well.
   
   If there're any ideas or progress about solving complex types, could you let know ? I'm also working on the sql problem.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-817380605


   > SQL query on re-indexed columns with double/float/long first and last aggregators WON'T work. This involves some changes in complex type handling which might be better in another PR.
   
   What happens when a user issues the `EARLIEST(expr, maxBytesPerString)` function on a longFirst column - do we expect that it will fail? Is the error message in this case clear?
   
   I'm asking because it looks like #10332 added handling of complex type columns, which used to be ok because stringFirst/Last was the only type of complex column. But now that we've introduced these column types, the expected behavior is less clear. Perhaps you can add some tests to CalciteQueryTest to validate the behavior that we want users to see when they issue sql queries on these column types.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r613828445



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/SerializablePairLongLongSerde.java
##########
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.common.config.NullHandling;
+
+import java.nio.ByteBuffer;
+import java.util.Comparator;
+
+/**
+ * The class serializes a Long-Long pair (SerializablePair<Long, Long>).
+ * The serialization structure is: Long:Long
+ * <p>
+ * The class is used on first/last Long aggregators to store the time and the first/last Long.
+ * Long:Long -> Timestamp:Long

Review comment:
       nit
   ```suggestion
    * (Long:timestamp, Long:value)
   ```

##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializableLongObjectPairSerde.java
##########
@@ -0,0 +1,134 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+import java.util.Comparator;
+
+/**
+ * The class serializes/deserializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators
+ */
+public abstract class AbstractSerializableLongObjectPairSerde<T extends SerializablePair<Long, ?>>
+    extends ComplexMetricSerde
+{
+  private final Class<T> pairClassObject;
+
+  public AbstractSerializableLongObjectPairSerde(Class<T> pairClassObject)
+  {
+    this.pairClassObject = pairClassObject;
+  }
+
+  @Override
+  public ComplexMetricExtractor getExtractor()
+  {
+    return new ComplexMetricExtractor()
+    {
+      @Override
+      public Class<T> extractedClass()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public Object extractValue(InputRow inputRow, String metricName)
+      {
+        return inputRow.getRaw(metricName);
+      }
+    };
+  }
+
+  @Override
+  public void deserializeColumn(ByteBuffer buffer, ColumnBuilder columnBuilder)
+  {
+    final GenericIndexed column = GenericIndexed.read(buffer, getObjectStrategy(), columnBuilder.getFileMapper());
+    columnBuilder.setComplexColumnSupplier(new ComplexColumnPartSupplier(getTypeName(), column));
+  }
+
+  @Override
+  public ObjectStrategy<T> getObjectStrategy()
+  {
+    return new ObjectStrategy<T>()
+    {
+      @Override
+      public int compare(@Nullable T o1, @Nullable T o2)
+      {
+        return getLongObjectPairComparator().compare(o1, o2);

Review comment:
       it would be nice to not create a new comparator object for each comparison op.

##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/first/NumericFirstAggregator.java
##########
@@ -49,16 +55,44 @@ public NumericFirstAggregator(BaseLongColumnValueSelector timeSelector, TSelecto
   /**
    * Store the current primitive typed 'first' value
    */
-  abstract void setCurrentValue();
+  abstract void setCurrentValue(ColumnValueSelector valueSelector);
+
+  abstract void setCurrentValue(Number number);
 
   @Override
   public void aggregate()
   {
+    if (needsFoldCheck) {
+
+      // Need to read this first (before time), just in case it's a SerializablePairLongString (we don't know; it's
+      // detected at query time).
+      final Object object = valueSelector.getObject();
+
+      if (object instanceof SerializablePair) {
+
+        // cast to Pair<Long, Number> to support reindex from type such as doubleFirst into longFirst
+        final SerializablePair<Long, Number> pair = (SerializablePair<Long, Number>) object;
+        if (pair.lhs < firstTime) {
+          firstTime = pair.lhs;
+
+          // rhs might be NULL under SQL-compatibility mode

Review comment:
       a bit out of my depth here. what will happen if the aggregate was stored as null in segment since sql compatibility was on in the task writing the segment. But then sql compatability is turned off when the segment data is being read. should it still be read as null? 

##########
File path: processing/src/test/java/org/apache/druid/query/aggregation/first/FloatFirstAggregationTest.java
##########
@@ -183,6 +201,40 @@ public void testSerde() throws Exception
     Assert.assertEquals(floatFirstAggregatorFactory, mapper.readValue(doubleSpecJson, AggregatorFactory.class));
   }
 
+  @Test
+  public void testFloatFirstAggregateCombiner()
+  {
+    AggregateCombiner floatFirstAggregateCombiner = combiningAggFactory.makeAggregateCombiner();
+
+    SerializablePair[] inputPairs = {
+        new SerializablePair<>(5L, 134.3f),

Review comment:
       can you also add tests with null values as input and/or expected result? 

##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializableLongObjectPairSerde.java
##########
@@ -0,0 +1,134 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+import java.util.Comparator;
+
+/**
+ * The class serializes/deserializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators
+ */
+public abstract class AbstractSerializableLongObjectPairSerde<T extends SerializablePair<Long, ?>>
+    extends ComplexMetricSerde
+{
+  private final Class<T> pairClassObject;
+
+  public AbstractSerializableLongObjectPairSerde(Class<T> pairClassObject)
+  {
+    this.pairClassObject = pairClassObject;
+  }
+
+  @Override
+  public ComplexMetricExtractor getExtractor()
+  {
+    return new ComplexMetricExtractor()
+    {
+      @Override
+      public Class<T> extractedClass()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public Object extractValue(InputRow inputRow, String metricName)
+      {
+        return inputRow.getRaw(metricName);
+      }
+    };
+  }
+
+  @Override
+  public void deserializeColumn(ByteBuffer buffer, ColumnBuilder columnBuilder)
+  {
+    final GenericIndexed column = GenericIndexed.read(buffer, getObjectStrategy(), columnBuilder.getFileMapper());
+    columnBuilder.setComplexColumnSupplier(new ComplexColumnPartSupplier(getTypeName(), column));
+  }
+
+  @Override
+  public ObjectStrategy<T> getObjectStrategy()

Review comment:
       I think it will be cleaner to have each subclass implement getObjectStrategy() and then we need not have three abstract methods. 

##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/first/StringFirstLastUtils.java
##########
@@ -36,11 +36,16 @@
   private static final int NULL_VALUE = -1;
 
   /**
-   * Returns whether a given value selector *might* contain SerializablePairLongString objects.

Review comment:
       The class may require a rename now. May be FirstLastUtils? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r614497886



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializableLongObjectPairSerde.java
##########
@@ -0,0 +1,134 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+import java.util.Comparator;
+
+/**
+ * The class serializes/deserializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators
+ */
+public abstract class AbstractSerializableLongObjectPairSerde<T extends SerializablePair<Long, ?>>
+    extends ComplexMetricSerde
+{
+  private final Class<T> pairClassObject;
+
+  public AbstractSerializableLongObjectPairSerde(Class<T> pairClassObject)
+  {
+    this.pairClassObject = pairClassObject;
+  }
+
+  @Override
+  public ComplexMetricExtractor getExtractor()
+  {
+    return new ComplexMetricExtractor()
+    {
+      @Override
+      public Class<T> extractedClass()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public Object extractValue(InputRow inputRow, String metricName)
+      {
+        return inputRow.getRaw(metricName);
+      }
+    };
+  }
+
+  @Override
+  public void deserializeColumn(ByteBuffer buffer, ColumnBuilder columnBuilder)
+  {
+    final GenericIndexed column = GenericIndexed.read(buffer, getObjectStrategy(), columnBuilder.getFileMapper());
+    columnBuilder.setComplexColumnSupplier(new ComplexColumnPartSupplier(getTypeName(), column));
+  }
+
+  @Override
+  public ObjectStrategy<T> getObjectStrategy()
+  {
+    return new ObjectStrategy<T>()
+    {
+      @Override
+      public int compare(@Nullable T o1, @Nullable T o2)
+      {
+        return getLongObjectPairComparator().compare(o1, o2);

Review comment:
       Implementations of `getLongObjectPairComparator` does not create new comparator object, they hold a static comparator object.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 edited a comment on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 edited a comment on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-811588087


   > Sorry, I will try to get to this soon! I think I have a similar problem to solve with complex types in a different thing I'm working on, so will be thinking about how we can deal with differences between intermediary and finalized types a bit better as well.
   
   If there're any ideas or progress about solving complex types, could you let me know ? I'm also working on the sql problem.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r611058889



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializablePairSerde.java
##########
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import com.google.common.primitives.Longs;
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * The class serializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators
+ */
+public abstract class AbstractSerializablePairSerde<T extends SerializablePair<Long, ?>> extends ComplexMetricSerde
+{
+  private final Class<T> pairClassObject;
+
+  public AbstractSerializablePairSerde(Class<T> pairClassObject)
+  {
+    this.pairClassObject = pairClassObject;
+  }
+
+  @Override
+  public ComplexMetricExtractor getExtractor()
+  {
+    return new ComplexMetricExtractor()
+    {
+      @Override
+      public Class<T> extractedClass()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public Object extractValue(InputRow inputRow, String metricName)
+      {
+        return inputRow.getRaw(metricName);
+      }
+    };
+  }
+
+  @Override
+  public void deserializeColumn(ByteBuffer buffer, ColumnBuilder columnBuilder)
+  {
+    final GenericIndexed column = GenericIndexed.read(buffer, getObjectStrategy(), columnBuilder.getFileMapper());
+    columnBuilder.setComplexColumnSupplier(new ComplexColumnPartSupplier(getTypeName(), column));
+  }
+
+  @Override
+  public ObjectStrategy<T> getObjectStrategy()
+  {
+    return new ObjectStrategy<T>()
+    {
+      @Override
+      public int compare(@Nullable T o1, @Nullable T o2)
+      {
+        return Longs.compare(o1.lhs, o2.lhs);
+      }
+
+      @Override
+      public Class<T> getClazz()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public T fromByteBuffer(ByteBuffer buffer, int numBytes)
+      {
+        return toPairObject(buffer);
+      }
+
+      @Override
+      public byte[] toBytes(T val)
+      {
+        return pairToBytes(val);
+      }
+    };
+  }
+
+  @Override
+  public GenericColumnSerializer<T> getSerializer(SegmentWriteOutMedium segmentWriteOutMedium, String column)
+  {
+    return LargeColumnSupportedComplexColumnSerializer.create(segmentWriteOutMedium, column, this.getObjectStrategy());
+  }
+
+  protected abstract T toPairObject(ByteBuffer buffer);
+
+  protected abstract byte[] pairToBytes(T val);

Review comment:
       In the new commit, null check is added to the caller of this method, so it's not necessary to declare this parameter as `Nullable`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
clintropolis commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-796562253


   > One way I can come up with is to define some macros such as double_latest for different data types at the SQL layer.
   @gianm @clintropolis Do you have any other suggestions ?
   
   Hmm, so what I have had in mind to solve this is to be able to determine whether a `RowSignature` should be "finalized" or not in terms of the aggregator types. #9638 added some of the pieces needed for this (`getFinalizedType`, etc) and touches on this idea in the PR description, I just haven't yet got back this core refactoring work, or quite had time to fully think through how to determine when we need the 'finalized' signature or not.
   
   #10277 also added tracking of the "name" of the complex type on `ColumnCapabilities` (which typically is what populates the `RowSignature`) so that is potentially available to give greater detail than `ValueType.COMPLEX`, but I think the finalized type would be the useful information here.
   
   That said, I haven't had a look at this PR at all yet. I will try to get to it sometime soon, maybe I will have some ideas while looking over the code.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-819423526


   > > SQL query on re-indexed columns with double/float/long first and last aggregators WON'T work. This involves some changes in complex type handling which might be better in another PR.
   > 
   > What happens when a user issues the `EARLIEST(expr, maxBytesPerString)` function on a longFirst column - do we expect that it will fail? Is the error message in this case clear?
   > 
   > I'm asking because it looks like #10332 added handling of complex type columns, which used to be ok because stringFirst/Last was the only type of complex column. But now that we've introduced these column types, the expected behavior is less clear. Perhaps you can add some tests to CalciteQueryTest to validate the behavior that we want users to see when they issue sql queries on these column types.
   
   EARLIEST/LATEST both work well for stringFirst/Last columns. They also work for none double/float/longFirst/Last columns.
   
   For double/long/floatFirst/Last columns, following exception message is returned
   
   ```
   Error: Plan validation failed
   
   org.apache.calcite.runtime.CalciteContextException: From line 3, column 3 to line 3, column 24: Cannot apply 'EARLIEST' to arguments of type 'EARLIEST(<OTHER>)'. Supported form(s): 'EARLIEST(<NUMERIC>)' 'EARLIEST(<BOOLEAN>)' 'EARLIEST(expr, maxBytesPerString)'
   
   org.apache.calcite.tools.ValidationException
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r614506613



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/first/StringFirstLastUtils.java
##########
@@ -36,11 +36,16 @@
   private static final int NULL_VALUE = -1;
 
   /**
-   * Returns whether a given value selector *might* contain SerializablePairLongString objects.

Review comment:
       Yes, `selectorNeedsFoldCheck` method in `StringFirstLastUtils` is now shared by long/float/doubleFirst/Last, it should be extracted out of this class. 
   
    I have not made more changes to this class file because stringFirst/Last will be refactored in a new PR to share new abstract classes provided in this PR, which means this class is also involved.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r614516071



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/first/NumericFirstAggregator.java
##########
@@ -49,16 +55,44 @@ public NumericFirstAggregator(BaseLongColumnValueSelector timeSelector, TSelecto
   /**
    * Store the current primitive typed 'first' value
    */
-  abstract void setCurrentValue();
+  abstract void setCurrentValue(ColumnValueSelector valueSelector);
+
+  abstract void setCurrentValue(Number number);
 
   @Override
   public void aggregate()
   {
+    if (needsFoldCheck) {
+
+      // Need to read this first (before time), just in case it's a SerializablePairLongString (we don't know; it's
+      // detected at query time).
+      final Object object = valueSelector.getObject();
+
+      if (object instanceof SerializablePair) {
+
+        // cast to Pair<Long, Number> to support reindex from type such as doubleFirst into longFirst
+        final SerializablePair<Long, Number> pair = (SerializablePair<Long, Number>) object;
+        if (pair.lhs < firstTime) {
+          firstTime = pair.lhs;
+
+          // rhs might be NULL under SQL-compatibility mode

Review comment:
       Good question, I will check it later.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r611058596



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializablePairSerde.java
##########
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import com.google.common.primitives.Longs;
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * The class serializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators
+ */
+public abstract class AbstractSerializablePairSerde<T extends SerializablePair<Long, ?>> extends ComplexMetricSerde
+{
+  private final Class<T> pairClassObject;
+
+  public AbstractSerializablePairSerde(Class<T> pairClassObject)
+  {
+    this.pairClassObject = pairClassObject;
+  }
+
+  @Override
+  public ComplexMetricExtractor getExtractor()
+  {
+    return new ComplexMetricExtractor()
+    {
+      @Override
+      public Class<T> extractedClass()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public Object extractValue(InputRow inputRow, String metricName)
+      {
+        return inputRow.getRaw(metricName);
+      }
+    };
+  }
+
+  @Override
+  public void deserializeColumn(ByteBuffer buffer, ColumnBuilder columnBuilder)
+  {
+    final GenericIndexed column = GenericIndexed.read(buffer, getObjectStrategy(), columnBuilder.getFileMapper());
+    columnBuilder.setComplexColumnSupplier(new ComplexColumnPartSupplier(getTypeName(), column));
+  }
+
+  @Override
+  public ObjectStrategy<T> getObjectStrategy()
+  {
+    return new ObjectStrategy<T>()
+    {
+      @Override
+      public int compare(@Nullable T o1, @Nullable T o2)
+      {
+        return Longs.compare(o1.lhs, o2.lhs);
+      }
+
+      @Override
+      public Class<T> getClazz()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public T fromByteBuffer(ByteBuffer buffer, int numBytes)
+      {
+        return toPairObject(buffer);
+      }
+
+      @Override
+      public byte[] toBytes(T val)

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-803932960


   > 
   > Hmm, so what I have had in mind to solve this is to be able to determine whether a `RowSignature` should be "finalized" or not in terms of the aggregator types. #9638 added some of the pieces needed for this (`getFinalizedType`, etc) and touches on this idea in the PR description, I just haven't yet got back this core refactoring work, or quite had time to fully think through how to determine when we need the 'finalized' signature or not.
   > 
   > #10277 also added tracking of the "name" of the complex type on `ColumnCapabilities` (which typically is what populates the `RowSignature`) so that is potentially available to give greater detail than `ValueType.COMPLEX`, but I think the finalized type would be the useful information here.
   > 
   > That said, I haven't had a look at this PR at all yet. I will try to get to it sometime soon, maybe I will have some ideas while looking over the code.
   
   The name of complex type has been set in first/last aggregator
   
   https://github.com/apache/druid/pull/10949/files#diff-9fedc71bcede0adcbb1deadbef33e9e1de175ee11209eb4d5d676580104f2c03R231
   
   And I checked the code about how `RowSignature` works today, and found that there's no way to get that name from `RowSignature` because when `RowSignature` is instantiated , that name is not passed to `RowSignature` 
   
   https://github.com/apache/druid/blob/51d2c61f1cd03f2d3fe396c5bb4738293dc92107/sql/src/main/java/org/apache/druid/sql/calcite/schema/DruidSchema.java#L698
   
   So, is it reasonable to make some changes here to pass the type name as well as its value type to `RowSignature` if its value type is COMPLEX ?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-797171292


   Hi @clintropolis , Thanks for your suggestion. I'll try to solve it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-817379603


   > > @FrankChen021 To help with reviewing this PR, could you update the PR description to include some notes how you chose to implement the solution for this. For example, it looks like the `AbstractSerializablePairSerde` was based off the `SerializablePairLongStringSerde`.
   > > Also, it would help if you included a description of the different scenarios you tested, and known unsupported conditions - like your comment about first last aggregators not working in SQL queries.
   > 
   > Description of this PR has been updated. Let me know if there's anything left.
   
   Thanks @FrankChen021 I will take a look again this week!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r611051246



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializablePairSerde.java
##########
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import com.google.common.primitives.Longs;
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * The class serializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators

Review comment:
       It's reasonable to refactor `SerializablePairLongStringSerde` to be subclass of `AbstractSerializablePairSerde`, but I think this kind of change is not tightly related to this PR because this PR is already a little bit large. I think a coming PR would resolve this problem once this PR is merged.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r615202300



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/first/StringFirstLastUtils.java
##########
@@ -36,11 +36,16 @@
   private static final int NULL_VALUE = -1;
 
   /**
-   * Returns whether a given value selector *might* contain SerializablePairLongString objects.

Review comment:
       Yes




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-820912588


   @FrankChen021 - have you run any performance tests/benchmark for ingestion time rollup? That will be handy to rule out any perf bug. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r611058540



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializablePairSerde.java
##########
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import com.google.common.primitives.Longs;
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * The class serializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators
+ */
+public abstract class AbstractSerializablePairSerde<T extends SerializablePair<Long, ?>> extends ComplexMetricSerde

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-817062657


   @suneet-s Thanks for your review. I will address all the comments you left and update this PR later this day.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-817155211


   > @FrankChen021 To help with reviewing this PR, could you update the PR description to include some notes how you chose to implement the solution for this. For example, it looks like the `AbstractSerializablePairSerde` was based off the `SerializablePairLongStringSerde`.
   > 
   > Also, it would help if you included a description of the different scenarios you tested, and known unsupported conditions - like your comment about first last aggregators not working in SQL queries.
   
   Description of this PR has been updated. Let me know if there's anything left.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r614563208



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/SerializablePairLongDoubleSerde.java
##########
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.common.config.NullHandling;
+import org.apache.druid.segment.data.ObjectStrategy;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+import java.util.Comparator;
+
+/**
+ * The class serializes a Long-Double pair (SerializablePair<Long, Double>).
+ * The serialization structure is: Long:Double
+ * <p>
+ * The class is used on first/last Double aggregators to store the time and the first/last Double.
+ * (Long:timestamp, Float:value)
+ */
+public class SerializablePairLongDoubleSerde extends AbstractSerializableLongObjectPairSerde<SerializablePairLongDouble>
+{
+  public static final String TYPE_NAME = "serializablePairLongDouble";
+
+  /**
+   * Since SerializablePairLongDouble is subclass of SerializablePair<Long,Double>,
+   * it's safe to declare the generic type of comparator as SerializablePair<Long,Double>.
+   */
+  private static final Comparator<SerializablePair<Long, Double>> VALUE_COMPARATOR = SerializablePair.createNullHandlingComparator(
+      Double::compare,
+      true
+  );
+
+  public SerializablePairLongDoubleSerde()
+  {
+    super(SerializablePairLongDouble.class);
+  }
+
+  @Override
+  public String getTypeName()
+  {
+    return TYPE_NAME;
+  }
+
+  @Override
+  public ObjectStrategy getObjectStrategy()
+  {
+    return new ObjectStrategy<SerializablePairLongDouble>()

Review comment:
       We can use a static ObjectStrategy since it is stateless. It seems right now we are creating a new object for every deserialization.  

##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/first/StringFirstLastUtils.java
##########
@@ -36,11 +36,16 @@
   private static final int NULL_VALUE = -1;
 
   /**
-   * Returns whether a given value selector *might* contain SerializablePairLongString objects.

Review comment:
       oh ok. so you will make the changes in that PR. is that right? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r611059030



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/SerializablePairLongFloatSerde.java
##########
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import org.apache.druid.common.config.NullHandling;
+
+import java.nio.ByteBuffer;
+
+/**
+ * The class serializes a Long-Float pair (SerializablePair<Long, Float>).
+ * The serialization structure is: Long:Float
+ * <p>
+ * The class is used on first/last Float aggregators to store the time and the first/last Float.
+ * Long:Float -> Timestamp:Float
+ */
+public class SerializablePairLongFloatSerde extends AbstractSerializablePairSerde<SerializablePairLongFloat>
+{
+  public static final String TYPE_NAME = "serializablePairLongFloat";
+
+  public SerializablePairLongFloatSerde()
+  {
+    super(SerializablePairLongFloat.class);
+  }
+
+  @Override
+  public String getTypeName()
+  {
+    return TYPE_NAME;
+  }
+
+  @Override
+  protected SerializablePairLongFloat toPairObject(ByteBuffer buffer)
+  {
+    final ByteBuffer readOnlyBuffer = buffer.asReadOnlyBuffer();
+    long lhs = readOnlyBuffer.getLong();
+    boolean isNotNull = readOnlyBuffer.get() == NullHandling.IS_NOT_NULL_BYTE;
+    if (isNotNull) {
+      return new SerializablePairLongFloat(lhs, readOnlyBuffer.getFloat());
+    } else {
+      return new SerializablePairLongFloat(lhs, null);
+    }
+  }
+
+  @Override
+  protected byte[] pairToBytes(SerializablePairLongFloat val)

Review comment:
       in the new commit, null check is added to the caller of this method, so no need to handle null in this method




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r618964596



##########
File path: processing/src/test/java/org/apache/druid/query/aggregation/first/FloatFirstAggregationTest.java
##########
@@ -183,6 +201,40 @@ public void testSerde() throws Exception
     Assert.assertEquals(floatFirstAggregatorFactory, mapper.readValue(doubleSpecJson, AggregatorFactory.class));
   }
 
+  @Test
+  public void testFloatFirstAggregateCombiner()
+  {
+    AggregateCombiner floatFirstAggregateCombiner = combiningAggFactory.makeAggregateCombiner();
+
+    SerializablePair[] inputPairs = {
+        new SerializablePair<>(5L, 134.3f),

Review comment:
       Sorry for the late response. null related tests have been added in the latest commit




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-810014452


   Hi @clintropolis @suneet-s  , Could you review this PR at any time you're convenient ? Since this PR is a little large, I think the SQL problem could be separated in another PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on pull request #10949:
URL: https://github.com/apache/druid/pull/10949#issuecomment-821761443


   > @FrankChen021 - have you run any performance tests/benchmark for ingestion time rollup? That will be handy to rule out any perf bug.
   
   I have not. But I will. Thanks for pointing out this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #10949: Add support to first/last aggregators for numeric types during ingestion

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #10949:
URL: https://github.com/apache/druid/pull/10949#discussion_r610969176



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializablePairSerde.java
##########
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import com.google.common.primitives.Longs;
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * The class serializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators

Review comment:
       Can you describe why you chose not to also make `SerializablePairLongStringSerde` not extend this class?

##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializablePairSerde.java
##########
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import com.google.common.primitives.Longs;
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * The class serializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators
+ */
+public abstract class AbstractSerializablePairSerde<T extends SerializablePair<Long, ?>> extends ComplexMetricSerde

Review comment:
       nit: change class name to `AbstractSerializableLongObjectPairSerde`
   ```suggestion
   public abstract class AbstractSerializableLongObjectPairSerde<T extends SerializablePair<Long, ?>> extends ComplexMetricSerde
   ```

##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/SerializablePairLongFloatSerde.java
##########
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import org.apache.druid.common.config.NullHandling;
+
+import java.nio.ByteBuffer;
+
+/**
+ * The class serializes a Long-Float pair (SerializablePair<Long, Float>).
+ * The serialization structure is: Long:Float
+ * <p>
+ * The class is used on first/last Float aggregators to store the time and the first/last Float.
+ * Long:Float -> Timestamp:Float
+ */
+public class SerializablePairLongFloatSerde extends AbstractSerializablePairSerde<SerializablePairLongFloat>
+{
+  public static final String TYPE_NAME = "serializablePairLongFloat";
+
+  public SerializablePairLongFloatSerde()
+  {
+    super(SerializablePairLongFloat.class);
+  }
+
+  @Override
+  public String getTypeName()
+  {
+    return TYPE_NAME;
+  }
+
+  @Override
+  protected SerializablePairLongFloat toPairObject(ByteBuffer buffer)
+  {
+    final ByteBuffer readOnlyBuffer = buffer.asReadOnlyBuffer();
+    long lhs = readOnlyBuffer.getLong();
+    boolean isNotNull = readOnlyBuffer.get() == NullHandling.IS_NOT_NULL_BYTE;
+    if (isNotNull) {
+      return new SerializablePairLongFloat(lhs, readOnlyBuffer.getFloat());
+    } else {
+      return new SerializablePairLongFloat(lhs, null);
+    }
+  }
+
+  @Override
+  protected byte[] pairToBytes(SerializablePairLongFloat val)

Review comment:
       Since the super class says `val` can be null, all these implementations should be able to handle a null `val`.
   
   I haven't dug in yet to know what this means, but this same pattern exists in all 3 implementations.

##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializablePairSerde.java
##########
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import com.google.common.primitives.Longs;
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * The class serializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators
+ */
+public abstract class AbstractSerializablePairSerde<T extends SerializablePair<Long, ?>> extends ComplexMetricSerde
+{
+  private final Class<T> pairClassObject;
+
+  public AbstractSerializablePairSerde(Class<T> pairClassObject)
+  {
+    this.pairClassObject = pairClassObject;
+  }
+
+  @Override
+  public ComplexMetricExtractor getExtractor()
+  {
+    return new ComplexMetricExtractor()
+    {
+      @Override
+      public Class<T> extractedClass()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public Object extractValue(InputRow inputRow, String metricName)
+      {
+        return inputRow.getRaw(metricName);
+      }
+    };
+  }
+
+  @Override
+  public void deserializeColumn(ByteBuffer buffer, ColumnBuilder columnBuilder)
+  {
+    final GenericIndexed column = GenericIndexed.read(buffer, getObjectStrategy(), columnBuilder.getFileMapper());
+    columnBuilder.setComplexColumnSupplier(new ComplexColumnPartSupplier(getTypeName(), column));
+  }
+
+  @Override
+  public ObjectStrategy<T> getObjectStrategy()
+  {
+    return new ObjectStrategy<T>()
+    {
+      @Override
+      public int compare(@Nullable T o1, @Nullable T o2)
+      {
+        return Longs.compare(o1.lhs, o2.lhs);
+      }
+
+      @Override
+      public Class<T> getClazz()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public T fromByteBuffer(ByteBuffer buffer, int numBytes)
+      {
+        return toPairObject(buffer);
+      }
+
+      @Override
+      public byte[] toBytes(T val)
+      {
+        return pairToBytes(val);
+      }
+    };
+  }
+
+  @Override
+  public GenericColumnSerializer<T> getSerializer(SegmentWriteOutMedium segmentWriteOutMedium, String column)
+  {
+    return LargeColumnSupportedComplexColumnSerializer.create(segmentWriteOutMedium, column, this.getObjectStrategy());
+  }
+
+  protected abstract T toPairObject(ByteBuffer buffer);
+
+  protected abstract byte[] pairToBytes(T val);

Review comment:
       ```suggestion
     protected abstract byte[] pairToBytes(@Nullable T val);
   ```

##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializablePairSerde.java
##########
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import com.google.common.primitives.Longs;
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * The class serializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators
+ */
+public abstract class AbstractSerializablePairSerde<T extends SerializablePair<Long, ?>> extends ComplexMetricSerde
+{
+  private final Class<T> pairClassObject;
+
+  public AbstractSerializablePairSerde(Class<T> pairClassObject)
+  {
+    this.pairClassObject = pairClassObject;
+  }
+
+  @Override
+  public ComplexMetricExtractor getExtractor()
+  {
+    return new ComplexMetricExtractor()
+    {
+      @Override
+      public Class<T> extractedClass()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public Object extractValue(InputRow inputRow, String metricName)
+      {
+        return inputRow.getRaw(metricName);
+      }
+    };
+  }
+
+  @Override
+  public void deserializeColumn(ByteBuffer buffer, ColumnBuilder columnBuilder)
+  {
+    final GenericIndexed column = GenericIndexed.read(buffer, getObjectStrategy(), columnBuilder.getFileMapper());
+    columnBuilder.setComplexColumnSupplier(new ComplexColumnPartSupplier(getTypeName(), column));
+  }
+
+  @Override
+  public ObjectStrategy<T> getObjectStrategy()
+  {
+    return new ObjectStrategy<T>()
+    {
+      @Override
+      public int compare(@Nullable T o1, @Nullable T o2)
+      {
+        return Longs.compare(o1.lhs, o2.lhs);
+      }
+
+      @Override
+      public Class<T> getClazz()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public T fromByteBuffer(ByteBuffer buffer, int numBytes)
+      {
+        return toPairObject(buffer);
+      }
+
+      @Override
+      public byte[] toBytes(T val)

Review comment:
       ```suggestion
         public byte[] toBytes(@Nullable T val)
   ```

##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializablePairSerde.java
##########
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import com.google.common.primitives.Longs;
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * The class serializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators
+ */
+public abstract class AbstractSerializablePairSerde<T extends SerializablePair<Long, ?>> extends ComplexMetricSerde
+{
+  private final Class<T> pairClassObject;
+
+  public AbstractSerializablePairSerde(Class<T> pairClassObject)
+  {
+    this.pairClassObject = pairClassObject;
+  }
+
+  @Override
+  public ComplexMetricExtractor getExtractor()
+  {
+    return new ComplexMetricExtractor()
+    {
+      @Override
+      public Class<T> extractedClass()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public Object extractValue(InputRow inputRow, String metricName)
+      {
+        return inputRow.getRaw(metricName);
+      }
+    };
+  }
+
+  @Override
+  public void deserializeColumn(ByteBuffer buffer, ColumnBuilder columnBuilder)
+  {
+    final GenericIndexed column = GenericIndexed.read(buffer, getObjectStrategy(), columnBuilder.getFileMapper());
+    columnBuilder.setComplexColumnSupplier(new ComplexColumnPartSupplier(getTypeName(), column));
+  }
+
+  @Override
+  public ObjectStrategy<T> getObjectStrategy()
+  {
+    return new ObjectStrategy<T>()
+    {
+      @Override
+      public int compare(@Nullable T o1, @Nullable T o2)
+      {
+        return Longs.compare(o1.lhs, o2.lhs);
+      }
+
+      @Override
+      public Class<T> getClazz()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public T fromByteBuffer(ByteBuffer buffer, int numBytes)
+      {
+        return toPairObject(buffer);
+      }
+
+      @Override
+      public byte[] toBytes(T val)
+      {
+        return pairToBytes(val);
+      }
+    };
+  }
+
+  @Override
+  public GenericColumnSerializer<T> getSerializer(SegmentWriteOutMedium segmentWriteOutMedium, String column)
+  {
+    return LargeColumnSupportedComplexColumnSerializer.create(segmentWriteOutMedium, column, this.getObjectStrategy());
+  }
+
+  protected abstract T toPairObject(ByteBuffer buffer);
+
+  protected abstract byte[] pairToBytes(T val);

Review comment:
       javadocs please

##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/AbstractSerializablePairSerde.java
##########
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import com.google.common.primitives.Longs;
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.data.input.InputRow;
+import org.apache.druid.segment.GenericColumnSerializer;
+import org.apache.druid.segment.column.ColumnBuilder;
+import org.apache.druid.segment.data.GenericIndexed;
+import org.apache.druid.segment.data.ObjectStrategy;
+import org.apache.druid.segment.serde.ComplexColumnPartSupplier;
+import org.apache.druid.segment.serde.ComplexMetricExtractor;
+import org.apache.druid.segment.serde.ComplexMetricSerde;
+import org.apache.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer;
+import org.apache.druid.segment.writeout.SegmentWriteOutMedium;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * The class serializes a Pair<Long, ?> object for double/float/longFirst and double/float/longLast aggregators
+ */
+public abstract class AbstractSerializablePairSerde<T extends SerializablePair<Long, ?>> extends ComplexMetricSerde
+{
+  private final Class<T> pairClassObject;
+
+  public AbstractSerializablePairSerde(Class<T> pairClassObject)
+  {
+    this.pairClassObject = pairClassObject;
+  }
+
+  @Override
+  public ComplexMetricExtractor getExtractor()
+  {
+    return new ComplexMetricExtractor()
+    {
+      @Override
+      public Class<T> extractedClass()
+      {
+        return pairClassObject;
+      }
+
+      @Override
+      public Object extractValue(InputRow inputRow, String metricName)
+      {
+        return inputRow.getRaw(metricName);
+      }
+    };
+  }
+
+  @Override
+  public void deserializeColumn(ByteBuffer buffer, ColumnBuilder columnBuilder)
+  {
+    final GenericIndexed column = GenericIndexed.read(buffer, getObjectStrategy(), columnBuilder.getFileMapper());
+    columnBuilder.setComplexColumnSupplier(new ComplexColumnPartSupplier(getTypeName(), column));
+  }
+
+  @Override
+  public ObjectStrategy<T> getObjectStrategy()
+  {
+    return new ObjectStrategy<T>()
+    {
+      @Override
+      public int compare(@Nullable T o1, @Nullable T o2)
+      {
+        return Longs.compare(o1.lhs, o2.lhs);

Review comment:
       This does not correctly handle if either o1 or o2 is null. See `StringFirstAggregatorFactory#VALUE_COMPARATOR`, we'll want a similar behavior here.
   
   Would it be possible to update the integration tests that were added to surface this error?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org