You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2022/01/07 09:35:51 UTC

[GitHub] [druid] gianm commented on a change in pull request #12078: Grouping on arrays as arrays

gianm commented on a change in pull request #12078:
URL: https://github.com/apache/druid/pull/12078#discussion_r780107136



##########
File path: processing/src/main/java/org/apache/druid/segment/data/ComparableIntArray.java
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.segment.data;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonValue;
+import com.google.common.hash.Hashing;
+
+import java.util.Arrays;
+
+public class ComparableIntArray implements Comparable<ComparableIntArray>

Review comment:
       It'd be good to have unit tests specifically for this class. Especially for hashCode, equals, and compareTo.
   
   Also I suggest implementing toString, since it'll make unit tests & debugging easier.

##########
File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/GroupByQueryEngineV2.java
##########
@@ -340,7 +348,8 @@ public static boolean isAllSingleValueDims(
 
               // Now check column capabilities, which must be present and explicitly not multi-valued
               final ColumnCapabilities columnCapabilities = inspector.getColumnCapabilities(dimension.getDimension());
-              return columnCapabilities != null && columnCapabilities.hasMultipleValues().isFalse();
+              return dimension.getOutputType().equals(ColumnType.STRING_ARRAY)

Review comment:
       This is confusing; it's changing the meaning of the function away from its name. It's called `isAllSingleValueDims`, but it'll return true if some columns are STRING_ARRAY? Seems weird.
   
   How about renaming the function `hasNoExplodingDimensions` and updating the javadocs accordingly?

##########
File path: processing/src/test/java/org/apache/druid/query/groupby/GroupByQueryRunnerTest.java
##########
@@ -1306,6 +1308,321 @@ public void testMultiValueDimension()
     TestHelper.assertExpectedObjects(expectedResults, results, "multi-value-dim");
   }
 
+  @Test
+  public void testMultiValueDimensionAsArray()
+  {
+    if (config.getDefaultStrategy().equals(GroupByStrategySelector.STRATEGY_V1)) {
+      return;
+    }
+
+    // Cannot vectorize due to multi-value dimensions.
+    cannotVectorize();
+
+    GroupByQuery query = makeQueryBuilder()
+        .setDataSource(QueryRunnerTestHelper.DATA_SOURCE)
+        .setQuerySegmentSpec(QueryRunnerTestHelper.FIRST_TO_THIRD)
+        .setVirtualColumns(new ExpressionVirtualColumn("v0", "array(placementish)", ColumnType.STRING_ARRAY,
+                                                       ExprMacroTable.nil()
+        ))
+        .setDimensions(
+            new DefaultDimensionSpec("v0", "alias", ColumnType.STRING_ARRAY))
+        .setAggregatorSpecs(QueryRunnerTestHelper.ROWS_COUNT, new LongSumAggregatorFactory("idx", "index"))
+        .setGranularity(QueryRunnerTestHelper.ALL_GRAN)
+        .build();
+
+    List<ResultRow> expectedResults = Arrays.asList(
+        makeRow(query, "2011-04-01", "alias", ComparableStringArray.of("a", "preferred"), "rows", 2L, "idx", 282L),
+        makeRow(query, "2011-04-01", "alias", ComparableStringArray.of("b", "preferred"), "rows", 2L, "idx", 230L),
+        makeRow(query, "2011-04-01", "alias", ComparableStringArray.of("e", "preferred"), "rows", 2L, "idx", 324L),
+        makeRow(query, "2011-04-01", "alias", ComparableStringArray.of("h", "preferred"), "rows", 2L, "idx", 233L),
+        makeRow(query, "2011-04-01", "alias", ComparableStringArray.of("m", "preferred"), "rows", 6L, "idx", 5317L),
+        makeRow(query, "2011-04-01", "alias", ComparableStringArray.of("n", "preferred"), "rows", 2L, "idx", 235L),
+        makeRow(query, "2011-04-01", "alias", ComparableStringArray.of("p", "preferred"), "rows", 6L, "idx", 5405L),
+        makeRow(query, "2011-04-01", "alias", ComparableStringArray.of("preferred", "t"), "rows", 4L, "idx", 420L)
+    );
+
+    Iterable<ResultRow> results = GroupByQueryRunnerTestHelper.runQuery(factory, runner, query);
+    TestHelper.assertExpectedObjects(expectedResults, results, "multi-value-dim-groupby-arrays");
+  }
+
+  @Test
+  public void testMultiValueDimensionAsArrayWithOtherDims()
+  {
+    if (config.getDefaultStrategy().equals(GroupByStrategySelector.STRATEGY_V1)) {
+      return;
+    }
+
+    // Cannot vectorize due to multi-value dimensions.
+    cannotVectorize();
+
+    GroupByQuery query = makeQueryBuilder()
+        .setDataSource(QueryRunnerTestHelper.DATA_SOURCE)
+        .setQuerySegmentSpec(QueryRunnerTestHelper.FIRST_TO_THIRD)
+        .setVirtualColumns(new ExpressionVirtualColumn("v0", "array(placementish)", ColumnType.STRING_ARRAY,
+                                                       ExprMacroTable.nil()
+        ))
+        .setDimensions(
+            new DefaultDimensionSpec("v0", "alias", ColumnType.STRING_ARRAY),
+            new DefaultDimensionSpec("quality", "quality")
+        )
+        .setAggregatorSpecs(QueryRunnerTestHelper.ROWS_COUNT, new LongSumAggregatorFactory("idx", "index"))
+        .setGranularity(QueryRunnerTestHelper.ALL_GRAN)
+        .build();
+
+    List<ResultRow> expectedResults = Arrays.asList(
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias",
+            ComparableStringArray.of("a", "preferred"),
+            "quality",
+            "automotive",
+            "rows",
+            2L,
+            "idx",
+            282L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias",
+            ComparableStringArray.of("b", "preferred"),
+            "quality",
+            "business",
+            "rows",
+            2L,
+            "idx",
+            230L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias",
+            ComparableStringArray.of("e", "preferred"),
+            "quality",
+            "entertainment",
+            "rows",
+            2L,
+            "idx",
+            324L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias",
+            ComparableStringArray.of("h", "preferred"),
+            "quality",
+            "health",
+            "rows",
+            2L,
+            "idx",
+            233L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias",
+            ComparableStringArray.of("m", "preferred"),
+            "quality",
+            "mezzanine",
+            "rows",
+            6L,
+            "idx",
+            5317L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias",
+            ComparableStringArray.of("n", "preferred"),
+            "quality",
+            "news",
+            "rows",
+            2L,
+            "idx",
+            235L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias",
+            ComparableStringArray.of("p", "preferred"),
+            "quality",
+            "premium",
+            "rows",
+            6L,
+            "idx",
+            5405L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias",
+            ComparableStringArray.of("preferred", "t"),
+            "quality",
+            "technology",
+            "rows",
+            2L,
+            "idx",
+            175L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias",
+            ComparableStringArray.of("preferred", "t"),
+            "quality",
+            "travel",
+            "rows",
+            2L,
+            "idx",
+            245L
+        )
+    );
+
+    Iterable<ResultRow> results = GroupByQueryRunnerTestHelper.runQuery(factory, runner, query);
+    TestHelper.assertExpectedObjects(expectedResults, results, "multi-value-dims-groupby-arrays");
+  }
+
+  @Test
+  public void testMultiValueDimensionAsArrayWithVirtualDim()

Review comment:
       Seems like the same test as testMultiValueDimensionAsArrayWithOtherDims.

##########
File path: processing/src/main/java/org/apache/druid/segment/data/ComparableStringArray.java
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.segment.data;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonValue;
+import com.google.common.hash.Hashing;
+
+import java.util.Arrays;
+
+public class ComparableStringArray implements Comparable<ComparableStringArray>

Review comment:
       It'd be good to have unit tests specifically for this class. Especially for hashCode, equals, and compareTo.
   
   Also I suggest implementing toString, since it'll make unit tests & debugging easier.

##########
File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/RowBasedGrouperHelper.java
##########
@@ -1426,6 +1464,186 @@ private RowBasedKeySerdeHelper makeNumericSerdeHelper(
       }
     }
 
+    private class ListRowBasedKeySerdeHelper implements RowBasedKeySerdeHelper
+    {
+      final int keyBufferPosition;
+      final BufferComparator bufferComparator;
+
+      public ListRowBasedKeySerdeHelper(
+          int keyBufferPosition,
+          @Nullable StringComparator stringComparator
+      )
+      {
+        this.keyBufferPosition = keyBufferPosition;
+        final StringComparator comparator = stringComparator == null
+                                            ? StringComparators.LEXICOGRAPHIC
+                                            : stringComparator;
+
+        this.bufferComparator = (lhsBuffer, rhsBuffer, lhsPosition, rhsPosition) -> {
+          final ComparableList lhsComparableArray = listDictionary.get(lhsBuffer.getInt(lhsPosition
+                                                                                        + keyBufferPosition));
+          final ComparableList rhsComparableArray = listDictionary.get(rhsBuffer.getInt(rhsPosition
+                                                                                        + keyBufferPosition));
+          if (lhsComparableArray == null && rhsComparableArray == null) {
+            return 0;
+          } else if (lhsComparableArray == null) {
+            return -1;
+          } else if (rhsComparableArray == null) {
+            return 1;
+          }
+
+          List lhs = lhsComparableArray.getDelegate();
+          List rhs = rhsComparableArray.getDelegate();
+
+          int minLength = Math.min(lhs.size(), rhs.size());
+
+          //noinspection ArrayEquality
+          if (lhs == rhs) {
+            return 0;
+          }
+          for (int i = 0; i < minLength; i++) {
+            final int cmp = comparator.compare(String.valueOf(lhs.get(i)), String.valueOf(rhs.get(i)));
+            if (cmp == 0) {
+              continue;
+            }
+            return cmp;
+          }
+          if (lhs.size() == rhs.size()) {
+            return 0;
+          } else if (lhs.size() < rhs.size()) {
+            return -1;
+          }
+          return 1;
+        };
+      }
+
+      @Override
+      public int getKeyBufferValueSize()
+      {
+        return Integer.BYTES;
+      }
+
+      @Override
+      public boolean putToKeyBuffer(RowBasedKey key, int idx)
+      {
+        final ComparableList comparableList = (ComparableList) key.getKey()[idx];
+        int id = reverseDictionary.getInt(comparableList);
+        if (id == UNKNOWN_DICTIONARY_ID) {
+          id = listDictionary.size();
+          reverseListDictionary.put(comparableList, id);
+          listDictionary.add(comparableList);
+        }
+        keyBuffer.putInt(id);
+        return true;
+      }
+
+      @Override
+      public void getFromByteBuffer(ByteBuffer buffer, int initialOffset, int dimValIdx, Comparable[] dimValues)
+      {
+        dimValues[dimValIdx] = listDictionary.get(buffer.getInt(initialOffset + keyBufferPosition));
+      }
+
+      @Override
+      public BufferComparator getBufferComparator()
+      {
+        return bufferComparator;
+      }
+    }
+    private class ArrayRowBasedKeySerdeHelper implements RowBasedKeySerdeHelper
+    {
+      final int keyBufferPosition;
+      final BufferComparator bufferComparator;
+
+      ArrayRowBasedKeySerdeHelper(
+          int keyBufferPosition,
+          @Nullable StringComparator stringComparator
+      )
+      {
+        //TODO: karan : add pushLimitDown ?

Review comment:
       Please either resolve or remove this TODO before committing, as appropriate.

##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/planner/Calcites.java
##########
@@ -128,27 +128,10 @@ public static String escapeStringLiteral(final String s)
   }
 
   /**
-   * Convert {@link RelDataType} to the most appropriate {@link ValueType}, coercing all ARRAY types to STRING (until
-   * the time is right and we are more comfortable handling Druid ARRAY types in all parts of the engine).
-   *
-   * Callers who are not scared of ARRAY types should isntead call {@link #getValueTypeForRelDataTypeFull(RelDataType)},
-   * which returns the most accurate conversion of {@link RelDataType} to {@link ValueType}.
+   * Convert {@link RelDataType} to the most appropriate {@link ValueType}}.
    */
   @Nullable
   public static ColumnType getColumnTypeForRelDataType(final RelDataType type)

Review comment:
       I take it that no callers are "scared of ARRAY types" anymore.
   
   1. What are the main risks of making this change, & are they mitigated by testing?
   2. Do we need a feature flag?

##########
File path: processing/src/main/java/org/apache/druid/segment/DimensionHandlerUtils.java
##########
@@ -376,11 +374,58 @@ public static Float convertObjectToFloat(@Nullable Object valObj, boolean report
         return convertObjectToDouble(obj, reportParseExceptions);
       case STRING:
         return convertObjectToString(obj);
+      case ARRAY:
+        switch (type.getElementType().getType()) {
+          case STRING:
+            return convertToComparableStringArray(obj);
+          default:
+            return convertToList(obj);
+        }
+
       default:
         throw new IAE("Type[%s] is not supported for dimensions!", type);
     }
   }
 
+  private static ComparableList convertToList(Object obj)
+  {
+    if (obj == null) {
+      return null;
+    }
+    if (obj instanceof List) {
+      return new ComparableList((List) obj);
+    }
+    if (obj instanceof ComparableList) {
+      return (ComparableList) obj;
+    }
+    throw new ISE("Unable to convert type %s to %s", obj.getClass().getName(), ComparableList.class.getName());
+  }
+
+
+  @Nullable
+  public static ComparableStringArray convertToComparableStringArray(Object obj)
+  {
+    if (obj == null) {
+      return null;
+    }
+    if (obj instanceof ComparableStringArray) {
+      return (ComparableStringArray) obj;
+    }
+    if (obj instanceof String[]) {
+      return ComparableStringArray.of((String[]) obj);
+    }
+    // Jackson converts the serialized array into a string. Converting it back to a string array

Review comment:
       Do you mean "Jackson converts the serialized array into a list"?

##########
File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/GroupByQueryEngineV2.java
##########
@@ -403,6 +412,20 @@ public GroupByColumnSelectorStrategy makeColumnSelectorStrategy(
           return makeNullableNumericStrategy(new FloatGroupByColumnSelectorStrategy());
         case DOUBLE:
           return makeNullableNumericStrategy(new DoubleGroupByColumnSelectorStrategy());
+        case ARRAY:
+          switch (capabilities.getElementType().getType()) {
+            case LONG:
+              return new ListLongGroupByColumnSelectorStrategy();

Review comment:
       Any reason why it's ListLong but ArrayString?

##########
File path: processing/src/test/java/org/apache/druid/query/groupby/GroupByQueryRunnerTest.java
##########
@@ -1306,6 +1308,321 @@ public void testMultiValueDimension()
     TestHelper.assertExpectedObjects(expectedResults, results, "multi-value-dim");
   }
 
+  @Test
+  public void testMultiValueDimensionAsArray()

Review comment:
       Please include other tests for:
   
   - Dimension type STRING_ARRAY on an underlying column of type STRING that happens to be single-valued, like `placement`
   - Dimension type STRING_ARRAY on an underlying column of type STRING that happens to be multi-valued, like `placementish`
   - Dimension type STRING_ARRAY on an underlying column of numeric type, like `index`
   - Dimension type STRING on an underlying column of type STRING_ARRAY (such as a virtual column)
   - An extraction dimension spec of type STRING on a column of type STRING_ARRAY, like `new ExtractionDimensionSpec("v0", "alias", ColumnType.STRING, new SubstringDimExtractionFn(1, 1))`
   - An extraction dimension spec of type STRING_ARRAY on a column of type STRING_ARRAY, like `new ExtractionDimensionSpec("v0", "alias", ColumnType.STRING_ARRAY, new SubstringDimExtractionFn(1, 1))`
   - A virtual column that builds an array from a numeric type, like `array(index)`, and declares its output type as STRING_ARRAY
   - A nested groupBy query, where the inner and outer queries both have dimensions of type STRING_ARRAY
   
   If any of these aren't supposed to work, use `expectedException` to verify that the exception message is good. Otherwise, verify that the results are what you expect to see.

##########
File path: sql/src/test/java/org/apache/druid/sql/calcite/CalciteArraysQueryTest.java
##########
@@ -923,6 +876,277 @@ public void testArrayOffset() throws Exception
     );
   }
 
+
+  @Test
+  public void testArrayGroupAsArray() throws Exception
+  {
+    // Cannot vectorize as we donot have support in native query subsytem for grouping on arrays as keys
+    cannotVectorize();
+    testQuery(
+        "SELECT ARRAY[dim3], SUM(cnt) FROM druid.numfoo GROUP BY 1 ORDER BY 2 DESC",
+        QUERY_CONTEXT_NO_STRINGIFY_ARRAY,
+        ImmutableList.of(
+            GroupByQuery.builder()
+                        .setDataSource(CalciteTests.DATASOURCE3)
+                        .setInterval(querySegmentSpec(Filtration.eternity()))
+                        .setGranularity(Granularities.ALL)
+                        .setVirtualColumns(expressionVirtualColumn(
+                            "v0",
+                            "array(\"dim3\")",
+                            ColumnType.STRING_ARRAY
+                        ))
+                        .setDimensions(
+                            dimensions(
+                                new DefaultDimensionSpec("v0", "_d0", ColumnType.STRING_ARRAY)
+                            )
+                        )
+                        .setAggregatorSpecs(aggregators(new LongSumAggregatorFactory("a0", "cnt")))
+                        .setLimitSpec(new DefaultLimitSpec(
+                            ImmutableList.of(new OrderByColumnSpec(
+                                "a0",
+                                OrderByColumnSpec.Direction.DESCENDING,
+                                StringComparators.NUMERIC
+                            )),
+                            Integer.MAX_VALUE
+                        ))
+                        .setContext(QUERY_CONTEXT_NO_STRINGIFY_ARRAY)
+                        .build()
+        ),
+        ImmutableList.of(
+            new Object[]{
+                new ArrayList<String>()
+                {
+                  {
+                    add(null);
+                  }
+                }, 3L
+            },
+            new Object[]{ImmutableList.of("a", "b"), 1L},
+            new Object[]{ImmutableList.of("b", "c"), 1L},
+            new Object[]{ImmutableList.of("d"), 1L}
+        )
+    );
+  }
+
+  @Test
+  public void testArrayGroupAsLongArray() throws Exception
+  {
+    // Cannot vectorize as we donot have support in native query subsytem for grouping on arrays as keys
+    cannotVectorize();
+    testQuery(
+        "SELECT ARRAY[l1], SUM(cnt) FROM druid.numfoo GROUP BY 1 ORDER BY 2 DESC",
+        QUERY_CONTEXT_NO_STRINGIFY_ARRAY,
+        ImmutableList.of(
+            GroupByQuery.builder()
+                        .setDataSource(CalciteTests.DATASOURCE3)
+                        .setInterval(querySegmentSpec(Filtration.eternity()))
+                        .setGranularity(Granularities.ALL)
+                        .setVirtualColumns(expressionVirtualColumn(
+                            "v0",
+                            "array(\"l1\")",
+                            ColumnType.LONG_ARRAY
+                        ))
+                        .setDimensions(
+                            dimensions(
+                                new DefaultDimensionSpec("v0", "_d0", ColumnType.LONG_ARRAY)
+                            )
+                        )
+                        .setAggregatorSpecs(aggregators(new LongSumAggregatorFactory("a0", "cnt")))
+                        .setLimitSpec(new DefaultLimitSpec(
+                            ImmutableList.of(new OrderByColumnSpec(
+                                "a0",
+                                OrderByColumnSpec.Direction.DESCENDING,
+                                StringComparators.NUMERIC
+                            )),
+                            Integer.MAX_VALUE
+                        ))
+                        .setContext(QUERY_CONTEXT_NO_STRINGIFY_ARRAY)
+                        .build()
+        ),
+        ImmutableList.of(
+            new Object[]{ImmutableList.of(0L), 4L},
+            new Object[]{ImmutableList.of(325323L), 1L},
+            new Object[]{ImmutableList.of(7L), 1L}
+        )
+    );
+  }
+
+
+  @Test
+  public void testArrayGroupAsDoubleArray() throws Exception
+  {
+    // Cannot vectorize as we donot have support in native query subsytem for grouping on arrays as keys
+    cannotVectorize();
+    testQuery(
+        "SELECT ARRAY[d1], SUM(cnt) FROM druid.numfoo GROUP BY 1 ORDER BY 2 DESC",
+        QUERY_CONTEXT_NO_STRINGIFY_ARRAY,
+        ImmutableList.of(
+            GroupByQuery.builder()
+                        .setDataSource(CalciteTests.DATASOURCE3)
+                        .setInterval(querySegmentSpec(Filtration.eternity()))
+                        .setGranularity(Granularities.ALL)
+                        .setVirtualColumns(expressionVirtualColumn(
+                            "v0",
+                            "array(\"d1\")",
+                            ColumnType.DOUBLE_ARRAY
+                        ))
+                        .setDimensions(
+                            dimensions(
+                                new DefaultDimensionSpec("v0", "_d0", ColumnType.DOUBLE_ARRAY)
+                            )
+                        )
+                        .setAggregatorSpecs(aggregators(new LongSumAggregatorFactory("a0", "cnt")))
+                        .setLimitSpec(new DefaultLimitSpec(
+                            ImmutableList.of(new OrderByColumnSpec(
+                                "a0",
+                                OrderByColumnSpec.Direction.DESCENDING,
+                                StringComparators.NUMERIC
+                            )),
+                            Integer.MAX_VALUE
+                        ))
+                        .setContext(QUERY_CONTEXT_NO_STRINGIFY_ARRAY)
+                        .build()
+        ),
+        ImmutableList.of(
+            new Object[]{ImmutableList.of(0.0), 4L},
+            new Object[]{ImmutableList.of(1.0), 1L},
+            new Object[]{ImmutableList.of(1.7), 1L}
+        )
+    );
+  }
+
+  @Test
+  public void testArrayGroupAsFloatArray() throws Exception
+  {
+    // Cannot vectorize as we donot have support in native query subsytem for grouping on arrays as keys
+    cannotVectorize();
+    testQuery(
+        "SELECT ARRAY[f1], SUM(cnt) FROM druid.numfoo GROUP BY 1 ORDER BY 2 DESC",
+        QUERY_CONTEXT_NO_STRINGIFY_ARRAY,
+        ImmutableList.of(
+            GroupByQuery.builder()
+                        .setDataSource(CalciteTests.DATASOURCE3)
+                        .setInterval(querySegmentSpec(Filtration.eternity()))
+                        .setGranularity(Granularities.ALL)
+                        .setVirtualColumns(expressionVirtualColumn(
+                            "v0",
+                            "array(\"f1\")",
+                            ColumnType.DOUBLE_ARRAY
+                        ))
+                        .setDimensions(
+                            dimensions(
+                                new DefaultDimensionSpec("v0", "_d0", ColumnType.DOUBLE_ARRAY)
+                            )
+                        )
+                        .setAggregatorSpecs(aggregators(new LongSumAggregatorFactory("a0", "cnt")))
+                        .setLimitSpec(new DefaultLimitSpec(
+                            ImmutableList.of(new OrderByColumnSpec(
+                                "a0",
+                                OrderByColumnSpec.Direction.DESCENDING,
+                                StringComparators.NUMERIC
+                            )),
+                            Integer.MAX_VALUE
+                        ))
+                        .setContext(QUERY_CONTEXT_NO_STRINGIFY_ARRAY)
+                        .build()
+        ),
+        ImmutableList.of(
+            new Object[]{ImmutableList.of(0.0), 4L},
+            new Object[]{ImmutableList.of(0.10000000149011612), 1L},
+            new Object[]{ImmutableList.of(1.0), 1L}
+        )
+    );
+  }
+
+  @Test

Review comment:
       Please include tests for the `ARRAY` constructor with multiple arguments, like:
   
   - two multi-valued columns: `ARRAY[dim3, dim3]`
   - multi-valued plus regular column: `ARRAY[dim3, dim1]`

##########
File path: processing/src/main/java/org/apache/druid/segment/data/ComparableList.java
##########
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.segment.data;
+
+import com.fasterxml.jackson.annotation.JsonValue;
+
+import java.util.List;
+
+
+public class ComparableList<T extends Comparable> implements Comparable<ComparableList>

Review comment:
       It'd be good to have unit tests specifically for this class. Especially for hashCode, equals, and compareTo.
   
   Also I suggest implementing toString, since it'll make unit tests & debugging easier.

##########
File path: sql/src/test/java/org/apache/druid/sql/calcite/CalciteArraysQueryTest.java
##########
@@ -143,54 +144,6 @@ public void testSelectNonConstantArrayExpressionFromTable() throws Exception
     );
   }
 
-  @Test
-  public void testSelectNonConstantArrayExpressionFromTableForMultival() throws Exception

Review comment:
       Why did you delete this test?

##########
File path: processing/src/test/java/org/apache/druid/query/groupby/GroupByQueryRunnerTest.java
##########
@@ -1306,6 +1308,321 @@ public void testMultiValueDimension()
     TestHelper.assertExpectedObjects(expectedResults, results, "multi-value-dim");
   }
 
+  @Test
+  public void testMultiValueDimensionAsArray()
+  {
+    if (config.getDefaultStrategy().equals(GroupByStrategySelector.STRATEGY_V1)) {
+      return;

Review comment:
       What happens if run with v1? If it's an error, better to use `expectedException` here.
   
   Same comment for the other tests too.

##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/rel/DruidQuery.java
##########
@@ -403,7 +402,9 @@ private static Grouping computeGrouping(
       }
 
       final RelDataType dataType = rexNode.getType();
-      final ColumnType outputType = Calcites.getColumnTypeForRelDataType(dataType);
+      final ColumnType outputType;
+      outputType = Calcites.getColumnTypeForRelDataType(dataType);

Review comment:
       Needless change?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org