You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "gianm (via GitHub)" <gi...@apache.org> on 2024/04/09 18:05:03 UTC

[PR] SQL: Support GROUP BY and ORDER BY for NULL types. (druid)

gianm opened a new pull request, #16252:
URL: https://github.com/apache/druid/pull/16252

   Treats SQL NULL types as strings at the native layer. This is consistent with how we treat unknown-type nulls in other contexts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


Re: [PR] SQL: Support GROUP BY and ORDER BY for NULL types. (druid)

Posted by "gianm (via GitHub)" <gi...@apache.org>.
gianm commented on code in PR #16252:
URL: https://github.com/apache/druid/pull/16252#discussion_r1600226518


##########
sql/src/main/java/org/apache/druid/sql/calcite/planner/Calcites.java:
##########
@@ -244,12 +244,22 @@ public static boolean isLongType(SqlTypeName sqlTypeName)
   }
 
   /**
-   * Returns the natural StringComparator associated with the RelDataType
+   * Returns the natural StringComparator associated with the RelDataType, or null if the type is not convertible to
+   * {@link ColumnType} by {@link #getColumnTypeForRelDataType(RelDataType)}.
    */
+  @Nullable
   public static StringComparator getStringComparatorForRelDataType(RelDataType dataType)
   {
-    final ColumnType valueType = getColumnTypeForRelDataType(dataType);
-    return getStringComparatorForValueType(valueType);
+    if (dataType.getSqlTypeName() == SqlTypeName.NULL) {
+      return StringComparators.NATURAL;
+    } else {
+      final ColumnType valueType = getColumnTypeForRelDataType(dataType);

Review Comment:
   `ColumnType` is meant to represent types that the native query engine can store and process. We don't _really_ have a pure-`NULL` type there, so adding one may have larger consequences that I did not want to run into in this patch. I thought it would be better to keep the changes to the SQL layer.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


Re: [PR] SQL: Support GROUP BY and ORDER BY for NULL types. (druid)

Posted by "cryptoe (via GitHub)" <gi...@apache.org>.
cryptoe commented on code in PR #16252:
URL: https://github.com/apache/druid/pull/16252#discussion_r1580496658


##########
sql/src/main/java/org/apache/druid/sql/calcite/planner/Calcites.java:
##########
@@ -244,12 +244,22 @@ public static boolean isLongType(SqlTypeName sqlTypeName)
   }
 
   /**
-   * Returns the natural StringComparator associated with the RelDataType
+   * Returns the natural StringComparator associated with the RelDataType, or null if the type is not convertible to
+   * {@link ColumnType} by {@link #getColumnTypeForRelDataType(RelDataType)}.
    */
+  @Nullable
   public static StringComparator getStringComparatorForRelDataType(RelDataType dataType)
   {
-    final ColumnType valueType = getColumnTypeForRelDataType(dataType);
-    return getStringComparatorForValueType(valueType);
+    if (dataType.getSqlTypeName() == SqlTypeName.NULL) {
+      return StringComparators.NATURAL;
+    } else {
+      final ColumnType valueType = getColumnTypeForRelDataType(dataType);

Review Comment:
   Should we add a new columnType called null just like the relDataType : SqlTypeName.null
   In that case, we can push this logic inside getStringComparatorsForValueType()  and remove this custom handling ?
   
   thoughts ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


Re: [PR] SQL: Support GROUP BY and ORDER BY for NULL types. (druid)

Posted by "gianm (via GitHub)" <gi...@apache.org>.
gianm commented on PR #16252:
URL: https://github.com/apache/druid/pull/16252#issuecomment-2045859287

   @kgyrtkirk FYI I encountered an error here in decoupled planning with one of the new tests, and disabled it: https://github.com/apache/druid/pull/16252/files#diff-8d40b0c4f1c04fc1a87f672f9fb48a91568c7bd4a4eb5ceda1c9cb008a22653fR70-R76


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


Re: [PR] SQL: Support GROUP BY and ORDER BY for NULL types. (druid)

Posted by "kgyrtkirk (via GitHub)" <gi...@apache.org>.
kgyrtkirk commented on code in PR #16252:
URL: https://github.com/apache/druid/pull/16252#discussion_r1558998133


##########
sql/src/test/java/org/apache/druid/sql/calcite/CalciteQueryTest.java:
##########
@@ -6108,6 +6108,102 @@ public void testCountStarWithTimeInIntervalFilterLosAngeles()
     );
   }
 
+  @Test
+  public void testGroupByNullType()
+  {
+    // Cannot vectorize due to null constant expression.
+    cannotVectorize();
+    testQuery(
+        "SELECT NULL as nullcol, COUNT(*) FROM druid.foo GROUP BY 1",
+        ImmutableList.of(
+            new GroupByQuery.Builder()
+                .setDataSource(CalciteTests.DATASOURCE1)
+                .setInterval(querySegmentSpec(Filtration.eternity()))
+                .setGranularity(Granularities.ALL)
+                .setVirtualColumns(expressionVirtualColumn("v0", "null", ColumnType.STRING))
+                .setDimensions(dimensions(new DefaultDimensionSpec("v0", "d0", ColumnType.STRING)))
+                .setAggregatorSpecs(aggregators(new CountAggregatorFactory("a0")))
+                .setContext(QUERY_CONTEXT_DEFAULT)
+                .build()
+        ),
+        ImmutableList.of(
+            new Object[]{null, 6L}
+        )
+    );
+  }
+
+  @Test
+  public void testOrderByNullType()

Review Comment:
   instead of overriding+disabling the testcase; it can be marked as unsupported with:
   ```suggestion
     @NotYetSupported(Modes.NOT_ENOUGH_RULES)
     @Test
     public void testOrderByNullType()
   ```
   this has the benefit that it will be verified that it still fails and could also be located based on the type of failure it have



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


Re: [PR] SQL: Support GROUP BY and ORDER BY for NULL types. (druid)

Posted by "gianm (via GitHub)" <gi...@apache.org>.
gianm commented on code in PR #16252:
URL: https://github.com/apache/druid/pull/16252#discussion_r1561804344


##########
sql/src/test/java/org/apache/druid/sql/calcite/CalciteQueryTest.java:
##########
@@ -6108,6 +6108,102 @@ public void testCountStarWithTimeInIntervalFilterLosAngeles()
     );
   }
 
+  @Test
+  public void testGroupByNullType()
+  {
+    // Cannot vectorize due to null constant expression.
+    cannotVectorize();
+    testQuery(
+        "SELECT NULL as nullcol, COUNT(*) FROM druid.foo GROUP BY 1",
+        ImmutableList.of(
+            new GroupByQuery.Builder()
+                .setDataSource(CalciteTests.DATASOURCE1)
+                .setInterval(querySegmentSpec(Filtration.eternity()))
+                .setGranularity(Granularities.ALL)
+                .setVirtualColumns(expressionVirtualColumn("v0", "null", ColumnType.STRING))
+                .setDimensions(dimensions(new DefaultDimensionSpec("v0", "d0", ColumnType.STRING)))
+                .setAggregatorSpecs(aggregators(new CountAggregatorFactory("a0")))
+                .setContext(QUERY_CONTEXT_DEFAULT)
+                .build()
+        ),
+        ImmutableList.of(
+            new Object[]{null, 6L}
+        )
+    );
+  }
+
+  @Test
+  public void testOrderByNullType()

Review Comment:
   OK, thanks, cool idea!
   
   It is supported for the basic planner, so I kept the override in `DecoupledPlanningCalciteQueryTest` and changed the annotation from `@Disabled` to `@NotYetSupported`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org