You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "clintropolis (via GitHub)" <gi...@apache.org> on 2023/05/19 09:16:11 UTC

[GitHub] [druid] clintropolis commented on a diff in pull request #14312: Hll Sketch estimate can now be used as an expression

clintropolis commented on code in PR #14312:
URL: https://github.com/apache/druid/pull/14312#discussion_r1198737541


##########
extensions-core/datasketches/src/test/java/org/apache/druid/query/aggregation/datasketches/hll/sql/HllSketchSqlAggregatorTest.java:
##########
@@ -888,4 +902,187 @@ public void testGroupByAggregatorDefaultValuesFinalizeOuterSketches()
         ImmutableList.of(new Object[]{"a", 0L, "0"})
     );
   }
+
+  @Test
+  public void testHllEstimateAsVirtualColumn()
+  {
+    testQuery(
+        "SELECT"
+        + " HLL_SKETCH_ESTIMATE(hllsketch_dim1)"
+        + " FROM druid.foo",
+        ImmutableList.of(
+            newScanQueryBuilder()
+                .dataSource(CalciteTests.DATASOURCE1)
+                .intervals(querySegmentSpec(Filtration.eternity()))
+                .virtualColumns(new ExpressionVirtualColumn(
+                    "v0",
+                    "hll_sketch_estimate(\"hllsketch_dim1\")",
+                    ColumnType.DOUBLE,
+                    MACRO_TABLE
+                ))
+                .resultFormat(ScanQuery.ResultFormat.RESULT_FORMAT_COMPACTED_LIST)
+                .columns("v0")
+                .context(QUERY_CONTEXT_DEFAULT)
+                .build()
+        ),
+        ImmutableList.of(
+            new Object[]{0.0D},
+            new Object[]{1.0D},
+            new Object[]{1.0D},
+            new Object[]{1.0D},
+            new Object[]{1.0D},
+            new Object[]{1.0D}
+        )
+    );
+  }
+
+  @Test
+  public void testHllEstimateAsVirtualColumnWithRound()
+  {
+    testQuery(
+        "SELECT"
+        + " HLL_SKETCH_ESTIMATE(hllsketch_dim1, TRUE)"
+        + " FROM druid.foo",
+        ImmutableList.of(
+            newScanQueryBuilder()
+                .dataSource(CalciteTests.DATASOURCE1)
+                .intervals(querySegmentSpec(Filtration.eternity()))
+                .virtualColumns(new ExpressionVirtualColumn(
+                    "v0",
+                    "hll_sketch_estimate(\"hllsketch_dim1\",1)",
+                    ColumnType.DOUBLE,
+                    MACRO_TABLE
+                ))
+                .resultFormat(ScanQuery.ResultFormat.RESULT_FORMAT_COMPACTED_LIST)
+                .columns("v0")
+                .context(QUERY_CONTEXT_DEFAULT)
+                .build()
+        ),
+        ImmutableList.of(
+            new Object[]{0.0D},
+            new Object[]{1.0D},
+            new Object[]{1.0D},
+            new Object[]{1.0D},
+            new Object[]{1.0D},
+            new Object[]{1.0D}
+        )
+    );
+  }
+
+  @Test
+  public void testHllEstimateAsVirtualColumnOnNonHllCol()
+  {
+    try {
+      testQuery(
+          "SELECT"
+          + " HLL_SKETCH_ESTIMATE(dim2)"
+          + " FROM druid.foo",
+          ImmutableList.of(
+              newScanQueryBuilder()
+                  .dataSource(CalciteTests.DATASOURCE1)
+                  .intervals(querySegmentSpec(Filtration.eternity()))
+                  .virtualColumns(new ExpressionVirtualColumn(
+                      "v0",
+                      "hll_sketch_estimate(\"dim2\")",
+                      ColumnType.DOUBLE,
+                      MACRO_TABLE
+                  ))
+                  .resultFormat(ScanQuery.ResultFormat.RESULT_FORMAT_COMPACTED_LIST)
+                  .columns("v0")
+                  .context(QUERY_CONTEXT_DEFAULT)
+                  .build()
+          ),
+          ImmutableList.of()
+      );
+    }
+    catch (ExpressionValidationException e) {
+      Assert.assertTrue(
+          e.getMessage().contains("Function[HLL_SKETCH_ESTIMATE] requires a HllSketch as the argument")
+      );
+    }
+  }
+
+  @Test
+  public void testHllEstimateAsVirtualColumnWithGroupByOrderBy()
+  {
+    skipVectorize();
+    cannotVectorize();

Review Comment:
   you have to [implement](https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/math/expr/Expr.java#L183) a [vector processor](https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/math/expr/vector/ExprVectorProcessor.java#L29) for expression to be vectorizable
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org