You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "clintropolis (via GitHub)" <gi...@apache.org> on 2023/10/18 02:27:44 UTC

[PR] fix incorrect unnest dimension cursor value matcher implementation (druid)

clintropolis opened a new pull request, #15192:
URL: https://github.com/apache/druid/pull/15192

   ### Description
   Fixes an issue from #15058 with value matchers created by `UnnestDimensionCursor` for matches against non-existent values, which I forgot to update to match only null rows whenever `includeUnknown` is set to true. I checked around and didn't notice any other implementations with this issue, the problem was that it was treating the case as a non-existent column instead of an always false matcher that should only match null rows when `includeUnknown` is true, and so instead would effectively match nothing when used with a not filter. Added tests cover the case.
   This PR has:
   
   - [x] been self-reviewed.
   - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


Re: [PR] fix incorrect unnest dimension cursor value matcher implementation (druid)

Posted by "soumyava (via GitHub)" <gi...@apache.org>.
soumyava commented on code in PR #15192:
URL: https://github.com/apache/druid/pull/15192#discussion_r1363058075


##########
processing/src/test/java/org/apache/druid/query/groupby/UnnestGroupByQueryRunnerTest.java:
##########
@@ -704,6 +706,229 @@ private Iterable<ResultRow> runQuery(final GroupByQuery query, final Incremental
     return GroupByQueryRunnerTestHelper.runQuery(factory, queryRunner, query);
   }
 
+  @Test
+  public void testGroupByOnUnnestedFilterMatch()
+  {
+    cannotVectorize();
+
+    final DataSource unnestDataSource = UnnestDataSource.create(
+        new TableDataSource(QueryRunnerTestHelper.DATA_SOURCE),
+        new ExpressionVirtualColumn(
+            QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST,
+            "\"" + QueryRunnerTestHelper.PLACEMENTISH_DIMENSION + "\"",
+            null,
+            ExprMacroTable.nil()
+        ),
+        null
+    );
+
+    GroupByQuery query = makeQueryBuilder()
+        .setDataSource(unnestDataSource)
+        .setQuerySegmentSpec(QueryRunnerTestHelper.FIRST_TO_THIRD)
+        .setDimensions(
+            new DefaultDimensionSpec(QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST, "alias0")
+        )
+        .setDimFilter(
+            new EqualityFilter(QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST, ColumnType.STRING, "a", null)
+        )
+        .setAggregatorSpecs(QueryRunnerTestHelper.ROWS_COUNT)
+        .setGranularity(QueryRunnerTestHelper.ALL_GRAN)
+        .addOrderByColumn("alias0", OrderByColumnSpec.Direction.ASCENDING)
+        .build();
+
+    // Total rows should add up to 26 * 2 = 52

Review Comment:
   nit. this comment is not relevant here



##########
processing/src/test/java/org/apache/druid/query/groupby/UnnestGroupByQueryRunnerTest.java:
##########
@@ -704,6 +706,229 @@ private Iterable<ResultRow> runQuery(final GroupByQuery query, final Incremental
     return GroupByQueryRunnerTestHelper.runQuery(factory, queryRunner, query);
   }
 
+  @Test
+  public void testGroupByOnUnnestedFilterMatch()
+  {
+    cannotVectorize();
+
+    final DataSource unnestDataSource = UnnestDataSource.create(
+        new TableDataSource(QueryRunnerTestHelper.DATA_SOURCE),
+        new ExpressionVirtualColumn(
+            QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST,
+            "\"" + QueryRunnerTestHelper.PLACEMENTISH_DIMENSION + "\"",
+            null,
+            ExprMacroTable.nil()
+        ),
+        null
+    );
+
+    GroupByQuery query = makeQueryBuilder()
+        .setDataSource(unnestDataSource)
+        .setQuerySegmentSpec(QueryRunnerTestHelper.FIRST_TO_THIRD)
+        .setDimensions(
+            new DefaultDimensionSpec(QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST, "alias0")
+        )
+        .setDimFilter(
+            new EqualityFilter(QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST, ColumnType.STRING, "a", null)
+        )
+        .setAggregatorSpecs(QueryRunnerTestHelper.ROWS_COUNT)
+        .setGranularity(QueryRunnerTestHelper.ALL_GRAN)
+        .addOrderByColumn("alias0", OrderByColumnSpec.Direction.ASCENDING)
+        .build();
+
+    // Total rows should add up to 26 * 2 = 52
+    // 26 rows and each has 2 entries in the column to be unnested
+    List<ResultRow> expectedResults = Collections.singletonList(
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "a",
+            "rows", 2L
+        )
+    );
+
+    Iterable<ResultRow> results = runQuery(query, TestIndex.getIncrementalTestIndex());
+    TestHelper.assertExpectedObjects(expectedResults, results, "groupBy-on-unnested-virtual-column");
+  }
+
+  @Test
+  public void testGroupByOnUnnestedNotFilterMatch()
+  {
+    cannotVectorize();
+
+    final DataSource unnestDataSource = UnnestDataSource.create(
+        new TableDataSource(QueryRunnerTestHelper.DATA_SOURCE),
+        new ExpressionVirtualColumn(
+            QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST,
+            "\"" + QueryRunnerTestHelper.PLACEMENTISH_DIMENSION + "\"",
+            null,
+            ExprMacroTable.nil()
+        ),
+        null
+    );
+
+    GroupByQuery query = makeQueryBuilder()
+        .setDataSource(unnestDataSource)
+        .setQuerySegmentSpec(QueryRunnerTestHelper.FIRST_TO_THIRD)
+        .setDimensions(
+            new DefaultDimensionSpec(QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST, "alias0")
+        )
+        .setDimFilter(
+            NotDimFilter.of(new EqualityFilter(QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST, ColumnType.STRING, "a", null))
+        )
+        .setAggregatorSpecs(QueryRunnerTestHelper.ROWS_COUNT)
+        .setGranularity(QueryRunnerTestHelper.ALL_GRAN)
+        .addOrderByColumn("alias0", OrderByColumnSpec.Direction.ASCENDING)
+        .build();
+
+    List<ResultRow> expectedResults = Arrays.asList(
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "b",
+            "rows", 2L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "e",
+            "rows", 2L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "h",
+            "rows", 2L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "m",
+            "rows", 6L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "n",
+            "rows", 2L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "p",
+            "rows", 6L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "preferred",
+            "rows", 26L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "t",
+            "rows", 4L
+        )
+    );
+
+    Iterable<ResultRow> results = runQuery(query, TestIndex.getIncrementalTestIndex());
+    TestHelper.assertExpectedObjects(expectedResults, results, "groupBy-on-unnested-virtual-column");
+  }
+
+  @Test
+  public void testGroupByOnUnnestedNotFilterMatchNonexistentValue()
+  {
+    cannotVectorize();
+
+    final DataSource unnestDataSource = UnnestDataSource.create(
+        new TableDataSource(QueryRunnerTestHelper.DATA_SOURCE),
+        new ExpressionVirtualColumn(
+            QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST,
+            "\"" + QueryRunnerTestHelper.PLACEMENTISH_DIMENSION + "\"",
+            null,
+            ExprMacroTable.nil()
+        ),
+        null
+    );
+
+    GroupByQuery query = makeQueryBuilder()
+        .setDataSource(unnestDataSource)
+        .setQuerySegmentSpec(QueryRunnerTestHelper.FIRST_TO_THIRD)
+        .setDimensions(
+            new DefaultDimensionSpec(QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST, "alias0")
+        )
+        .setDimFilter(
+            NotDimFilter.of(new EqualityFilter(QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST, ColumnType.STRING, "noexist", null))
+        )
+        .setAggregatorSpecs(QueryRunnerTestHelper.ROWS_COUNT)
+        .setGranularity(QueryRunnerTestHelper.ALL_GRAN)
+        .addOrderByColumn("alias0", OrderByColumnSpec.Direction.ASCENDING)
+        .build();
+
+    // Total rows should add up to 26 * 2 = 52

Review Comment:
   nit. there are 52 values as filter is on non existent value



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


Re: [PR] fix incorrect unnest dimension cursor value matcher implementation (druid)

Posted by "clintropolis (via GitHub)" <gi...@apache.org>.
clintropolis merged PR #15192:
URL: https://github.com/apache/druid/pull/15192


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


Re: [PR] fix incorrect unnest dimension cursor value matcher implementation (druid)

Posted by "clintropolis (via GitHub)" <gi...@apache.org>.
clintropolis commented on code in PR #15192:
URL: https://github.com/apache/druid/pull/15192#discussion_r1363111685


##########
processing/src/test/java/org/apache/druid/query/groupby/UnnestGroupByQueryRunnerTest.java:
##########
@@ -704,6 +706,229 @@ private Iterable<ResultRow> runQuery(final GroupByQuery query, final Incremental
     return GroupByQueryRunnerTestHelper.runQuery(factory, queryRunner, query);
   }
 
+  @Test
+  public void testGroupByOnUnnestedFilterMatch()
+  {
+    cannotVectorize();
+
+    final DataSource unnestDataSource = UnnestDataSource.create(
+        new TableDataSource(QueryRunnerTestHelper.DATA_SOURCE),
+        new ExpressionVirtualColumn(
+            QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST,
+            "\"" + QueryRunnerTestHelper.PLACEMENTISH_DIMENSION + "\"",
+            null,
+            ExprMacroTable.nil()
+        ),
+        null
+    );
+
+    GroupByQuery query = makeQueryBuilder()
+        .setDataSource(unnestDataSource)
+        .setQuerySegmentSpec(QueryRunnerTestHelper.FIRST_TO_THIRD)
+        .setDimensions(
+            new DefaultDimensionSpec(QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST, "alias0")
+        )
+        .setDimFilter(
+            new EqualityFilter(QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST, ColumnType.STRING, "a", null)
+        )
+        .setAggregatorSpecs(QueryRunnerTestHelper.ROWS_COUNT)
+        .setGranularity(QueryRunnerTestHelper.ALL_GRAN)
+        .addOrderByColumn("alias0", OrderByColumnSpec.Direction.ASCENDING)
+        .build();
+
+    // Total rows should add up to 26 * 2 = 52
+    // 26 rows and each has 2 entries in the column to be unnested
+    List<ResultRow> expectedResults = Collections.singletonList(
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "a",
+            "rows", 2L
+        )
+    );
+
+    Iterable<ResultRow> results = runQuery(query, TestIndex.getIncrementalTestIndex());
+    TestHelper.assertExpectedObjects(expectedResults, results, "groupBy-on-unnested-virtual-column");
+  }
+
+  @Test
+  public void testGroupByOnUnnestedNotFilterMatch()
+  {
+    cannotVectorize();
+
+    final DataSource unnestDataSource = UnnestDataSource.create(
+        new TableDataSource(QueryRunnerTestHelper.DATA_SOURCE),
+        new ExpressionVirtualColumn(
+            QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST,
+            "\"" + QueryRunnerTestHelper.PLACEMENTISH_DIMENSION + "\"",
+            null,
+            ExprMacroTable.nil()
+        ),
+        null
+    );
+
+    GroupByQuery query = makeQueryBuilder()
+        .setDataSource(unnestDataSource)
+        .setQuerySegmentSpec(QueryRunnerTestHelper.FIRST_TO_THIRD)
+        .setDimensions(
+            new DefaultDimensionSpec(QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST, "alias0")
+        )
+        .setDimFilter(
+            NotDimFilter.of(new EqualityFilter(QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST, ColumnType.STRING, "a", null))
+        )
+        .setAggregatorSpecs(QueryRunnerTestHelper.ROWS_COUNT)
+        .setGranularity(QueryRunnerTestHelper.ALL_GRAN)
+        .addOrderByColumn("alias0", OrderByColumnSpec.Direction.ASCENDING)
+        .build();
+
+    List<ResultRow> expectedResults = Arrays.asList(
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "b",
+            "rows", 2L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "e",
+            "rows", 2L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "h",
+            "rows", 2L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "m",
+            "rows", 6L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "n",
+            "rows", 2L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "p",
+            "rows", 6L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "preferred",
+            "rows", 26L
+        ),
+        makeRow(
+            query,
+            "2011-04-01",
+            "alias0", "t",
+            "rows", 4L
+        )
+    );
+
+    Iterable<ResultRow> results = runQuery(query, TestIndex.getIncrementalTestIndex());
+    TestHelper.assertExpectedObjects(expectedResults, results, "groupBy-on-unnested-virtual-column");
+  }
+
+  @Test
+  public void testGroupByOnUnnestedNotFilterMatchNonexistentValue()
+  {
+    cannotVectorize();
+
+    final DataSource unnestDataSource = UnnestDataSource.create(
+        new TableDataSource(QueryRunnerTestHelper.DATA_SOURCE),
+        new ExpressionVirtualColumn(
+            QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST,
+            "\"" + QueryRunnerTestHelper.PLACEMENTISH_DIMENSION + "\"",
+            null,
+            ExprMacroTable.nil()
+        ),
+        null
+    );
+
+    GroupByQuery query = makeQueryBuilder()
+        .setDataSource(unnestDataSource)
+        .setQuerySegmentSpec(QueryRunnerTestHelper.FIRST_TO_THIRD)
+        .setDimensions(
+            new DefaultDimensionSpec(QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST, "alias0")
+        )
+        .setDimFilter(
+            NotDimFilter.of(new EqualityFilter(QueryRunnerTestHelper.PLACEMENTISH_DIMENSION_UNNEST, ColumnType.STRING, "noexist", null))
+        )
+        .setAggregatorSpecs(QueryRunnerTestHelper.ROWS_COUNT)
+        .setGranularity(QueryRunnerTestHelper.ALL_GRAN)
+        .addOrderByColumn("alias0", OrderByColumnSpec.Direction.ASCENDING)
+        .build();
+
+    // Total rows should add up to 26 * 2 = 52

Review Comment:
   oops, i copied this whole test from another test and just added filters, will just remove



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org