You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "clintropolis (via GitHub)" <gi...@apache.org> on 2023/03/27 06:06:05 UTC

[GitHub] [druid] clintropolis commented on a diff in pull request #13977: smarter nested column index utilization

clintropolis commented on code in PR #13977:
URL: https://github.com/apache/druid/pull/13977#discussion_r1148822590


##########
benchmarks/src/test/java/org/apache/druid/benchmark/query/SqlNestedDataBenchmark.java:
##########
@@ -173,7 +173,31 @@ public String getFormatString()
       "SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) < 1005.0 AND JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) > 1000.0",
       // 28, 29
       "SELECT SUM(long1) FROM foo WHERE double3 < 2000.0 AND double3 > 1000.0",
-      "SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) < 2000.0 AND JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) > 1000.0"
+      "SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) < 2000.0 AND JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) > 1000.0",
+      // 30, 31
+      "SELECT SUM(long1) FROM foo WHERE double3 < 3000.0 AND double3 > 1000.0",
+      "SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) < 3000.0 AND JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) > 1000.0",
+      // 32,33
+      "SELECT SUM(long1) FROM foo WHERE double3 < 5000.0 AND double3 > 1000.0",
+      "SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) < 5000.0 AND JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) > 1000.0",
+      // 34,35 smaller cardinality like range filter
+      "SELECT SUM(long1) FROM foo WHERE string1 LIKE '1%'",
+      "SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.string1') LIKE '1%'",
+      // 36,37 smaller cardinality like predicate filter
+      "SELECT SUM(long1) FROM foo WHERE string1 LIKE '%1%'",
+      "SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.string1') LIKE '%1%'",
+      // 38-39 moderate cardinality like range
+      "SELECT SUM(long1) FROM foo WHERE string5 LIKE '1%'",
+      "SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.string5') LIKE '1%'",
+      // 40, 41 big cardinality lex range
+      "SELECT SUM(long1) FROM foo WHERE string5 > '1'",
+      "SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.string5') > '1'",
+      // 42, 43 big cardinality like predicate filter
+      "SELECT SUM(long1) FROM foo WHERE string5 LIKE '%1%'",
+      "SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.string5') LIKE '%1%'",

Review Comment:
   `InDimFilter` typically only uses `Utf8ValueSetIndex` and `StringValueSetIndex` which cannot be impacted by the changes currently in this PR, _unless_ it is using an `ExtractionFn`. I need to figure out how to get something that would plan to something with an extractionFn in here to see how this PR would impact this filter.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org