You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/06/19 03:08:09 UTC

[GitHub] [druid] maytasm commented on a change in pull request #10053: fix topn on string columns with non-sorted or non-unique dictionaries

maytasm commented on a change in pull request #10053:
URL: https://github.com/apache/druid/pull/10053#discussion_r442605486



##########
File path: processing/src/main/java/org/apache/druid/query/topn/TopNQueryEngine.java
##########
@@ -126,14 +126,20 @@ private TopNMapFn getMapFn(
         // Once we have arbitrary dimension types following check should be replaced by checking
         // that the column is of type long and single-value.
         dimension.equals(ColumnHolder.TIME_COLUMN_NAME)
-        ) {
+    ) {
       // A special TimeExtractionTopNAlgorithm is required, since DimExtractionTopNAlgorithm
       // currently relies on the dimension cardinality to support lexicographic sorting
       topNAlgorithm = new TimeExtractionTopNAlgorithm(adapter, query);
     } else if (selector.isHasExtractionFn()) {
       topNAlgorithm = new HeapBasedTopNAlgorithm(adapter, query);
-    } else if (columnCapabilities == null || !(columnCapabilities.getType() == ValueType.STRING
-                                               && columnCapabilities.isDictionaryEncoded())) {
+    } else if (
+        columnCapabilities == null ||
+        !(columnCapabilities.getType() == ValueType.STRING &&
+          columnCapabilities.isDictionaryEncoded() &&
+          columnCapabilities.areDictionaryValuesSorted().isTrue() &&
+          columnCapabilities.areDictionaryValuesUnique().isTrue()
+        )
+    ) {
       // Use HeapBasedTopNAlgorithm for non-Strings and for non-dictionary-encoded Strings, and for things we don't know

Review comment:
       Can you update the comments and also mention why we can/want to use HeapBasedTopNAlgorithm for each of those conditions (not sorted, not unique, etc)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org