You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/06/19 18:34:31 UTC

[GitHub] [druid] gianm commented on a change in pull request #10053: fix topn on string columns with non-sorted or non-unique dictionaries

gianm commented on a change in pull request #10053:
URL: https://github.com/apache/druid/pull/10053#discussion_r442988947



##########
File path: processing/src/main/java/org/apache/druid/query/topn/TopNQueryEngine.java
##########
@@ -158,6 +156,40 @@ private TopNMapFn getMapFn(
     return new TopNMapFn(query, topNAlgorithm);
   }
 
+  /**
+   * {@link PooledTopNAlgorithm} (and {@link AggregateTopNMetricFirstAlgorithm} which utilizes the pooled
+   * algorithm) are optimized off-heap algorithms for aggregating dictionary encoded string columns. These algorithms
+   * rely on dictionary ids being unique so to aggregate on the dictionary ids directly and defer
+   * {@link org.apache.druid.segment.DimensionSelector#lookupName(int)} until as late as possible in query processing.
+   *
+   * When these conditions are not true, we have an on-heap fall-back algorithm, the {@link HeapBasedTopNAlgorithm}
+   * (and {@link TimeExtractionTopNAlgorithm} for a specialized form for long columns) which aggregates on values of
+   * selectors.
+   */
+  private static boolean requiresHeapAlgorithm(

Review comment:
       I think this method would be easier to read if it was called `canUsePooledAlgorithm` and the checks were all flipped. This is because IMO it makes sense to view the heap algorithm as the base case, and the pooled algorithm as a special case, meaning the logic should be "can we do the special case".




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org