You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/09/12 19:19:08 UTC

[GitHub] [lucene] shaie commented on a diff in pull request #11764: GITHUB#11742: MatchingFacetSetsCounts#getTopChildren now returns top children instead of all children

shaie commented on code in PR #11764:
URL: https://github.com/apache/lucene/pull/11764#discussion_r968832961


##########
lucene/facet/src/java/org/apache/lucene/facet/facetset/MatchingFacetSetsCounts.java:
##########
@@ -156,7 +157,43 @@ public FacetResult getAllChildren(String dim, String... path) throws IOException
   @Override
   public FacetResult getTopChildren(int topN, String dim, String... path) throws IOException {
     validateTopN(topN);
-    return getAllChildren(dim, path);
+
+    topN = Math.min(topN, counts.length);
+
+    PriorityQueue<Entry> pq =
+        new PriorityQueue<>(topN) {

Review Comment:
   Would you like to use the variant which takes a supplier of sentinel objects? It will allow you to avoid the `null` checks below and in general improve the performance of the algorithm below (unless we don't expect `topN` to be much smaller than `counts`. The usage pattern then becomes comparing the current count+label to the `top()` and if it's better you update the `top()` and call `updateTop()`.



##########
lucene/facet/src/test/org/apache/lucene/facet/facetset/TestExactFacetSetMatcher.java:
##########
@@ -46,6 +49,105 @@ public class TestExactFacetSetMatcher extends FacetTestCase {
   private static final int[] MANUFACTURER_ORDS = {FORD_ORD, TOYOTA_ORD, CHEVY_ORD, NISSAN_ORD};
   private static final int[] YEARS = {2010, 2011, 2012};
 
+  public void testTopChildren() throws Exception {

Review Comment:
   Since the change is only in `MatchingFacetSetsCounts` and has nothing to do with the matchers, I think the test should be in `TestMatchingFacetSetsCounts` and remove the duplication in the "Range" test class. WDYT?



##########
lucene/facet/src/java/org/apache/lucene/facet/facetset/MatchingFacetSetsCounts.java:
##########
@@ -156,7 +157,43 @@ public FacetResult getAllChildren(String dim, String... path) throws IOException
   @Override
   public FacetResult getTopChildren(int topN, String dim, String... path) throws IOException {
     validateTopN(topN);
-    return getAllChildren(dim, path);
+
+    topN = Math.min(topN, counts.length);
+
+    PriorityQueue<Entry> pq =
+        new PriorityQueue<>(topN) {
+          @Override
+          protected boolean lessThan(Entry a, Entry b) {
+            int cmp = Integer.compare(a.count, b.count);
+            if (cmp == 0) {
+              cmp = b.label.compareTo(a.label);
+            }
+            return cmp < 0;
+          }
+        };
+
+    int childCount = 0;
+    Entry reuse = null;
+    for (int i = 0; i < counts.length; i++) {
+      int count = counts[i];
+      if (count > 0) {
+        childCount++;
+        if (reuse == null) {
+          reuse = new Entry();
+        }
+        reuse.label = facetSetMatchers[i].label;
+        reuse.count = count;
+        reuse = pq.insertWithOverflow(reuse);
+      }
+    }
+
+    LabelAndValue[] labelValues = new LabelAndValue[topN];
+    for (int i = topN - 1; i >= 0; i--) {
+      Entry e = pq.pop();

Review Comment:
   If you accept my proposal above, you will need to break the loop when hitting a sentinel value



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org