You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/02/24 18:06:43 UTC

[GitHub] [lucene] rmuir commented on a change in pull request #710: LUCENE-10311: Make FixedBitSet#approximateCardinality faster (and actually approximate).

rmuir commented on a change in pull request #710:
URL: https://github.com/apache/lucene/pull/710#discussion_r814137771



##########
File path: lucene/core/src/java/org/apache/lucene/util/FixedBitSet.java
##########
@@ -176,6 +176,30 @@ public int cardinality() {
     return (int) BitUtil.pop_array(bits, 0, numWords);
   }
 
+  @Override
+  public int approximateCardinality() {
+    // Naive sampling: compute the number of bits that are set on the first 16 longs every 1024
+    // longs and scale the result by 1024/16.
+    // This computes the pop count on ranges instead of single longs in order to take advantage of
+    // vectorization.
+
+    final int rangeLength = 16;
+    final int interval = 1024;
+
+    if (numWords < interval) {
+      return cardinality();
+    }
+
+    long popCount = 0;
+    int maxWord;
+    for (maxWord = 0; maxWord + interval < numWords; maxWord += interval) {
+      popCount += BitUtil.pop_array(bits, maxWord, rangeLength);

Review comment:
       this isn't related/review comment. just saying i would be in favor of removing these `BitUtil` methods as I think they are outdated and provide no value. I think it would be easier on our eyes to just see loops with Long.bitCount? 
   
   The other constants/methods in the `BitUtil` class actually provide value. But let's not wrap what the JDK provides efficiently for no reason?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org