You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Claude Warren (Jira)" <ji...@apache.org> on 2022/06/15 08:32:00 UTC

[jira] [Updated] (COLLECTIONS-823) BloomFilter: Optimize ArrayCountingBloomFilter.ForEachBitMap

     [ https://issues.apache.org/jira/browse/COLLECTIONS-823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Claude Warren updated COLLECTIONS-823:
--------------------------------------
    Description: 
 
 
 Member 
h3. !https://avatars.githubusercontent.com/u/886334?s=48&v=4|width=24,height=24!  *[aherbert|https://github.com/aherbert]*  [on 27 Feb|https://github.com/apache/commons-collections/pull/258#discussion_r812499923]
This converts all the non zero indices to a bitmap long[] array. But to do so requires using the {{forEachIndex}} method with a conditional boolean check on each loop iteration. I wonder if this should be brought inline for efficiency:

{noformat}

     @Override
    public boolean forEachBitMap(LongPredicate consumer) {
        Objects.requireNonNull(consumer, "consumer");
        long[] result = new long[BitMap.numberOfBitMaps(shape.getNumberOfBits())];
        for (int i = 0; i < counts.length; i++) {
            if (counts[i] != 0) {
                // Avoids a second check on the predicate result in forEachIndex
                BitMap.set(result, i);
            }
        }
        return BitMapProducer.fromBitMapArray(result).forEachBitMap(consumer);
    }

{noformat}


Or better yet, avoid the {{long[]}} array:


{noformat}
        int blocksm1 = BitMap.numberOfBitMaps(shape.getNumberOfBits()) - 1;
        int i = 0;
        long value;
        for (int j = 0; j < blocksm1; j++) {
            value = 0;
            for (int k = 0; k < Long.SIZE; k++) {
                if (counts[i++] != 0) {
                    value |= BitMap.getLongBit(k);
                }
            }
            if (!consumer.test(value)) {
                return false;
            }
        }
        // Final block
        value = 0;
        for (int k = 0; i < counts.length; k++) {
            if (counts[i++] != 0) {
                value |= BitMap.getLongBit(k);
            }
        }
        return consumer.test(value);
{noformat}


  was:
 
 
 Member 
h3. !https://avatars.githubusercontent.com/u/886334?s=48&v=4|width=24,height=24!  *[aherbert|https://github.com/aherbert]*  [on 27 Feb|https://github.com/apache/commons-collections/pull/258#discussion_r812499923]
This converts all the non zero indices to a bitmap long[] array. But to do so requires using the {{forEachIndex}} method with a conditional boolean check on each loop iteration. I wonder if this should be brought inline for efficiency:

     @Override
    public boolean forEachBitMap(LongPredicate consumer) {
        Objects.requireNonNull(consumer, "consumer");
        long[] result = new long[BitMap.numberOfBitMaps(shape.getNumberOfBits())];
        for (int i = 0; i < counts.length; i++) {
            if (counts[i] != 0) {
                // Avoids a second check on the predicate result in forEachIndex
                BitMap.set(result, i);
            }
        }
        return BitMapProducer.fromBitMapArray(result).forEachBitMap(consumer);
    }


Or better yet, avoid the {{long[]}} array:

        int blocksm1 = BitMap.numberOfBitMaps(shape.getNumberOfBits()) - 1;
        int i = 0;
        long value;
        for (int j = 0; j < blocksm1; j++) {
            value = 0;
            for (int k = 0; k < Long.SIZE; k++) {
                if (counts[i++] != 0) {
                    value |= BitMap.getLongBit(k);
                }
            }
            if (!consumer.test(value)) {
                return false;
            }
        }
        // Final block
        value = 0;
        for (int k = 0; i < counts.length; k++) {
            if (counts[i++] != 0) {
                value |= BitMap.getLongBit(k);
            }
        }
        return consumer.test(value);


> BloomFilter: Optimize ArrayCountingBloomFilter.ForEachBitMap
> ------------------------------------------------------------
>
>                 Key: COLLECTIONS-823
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-823
>             Project: Commons Collections
>          Issue Type: Improvement
>          Components: Collection
>    Affects Versions: 4.5
>            Reporter: Claude Warren
>            Priority: Minor
>              Labels: bloom-filter
>
>  
>  
>  Member 
> h3. !https://avatars.githubusercontent.com/u/886334?s=48&v=4|width=24,height=24!  *[aherbert|https://github.com/aherbert]*  [on 27 Feb|https://github.com/apache/commons-collections/pull/258#discussion_r812499923]
> This converts all the non zero indices to a bitmap long[] array. But to do so requires using the {{forEachIndex}} method with a conditional boolean check on each loop iteration. I wonder if this should be brought inline for efficiency:
> {noformat}
>      @Override
>     public boolean forEachBitMap(LongPredicate consumer) {
>         Objects.requireNonNull(consumer, "consumer");
>         long[] result = new long[BitMap.numberOfBitMaps(shape.getNumberOfBits())];
>         for (int i = 0; i < counts.length; i++) {
>             if (counts[i] != 0) {
>                 // Avoids a second check on the predicate result in forEachIndex
>                 BitMap.set(result, i);
>             }
>         }
>         return BitMapProducer.fromBitMapArray(result).forEachBitMap(consumer);
>     }
> {noformat}
> Or better yet, avoid the {{long[]}} array:
> {noformat}
>         int blocksm1 = BitMap.numberOfBitMaps(shape.getNumberOfBits()) - 1;
>         int i = 0;
>         long value;
>         for (int j = 0; j < blocksm1; j++) {
>             value = 0;
>             for (int k = 0; k < Long.SIZE; k++) {
>                 if (counts[i++] != 0) {
>                     value |= BitMap.getLongBit(k);
>                 }
>             }
>             if (!consumer.test(value)) {
>                 return false;
>             }
>         }
>         // Final block
>         value = 0;
>         for (int k = 0; i < counts.length; k++) {
>             if (counts[i++] != 0) {
>                 value |= BitMap.getLongBit(k);
>             }
>         }
>         return consumer.test(value);
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)