You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/07/05 12:37:00 UTC
[jira] [Work logged] (COLLECTIONS-823) BloomFilter: Optimize ArrayCountingBloomFilter.ForEachBitMap
[ https://issues.apache.org/jira/browse/COLLECTIONS-823?focusedWorklogId=787825&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-787825 ]
ASF GitHub Bot logged work on COLLECTIONS-823:
----------------------------------------------
Author: ASF GitHub Bot
Created on: 05/Jul/22 12:36
Start Date: 05/Jul/22 12:36
Worklog Time Spent: 10m
Work Description: aherbert merged PR #316:
URL: https://github.com/apache/commons-collections/pull/316
Issue Time Tracking
-------------------
Worklog Id: (was: 787825)
Remaining Estimate: 0h
Time Spent: 10m
> BloomFilter: Optimize ArrayCountingBloomFilter.ForEachBitMap
> ------------------------------------------------------------
>
> Key: COLLECTIONS-823
> URL: https://issues.apache.org/jira/browse/COLLECTIONS-823
> Project: Commons Collections
> Issue Type: Improvement
> Components: Collection
> Affects Versions: 4.5
> Reporter: Claude Warren
> Assignee: Claude Warren
> Priority: Minor
> Labels: bloom-filter
> Time Spent: 10m
> Remaining Estimate: 0h
>
>
>
> Member
> h3. !https://avatars.githubusercontent.com/u/886334?s=48&v=4|width=24,height=24! *[aherbert|https://github.com/aherbert]* [on 27 Feb|https://github.com/apache/commons-collections/pull/258#discussion_r812499923]
> This converts all the non zero indices to a bitmap long[] array. But to do so requires using the {{forEachIndex}} method with a conditional boolean check on each loop iteration. I wonder if this should be brought inline for efficiency:
> {noformat}
> @Override
> public boolean forEachBitMap(LongPredicate consumer) {
> Objects.requireNonNull(consumer, "consumer");
> long[] result = new long[BitMap.numberOfBitMaps(shape.getNumberOfBits())];
> for (int i = 0; i < counts.length; i++) {
> if (counts[i] != 0) {
> // Avoids a second check on the predicate result in forEachIndex
> BitMap.set(result, i);
> }
> }
> return BitMapProducer.fromBitMapArray(result).forEachBitMap(consumer);
> }
> {noformat}
> Or better yet, avoid the {{long[]}} array:
> {noformat}
> int blocksm1 = BitMap.numberOfBitMaps(shape.getNumberOfBits()) - 1;
> int i = 0;
> long value;
> for (int j = 0; j < blocksm1; j++) {
> value = 0;
> for (int k = 0; k < Long.SIZE; k++) {
> if (counts[i++] != 0) {
> value |= BitMap.getLongBit(k);
> }
> }
> if (!consumer.test(value)) {
> return false;
> }
> }
> // Final block
> value = 0;
> for (int k = 0; i < counts.length; k++) {
> if (counts[i++] != 0) {
> value |= BitMap.getLongBit(k);
> }
> }
> return consumer.test(value);
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)