You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Claude Warren (Jira)" <ji...@apache.org> on 2022/06/15 10:19:00 UTC
[jira] [Updated] (COLLECTIONS-820) BloomFilter: support merge without checking
[ https://issues.apache.org/jira/browse/COLLECTIONS-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Claude Warren updated COLLECTIONS-820:
--------------------------------------
Summary: BloomFilter: support merge without checking (was: BloomFilter: renaming merge/mergeInPlace)
> BloomFilter: support merge without checking
> -------------------------------------------
>
> Key: COLLECTIONS-820
> URL: https://issues.apache.org/jira/browse/COLLECTIONS-820
> Project: Commons Collections
> Issue Type: Improvement
> Components: Collection
> Affects Versions: 4.5
> Reporter: Claude Warren
> Priority: Minor
> Labels: bloom-filter
>
> Support a merge without checking and then provide a method add with checking for a change in the filter. This will duplicate a lot of (simple) code. But it provides a nice method to check if a filter contains an object and do something if it did not, rather than calling contains before then doing a merge, or doing a merge and checking cardinality change afterwards.
> For a bitmap bloom filter this can be done branchless by accumulating all the zero bits that are set to 1 using:
> long[] flag = {0};
> int[] idx = new int[1];
> bitMapProducer.forEachBitMap(value -> {
> int i = idx[0]++;
> flag[0] |= (~bitMap[i] & value);
> bitMap[i] |= value;
> return true;
> });
> return flag[0] != 0;
> For a filter with an explicit cardinality you just check the cardinality before and after the merge.
> h3. !https://avatars.githubusercontent.com/u/886334?s=48&v=4|width=24,height=24! *[aherbert|https://github.com/aherbert]* [on 27 Feb|https://github.com/apache/commons-collections/pull/258#discussion_r813419143]
>
> With the provision of the {{copy}} method the {{merge}} method seems redundant with {{{}mergeInPlace{}}}. It is equivalent to two lines of code:
>
> {{BloomFilter filter2 = filter.copy(); }}
> {{filter2.mergeInPlace(other);}}
>
> I think {{merge}} should be the collections equivalent of {{{}add{}}}. This acts in place. Very few collections operations return new collections unless they are views of the current collection, e.g. List.subList. Note that the CountingBloomFilter has {{add}} which acts in place. There is no {{add}} returning a new CountingBloomFilter and then an {{{}addInPlace{}}}. So there is some inconsistency here.
> IMO the more common operation is to merge a filter in place. So this should be the action of merge. Then the mergeInPlace can be dropped or replaced with {{copyAndMerge}} or {{{}mergeToCopy{}}}. You then also have the boolean result of merge to consider. Should the copy not be returned if the merge returns false? Since it is 2 lines of code I would not bother and drop this.
> The BloomFilter then has {{merge}} and CountingBloomFilter has {{{}add{}}}, both act in place (the common operation). The user can create a copy if desired for a merge if that is useful to them.
> Another comment on {{merge}} suggests changing it to support {{add}} and {{merge}} as both have their own use cases.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)