You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by GitBox <gi...@apache.org> on 2022/11/08 19:32:53 UTC

[GitHub] [commons-collections] Claudenw commented on pull request #357: Added Hasher.isEmpty() and appropriate tests

Claudenw commented on PR #357:
URL: https://github.com/apache/commons-collections/pull/357#issuecomment-1307730779

   
   > I am curious as to why you would have an empty hasher. Is this not a programming error? Basically it means you are testing if `nothing` is present in a filter, or adding `nothing` to a filter.
   
   First I need to define a "reference" Bloom filter.  Consider the properties of some object.  If you can index and search a collection of Bloom filters then you can take the properties of an object, create a Bloom filter from the values, and use that as a reference to the object.  Now you can search the collection of filters by creating a Bloom filter that contains only those values that you want to locate.  This usage of Bloom filters is really only available once the indexed collection is possible.  The indexed collection is called a multidimensional Bloom filter -- actually any collection of Bloom filters is a multidimensional Bloom filter.
   
   In the case of a searching multidimensional Bloom filter in a client-server environment where:
   
   - the client and the server agree on the underlying hashing algorithm and the use of the EnhancedDoubleHasher.  
   - the Bloom filter being generated is not a standard "gateway" Bloom filter but a "reference" filter.
   - The client creates the hashers for the filter and places them into a collection which it then passes to the server.
   - The server deserializes the hashers into a HasherCollection.
   - If the hasher is empty the server can return the entire collection or can return an error.
   - otherwise the server performs the search building the internal Bloom filters as required by the multidimensional Bloom filter algorithm.
   
   Notice that the empty hasher allows a short circuit of the expensive search.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org