You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict (JIRA)" <ji...@apache.org> on 2014/01/29 13:00:13 UTC
[jira] [Created] (CASSANDRA-6633) Investigate Bloom Filter
Improvements
Benedict created CASSANDRA-6633:
-----------------------------------
Summary: Investigate Bloom Filter Improvements
Key: CASSANDRA-6633
URL: https://issues.apache.org/jira/browse/CASSANDRA-6633
Project: Cassandra
Issue Type: Improvement
Reporter: Benedict
Investigate various possible improvements to our bloom filters:
1) Dynamic resizing would be useful. There are a few ways this could be achieved: with some modification, downsampling could be supported; partitioning the hash functions so that we may select the number of hashes/bits dynamically, by loading/unloading a given partition; and there are some related data structures, such as Quotient Filters that support resizing and merging.
2) Faster loading: should be possible to mmap the bloom filter representation on disk
3) Most ambitious of all, would be to try to reduce the memory requirement of bloom filters. Golomb Coded Sets [1] are a possibility, as are other compressed hash structures.
[1|http://algo2.iti.kit.edu/singler/publications/cacheefficientbloomfilters-wea2007.pdf]
[2|http://www.it-c.dk/people/pagh/papers/bloom.pdf]
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)