You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Yitong Zhou (JIRA)" <ji...@apache.org> on 2015/03/18 22:34:38 UTC
[jira] [Created] (HADOOP-11727) Make
org.hadoop.util.bloom.BloomFilter returns the expected false positive
probability
Yitong Zhou created HADOOP-11727:
------------------------------------
Summary: Make org.hadoop.util.bloom.BloomFilter returns the expected false positive probability
Key: HADOOP-11727
URL: https://issues.apache.org/jira/browse/HADOOP-11727
Project: Hadoop Common
Issue Type: Improvement
Reporter: Yitong Zhou
When bloom filtering, sometimes it would be handy to know the current expected false positive rate (bitSet's cardinality / vector size)^(# of hash functions), so that when the FP rate is too high, we can choose to rebuild the bloomfilter into a larger size.
The codes would look like this:
{code}
/*
* Returns the expected false positive probability of the current filter.
*
* @return The expected false positive probability
*/
public double expectedFalsePositiveProbability() {
return Math.pow((double) bits.cardinality() / vectorSize, nbHash);
}
{code}
Does this sound like a reasonable minor function that could be added into the code base?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)