You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by vineet daniel <vi...@gmail.com> on 2010/05/07 12:51:42 UTC

bloom filter

Hi

what is the benefit of creating bloom filter when cassandra writes data, how
does it helps ?


_______________________________________
Vineet Daniel
_______________________________________

Let your email find you....

Re: bloom filter

Posted by David Strauss <da...@fourkitchens.com>.
On 2010-05-07 11:03, vineet daniel wrote:
> 2. "It is also important for identifying which SSTable files to look inside
> even when a key is present." - David can you please throw some more
> light on your point, like what are the implications, why do we need to
> identify etc.

A bloom filter is almost like a street sign that tells you the range of
addresses on a street block. Such a street sign doesn't guarantee the
whole range of addresses exists on the block, but it does mean you can
avoid driving down streets that don't contain the address you're looking
for.

When Cassandra is looking for a key, there could be several files that
potentially contain it. By looking at the bloom filter for each, it can
avoid looking inside the files that definitely do not have the desired data.

(My analogy breaks down a bit here because the street signs indicate
mutually exclusive ranges of addresses, while the bloom filters may
indicate the possible presence of a key in *several* files.)

-- 
David Strauss
   | david@fourkitchens.com
Four Kitchens
   | http://fourkitchens.com
   | +1 512 454 6659 [office]
   | +1 512 870 8453 [direct]


Re: bloom filter

Posted by vineet daniel <vi...@gmail.com>.
1. Peter said 'without going to disk' so that means bloom filters reside in
memory, always or just when request to that particular CF is made.
2. "It is also important for identifying which SSTable files to look inside
even when a key is present." - David can you please throw some more light on
your point, like what are the implications, why do we need to identify etc.


_______________________________________
Vineet Daniel
_______________________________________

Let your email find you....


On Fri, May 7, 2010 at 4:28 PM, David Strauss <da...@fourkitchens.com>wrote:

> On 2010-05-07 10:55, Peter Schüller wrote:
> >> what is the benefit of creating bloom filter when cassandra writes data,
> how
> >> does it helps ?
> >
> > It allows Cassandra to answer requests for non-existent keys without
> > going to disk, except in cases where the bloom filter gives a false
> > positive.
> >
> > See:
> >
> >
> http://spyced.blogspot.com/2009/01/all-you-ever-wanted-to-know-about.html
>
> It is also important for identifying which SSTable files to look inside
> even when a key is present.
>
> --
> David Strauss
>   | david@fourkitchens.com
> Four Kitchens
>   | http://fourkitchens.com
>   | +1 512 454 6659 [office]
>   | +1 512 870 8453 [direct]
>
>

Re: bloom filter

Posted by David Strauss <da...@fourkitchens.com>.
On 2010-05-07 10:55, Peter Schüller wrote:
>> what is the benefit of creating bloom filter when cassandra writes data, how
>> does it helps ?
> 
> It allows Cassandra to answer requests for non-existent keys without
> going to disk, except in cases where the bloom filter gives a false
> positive.
> 
> See:
> 
> http://spyced.blogspot.com/2009/01/all-you-ever-wanted-to-know-about.html

It is also important for identifying which SSTable files to look inside
even when a key is present.

-- 
David Strauss
   | david@fourkitchens.com
Four Kitchens
   | http://fourkitchens.com
   | +1 512 454 6659 [office]
   | +1 512 870 8453 [direct]


Re: bloom filter

Posted by Peter Schüller <sc...@spotify.com>.
> what is the benefit of creating bloom filter when cassandra writes data, how
> does it helps ?

It allows Cassandra to answer requests for non-existent keys without
going to disk, except in cases where the bloom filter gives a false
positive.

See:

http://spyced.blogspot.com/2009/01/all-you-ever-wanted-to-know-about.html

-- 
/ Peter Schuller aka scode

Re: bloom filter

Posted by David Strauss <da...@fourkitchens.com>.
On 2010-05-07 10:58, vineet daniel wrote:
> Is there any way to view the content of this file.

Which file?

-- 
David Strauss
   | david@fourkitchens.com
Four Kitchens
   | http://fourkitchens.com
   | +1 512 454 6659 [office]
   | +1 512 870 8453 [direct]


Re: bloom filter

Posted by vineet daniel <vi...@gmail.com>.
Thanks David and Peter.

Is there any way to view the content of this file.
_______________________________________
Vineet Daniel
_______________________________________

Let your email find you....


On Fri, May 7, 2010 at 4:24 PM, David Strauss <da...@fourkitchens.com>wrote:

> On 2010-05-07 10:51, vineet daniel wrote:
> > what is the benefit of creating bloom filter when cassandra writes data,
> > how does it helps ?
>
> http://wiki.apache.org/cassandra/ArchitectureOverview
>
> --
> David Strauss
>   | david@fourkitchens.com
> Four Kitchens
>   | http://fourkitchens.com
>   | +1 512 454 6659 [office]
>   | +1 512 870 8453 [direct]
>
>

Re: bloom filter

Posted by David Strauss <da...@fourkitchens.com>.
On 2010-05-07 10:51, vineet daniel wrote:
> what is the benefit of creating bloom filter when cassandra writes data,
> how does it helps ?

http://wiki.apache.org/cassandra/ArchitectureOverview

-- 
David Strauss
   | david@fourkitchens.com
Four Kitchens
   | http://fourkitchens.com
   | +1 512 454 6659 [office]
   | +1 512 870 8453 [direct]