You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Martin Mačura <m....@gmail.com> on 2019/05/16 13:14:39 UTC

Re: Bloom filter false positives high

I've decreased bloom_filter_fp_chance from 0.01 to 0.001.  The
sstableupgrade took 3 days to complete. And this is a result:
node1
               Bloom filter false positives: 380965
               Bloom filter false ratio: 0.46560
               Bloom filter space used: 27.1 MiB
               Bloom filter off heap memory used: 27.09 MiB
node2
               Bloom filter false positives: 866636
               Bloom filter false ratio: 0.40865
               Bloom filter space used: 27.78 MiB
               Bloom filter off heap memory used: 27.77 MiB
node3
               Bloom filter false positives: 433296
               Bloom filter false ratio: 0.20359
               Bloom filter space used: 26.15 MiB
               Bloom filter off heap memory used: 26.15 MiB
node4
               Bloom filter false positives: 550721
               Bloom filter false ratio: 0.30233
               Bloom filter space used: 24.7 MiB
               Bloom filter off heap memory used: 24.7 MiB




Martin




On Wed, Apr 17, 2019 at 1:45 PM Stefan Miklosovic <
stefan.miklosovic@instaclustr.com> wrote:

> Lastly I wonder if that number is very same from every node you
> connect your nodetool to. Do all nodes see very similar false
> positives ratio / number?
>
> On Wed, 17 Apr 2019 at 21:41, Stefan Miklosovic
> <st...@instaclustr.com> wrote:
> >
> > One thing comes to my mind but my reasoning is questionable as I am
> > not an expert in this.
> >
> > If you think about this, the whole concept of Bloom filter is to check
> > if some record is in particular SSTable. False positive mean that,
> > obviously, filter thought it was there but in fact it is not. So
> > Cassandra did a look unnecessarily. Why does it think that it is there
> > in such number of cases? You either make a lot of same requests on
> > same partition key over time hence querying same data over and over
> > again (but would not that data be cached?) or there was a lot of data
> > written with same partition key so it thinks it is there but
> > clustering column is different. As ts is of type timeuuid, isnt it
> > true that you are doing a lot of queries with some date? It might be
> > true that hash is done only on partition keys and not on clustering
> > columns so filter gives you "yes" and it goes there, checks it
> > clustering column is equal what you queried and its not there. But as
> > I say I might be wrong ...
> >
> > More to it, your read_repair_chance is 0.0 so it will never do a
> > repair after successful read (e.g. you have rf 3 and cl quorum so one
> > node is somehow behind) so if you dont run repairs maybe it is just
> > somehow unsychronized but that is really just my guess.
> >
> > On Wed, 17 Apr 2019 at 21:39, Martin Mačura <m....@gmail.com> wrote:
> > >
> > > We cannot run any repairs on these tables.  Whenever we tried it
> (incremental or full or partitioner range), it caused a node to run out of
> disk space during anticompaction.  We'll try again once Cassandra 4.0 is
> released.
> > >
> > > On Wed, Apr 17, 2019 at 1:07 PM Stefan Miklosovic <
> stefan.miklosovic@instaclustr.com> wrote:
> > >>
> > >> if you invoke nodetool it gets false positives number from this metric
> > >>
> > >>
> https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/metrics/TableMetrics.java#L564-L578
> > >>
> > >> You get high false positives so this accumulates them
> > >>
> > >>
> https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/metrics/TableMetrics.java#L572
> > >>
> > >> If you follow that, that number is computed here
> > >>
> > >>
> https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/io/sstable/BloomFilterTracker.java#L44-L55
> > >>
> > >> In order to have that number so high, the difference has to be so big
> > >> so lastFalsePositiveCount is imho significantly lower
> > >>
> > >> False positives are ever increased only in BigTableReader where it get
> > >> complicated very quickly and I am not sure why it is called to be
> > >> honest.
> > >>
> > >> Is all fine with db as such? Do you run repairs? Does that number
> > >> increses or decreases over time? Has repair or compaction some effect
> > >> on it?
> > >>
> > >> On Wed, 17 Apr 2019 at 20:48, Martin Mačura <m....@gmail.com>
> wrote:
> > >> >
> > >> > Both tables use the default bloom_filter_fp_chance of 0.01 ...
> > >> >
> > >> > CREATE TABLE ... (
> > >> >    a int,
> > >> >    b int,
> > >> >    bucket timestamp,
> > >> >    ts timeuuid,
> > >> >    c int,
> > >> > ...
> > >> >    PRIMARY KEY ((a, b, bucket), ts, c)
> > >> > ) WITH CLUSTERING ORDER BY (ts DESC, monitor ASC)
> > >> >    AND bloom_filter_fp_chance = 0.01
> > >> >    AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
> 'compaction_window_size': '1', 'compaction_window_unit': 'DAYS',
> 'tombstone_threshold': '0.9', 'unchecked_tombstone_compaction':
> > >> > 'false'}
> > >> >    AND dclocal_read_repair_chance = 0.0
> > >> >    AND default_time_to_live = 63072000
> > >> >    AND gc_grace_seconds = 10800
> > >> > ...
> > >> >    AND read_repair_chance = 0.0
> > >> >    AND speculative_retry = 'NONE';
> > >> >
> > >> >
> > >> > CREATE TABLE ... (
> > >> >    c int,
> > >> >    b int,
> > >> >    bucket timestamp,
> > >> >    ts timeuuid,
> > >> > ...
> > >> >    PRIMARY KEY ((c, b, bucket), ts)
> > >> > ) WITH CLUSTERING ORDER BY (ts DESC)
> > >> >    AND bloom_filter_fp_chance = 0.01
> > >> >    AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
> 'compaction_window_size': '1', 'compaction_window_unit': 'DAYS',
> 'tombstone_threshold': '0.9', 'unchecked_tombstone_compaction':
> > >> > 'false'}
> > >> >    AND dclocal_read_repair_chance = 0.0
> > >> >    AND default_time_to_live = 63072000
> > >> >    AND gc_grace_seconds = 10800
> > >> > ...
> > >> >    AND read_repair_chance = 0.0
> > >> >    AND speculative_retry = 'NONE';
> > >> >
> > >> > On Wed, Apr 17, 2019 at 12:25 PM Stefan Miklosovic <
> stefan.miklosovic@instaclustr.com> wrote:
> > >> >>
> > >> >> What is your bloom_filter_fp_chance for either table? I guess it is
> > >> >> bigger for the first one, bigger that number is between 0 and 1,
> less
> > >> >> memory it will use (17 MiB against 54.9 Mib) which means more false
> > >> >> positives you will get.
> > >> >>
> > >> >> On Wed, 17 Apr 2019 at 19:59, Martin Mačura <m....@gmail.com>
> wrote:
> > >> >> >
> > >> >> > Hi,
> > >> >> > I have a table with poor bloom filter false ratio:
> > >> >> >                SSTable count: 1223
> > >> >> >                Space used (live): 726.58 GiB
> > >> >> >                Number of partitions (estimate): 8592749
> > >> >> >                Bloom filter false positives: 35796352
> > >> >> >                Bloom filter false ratio: 0.68472
> > >> >> >                Bloom filter space used: 17.82 MiB
> > >> >> >                Compacted partition maximum bytes: 386857368
> > >> >> >
> > >> >> > It's a time series, TWCS compaction, window size 1 day, data
> partitioned in daily buckets, TTL 2 years.
> > >> >> >
> > >> >> > I have another table with a similar schema, but it is not
> affected for some reason:
> > >> >> >                SSTable count: 1114
> > >> >> >                Space used (live): 329.87 GiB
> > >> >> >                Number of partitions (estimate): 25460768
> > >> >> >                Bloom filter false positives: 156942
> > >> >> >                Bloom filter false ratio: 0.00010
> > >> >> >                Bloom filter space used: 54.9 MiB
> > >> >> >                Compacted partition maximum bytes: 20924300
> > >> >> >
> > >> >> > Thanks for any advice,
> > >> >> >
> > >> >> > Martin
> > >> >>
> > >> >>
> ---------------------------------------------------------------------
> > >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> > >> >>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > >> For additional commands, e-mail: user-help@cassandra.apache.org
> > >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>