You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Matthias Pfau <ma...@tutao.de.INVALID> on 2019/09/10 12:22:18 UTC

Drastic increase of bloom filter sizer after upgrading from 2.2.14 to 3.11.4

Hi there,
we just finished upgrading sstables on a single node after upgrading from 2.2.14 to 3.11.4. Since then, we noted a drastic increase of off heap memory consumption. This is due to increased bloom filter size.

According to cfstats output "Bloom filter off heap memory used" increased by a factor between 7 and 8. That means that while bloom filters took 1 GB of off heap storage with 2.2.14, they fill around 7.5 GB with 3.11.4.

Did anyone observe something similar? Have there been bigger changes to the bloom filter implementation between those versions?

Best,
Matthias

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


Re: Drastic increase of bloom filter sizer after upgrading from 2.2.14 to 3.11.4

Posted by Matthias Pfau <ma...@tutao.de.INVALID>.
Just a short follow up on this:

After running upgradesstables for a CF, off heap memory used by bloom filters increases by a factor between 6 and 12 in our case. This is a cassandra bug. Bloom filters are obviously calculated before splitting the sstable for multiple data dirs.

When you delete those bloom filter files and restart cassandra, they are re-created. You can also run a user defined compaction on that sstable to rewrite the bloom filter file.

This is exactly how we upgraded:
determine which CFs have bigger bloom filters (cfstats)
run upgradesstables individually for those CFs and run user defined compactions for this the sstables of that CF afterwards to reduce bloom filter size
We were only able to upgrade using this incremental approach.

Best,
Matthias


Sep 10, 2019, 18:44 by matthias.pfau@tutao.de.INVALID:

> A few more details:
>
> 1. bloom_filter_fp_chance is set to 0.01
>
> 2. I reviewed CASSANDRA-8413 (https://github.com/apache/cassandra/commit/23fd75f27c40462636f09920719b5dcbef5b8f36 <https://github.com/apache/cassandra/commit/23fd75f27c40462636f09920719b5dcbef5b8f36>) and this should not have lead to much larger bloom filters.
>
> 3. large sstables (few above 1 TB) have been splitted into way smaller ones (256 vnodes) during the upgrade sstables. Could this lead to the described problem with way too large bloom filters?
>
> Best,
> Matthias
> 10. Sep. 2019, 14:22 von matthias.pfau@tutao.de.INVALID:
>
>> Hi there,
>> we just finished upgrading sstables on a single node after upgrading from 2.2.14 to 3.11.4. Since then, we noted a drastic increase of off heap memory consumption. This is due to increased bloom filter size.
>>
>> According to cfstats output "Bloom filter off heap memory used" increased by a factor between 7 and 8. That means that while bloom filters took 1 GB of off heap storage with 2.2.14, they fill around 7.5 GB with 3.11.4.
>>
>> Did anyone observe something similar? Have there been bigger changes to the bloom filter implementation between those versions?
>>
>> Best,
>> Matthias
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


Re: Drastic increase of bloom filter sizer after upgrading from 2.2.14 to 3.11.4

Posted by Matthias Pfau <ma...@tutao.de.INVALID>.
A few more details:

1. bloom_filter_fp_chance is set to 0.01

2. I reviewed CASSANDRA-8413 (https://github.com/apache/cassandra/commit/23fd75f27c40462636f09920719b5dcbef5b8f36 <https://github.com/apache/cassandra/commit/23fd75f27c40462636f09920719b5dcbef5b8f36>) and this should not have lead to much larger bloom filters.

3. large sstables (few above 1 TB) have been splitted into way smaller ones (256 vnodes) during the upgrade sstables. Could this lead to the described problem with way too large bloom filters?

Best,
Matthias
10. Sep. 2019, 14:22 von matthias.pfau@tutao.de.INVALID:

> Hi there,
> we just finished upgrading sstables on a single node after upgrading from 2.2.14 to 3.11.4. Since then, we noted a drastic increase of off heap memory consumption. This is due to increased bloom filter size.
>
> According to cfstats output "Bloom filter off heap memory used" increased by a factor between 7 and 8. That means that while bloom filters took 1 GB of off heap storage with 2.2.14, they fill around 7.5 GB with 3.11.4.
>
> Did anyone observe something similar? Have there been bigger changes to the bloom filter implementation between those versions?
>
> Best,
> Matthias
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org