You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stu Hood (JIRA)" <ji...@apache.org> on 2010/10/27 20:51:20 UTC

[jira] Updated: (CASSANDRA-1555) Considerations for larger bloom filters

     [ https://issues.apache.org/jira/browse/CASSANDRA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1555:
--------------------------------

    Description: 
To (optimally) support SSTables larger than 143 million keys, we need to support bloom filters larger than 2^31 bits, which java.util.BitSet can't handle directly.

A few options:
* Switch to a BitSet class which supports 2^31 * 64 bits (Lucene's OpenBitSet)
* Partition the java.util.BitSet behind our current BloomFilter
** Straightforward bit partitioning: bit N is in bitset N // 2^31
** Separate equally sized complete bloom filters for member ranges, which can be used independently or OR'd together under memory pressure.

All of these options require new approaches to serialization.

  was:
To (optimally) support SSTables larger than 143 million keys, we need to support bloom filters larger than 2 GB, which java.util.BitSet can't handle directly.

A few options:
* Switch to a BitSet class which supports 2^63 bits (Lucene's OpenBitSet)
* Partition the java.util.BitSet behind our current BloomFilter
** Straightforward bit partitioning: bit N is in bitset N // 2^31
** Separate equally sized complete bloom filters for member ranges, which can be used independently or OR'd together under memory pressure.

All of these options require new approaches to serialization.


Well that's embarrassing.

> Considerations for larger bloom filters
> ---------------------------------------
>
>                 Key: CASSANDRA-1555
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1555
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>             Fix For: 0.8
>
>
> To (optimally) support SSTables larger than 143 million keys, we need to support bloom filters larger than 2^31 bits, which java.util.BitSet can't handle directly.
> A few options:
> * Switch to a BitSet class which supports 2^31 * 64 bits (Lucene's OpenBitSet)
> * Partition the java.util.BitSet behind our current BloomFilter
> ** Straightforward bit partitioning: bit N is in bitset N // 2^31
> ** Separate equally sized complete bloom filters for member ranges, which can be used independently or OR'd together under memory pressure.
> All of these options require new approaches to serialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.