You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Stu Hood (JIRA)" <ji...@apache.org> on 2010/07/01 22:05:50 UTC

[jira] Updated: (CASSANDRA-1207) Don't write BloomFilters for skinny rows

     [ https://issues.apache.org/jira/browse/CASSANDRA-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1207:
--------------------------------

    Fix Version/s: 0.8
                       (was: 0.7)

> because typically you will have "original data" CFs whose columns are either accessed by name (you want a BF, index is unnecessary)
Depending on the size of the row (the threshold I think we need to find), you don't want the bloom filter here either, since the disk/os is likely to bring the entire thing into memory. Optimizing the deserialization of columns to skip values would push the threshold up even more.

----

I'm removing this one from 0.7, since we are planning to refactor the file format in 0.8 anyway.

> Don't write BloomFilters for skinny rows
> ----------------------------------------
>
>                 Key: CASSANDRA-1207
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1207
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Stu Hood
>            Priority: Minor
>             Fix For: 0.8
>
>         Attachments: 0001-Return-alwaysMatchingBloomFilter-for-0-length-filter.patch, 0002-Conditionally-write-the-row-bloom-filter.patch
>
>
> All rows currently contain a serialized BloomFilter, regardless of size. For smaller rows, it is much more efficient in space and CPU time to not write a BloomFilter, and to eagerly perform lookups against the existing columns.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.