You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2012/05/25 00:13:33 UTC

[jira] [Comment Edited] (CASSANDRA-2897) Secondary indexes without read-before-write

    [ https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282683#comment-13282683 ] 

Jonathan Ellis edited comment on CASSANDRA-2897 at 5/24/12 10:03 PM:
---------------------------------------------------------------------

bq. this doesn't work for us, since (unlike Bigtable) we don't make an effort to preserve all older versions of a column on disk

We can fix this without having to go full-on Bigtable with value retention.  "All" we need to do is have the memtable update code special case replacements in the CF map to issue an index delete against the replaced value.  Messy, but not as messy as having to maintain two KEYS index implementations.

So, we can add that as step 2.5 to my list above and we should be good.

                
      was (Author: jbellis):
    Adding code to the memtable update to issue an index delete when an overwrite happens would be messy, but not as messy as having to maintain two KEYS index implementations.

So, we can add that as step 2.5 to my list above and we should be good.

                  
> Secondary indexes without read-before-write
> -------------------------------------------
>
>                 Key: CASSANDRA-2897
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Sylvain Lebresne
>            Priority: Minor
>              Labels: secondary_index
>
> Currently, secondary index updates require a read-before-write to maintain the index consistency. Keeping the index consistent at all time is not necessary however. We could let the (secondary) index get inconsistent on writes and repair those on reads. This would be easy because on reads, we make sure to request the indexed columns anyway, so we can just skip the row that are not needed and repair the index at the same time.
> This does trade work on writes for work on reads. However, read-before-write is sufficiently costly that it will likely be a win overall.
> There is (at least) two small technical difficulties here though:
> # If we repair on read, this will be racy with writes, so we'll probably have to synchronize there.
> # We probably shouldn't only rely on read to repair and we should also have a task to repair the index for things that are rarely read. It's unclear how to make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira