You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sandeep Tata (JIRA)" <ji...@apache.org> on 2009/04/24 00:10:30 UTC

[jira] Created: (CASSANDRA-98) Reads (get_column) miss data or return stale values if a memtable is being flushed

Reads (get_column) miss data or return stale values if a memtable is being flushed
----------------------------------------------------------------------------------

                 Key: CASSANDRA-98
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-98
             Project: Cassandra
          Issue Type: Bug
    Affects Versions: trunk
         Environment: all
            Reporter: Sandeep Tata
            Assignee: Sandeep Tata
             Fix For: trunk


Reads can return missing values (null/exception) or find stale copies of a column if the read happens during an SSTable flush.

The get_column can go in, and not find the data in the current memtable. When it looks in the "historical" memtable, if that CF has already been flushed, then  it gets cleared from the historical memtable. As a result, the read looks for the column in older SSTables and finds a stale value (if it exists) or returns with null.

It can be tricky to reproduce this problem, but the reason is pretty easy to see.

While subsequent reads might return the correct value (from disk), this behavior makes it very difficult for apps that expect to "read your writes", at least in the absence of failures.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-98) Reads (get_column) miss data or return stale values if a memtable is being flushed

Posted by "Sandeep Tata (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-98?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sandeep Tata updated CASSANDRA-98:
----------------------------------

    Attachment: CASSANDRA-98.patch

Simple solution: Don't call columnFamily.clear() until the entire "historical" memtable has been flushed. This way, you're not stuck in a state where you can't find the data in the memtables and the SSTable is not ready yet.

The impact of this change is that memory cannot get freed up partially while a flush is going on. This is an insignificant penalty.

> Reads (get_column) miss data or return stale values if a memtable is being flushed
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-98
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-98
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: trunk
>         Environment: all
>            Reporter: Sandeep Tata
>            Assignee: Sandeep Tata
>             Fix For: trunk
>
>         Attachments: CASSANDRA-98.patch
>
>
> Reads can return missing values (null/exception) or find stale copies of a column if the read happens during an SSTable flush.
> The get_column can go in, and not find the data in the current memtable. When it looks in the "historical" memtable, if that CF has already been flushed, then  it gets cleared from the historical memtable. As a result, the read looks for the column in older SSTables and finds a stale value (if it exists) or returns with null.
> It can be tricky to reproduce this problem, but the reason is pretty easy to see.
> While subsequent reads might return the correct value (from disk), this behavior makes it very difficult for apps that expect to "read your writes", at least in the absence of failures.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.