You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Yang Yang (JIRA)" <ji...@apache.org> on 2011/09/15 10:40:09 UTC

[jira] [Issue Comment Edited] (CASSANDRA-3085) Race condition in sstable reference counting

    [ https://issues.apache.org/jira/browse/CASSANDRA-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13105212#comment-13105212 ] 

Yang Yang edited comment on CASSANDRA-3085 at 9/15/11 8:39 AM:
---------------------------------------------------------------

Thanks Jonathan.

but I still can't see why the old code would cause errors, could you please see if the following reason makes sense?



if you look at the operations of +1 and -1 by read paths and the compaction path, either the read path or the compaction can be seen as the following sequence

+1

//access the SSTableReader

-1


where for the compaction the "+1" happens at creation of the SSTableReader; for read paths the "+1" happens at acquireReference()

since every path (either compaction or reader) does one +1 and one -1, by the time a path finishes, the ref count will be equal to the number of live code paths

if the file is removed, the ref count must be 0, hence live paths count at that moment must be 0. if there are no future paths to run, it's all good. if there are , the path would access a file already removed and we have a problem. but this is impossible because: if  the +1 comes after compaction.release(), because compaction.release() comes after the view change in DataTracker.replace(), then reader path +1 comes after DataTracker.replace(),  but this is impossible because the reader can not see that SSTableReader in its view.



      was (Author: yangyangyyy):
    Thanks Jonathan.

but I still can't see why the old code would cause errors, could you please see if the following reason makes sense?



if you look at the operations of +1 and -1 by read paths and the compaction path, either the read path or the compaction can be seen as the following sequence

+1

//access the SSTableReader

-1


where for the compaction the "+1" happens at creation of the SSTableReader; for read paths the "+1" happens at acquireReference()

since every path (either compaction or reader) does one +1 and one -1, by the time a path finishes, the ref count will be equal to the number of live code paths

if the file is removed, the ref count must be 0, hence live paths count at that moment must be 0. if there are no future paths to run, it's all good. if there are , the path would access a file already removed and we have a problem. but this is impossible because: if  the +1 comes after compaction.release(), because compaction.release() comes after the view change in DataTracker.replace(), then reader path.acquire() comes after DataTracker.replace(),  but this is impossible because the reader can not see that SSTableReader in its view.


  
> Race condition in sstable reference counting
> --------------------------------------------
>
>                 Key: CASSANDRA-3085
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3085
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.0
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Critical
>             Fix For: 1.0.0
>
>         Attachments: 3085-v2.txt, 3085.txt
>
>
> DataTracker gives us an atomic View of memtable/sstables, but acquiring references is not atomic.  So it is possible to acquire references to an SSTableReader object that is no longer valid, as in this example:
> View V contains sstables {A, B}.  We attempt a read in thread T using this View.
> Meanwhile, A and B are compacted to {C}, yielding View W.  No references exist to A or B so they are cleaned up.
> Back in thread T we acquire references to A and B.  This does not cause an error, but it will when we attempt to read from them next.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira