You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Yuki Morishita (JIRA)" <ji...@apache.org> on 2014/02/01 00:46:09 UTC

[jira] [Commented] (CASSANDRA-6503) sstables from stalled repair sessions become live after a reboot and can resurrect deleted data

    [ https://issues.apache.org/jira/browse/CASSANDRA-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13888321#comment-13888321 ] 

Yuki Morishita commented on CASSANDRA-6503:
-------------------------------------------

[~jasobrown] Complete message exchange was actually fragile before and could leave streaming session to WAIT_COMPLETE state on one side.

Only bellow pattern worked, and it worked because we were sending complete as we receive FileMessage in the same thread.

{code}
(A) ---> File     ---> (B) ...1
(A) <--- Complete <--- (B) ...2
(A) ---> Complete ---> (B) ...3
{code}

But now finalizing all received files moved to another thread. So sending receiving complete from A(3) gets first and B terminates it session without sending back complete, leaving A as WAIT_COMPLETE. Thus we needed to make sure to send complete message.


> sstables from stalled repair sessions become live after a reboot and can resurrect deleted data
> -----------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6503
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6503
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jeremiah Jordan
>            Assignee: Jason Brown
>            Priority: Minor
>             Fix For: 1.2.14, 2.0.5
>
>         Attachments: 6503-2.0-followup.txt, 6503_2.0-v2.diff, 6503_2.0-v3.diff, 6503_c1.2-v1.patch
>
>
> The sstables streamed in during a repair session don't become active until the session finishes.  If something causes the repair session to hang for some reason, those sstables will hang around until the next reboot, and become active then.  If you don't reboot for 3 months, this can cause data to resurrect, as GC grace has expired, so tombstones for the data in those sstables may have already been collected.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)