You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Brandon Williams (JIRA)" <ji...@apache.org> on 2011/04/18 20:59:06 UTC

[jira] [Created] (CASSANDRA-2496) Gossip should handle 'dead' states

Gossip should handle 'dead' states
----------------------------------

                 Key: CASSANDRA-2496
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
             Project: Cassandra
          Issue Type: Bug
          Components: Core
            Reporter: Brandon Williams


For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-2496:
----------------------------------------

    Attachment: 0008-only-handleStateRemoving-if-the-node-is-a-member.patch
                0007-Always-update-epstate-timestamps-when-the-node-is-al.patch

0007 handles problems when a node has been down longer than aVeryLongTime, and ensures that we advertise the new token states long enough.

0008 makes sure that SS only get involved with removal if the token is a member.

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt, 0006-acknowledge-unexpected-repl-fins.patch.txt, 0007-Always-update-epstate-timestamps-when-the-node-is-al.patch, 0008-only-handleStateRemoving-if-the-node-is-a-member.patch
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "paul cannon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068725#comment-13068725 ] 

paul cannon commented on CASSANDRA-2496:
----------------------------------------

Ok, nodes do indeed infinitely retry the replication confirmation in some cases, but it appears it's not just when the former removal coordinator has restarted in the interim- it seems to be when the removetoken is reissued to another, new removal coordinator. In this case, I get this traceback every 10 seconds:

{noformat}
ERROR [MiscStage:9] 2011-07-20 23:42:06,599 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[MiscStage:9,5,main]
java.lang.AssertionError
        at org.apache.cassandra.service.StorageService.confirmReplication(StorageService.java:2088)
        at org.apache.cassandra.streaming.ReplicationFinishedVerbHandler.doVerb(ReplicationFinishedVerbHandler.java:38)
        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{noformat}

I'll look into this.

Second, it seems that moving/joining nodes can take over the removed token fine, once the removetoken is complete. I haven't tried having a node take over the removed token while the removal is ongoing- I assume we can just document that that probably isn't a great idea?

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-2496:
----------------------------------------

    Attachment:     (was: 0001-Rework-token-removal-process.patch)

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "paul cannon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

paul cannon updated CASSANDRA-2496:
-----------------------------------

    Attachment: 0005-drain-self-if-removetoken-d-elsewhere.patch.txt

0005-drain-self-if-removetoken-d-elsewhere.patch.txt : when node X was partitioned and removetoken'd but then it shows up again, it should shut itself down, rather than becoming a zombie

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-2496:
----------------------------------------

    Attachment: 0001-Rework-token-removal-process.txt

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "paul cannon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

paul cannon updated CASSANDRA-2496:
-----------------------------------

    Attachment: 0006-acknowledge-unexpected-repl-fins.patch.txt

0006-acknowledge-unexpected-repl-fins.patch.txt (updated): also log at info when acknowledging the unexpected messages

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt, 0006-acknowledge-unexpected-repl-fins.patch.txt, 0006-acknowledge-unexpected-repl-fins.patch.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-2496:
----------------------------------------

    Fix Version/s: 0.8.3

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>             Fix For: 0.8.3
>
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt, 0006-acknowledge-unexpected-repl-fins.patch.txt, 0007-Always-update-epstate-timestamps-when-the-node-is-al.patch, 0008-only-handleStateRemoving-if-the-node-is-a-member.patch
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "paul cannon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

paul cannon updated CASSANDRA-2496:
-----------------------------------

    Attachment: 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt

0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt: use REMOVED_TOKEN instead of STATUS_LEFT (would probably be ok either way, but otherwise, the REMOVED_TOKEN state would not be used). Seems this is more the way it was intended.

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2496:
--------------------------------------

    Reviewer: thepaul

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-2496:
----------------------------------------

    Attachment:     (was: 0001-Rework-token-removal-process.txt)

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "paul cannon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065531#comment-13065531 ] 

paul cannon commented on CASSANDRA-2496:
----------------------------------------

I'll see what I can do to test the "infinitely loop retrying the confirmation" and "bootstrapping/moving nodes can take over these dead tokens" situations.

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "paul cannon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

paul cannon updated CASSANDRA-2496:
-----------------------------------

    Attachment:     (was: 0006-acknowledge-unexpected-repl-fins.patch.txt)

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt, 0006-acknowledge-unexpected-repl-fins.patch.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams reassigned CASSANDRA-2496:
-------------------------------------------

    Assignee: Brandon Williams

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-2496:
----------------------------------------

    Attachment:     (was: 0002-add-2115-back.patch)

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "paul cannon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

paul cannon updated CASSANDRA-2496:
-----------------------------------

    Attachment: 0003-update-gossip-related-comments.patch.txt

These small patches build on the others.

0003-update-gossip-related-comments.patch.txt: updates gossip-related comments derp derp.

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-2496:
----------------------------------------

    Attachment:     (was: 0001-Rework-token-removal-process.txt)

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042254#comment-13042254 ] 

Brandon Williams commented on CASSANDRA-2496:
---------------------------------------------

Some explanation of what changed and why it was necessary:

Consider nodes A through D. D is partitioned, and C is dead and needs to be removed. A removetoken will be issued to A for this.

In the current way we do things, A will modify it's own state by appending information to its status indicating that it will be removing C. B will see this, re-replicate as needed, then report to A that is is done. The problem however, is that since A modified its own state, A is also free to wipe that state out, either by restarting, or simple remove another token, because there's only space for one. If A reboots and then D's partition heals, D will never know C was removed. Worse, it will still have state for C that neither A nor B do, and so it will repopulate the ring with C again.

This patch changes this by instead having A sleep for RING_DELAY to make sure the generation for C is stable, and then it modifies C's state to indicate it is being removed, just as if C itself had done this. It also appends some extra state to indicate that A will be the removal coordinator. The others nodes see this, re-replicate and report back to A, which then modifies C's state once more to indicate it is completely removed. At this point, it doesn't matter if A dies completely and D's partition heals, since the state is stored in C's gossip information. If A reboots, it will be able to get the correct state information from B, or any other node.

If A fails while the other nodes are re-replicating, a new removetoken can be started elsewhere, or in the case of other replicas being down preventing removetoken from completing, a removetoken force will remove the node and then repair can be run to restore the replica count.

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-2496:
----------------------------------------

    Attachment: 0001-Rework-token-removal-process.txt

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "paul cannon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069202#comment-13069202 ] 

paul cannon commented on CASSANDRA-2496:
----------------------------------------

ok, +1 with these patches.

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt, 0006-acknowledge-unexpected-repl-fins.patch.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-2496:
----------------------------------------

    Attachment: 0002-add-2115-back.txt
                0001-Rework-token-removal-process.txt

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070710#comment-13070710 ] 

Hudson commented on CASSANDRA-2496:
-----------------------------------

Integrated in Cassandra-0.8 #238 (See [https://builds.apache.org/job/Cassandra-0.8/238/])
    Gossip handles dead states, token removal actually works, gossip states
are held for aVeryLongTime.
Patch by brandonwilliams and Paul Cannon, reviewed by Paul Cannon for
CASSANDRA-2496.

brandonwilliams : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1150847
Files : 
* /cassandra/branches/cassandra-0.8/CHANGES.txt
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/gms/Gossiper.java
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/gms/VersionedValue.java
* /cassandra/branches/cassandra-0.8/NEWS.txt
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/gms/HeartBeatState.java
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageService.java
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/MessagingService.java
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/gms/ApplicationState.java


> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt, 0006-acknowledge-unexpected-repl-fins.patch.txt, 0007-Always-update-epstate-timestamps-when-the-node-is-al.patch, 0008-only-handleStateRemoving-if-the-node-is-a-member.patch
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "paul cannon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

paul cannon updated CASSANDRA-2496:
-----------------------------------

    Attachment: 0006-acknowledge-unexpected-repl-fins.patch.txt

0006-acknowledge-unexpected-repl-fins.patch.txt: don't assert and drop the message when we see an unexpected REPLICATION_FINISHED. Ack it instead, so the sender doesn't continually retry.

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt, 0006-acknowledge-unexpected-repl-fins.patch.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "paul cannon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070663#comment-13070663 ] 

paul cannon commented on CASSANDRA-2496:
----------------------------------------

+1

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt, 0006-acknowledge-unexpected-repl-fins.patch.txt, 0007-Always-update-epstate-timestamps-when-the-node-is-al.patch, 0008-only-handleStateRemoving-if-the-node-is-a-member.patch
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-2496:
----------------------------------------

    Attachment: 0002-add-2115-back.patch
                0001-Rework-token-removal-process.patch

The first patch allows gossip to track dead states and completely changes how removetoken works, since it is the problem with keeping dead state around, and currently broken in many scenarios.  The second patch adds the previously reverted CASSANDRA-2115 back now that removetoken is more resilient.

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.patch, 0002-add-2115-back.patch
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2496) Gossip should handle 'dead' states

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065409#comment-13065409 ] 

Brandon Williams commented on CASSANDRA-2496:
---------------------------------------------

I see two more things to be done with this patch.  First, when re-replicating nodes report back to the removal coordinator, if the coordinator has restarted it won't understand them, and they will infinitely loop retrying the confirmation.  Second, since we're holding dead states, we need to make sure that bootstrapping/moving nodes can take over these dead tokens.

> Gossip should handle 'dead' states
> ----------------------------------
>
>                 Key: CASSANDRA-2496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt
>
>
> For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira