You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "techlabs (JIRA)" <ji...@apache.org> on 2011/03/23 23:05:05 UTC

[jira] [Created] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Removed/Dead Node keeps reappearing
-----------------------------------

                 Key: CASSANDRA-2371
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
             Project: Cassandra
          Issue Type: Bug
          Components: Tools
    Affects Versions: 0.7.4, 0.7.3
         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
            Reporter: techlabs
            Priority: Minor


The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.

Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 

INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
 INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
 INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
 INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
 INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
 INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
 INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63


 INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
 INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
 INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
 INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
 INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
 INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
 INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
 INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
 INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
 INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
 WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
 INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
 INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
 INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
 INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
 INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Posted by "Jeremy Hanna (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018057#comment-13018057 ] 

Jeremy Hanna commented on CASSANDRA-2371:
-----------------------------------------

I applied the patch and within a few days even with no restarting of any of the Cassandra nodes, the removed token came back. Just FYI.

> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 
> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-2371:
----------------------------------------

    Attachment: 2371.txt

My guess is what is happening is that after aVeryLongTime when the state is evicted, another gossip round occurs and the state is repopulated.  Patch to re-quarantine on eviction to avoid this.

> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 
> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Posted by "Eric Evans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Evans updated CASSANDRA-2371:
----------------------------------

    Attachment: v1-0001-CASSANDRA-2371-baseline-driver-versions-at-1.0.0.txt

> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt, v1-0001-CASSANDRA-2371-baseline-driver-versions-at-1.0.0.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 
> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Posted by "techlabs (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010519#comment-13010519 ] 

techlabs commented on CASSANDRA-2371:
-------------------------------------

This looks like it's changing the source code. Can we deploy this on a live cluster? 

> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 
> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Posted by "techlabs (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014736#comment-13014736 ] 

techlabs commented on CASSANDRA-2371:
-------------------------------------

Actually the issue was resolved when we did a restart of the cluster, and ran the nodetool removetoken command when only a portion of the nodes had started up. The main reason I didn't apply the patch yet, is merely because I need to learn how to patch the code and integrate the new build first... Thank you!

> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 
> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Posted by "Eric Evans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Evans updated CASSANDRA-2371:
----------------------------------

    Attachment:     (was: v1-0001-CASSANDRA-2371-baseline-driver-versions-at-1.0.0.txt)

> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 
> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016627#comment-13016627 ] 

Brandon Williams commented on CASSANDRA-2371:
---------------------------------------------

There is a second problem here.  We don't populate the gossiper's application state to LEFT for removetoken, only for decommission.  The problem with adding it, however, is that SS.handleStateLeft gets called every gossip round and runs through the hint removal process.  One option may be to check if the node is locally persisted and if not, just ignore the message since we never knew about it anyway.  Another is to just not remove hints when we see the LEFT state, because they'll expire anyway and large unaccessed rows aren't a problem, so this seems like a throwback from the <=0.6 days.

> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 
> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014389#comment-13014389 ] 

Jonathan Ellis edited comment on CASSANDRA-2371 at 4/1/11 4:19 AM:
-------------------------------------------------------------------

Did you get a chance to try a patched build?

      was (Author: jbellis):
    Did you git a chance to try a patched build?
  
> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 
> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2371:
--------------------------------------

    Fix Version/s: 0.7.5
         Assignee: Brandon Williams

> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 
> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams resolved CASSANDRA-2371.
-----------------------------------------

    Resolution: Fixed

Reverted CASSANDRA-2115 and created CASSANDRA-2496 to fix this correctly.

> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 
> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014389#comment-13014389 ] 

Jonathan Ellis commented on CASSANDRA-2371:
-------------------------------------------

Did you git a chance to try a patched build?

> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 
> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018412#comment-13018412 ] 

Jonathan Ellis commented on CASSANDRA-2371:
-------------------------------------------

bq. The problem with adding it, however, is that SS.handleStateLeft gets called every gossip round and runs through the hint removal process

But it only runs hint removal once per node, right?  Or is "onChange" not an accurate method name?

bq. One option may be to check if the node is locally persisted and if not, just ignore the message since we never knew about it anyway.

Locally persisted... with a token in SystemTable?  Ignore the ... hint message?

bq. Another is to just not remove hints when we see the LEFT state

We should issue a delete to the hints row, but we should not force a major compaction. The rule of thumb is, avoiding inflicting a performance hit on the cluster trumps immediate disk space cleanup.

> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 
> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020136#comment-13020136 ] 

Jonathan Ellis commented on CASSANDRA-2371:
-------------------------------------------

Sounds reasonable.

> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 
> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020080#comment-13020080 ] 

Brandon Williams commented on CASSANDRA-2371:
---------------------------------------------

It looks like the bad news here is that gossip makes many assumptions about a node being alive when it hears about it, and has no provisions for keeping removed node state.  This is bad, but highly exacerbated by keeping the removed node state for 3 days instead of 30s.  I think the best solution right now is to revert CASSANDRA-2115 until we can overhaul gossip to handle this.  The bug will still exist as it always has, but the window to trigger it is far shorter.

> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 
> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2371) Removed/Dead Node keeps reappearing

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021204#comment-13021204 ] 

Hudson commented on CASSANDRA-2371:
-----------------------------------

Integrated in Cassandra-0.7 #440 (See [https://hudson.apache.org/hudson/job/Cassandra-0.7/440/])
    Revert "Keep endpoint state until aVeryLongTime when not a fat client"

This reverts commit db9164ffd96ebc7752fc5789c90c7211ba323ad2.

For CASSANDRA-2371.


> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back into the ring, even after the EC2 instance is no longer in existence. Originally I tried to add a new node 10.214.103.224 with the same token, but there were some complications with that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html 
> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63 is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63 has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress /10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224 and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224 is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000 changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63 is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63 state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210) Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira