You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Ryan McGuire (JIRA)" <ji...@apache.org> on 2013/12/18 19:16:09 UTC
[jira] [Comment Edited] (CASSANDRA-6053) system.peers table not updated after decommissioning nodes in C* 2.0

    [ https://issues.apache.org/jira/browse/CASSANDRA-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13851971#comment-13851971 ] 

Ryan McGuire edited comment on CASSANDRA-6053 at 12/18/13 6:15 PM:
-------------------------------------------------------------------

First attempt appears to work correctly on cassandra-2.0 HEAD and 1.2.9 : 

{code}
12:53 PM:~$ ccm create -v git:cassandra-1.2.9 t
Fetching Cassandra updates...
Current cluster is now: t
12:53 PM:~$ ccm populate -n 5
12:54 PM:~$ ccm start
12:54 PM:~$ ccm node1 stress
Created keyspaces. Sleeping 1s for propagation.
total,interval_op_rate,interval_key_rate,latency/95th/99th,elapsed_time
24994,2499,2499,9.5,55.2,179.0,10
103123,7812,7812,2.8,27.2,134.7,20
236358,13323,13323,1.7,15.4,134.7,30
329477,9311,9311,1.7,9.8,109.8,40
405667,7619,7619,1.8,9.2,6591.9,50
558989,15332,15332,1.5,6.6,6591.1,60
^C12:55 PM:~$ ccm node1 cqlsh
Connected to t at 127.0.0.1:9160.
[cqlsh 3.1.7 | Cassandra 1.2.9-SNAPSHOT | CQL spec 3.0.0 | Thrift protocol 19.36.0]
Use HELP for help.
cqlsh> select peer from system.peers;

 peer
-----------
 127.0.0.3
 127.0.0.2
 127.0.0.5
 127.0.0.4

cqlsh>
12:55 PM:~$ ccm node2 decommission
12:57 PM:~$ ccm node1 cqlsh
Connected to t at 127.0.0.1:9160.
[cqlsh 3.1.7 | Cassandra 1.2.9-SNAPSHOT | CQL spec 3.0.0 | Thrift protocol 19.36.0]
Use HELP for help.
cqlsh> select peer from system.peers;

 peer
-----------
 127.0.0.3
 127.0.0.5
 127.0.0.4

cqlsh>
12:58 PM:~$
{code}

All nodes show the same peers table.


was (Author: enigmacurry):
First attempt appears to work correctly on cassandra-2.0 HEAD and 1.2.9 : 

{code}
12:53 PM:~$ ccm create -v git:cassandra-1.2.9 t
Fetching Cassandra updates...
Current cluster is now: t
12:53 PM:~$ ccm populate -n 5
12:54 PM:~$ ccm start
12:54 PM:~$ ccm node1 stress
Created keyspaces. Sleeping 1s for propagation.
total,interval_op_rate,interval_key_rate,latency/95th/99th,elapsed_time
24994,2499,2499,9.5,55.2,179.0,10
103123,7812,7812,2.8,27.2,134.7,20
236358,13323,13323,1.7,15.4,134.7,30
329477,9311,9311,1.7,9.8,109.8,40
405667,7619,7619,1.8,9.2,6591.9,50
558989,15332,15332,1.5,6.6,6591.1,60
^C12:55 PM:~$ ccm node1 cqlsh
Connected to t at 127.0.0.1:9160.
[cqlsh 3.1.7 | Cassandra 1.2.9-SNAPSHOT | CQL spec 3.0.0 | Thrift protocol 19.36.0]
Use HELP for help.
cqlsh> select peer from system.peers;

 peer
-----------
 127.0.0.3
 127.0.0.2
 127.0.0.5
 127.0.0.4

cqlsh>
12:55 PM:~$ ccm node2 decommission
12:57 PM:~$ ccm node1 cqlsh
Connected to t at 127.0.0.1:9160.
[cqlsh 3.1.7 | Cassandra 1.2.9-SNAPSHOT | CQL spec 3.0.0 | Thrift protocol 19.36.0]
Use HELP for help.
cqlsh> select peer from system.peers;

 peer
-----------
 127.0.0.3
 127.0.0.5
 127.0.0.4

cqlsh>
12:58 PM:~$
{code}

> system.peers table not updated after decommissioning nodes in C* 2.0
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-6053
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6053
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Datastax AMI running EC2 m1.xlarge instances
>            Reporter: Guyon Moree
>            Assignee: Brandon Williams
>         Attachments: peers
>
>
> After decommissioning my cluster from 20 to 9 nodes using opscenter, I found all but one of the nodes had incorrect system.peers tables.
> This became a problem (afaik) when using the python-driver, since this queries the peers table to set up its connection pool. Resulting in very slow startup times, because of timeouts.
> The output of nodetool didn't seem to be affected. After removing the incorrect entries from the peers tables, the connection issues seem to have disappeared for us. 
> Would like some feedback on if this was the right way to handle the issue or if I'm still left with a broken cluster.
> Attached is the output of nodetool status, which shows the correct 9 nodes. Below that the output of the system.peers tables on the individual nodes.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)