You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Tim Dunphy <bl...@gmail.com> on 2014/05/25 15:20:35 UTC

decommissioning a node

Hey all,

I'm attempting to decommission a node I want to remove.

First I get a status of the ring

[root@beta-new:~] #nodetool status

Datacenter: datacenter1

=======================

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address         Load       Tokens  Owns   Host ID
        Rack

UN  10.10.1.94  197.37 KB  256     49.4%
fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1

UN  10.10.1.18    216.95 KB  256     50.6%
f2a48fc7-a362-43f5-9061-4bb3739fdeaf  rack


I see that the node I want to remove is UP. Tho I believe UN means up I
don't know what it stands for.


[root@beta-new:~] #nodetool -host  10.10.1.18 decommission

Failed to connect to ' 10.10.1.18 : Connection timed out

The connection to the node I want to decommission times out. :(

I’m running this node from the seed node, and while I do see port 7199
active and listening there, I do NOT see this port active and listening on
the node that I want to decommission.


Seed node:

[root@beta-new:~] #lsof -i :7199

COMMAND   PID USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME

java    15331 root   51u  IPv4 566368606      0t0  TCP *:7199 (LISTEN)


[root@beta:/etc/alternatives/cassandrahome] #lsof -i :7199

[root@beta:/etc/alternatives/cassandrahome] #


However cassandra does seem to be running on the node I want to
decommission in addition to it being shown as UN by nodetool status:


[root@beta:/etc/alternatives/cassandrahome] #netstat -tulpn | grep -i
listen | grep java

tcp        0      0 0.0.0.0:46755               0.0.0.0:*
LISTEN      23039/java

tcp        0      0 10.10.1.18:9160           0.0.0.0:*
LISTEN      23039/java

tcp        0      0 0.0.0.0:42990               0.0.0.0:*
LISTEN      23039/java

tcp        0      0 10.10.1.18:8081           0.0.0.0:*
LISTEN      23039/java

tcp        0      0 10.10.1.18:9042           0.0.0.0:*
LISTEN      23039/java

tcp        0      0 10.10.1.18:7000           0.0.0.0:*
LISTEN      23039/java

tcp        0      0 0.0.0.0:7198                0.0.0.0:*
LISTEN      23039/java


So why do you think my seed is listening on port 7199 but the node I want
to get rid of is not? And how can I accomplish my goal of deleting the
unwanted node?


Thanks

Tim



-- 
GPG me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B

Re: decommissioning a node

Posted by Tim Dunphy <bl...@gmail.com>.
>
> In case the above doesn't work, another thing to be aware of is that JMX
> uses 2 different ports. The initial connection to 7199 causes a second port
> to be opened, which is normally assigned randomly to an available and
> otherwise unused port above 1024. If your server has a
> firewall/ACL/Security Group configured, it probably doesn't expose the port.
>
> Assuming you are on Java 7 (u4 or later), you can add the following to
> your environment:
> JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.rmi.port=7200"
>
> to force a fixed port (7200 in this case) for the second one, which is
> more firewall friendly. You'll still need to be sure access to port 7200 is
> allowed, but at least it's only a single port.
>

Hope that helps,
>

Hi Steven,

 Thank you for that. I'll give this a try. Right now I am down to one
cassandra node with one node reported down in nodetool stats. I'll probably
make another attempt to revive another cassandra node in order to get rid
of the lingering dead one later on this week.

But in the meantime, as an FYI I am on java 8 currently. So this should
hopefully work.

java version "1.8.0"
Java(TM) SE Runtime Environment (build 1.8.0-b132)
Java HotSpot(TM) 64-Bit Server VM (build 25.0-b70, mixed mode)




> Thanks
>
Tim


On Tue, May 27, 2014 at 12:04 PM, Steven A Robenalt
<sr...@stanford.edu>wrote:

> Hi Tim,
>
> In case the above doesn't work, another thing to be aware of is that JMX
> uses 2 different ports. The initial connection to 7199 causes a second port
> to be opened, which is normally assigned randomly to an available and
> otherwise unused port above 1024. If your server has a
> firewall/ACL/Security Group configured, it probably doesn't expose the port.
>
> Assuming you are on Java 7 (u4 or later), you can add the following to
> your environment:
> JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.rmi.port=7200"
>
> to force a fixed port (7200 in this case) for the second one, which is
> more firewall friendly. You'll still need to be sure access to port 7200 is
> allowed, but at least it's only a single port.
>
> Hope that helps,
>
> Steve
>
>
>
> On Sun, May 25, 2014 at 7:48 AM, Tim Dunphy <bl...@gmail.com> wrote:
>
>> ok I copied the cassandra.env from the host that had cassandra listening
>> on port 7199 to the node that wasn't.
>>
>> That got it listening on the JMX port:
>>
>> [root@beta:~] #lsof -i :7199
>>  COMMAND  PID USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
>> java    9197 root   45u  IPv4 6411278      0t0  TCP *:7199 (LISTEN)
>>
>> But even tho I can telnet to that port from the seed node:
>>
>> [root@beta-new:~] #telnet  10.10.1.18 7199
>> Trying 166.78.27.18...
>> Connected to 166.78.27.18.
>> Escape character is '^]'.
>>
>>
>> I still get connection refused when trying to decommission the node:
>>
>> [root@beta-new:~] #nodetool -host 10.10.1.18 decommission
>> Failed to connect to '166.78.27.18:7199': Connection refused
>>
>> To Colin, thanks for the information!
>>
>> nodetool decomission <host-id-of-node-to-decomission>
>>
>> UN means UP, NORMAL
>>
>> Oh and thanks I tried that and it seems to be working!
>>
>> [root@beta-new:~] #nodetool decommission
>> f2a48fc7-a362-43f5-9061-4bb3739fdeaf
>> Decommission will decommission the node you are connected to and does not
>> take arguments!
>>
>> Sorry guys, wrote this email in a hurry as we're checking out of a hotel
>> room currently. :) I'll let you know if this does work.
>>
>>
>>
>> On Sun, May 25, 2014 at 10:19 AM, Colin Clark <co...@clark.ws> wrote:
>>
>>> Try this:
>>>
>>> nodetool decomission <host-id-of-node-to-decomission>
>>>
>>>  UN means UP, NORMAL
>>>
>>> --
>>> Colin
>>> +1 320 221 9531
>>>
>>>
>>>
>>> On Sun, May 25, 2014 at 9:09 AM, Tim Dunphy <bl...@gmail.com>wrote:
>>>
>>>> Also for information that may help diagnose this issue I am running
>>>> cassandra 2.0.7
>>>>
>>>> I am also using these java options:
>>>>
>>>> [root@beta:/etc/alternatives/cassandrahome] #grep -i jvm_opts
>>>> conf/cassandra-env.sh  | grep -v '#'
>>>> JVM_OPTS="$JVM_OPTS -ea"
>>>>     JVM_OPTS="$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar"
>>>> JVM_OPTS="$JVM_OPTS -XX:+CMSClassUnloadingEnabled"
>>>> JVM_OPTS="$JVM_OPTS -XX:+UseThreadPriorities"
>>>> JVM_OPTS="$JVM_OPTS -XX:ThreadPriorityPolicy=42"
>>>> JVM_OPTS="$JVM_OPTS -Xms${MAX_HEAP_SIZE}"
>>>> JVM_OPTS="$JVM_OPTS -Xmx${MAX_HEAP_SIZE}"
>>>> JVM_OPTS="$JVM_OPTS -Xmn${HEAP_NEWSIZE}"
>>>> JVM_OPTS="$JVM_OPTS -XX:+HeapDumpOnOutOfMemoryError"
>>>>     JVM_OPTS="$JVM_OPTS
>>>> -XX:HeapDumpPath=$CASSANDRA_HEAPDUMP_DIR/cassandra-`date +%s`-pid$$.hprof"
>>>> JVM_OPTS="$JVM_OPTS -Xss256k"
>>>> JVM_OPTS="$JVM_OPTS -XX:StringTableSize=1000003"
>>>> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
>>>> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
>>>> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
>>>> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8"
>>>> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1"
>>>> JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
>>>> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
>>>> JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
>>>>     JVM_OPTS="$JVM_OPTS -XX:+UseCondCardMark"
>>>> JVM_OPTS="$JVM_OPTS -Djava.net.preferIPv4Stack=true"
>>>> JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT"
>>>> JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.ssl=false"
>>>> JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.authenticate=false"
>>>> JVM_OPTS="$JVM_OPTS $JVM_EXTRA_OPTS"
>>>>
>>>>
>>>> Still need to figure out why the node I want to decommission isn't
>>>> listening on port 7199 and how I can actually decommission it.
>>>>
>>>> Thanks
>>>> Tim
>>>>
>>>>
>>>> On Sun, May 25, 2014 at 9:20 AM, Tim Dunphy <bl...@gmail.com>wrote:
>>>>
>>>>>
>>>>> Hey all,
>>>>>
>>>>> I'm attempting to decommission a node I want to remove.
>>>>>
>>>>> First I get a status of the ring
>>>>>
>>>>> [root@beta-new:~] #nodetool status
>>>>>
>>>>> Datacenter: datacenter1
>>>>>
>>>>> =======================
>>>>>
>>>>> Status=Up/Down
>>>>>
>>>>> |/ State=Normal/Leaving/Joining/Moving
>>>>>
>>>>> --  Address         Load       Tokens  Owns   Host ID
>>>>>               Rack
>>>>>
>>>>> UN  10.10.1.94  197.37 KB  256     49.4%
>>>>> fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1
>>>>>
>>>>> UN  10.10.1.18    216.95 KB  256     50.6%
>>>>> f2a48fc7-a362-43f5-9061-4bb3739fdeaf  rack
>>>>>
>>>>>
>>>>> I see that the node I want to remove is UP. Tho I believe UN means up
>>>>> I don't know what it stands for.
>>>>>
>>>>>
>>>>> [root@beta-new:~] #nodetool -host  10.10.1.18 decommission
>>>>>
>>>>> Failed to connect to ' 10.10.1.18 : Connection timed out
>>>>>
>>>>> The connection to the node I want to decommission times out. :(
>>>>>
>>>>> I’m running this node from the seed node, and while I do see port 7199
>>>>> active and listening there, I do NOT see this port active and listening on
>>>>> the node that I want to decommission.
>>>>>
>>>>>
>>>>> Seed node:
>>>>>
>>>>> [root@beta-new:~] #lsof -i :7199
>>>>>
>>>>> COMMAND   PID USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
>>>>>
>>>>> java    15331 root   51u  IPv4 566368606      0t0  TCP *:7199 (LISTEN)
>>>>>
>>>>>
>>>>> [root@beta:/etc/alternatives/cassandrahome] #lsof -i :7199
>>>>>
>>>>> [root@beta:/etc/alternatives/cassandrahome] #
>>>>>
>>>>>
>>>>> However cassandra does seem to be running on the node I want to
>>>>> decommission in addition to it being shown as UN by nodetool status:
>>>>>
>>>>>
>>>>> [root@beta:/etc/alternatives/cassandrahome] #netstat -tulpn | grep -i
>>>>> listen | grep java
>>>>>
>>>>> tcp        0      0 0.0.0.0:46755               0.0.0.0:*
>>>>>       LISTEN      23039/java
>>>>>
>>>>> tcp        0      0 10.10.1.18:9160           0.0.0.0:*
>>>>>     LISTEN      23039/java
>>>>>
>>>>> tcp        0      0 0.0.0.0:42990               0.0.0.0:*
>>>>>       LISTEN      23039/java
>>>>>
>>>>> tcp        0      0 10.10.1.18:8081           0.0.0.0:*
>>>>>     LISTEN      23039/java
>>>>>
>>>>> tcp        0      0 10.10.1.18:9042           0.0.0.0:*
>>>>>     LISTEN      23039/java
>>>>>
>>>>> tcp        0      0 10.10.1.18:7000           0.0.0.0:*
>>>>>     LISTEN      23039/java
>>>>>
>>>>> tcp        0      0 0.0.0.0:7198                0.0.0.0:*
>>>>>       LISTEN      23039/java
>>>>>
>>>>>
>>>>> So why do you think my seed is listening on port 7199 but the node I
>>>>> want to get rid of is not? And how can I accomplish my goal of deleting the
>>>>> unwanted node?
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Tim
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> GPG me!!
>>>>>
>>>>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> GPG me!!
>>>>
>>>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>>>
>>>>
>>>
>>
>>
>> --
>> GPG me!!
>>
>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>
>>
>
>
> --
> Steve Robenalt
> Software Architect
> HighWire | Stanford University
> 425 Broadway St, Redwood City, CA 94063
>
> srobenal@stanford.edu
> http://highwire.stanford.edu
>
>
>
>
>
>


-- 
GPG me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B

Re: decommissioning a node

Posted by Steven A Robenalt <sr...@stanford.edu>.
Hi Tim,

In case the above doesn't work, another thing to be aware of is that JMX
uses 2 different ports. The initial connection to 7199 causes a second port
to be opened, which is normally assigned randomly to an available and
otherwise unused port above 1024. If your server has a
firewall/ACL/Security Group configured, it probably doesn't expose the port.

Assuming you are on Java 7 (u4 or later), you can add the following to your
environment:
JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.rmi.port=7200"

to force a fixed port (7200 in this case) for the second one, which is more
firewall friendly. You'll still need to be sure access to port 7200 is
allowed, but at least it's only a single port.

Hope that helps,

Steve



On Sun, May 25, 2014 at 7:48 AM, Tim Dunphy <bl...@gmail.com> wrote:

> ok I copied the cassandra.env from the host that had cassandra listening
> on port 7199 to the node that wasn't.
>
> That got it listening on the JMX port:
>
> [root@beta:~] #lsof -i :7199
> COMMAND  PID USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
> java    9197 root   45u  IPv4 6411278      0t0  TCP *:7199 (LISTEN)
>
> But even tho I can telnet to that port from the seed node:
>
> [root@beta-new:~] #telnet  10.10.1.18 7199
> Trying 166.78.27.18...
> Connected to 166.78.27.18.
> Escape character is '^]'.
>
>
> I still get connection refused when trying to decommission the node:
>
> [root@beta-new:~] #nodetool -host 10.10.1.18 decommission
> Failed to connect to '166.78.27.18:7199': Connection refused
>
> To Colin, thanks for the information!
>
> nodetool decomission <host-id-of-node-to-decomission>
>
> UN means UP, NORMAL
>
> Oh and thanks I tried that and it seems to be working!
>
> [root@beta-new:~] #nodetool decommission
> f2a48fc7-a362-43f5-9061-4bb3739fdeaf
> Decommission will decommission the node you are connected to and does not
> take arguments!
>
> Sorry guys, wrote this email in a hurry as we're checking out of a hotel
> room currently. :) I'll let you know if this does work.
>
>
>
> On Sun, May 25, 2014 at 10:19 AM, Colin Clark <co...@clark.ws> wrote:
>
>> Try this:
>>
>> nodetool decomission <host-id-of-node-to-decomission>
>>
>> UN means UP, NORMAL
>>
>> --
>> Colin
>> +1 320 221 9531
>>
>>
>>
>> On Sun, May 25, 2014 at 9:09 AM, Tim Dunphy <bl...@gmail.com> wrote:
>>
>>> Also for information that may help diagnose this issue I am running
>>> cassandra 2.0.7
>>>
>>> I am also using these java options:
>>>
>>> [root@beta:/etc/alternatives/cassandrahome] #grep -i jvm_opts
>>> conf/cassandra-env.sh  | grep -v '#'
>>> JVM_OPTS="$JVM_OPTS -ea"
>>>     JVM_OPTS="$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar"
>>> JVM_OPTS="$JVM_OPTS -XX:+CMSClassUnloadingEnabled"
>>> JVM_OPTS="$JVM_OPTS -XX:+UseThreadPriorities"
>>> JVM_OPTS="$JVM_OPTS -XX:ThreadPriorityPolicy=42"
>>> JVM_OPTS="$JVM_OPTS -Xms${MAX_HEAP_SIZE}"
>>> JVM_OPTS="$JVM_OPTS -Xmx${MAX_HEAP_SIZE}"
>>> JVM_OPTS="$JVM_OPTS -Xmn${HEAP_NEWSIZE}"
>>> JVM_OPTS="$JVM_OPTS -XX:+HeapDumpOnOutOfMemoryError"
>>>     JVM_OPTS="$JVM_OPTS
>>> -XX:HeapDumpPath=$CASSANDRA_HEAPDUMP_DIR/cassandra-`date +%s`-pid$$.hprof"
>>> JVM_OPTS="$JVM_OPTS -Xss256k"
>>> JVM_OPTS="$JVM_OPTS -XX:StringTableSize=1000003"
>>> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
>>> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
>>> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
>>> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8"
>>> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1"
>>> JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
>>> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
>>> JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
>>>     JVM_OPTS="$JVM_OPTS -XX:+UseCondCardMark"
>>> JVM_OPTS="$JVM_OPTS -Djava.net.preferIPv4Stack=true"
>>> JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT"
>>> JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.ssl=false"
>>> JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.authenticate=false"
>>> JVM_OPTS="$JVM_OPTS $JVM_EXTRA_OPTS"
>>>
>>>
>>> Still need to figure out why the node I want to decommission isn't
>>> listening on port 7199 and how I can actually decommission it.
>>>
>>> Thanks
>>> Tim
>>>
>>>
>>> On Sun, May 25, 2014 at 9:20 AM, Tim Dunphy <bl...@gmail.com>wrote:
>>>
>>>>
>>>> Hey all,
>>>>
>>>> I'm attempting to decommission a node I want to remove.
>>>>
>>>> First I get a status of the ring
>>>>
>>>> [root@beta-new:~] #nodetool status
>>>>
>>>> Datacenter: datacenter1
>>>>
>>>> =======================
>>>>
>>>> Status=Up/Down
>>>>
>>>> |/ State=Normal/Leaving/Joining/Moving
>>>>
>>>> --  Address         Load       Tokens  Owns   Host ID
>>>>             Rack
>>>>
>>>> UN  10.10.1.94  197.37 KB  256     49.4%
>>>> fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1
>>>>
>>>> UN  10.10.1.18    216.95 KB  256     50.6%
>>>> f2a48fc7-a362-43f5-9061-4bb3739fdeaf  rack
>>>>
>>>>
>>>> I see that the node I want to remove is UP. Tho I believe UN means up I
>>>> don't know what it stands for.
>>>>
>>>>
>>>> [root@beta-new:~] #nodetool -host  10.10.1.18 decommission
>>>>
>>>> Failed to connect to ' 10.10.1.18 : Connection timed out
>>>>
>>>> The connection to the node I want to decommission times out. :(
>>>>
>>>> I’m running this node from the seed node, and while I do see port 7199
>>>> active and listening there, I do NOT see this port active and listening on
>>>> the node that I want to decommission.
>>>>
>>>>
>>>> Seed node:
>>>>
>>>> [root@beta-new:~] #lsof -i :7199
>>>>
>>>> COMMAND   PID USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
>>>>
>>>> java    15331 root   51u  IPv4 566368606      0t0  TCP *:7199 (LISTEN)
>>>>
>>>>
>>>> [root@beta:/etc/alternatives/cassandrahome] #lsof -i :7199
>>>>
>>>> [root@beta:/etc/alternatives/cassandrahome] #
>>>>
>>>>
>>>> However cassandra does seem to be running on the node I want to
>>>> decommission in addition to it being shown as UN by nodetool status:
>>>>
>>>>
>>>> [root@beta:/etc/alternatives/cassandrahome] #netstat -tulpn | grep -i
>>>> listen | grep java
>>>>
>>>> tcp        0      0 0.0.0.0:46755               0.0.0.0:*
>>>>       LISTEN      23039/java
>>>>
>>>> tcp        0      0 10.10.1.18:9160           0.0.0.0:*
>>>>     LISTEN      23039/java
>>>>
>>>> tcp        0      0 0.0.0.0:42990               0.0.0.0:*
>>>>       LISTEN      23039/java
>>>>
>>>> tcp        0      0 10.10.1.18:8081           0.0.0.0:*
>>>>     LISTEN      23039/java
>>>>
>>>> tcp        0      0 10.10.1.18:9042           0.0.0.0:*
>>>>     LISTEN      23039/java
>>>>
>>>> tcp        0      0 10.10.1.18:7000           0.0.0.0:*
>>>>     LISTEN      23039/java
>>>>
>>>> tcp        0      0 0.0.0.0:7198                0.0.0.0:*
>>>>       LISTEN      23039/java
>>>>
>>>>
>>>> So why do you think my seed is listening on port 7199 but the node I
>>>> want to get rid of is not? And how can I accomplish my goal of deleting the
>>>> unwanted node?
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Tim
>>>>
>>>>
>>>>
>>>> --
>>>> GPG me!!
>>>>
>>>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>>>
>>>>
>>>
>>>
>>> --
>>> GPG me!!
>>>
>>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>>
>>>
>>
>
>
> --
> GPG me!!
>
> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>
>


-- 
Steve Robenalt
Software Architect
HighWire | Stanford University
425 Broadway St, Redwood City, CA 94063

srobenal@stanford.edu
http://highwire.stanford.edu

Re: decommissioning a node

Posted by Tim Dunphy <bl...@gmail.com>.
ok I copied the cassandra.env from the host that had cassandra listening on
port 7199 to the node that wasn't.

That got it listening on the JMX port:

[root@beta:~] #lsof -i :7199
COMMAND  PID USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
java    9197 root   45u  IPv4 6411278      0t0  TCP *:7199 (LISTEN)

But even tho I can telnet to that port from the seed node:

[root@beta-new:~] #telnet  10.10.1.18 7199
Trying 166.78.27.18...
Connected to 166.78.27.18.
Escape character is '^]'.


I still get connection refused when trying to decommission the node:

[root@beta-new:~] #nodetool -host 10.10.1.18 decommission
Failed to connect to '166.78.27.18:7199': Connection refused

To Colin, thanks for the information!

nodetool decomission <host-id-of-node-to-decomission>

UN means UP, NORMAL

Oh and thanks I tried that and it seems to be working!

[root@beta-new:~] #nodetool decommission
f2a48fc7-a362-43f5-9061-4bb3739fdeaf
Decommission will decommission the node you are connected to and does not
take arguments!

Sorry guys, wrote this email in a hurry as we're checking out of a hotel
room currently. :) I'll let you know if this does work.



On Sun, May 25, 2014 at 10:19 AM, Colin Clark <co...@clark.ws> wrote:

> Try this:
>
> nodetool decomission <host-id-of-node-to-decomission>
>
> UN means UP, NORMAL
>
> --
> Colin
> +1 320 221 9531
>
>
>
> On Sun, May 25, 2014 at 9:09 AM, Tim Dunphy <bl...@gmail.com> wrote:
>
>> Also for information that may help diagnose this issue I am running
>> cassandra 2.0.7
>>
>> I am also using these java options:
>>
>> [root@beta:/etc/alternatives/cassandrahome] #grep -i jvm_opts
>> conf/cassandra-env.sh  | grep -v '#'
>> JVM_OPTS="$JVM_OPTS -ea"
>>     JVM_OPTS="$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar"
>> JVM_OPTS="$JVM_OPTS -XX:+CMSClassUnloadingEnabled"
>> JVM_OPTS="$JVM_OPTS -XX:+UseThreadPriorities"
>> JVM_OPTS="$JVM_OPTS -XX:ThreadPriorityPolicy=42"
>> JVM_OPTS="$JVM_OPTS -Xms${MAX_HEAP_SIZE}"
>> JVM_OPTS="$JVM_OPTS -Xmx${MAX_HEAP_SIZE}"
>> JVM_OPTS="$JVM_OPTS -Xmn${HEAP_NEWSIZE}"
>> JVM_OPTS="$JVM_OPTS -XX:+HeapDumpOnOutOfMemoryError"
>>     JVM_OPTS="$JVM_OPTS
>> -XX:HeapDumpPath=$CASSANDRA_HEAPDUMP_DIR/cassandra-`date +%s`-pid$$.hprof"
>> JVM_OPTS="$JVM_OPTS -Xss256k"
>> JVM_OPTS="$JVM_OPTS -XX:StringTableSize=1000003"
>> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
>> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
>> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
>> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8"
>> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1"
>> JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
>> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
>> JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
>>     JVM_OPTS="$JVM_OPTS -XX:+UseCondCardMark"
>> JVM_OPTS="$JVM_OPTS -Djava.net.preferIPv4Stack=true"
>> JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT"
>> JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.ssl=false"
>> JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.authenticate=false"
>> JVM_OPTS="$JVM_OPTS $JVM_EXTRA_OPTS"
>>
>>
>> Still need to figure out why the node I want to decommission isn't
>> listening on port 7199 and how I can actually decommission it.
>>
>> Thanks
>> Tim
>>
>>
>> On Sun, May 25, 2014 at 9:20 AM, Tim Dunphy <bl...@gmail.com> wrote:
>>
>>>
>>> Hey all,
>>>
>>> I'm attempting to decommission a node I want to remove.
>>>
>>> First I get a status of the ring
>>>
>>> [root@beta-new:~] #nodetool status
>>>
>>> Datacenter: datacenter1
>>>
>>> =======================
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> --  Address         Load       Tokens  Owns   Host ID
>>>             Rack
>>>
>>> UN  10.10.1.94  197.37 KB  256     49.4%
>>> fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1
>>>
>>> UN  10.10.1.18    216.95 KB  256     50.6%
>>> f2a48fc7-a362-43f5-9061-4bb3739fdeaf  rack
>>>
>>>
>>> I see that the node I want to remove is UP. Tho I believe UN means up I
>>> don't know what it stands for.
>>>
>>>
>>> [root@beta-new:~] #nodetool -host  10.10.1.18 decommission
>>>
>>> Failed to connect to ' 10.10.1.18 : Connection timed out
>>>
>>> The connection to the node I want to decommission times out. :(
>>>
>>> I’m running this node from the seed node, and while I do see port 7199
>>> active and listening there, I do NOT see this port active and listening on
>>> the node that I want to decommission.
>>>
>>>
>>> Seed node:
>>>
>>> [root@beta-new:~] #lsof -i :7199
>>>
>>> COMMAND   PID USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
>>>
>>> java    15331 root   51u  IPv4 566368606      0t0  TCP *:7199 (LISTEN)
>>>
>>>
>>> [root@beta:/etc/alternatives/cassandrahome] #lsof -i :7199
>>>
>>> [root@beta:/etc/alternatives/cassandrahome] #
>>>
>>>
>>> However cassandra does seem to be running on the node I want to
>>> decommission in addition to it being shown as UN by nodetool status:
>>>
>>>
>>> [root@beta:/etc/alternatives/cassandrahome] #netstat -tulpn | grep -i
>>> listen | grep java
>>>
>>> tcp        0      0 0.0.0.0:46755               0.0.0.0:*
>>>     LISTEN      23039/java
>>>
>>> tcp        0      0 10.10.1.18:9160           0.0.0.0:*
>>>   LISTEN      23039/java
>>>
>>> tcp        0      0 0.0.0.0:42990               0.0.0.0:*
>>>     LISTEN      23039/java
>>>
>>> tcp        0      0 10.10.1.18:8081           0.0.0.0:*
>>>   LISTEN      23039/java
>>>
>>> tcp        0      0 10.10.1.18:9042           0.0.0.0:*
>>>   LISTEN      23039/java
>>>
>>> tcp        0      0 10.10.1.18:7000           0.0.0.0:*
>>>   LISTEN      23039/java
>>>
>>> tcp        0      0 0.0.0.0:7198                0.0.0.0:*
>>>     LISTEN      23039/java
>>>
>>>
>>> So why do you think my seed is listening on port 7199 but the node I
>>> want to get rid of is not? And how can I accomplish my goal of deleting the
>>> unwanted node?
>>>
>>>
>>> Thanks
>>>
>>> Tim
>>>
>>>
>>>
>>> --
>>> GPG me!!
>>>
>>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>>
>>>
>>
>>
>> --
>> GPG me!!
>>
>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>
>>
>


-- 
GPG me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B

Re: decommissioning a node

Posted by Colin Clark <co...@clark.ws>.
Try this:

nodetool decomission <host-id-of-node-to-decomission>

UN means UP, NORMAL

--
Colin
+1 320 221 9531



On Sun, May 25, 2014 at 9:09 AM, Tim Dunphy <bl...@gmail.com> wrote:

> Also for information that may help diagnose this issue I am running
> cassandra 2.0.7
>
> I am also using these java options:
>
> [root@beta:/etc/alternatives/cassandrahome] #grep -i jvm_opts
> conf/cassandra-env.sh  | grep -v '#'
> JVM_OPTS="$JVM_OPTS -ea"
>     JVM_OPTS="$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar"
> JVM_OPTS="$JVM_OPTS -XX:+CMSClassUnloadingEnabled"
> JVM_OPTS="$JVM_OPTS -XX:+UseThreadPriorities"
> JVM_OPTS="$JVM_OPTS -XX:ThreadPriorityPolicy=42"
> JVM_OPTS="$JVM_OPTS -Xms${MAX_HEAP_SIZE}"
> JVM_OPTS="$JVM_OPTS -Xmx${MAX_HEAP_SIZE}"
> JVM_OPTS="$JVM_OPTS -Xmn${HEAP_NEWSIZE}"
> JVM_OPTS="$JVM_OPTS -XX:+HeapDumpOnOutOfMemoryError"
>     JVM_OPTS="$JVM_OPTS
> -XX:HeapDumpPath=$CASSANDRA_HEAPDUMP_DIR/cassandra-`date +%s`-pid$$.hprof"
> JVM_OPTS="$JVM_OPTS -Xss256k"
> JVM_OPTS="$JVM_OPTS -XX:StringTableSize=1000003"
> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8"
> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1"
> JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
> JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
>     JVM_OPTS="$JVM_OPTS -XX:+UseCondCardMark"
> JVM_OPTS="$JVM_OPTS -Djava.net.preferIPv4Stack=true"
> JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT"
> JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.ssl=false"
> JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.authenticate=false"
> JVM_OPTS="$JVM_OPTS $JVM_EXTRA_OPTS"
>
>
> Still need to figure out why the node I want to decommission isn't
> listening on port 7199 and how I can actually decommission it.
>
> Thanks
> Tim
>
>
> On Sun, May 25, 2014 at 9:20 AM, Tim Dunphy <bl...@gmail.com> wrote:
>
>>
>> Hey all,
>>
>> I'm attempting to decommission a node I want to remove.
>>
>> First I get a status of the ring
>>
>> [root@beta-new:~] #nodetool status
>>
>> Datacenter: datacenter1
>>
>> =======================
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> --  Address         Load       Tokens  Owns   Host ID
>>           Rack
>>
>> UN  10.10.1.94  197.37 KB  256     49.4%
>> fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1
>>
>> UN  10.10.1.18    216.95 KB  256     50.6%
>> f2a48fc7-a362-43f5-9061-4bb3739fdeaf  rack
>>
>>
>> I see that the node I want to remove is UP. Tho I believe UN means up I
>> don't know what it stands for.
>>
>>
>> [root@beta-new:~] #nodetool -host  10.10.1.18 decommission
>>
>> Failed to connect to ' 10.10.1.18 : Connection timed out
>>
>> The connection to the node I want to decommission times out. :(
>>
>> I’m running this node from the seed node, and while I do see port 7199
>> active and listening there, I do NOT see this port active and listening on
>> the node that I want to decommission.
>>
>>
>> Seed node:
>>
>> [root@beta-new:~] #lsof -i :7199
>>
>> COMMAND   PID USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
>>
>> java    15331 root   51u  IPv4 566368606      0t0  TCP *:7199 (LISTEN)
>>
>>
>> [root@beta:/etc/alternatives/cassandrahome] #lsof -i :7199
>>
>> [root@beta:/etc/alternatives/cassandrahome] #
>>
>>
>> However cassandra does seem to be running on the node I want to
>> decommission in addition to it being shown as UN by nodetool status:
>>
>>
>> [root@beta:/etc/alternatives/cassandrahome] #netstat -tulpn | grep -i
>> listen | grep java
>>
>> tcp        0      0 0.0.0.0:46755               0.0.0.0:*
>>     LISTEN      23039/java
>>
>> tcp        0      0 10.10.1.18:9160           0.0.0.0:*
>>   LISTEN      23039/java
>>
>> tcp        0      0 0.0.0.0:42990               0.0.0.0:*
>>     LISTEN      23039/java
>>
>> tcp        0      0 10.10.1.18:8081           0.0.0.0:*
>>   LISTEN      23039/java
>>
>> tcp        0      0 10.10.1.18:9042           0.0.0.0:*
>>   LISTEN      23039/java
>>
>> tcp        0      0 10.10.1.18:7000           0.0.0.0:*
>>   LISTEN      23039/java
>>
>> tcp        0      0 0.0.0.0:7198                0.0.0.0:*
>>     LISTEN      23039/java
>>
>>
>> So why do you think my seed is listening on port 7199 but the node I want
>> to get rid of is not? And how can I accomplish my goal of deleting the
>> unwanted node?
>>
>>
>> Thanks
>>
>> Tim
>>
>>
>>
>> --
>> GPG me!!
>>
>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>
>>
>
>
> --
> GPG me!!
>
> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>
>

Re: decommissioning a node

Posted by Tim Dunphy <bl...@gmail.com>.
Also for information that may help diagnose this issue I am running
cassandra 2.0.7

I am also using these java options:

[root@beta:/etc/alternatives/cassandrahome] #grep -i jvm_opts
conf/cassandra-env.sh  | grep -v '#'
JVM_OPTS="$JVM_OPTS -ea"
    JVM_OPTS="$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar"
JVM_OPTS="$JVM_OPTS -XX:+CMSClassUnloadingEnabled"
JVM_OPTS="$JVM_OPTS -XX:+UseThreadPriorities"
JVM_OPTS="$JVM_OPTS -XX:ThreadPriorityPolicy=42"
JVM_OPTS="$JVM_OPTS -Xms${MAX_HEAP_SIZE}"
JVM_OPTS="$JVM_OPTS -Xmx${MAX_HEAP_SIZE}"
JVM_OPTS="$JVM_OPTS -Xmn${HEAP_NEWSIZE}"
JVM_OPTS="$JVM_OPTS -XX:+HeapDumpOnOutOfMemoryError"
    JVM_OPTS="$JVM_OPTS
-XX:HeapDumpPath=$CASSANDRA_HEAPDUMP_DIR/cassandra-`date +%s`-pid$$.hprof"
JVM_OPTS="$JVM_OPTS -Xss256k"
JVM_OPTS="$JVM_OPTS -XX:StringTableSize=1000003"
JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8"
JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1"
JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
    JVM_OPTS="$JVM_OPTS -XX:+UseCondCardMark"
JVM_OPTS="$JVM_OPTS -Djava.net.preferIPv4Stack=true"
JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT"
JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.ssl=false"
JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.authenticate=false"
JVM_OPTS="$JVM_OPTS $JVM_EXTRA_OPTS"


Still need to figure out why the node I want to decommission isn't
listening on port 7199 and how I can actually decommission it.

Thanks
Tim


On Sun, May 25, 2014 at 9:20 AM, Tim Dunphy <bl...@gmail.com> wrote:

>
> Hey all,
>
> I'm attempting to decommission a node I want to remove.
>
> First I get a status of the ring
>
> [root@beta-new:~] #nodetool status
>
> Datacenter: datacenter1
>
> =======================
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> --  Address         Load       Tokens  Owns   Host ID
>           Rack
>
> UN  10.10.1.94  197.37 KB  256     49.4%
> fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1
>
> UN  10.10.1.18    216.95 KB  256     50.6%
> f2a48fc7-a362-43f5-9061-4bb3739fdeaf  rack
>
>
> I see that the node I want to remove is UP. Tho I believe UN means up I
> don't know what it stands for.
>
>
> [root@beta-new:~] #nodetool -host  10.10.1.18 decommission
>
> Failed to connect to ' 10.10.1.18 : Connection timed out
>
> The connection to the node I want to decommission times out. :(
>
> I’m running this node from the seed node, and while I do see port 7199
> active and listening there, I do NOT see this port active and listening on
> the node that I want to decommission.
>
>
> Seed node:
>
> [root@beta-new:~] #lsof -i :7199
>
> COMMAND   PID USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
>
> java    15331 root   51u  IPv4 566368606      0t0  TCP *:7199 (LISTEN)
>
>
> [root@beta:/etc/alternatives/cassandrahome] #lsof -i :7199
>
> [root@beta:/etc/alternatives/cassandrahome] #
>
>
> However cassandra does seem to be running on the node I want to
> decommission in addition to it being shown as UN by nodetool status:
>
>
> [root@beta:/etc/alternatives/cassandrahome] #netstat -tulpn | grep -i
> listen | grep java
>
> tcp        0      0 0.0.0.0:46755               0.0.0.0:*
>   LISTEN      23039/java
>
> tcp        0      0 10.10.1.18:9160           0.0.0.0:*
> LISTEN      23039/java
>
> tcp        0      0 0.0.0.0:42990               0.0.0.0:*
>   LISTEN      23039/java
>
> tcp        0      0 10.10.1.18:8081           0.0.0.0:*
> LISTEN      23039/java
>
> tcp        0      0 10.10.1.18:9042           0.0.0.0:*
> LISTEN      23039/java
>
> tcp        0      0 10.10.1.18:7000           0.0.0.0:*
> LISTEN      23039/java
>
> tcp        0      0 0.0.0.0:7198                0.0.0.0:*
>   LISTEN      23039/java
>
>
> So why do you think my seed is listening on port 7199 but the node I want
> to get rid of is not? And how can I accomplish my goal of deleting the
> unwanted node?
>
>
> Thanks
>
> Tim
>
>
>
> --
> GPG me!!
>
> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>
>


-- 
GPG me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B