You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by 土卜皿 <pe...@gmail.com> on 2016/01/26 05:19:35 UTC

About cassandra's reblance when adding one or more nodes into the existed cluster?

Hi, all
   After I added one node into the existed cluster, I want to use "nodetool
move" command:

../cassandra-2.1.11/bin/nodetool -h 192.168.56.110 move -2696407920004217295

I hope move  -2696407920004217295 (existed in 192.168.56.110) into
192.168.56.112, but I got the following error:

[root@test-1 pengcz]# ../cassandra-2.1.11/bin/nodetool -h 192.168.56.110
move -2696407920004217295

error: target token -2696407920004217295 is already owned by another node.

-- StackTrace --

java.io.IOException: target token -2696407920004217295 is already owned by
another node.

at
org.apache.cassandra.service.StorageService.move(StorageService.java:3479)

What should I do for this, Thanks in advance!


Dillon

Re: About cassandra's reblance when adding one or more nodes into the existed cluster?

Posted by Alain RODRIGUEZ <ar...@gmail.com>.
>
> I am so sorry for this issue! I should not use the command "nodetool move"
> because I set "num_tokens: 256" in every node's cassandra.yaml.
>
I was guessing this was the issue.


> So I restarted it and the join continued! I don't know why there is the
> difference between the two nodes?
>
My guess is the join did not continue. Once you bootstrap a node, system
keyspace is filled up with some information. If the bootstrap fails, you
need to wipe the data directory. I advice you to directly "rm -rf
/path_to_cassandra/data/*".

If you don't remove system KS, node will behave as he is already part of
the ring and so, won't stream anything, it won't bootstrap, just start. So
that would be the difference imho.

If you just wipe the system keyspace (not your data), it will work, yet you
will end up streaming the same data and will need to compact, adding
useless work.

So I would go clean stat and start the process again.

WARN  00:57:42 [Stream #8eb8cbe0-c488-11e5-baf9-918c8558de90] Stream failed
> ERROR 00:57:42 Exception encountered during startup
> java.lang.RuntimeException: Error during boostrap: Stream failed
>
> Take care not restarting any other node during the bootstrap. If you did
not, you might want to try bootstrapping the node in debug or trace mode
and see what stream is failing and why. You might want to change the value
of streaming_socket_timeout_in_ms (see
https://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configCassandra_yaml_r.html#reference_ds_qfg_n1r_1k__streaming_socket_timeout_in_ms
).

C*heers,

-----------------
Alain Rodriguez
France

The Last Pickle
http://www.thelastpickle.com

2016-01-27 2:12 GMT+01:00 土卜皿 <pe...@gmail.com>:

> Hi Alain and Romain,
>
> I am so sorry for this issue! I should not use the command "nodetool move"
> because I set "num_tokens: 256" in every node's cassandra.yaml.
>
> However, I have new questions after adding two nodes into the cluster:
>
> node1: 192.21.0.184
> node2: 192.21.0.185
>
> After starting the two nodes one by one, the first node 192.21.0.184 finished
> the joining immediately, but the second one 192.21.0.185 took several
> hours to join and not finished now: Under 192.168.0.184:
>
> [root@report-01 cassandra]# bin/nodetool compactionstats
> pending tasks: 0
>
> Under 192.168.0.185:
>
> [root@report-02 cassandra]# bin/nodetool compactionstats
> pending tasks: 11
> compaction type      keyspace       table   completed          total    unit   progress
>     Compaction   testforuser   users1028     9439074      545972293   bytes      1.73%
>     Compaction   user_center       users     7566752   263673724274   bytes      0.00%
> Active compaction remaining time :   4h22m27s
>
> And:
>
> [root@report-01 cassandra]# bin/nodetool status
> Datacenter: DC1
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address       Load       Tokens  Owns    Host ID                                   Rack
> UN  192.21.0.135  120.83 GB  512     ?       11e1e80f-9c5f-4f7c-81f2-42d3b704d8e3  RAC1
> UN  192.21.0.133  129.11 GB  512     ?       3e662ccb-fa2b-427b-9ca1-c2d3468bfbc9  RAC1
> UN  192.21.0.131  152.25 GB  512     ?       60f763f3-09bc-4d6f-9301-494c93857fc1  RAC1
> UJ  192.21.0.185  117.94 GB  256     ?       84c0dd16-6491-4bfb-b288-d4e410cd8c2a  RAC1
> UN  192.21.0.184  649.03 MB  256     ?       4041c232-c110-4315-89a1-23ca53b851c2  RAC1
>
> And the node2's boostrap interrupted several times because it got a error:
>
> INFO  00:57:42 [Stream #8eb8cbe0-c488-11e5-baf9-918c8558de90] Session with /192.21.0.135 is complete
> INFO  00:57:42 [Stream #8eb8cbe0-c488-11e5-baf9-918c8558de90] Session with /192.21.0.131 is complete
> WARN  00:57:42 [Stream #8eb8cbe0-c488-11e5-baf9-918c8558de90] Stream failed
> ERROR 00:57:42 Exception encountered during startup
> java.lang.RuntimeException: Error during boostrap: Stream failed
>
> So I restarted it and the join continued! I don't know why there is the
> difference between the two nodes? Thank you in advance!
>
> Dillon
>
> 2016-01-27 4:33 GMT+08:00 Romain Hardouin <ro...@yahoo.fr>:
>
>>
>>
>> Hi Dillon,
>>
>> CMIIW I suspect that you use vnodes and you want to "move one of the 256
>> tokens to another node". If yes, that's not possible.
>> "nodetool move" is not allowed with vnodes:
>> https://github.com/apache/cassandra/blob/cassandra-2.1.11/src/java/org/apache/cassandra/service/StorageService.java#L3488
>>
>> *But* if you try "nodetool move" with a token that is already owned by a
>> node, the check is done *before* the vnodes check:
>>
>> https://github.com/apache/cassandra/blob/cassandra-2.1.11/src/java/org/apache/cassandra/service/StorageService.java#L3479
>>
>> If you use single token, it seems you try to replace a node by another
>> one...
>> Maybe you could explain what is the problem that leads you to do a
>> nodetool move? (along with the nodetool ring output as Alain suggested)
>>
>> Best,
>> Romain
>>
>
>

Re: About cassandra's reblance when adding one or more nodes into the existed cluster?

Posted by 土卜皿 <pe...@gmail.com>.
Hi Alain and Romain,

I am so sorry for this issue! I should not use the command "nodetool move"
because I set "num_tokens: 256" in every node's cassandra.yaml.

However, I have new questions after adding two nodes into the cluster:

node1: 192.21.0.184
node2: 192.21.0.185

After starting the two nodes one by one, the first node 192.21.0.184 finished
the joining immediately, but the second one 192.21.0.185 took several hours
to join and not finished now: Under 192.168.0.184:

[root@report-01 cassandra]# bin/nodetool compactionstats
pending tasks: 0

Under 192.168.0.185:

[root@report-02 cassandra]# bin/nodetool compactionstats
pending tasks: 11
compaction type      keyspace       table   completed          total
 unit   progress
    Compaction   testforuser   users1028     9439074      545972293
bytes      1.73%
    Compaction   user_center       users     7566752   263673724274
bytes      0.00%
Active compaction remaining time :   4h22m27s

And:

[root@report-01 cassandra]# bin/nodetool status
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens  Owns    Host ID
                Rack
UN  192.21.0.135  120.83 GB  512     ?
11e1e80f-9c5f-4f7c-81f2-42d3b704d8e3  RAC1
UN  192.21.0.133  129.11 GB  512     ?
3e662ccb-fa2b-427b-9ca1-c2d3468bfbc9  RAC1
UN  192.21.0.131  152.25 GB  512     ?
60f763f3-09bc-4d6f-9301-494c93857fc1  RAC1
UJ  192.21.0.185  117.94 GB  256     ?
84c0dd16-6491-4bfb-b288-d4e410cd8c2a  RAC1
UN  192.21.0.184  649.03 MB  256     ?
4041c232-c110-4315-89a1-23ca53b851c2  RAC1

And the node2's boostrap interrupted several times because it got a error:

INFO  00:57:42 [Stream #8eb8cbe0-c488-11e5-baf9-918c8558de90] Session
with /192.21.0.135 is complete
INFO  00:57:42 [Stream #8eb8cbe0-c488-11e5-baf9-918c8558de90] Session
with /192.21.0.131 is complete
WARN  00:57:42 [Stream #8eb8cbe0-c488-11e5-baf9-918c8558de90] Stream failed
ERROR 00:57:42 Exception encountered during startup
java.lang.RuntimeException: Error during boostrap: Stream failed

So I restarted it and the join continued! I don't know why there is the
difference between the two nodes? Thank you in advance!

Dillon

2016-01-27 4:33 GMT+08:00 Romain Hardouin <ro...@yahoo.fr>:

>
>
> Hi Dillon,
>
> CMIIW I suspect that you use vnodes and you want to "move one of the 256
> tokens to another node". If yes, that's not possible.
> "nodetool move" is not allowed with vnodes:
> https://github.com/apache/cassandra/blob/cassandra-2.1.11/src/java/org/apache/cassandra/service/StorageService.java#L3488
>
> *But* if you try "nodetool move" with a token that is already owned by a
> node, the check is done *before* the vnodes check:
>
> https://github.com/apache/cassandra/blob/cassandra-2.1.11/src/java/org/apache/cassandra/service/StorageService.java#L3479
>
> If you use single token, it seems you try to replace a node by another
> one...
> Maybe you could explain what is the problem that leads you to do a
> nodetool move? (along with the nodetool ring output as Alain suggested)
>
> Best,
> Romain
>

Re: About cassandra's reblance when adding one or more nodes into the existed cluster?

Posted by Romain Hardouin <ro...@yahoo.fr>.

Hi Dillon, 
CMIIW I suspect that you use vnodes and you want to "move one of the 256 tokens to another node". If yes, that's not possible."nodetool move" is not allowed with vnodes: https://github.com/apache/cassandra/blob/cassandra-2.1.11/src/java/org/apache/cassandra/service/StorageService.java#L3488
*But* if you try "nodetool move" with a token that is already owned by a node, the check is done *before* the vnodes check: https://github.com/apache/cassandra/blob/cassandra-2.1.11/src/java/org/apache/cassandra/service/StorageService.java#L3479
If you use single token, it seems you try to replace a node by another one...Maybe you could explain what is the problem that leads you to do a nodetool move? (along with the nodetool ring output as Alain suggested)
Best,Romain

Re: About cassandra's reblance when adding one or more nodes into the existed cluster?

Posted by Alain RODRIGUEZ <ar...@gmail.com>.
Hi Dillon,

I will assume you're using Murmur3 and that you *don't* use vnodes (ie
"num_token" option commented in cassandra.yaml) as if vnodes are enabled
this operation is useless (and maybe harmful, not sure about that)

Can you give us the output of :

$ *nodetool ring*

It looks like your trying to take a token already in use as described.

 After I added one node into the existed cluster, I want to use "nodetool
> move" command:


Why not starting with the good token setting "num_token" to this token ? A
move is quite an heavy operation, plus you will have to run *nodetool
cleanup *on all the nodes which have had their range reduced (node impacted
+ replicas) which is also long and potentially heavy to.

-----------------
Alain Rodriguez
France

The Last Pickle
http://www.thelastpickle.com

2016-01-26 5:19 GMT+01:00 土卜皿 <pe...@gmail.com>:

> Hi, all
>    After I added one node into the existed cluster, I want to use
> "nodetool move" command:
>
> ../cassandra-2.1.11/bin/nodetool -h 192.168.56.110 move
> -2696407920004217295
>
> I hope move  -2696407920004217295 (existed in 192.168.56.110) into
> 192.168.56.112, but I got the following error:
>
> [root@test-1 pengcz]# ../cassandra-2.1.11/bin/nodetool -h 192.168.56.110
> move -2696407920004217295
>
> error: target token -2696407920004217295 is already owned by another node.
>
> -- StackTrace --
>
> java.io.IOException: target token -2696407920004217295 is already owned by
> another node.
>
> at
> org.apache.cassandra.service.StorageService.move(StorageService.java:3479)
>
> What should I do for this, Thanks in advance!
>
>
> Dillon
>
>
>