You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Pranay Agarwal <ag...@gmail.com> on 2015/03/20 01:32:53 UTC

Adding new node to Cassandra cluster is too slow

Hi,

I have 14 nodes cassandra cluster, each node as around 50gb of data. I
added 3 new nodes to the cluster and I can see the status as *UJ *for the
new nodes. They have been in that for almost a day now and their data size
seems to be same as well. There is almost no CPU or disk usage either on
them.

It can't be so slow to add new nodes or there is no benefit of scaling up
or down in real time as the requests.


-Pranay

Re: Adding new node to Cassandra cluster is too slow

Posted by Pranay Agarwal <ag...@gmail.com>.
Also, typically how long does it take for a node to join? I have in total 1
TB of data in 15 nodes cassandra cluster.

On Fri, Mar 20, 2015 at 10:53 AM, Pranay Agarwal <ag...@gmail.com>
wrote:

> Thank Rahul, you are right. Unless the node complete joins the ring, there
> is no data dependency on them.
>
>
> On Fri, Mar 20, 2015 at 4:46 AM, Rahul Neelakantan <ra...@rahul.be> wrote:
>
>> You won't lose data unless you have run nodetool cleanup on the existing
>> nodes.
>>
>> Rahul
>>
>> On Mar 19, 2015, at 9:16 PM, Pranay Agarwal <ag...@gmail.com>
>> wrote:
>>
>> Also, the new nodes (3 of them, in *UJ state*) are showing some data
>> size (~10g). Is there any data loss chances with stopping the cassandra on
>> them?
>>
>> On Thu, Mar 19, 2015 at 6:02 PM, Pranay Agarwal <agarwalpranaya@gmail.com
>> > wrote:
>>
>>> Thanks Rob, You are right. I am using ReleaseVersion: 2.1.0
>>>
>>> What do you mean by point 3? Also, by doing one at a time, does it mean
>>> wait till nodetool status of the new node is UN from UJ?
>>>
>>> On Thu, Mar 19, 2015 at 5:44 PM, Robert Coli <rc...@eventbrite.com>
>>> wrote:
>>>
>>>> On Thu, Mar 19, 2015 at 5:32 PM, Pranay Agarwal <
>>>> agarwalpranaya@gmail.com> wrote:
>>>>
>>>>> I have 14 nodes cassandra cluster, each node as around 50gb of data. I
>>>>> added 3 new nodes to the cluster and I can see the status as *UJ *for
>>>>> the new nodes. They have been in that for almost a day now and their data
>>>>> size seems to be same as well. There is almost no CPU or disk usage either
>>>>> on them.
>>>>>
>>>>
>>>> It is not supported to add multiple nodes to a cluster simultaneously
>>>> until 2.1.1 [1]. Usually what happens is one or more of the bootstraps
>>>> fails and hangs forever. This seems to be what has happened to you.
>>>>
>>>> To resolve :
>>>>
>>>> 1) stop each of the bootstrapping nodes
>>>> 2) wipe their data directories completey
>>>> 3) verify that they do not show up in gossip on the other nodes
>>>> 4) bootstrap them again, one at a time
>>>>
>>>> =Rob
>>>> [1] https://issues.apache.org/jira/browse/CASSANDRA-7069
>>>>
>>>>
>>>>
>>>
>>
>

Re: Adding new node to Cassandra cluster is too slow

Posted by Pranay Agarwal <ag...@gmail.com>.
Thank Rahul, you are right. Unless the node complete joins the ring, there
is no data dependency on them.


On Fri, Mar 20, 2015 at 4:46 AM, Rahul Neelakantan <ra...@rahul.be> wrote:

> You won't lose data unless you have run nodetool cleanup on the existing
> nodes.
>
> Rahul
>
> On Mar 19, 2015, at 9:16 PM, Pranay Agarwal <ag...@gmail.com>
> wrote:
>
> Also, the new nodes (3 of them, in *UJ state*) are showing some data size
> (~10g). Is there any data loss chances with stopping the cassandra on them?
>
> On Thu, Mar 19, 2015 at 6:02 PM, Pranay Agarwal <ag...@gmail.com>
> wrote:
>
>> Thanks Rob, You are right. I am using ReleaseVersion: 2.1.0
>>
>> What do you mean by point 3? Also, by doing one at a time, does it mean
>> wait till nodetool status of the new node is UN from UJ?
>>
>> On Thu, Mar 19, 2015 at 5:44 PM, Robert Coli <rc...@eventbrite.com>
>> wrote:
>>
>>> On Thu, Mar 19, 2015 at 5:32 PM, Pranay Agarwal <
>>> agarwalpranaya@gmail.com> wrote:
>>>
>>>> I have 14 nodes cassandra cluster, each node as around 50gb of data. I
>>>> added 3 new nodes to the cluster and I can see the status as *UJ *for
>>>> the new nodes. They have been in that for almost a day now and their data
>>>> size seems to be same as well. There is almost no CPU or disk usage either
>>>> on them.
>>>>
>>>
>>> It is not supported to add multiple nodes to a cluster simultaneously
>>> until 2.1.1 [1]. Usually what happens is one or more of the bootstraps
>>> fails and hangs forever. This seems to be what has happened to you.
>>>
>>> To resolve :
>>>
>>> 1) stop each of the bootstrapping nodes
>>> 2) wipe their data directories completey
>>> 3) verify that they do not show up in gossip on the other nodes
>>> 4) bootstrap them again, one at a time
>>>
>>> =Rob
>>> [1] https://issues.apache.org/jira/browse/CASSANDRA-7069
>>>
>>>
>>>
>>
>

Re: Adding new node to Cassandra cluster is too slow

Posted by Rahul Neelakantan <ra...@rahul.be>.
You won't lose data unless you have run nodetool cleanup on the existing nodes.

Rahul

> On Mar 19, 2015, at 9:16 PM, Pranay Agarwal <ag...@gmail.com> wrote:
> 
> Also, the new nodes (3 of them, in UJ state) are showing some data size (~10g). Is there any data loss chances with stopping the cassandra on them? 
> 
>> On Thu, Mar 19, 2015 at 6:02 PM, Pranay Agarwal <ag...@gmail.com> wrote:
>> Thanks Rob, You are right. I am using ReleaseVersion: 2.1.0
>>  
>> What do you mean by point 3? Also, by doing one at a time, does it mean wait till nodetool status of the new node is UN from UJ?
>> 
>>> On Thu, Mar 19, 2015 at 5:44 PM, Robert Coli <rc...@eventbrite.com> wrote:
>>>> On Thu, Mar 19, 2015 at 5:32 PM, Pranay Agarwal <ag...@gmail.com> wrote:
>>>> I have 14 nodes cassandra cluster, each node as around 50gb of data. I added 3 new nodes to the cluster and I can see the status as UJ for the new nodes. They have been in that for almost a day now and their data size seems to be same as well. There is almost no CPU or disk usage either on them.
>>>  
>>> It is not supported to add multiple nodes to a cluster simultaneously until 2.1.1 [1]. Usually what happens is one or more of the bootstraps fails and hangs forever. This seems to be what has happened to you.
>>> 
>>> To resolve :
>>> 
>>> 1) stop each of the bootstrapping nodes
>>> 2) wipe their data directories completey
>>> 3) verify that they do not show up in gossip on the other nodes
>>> 4) bootstrap them again, one at a time
>>> 
>>> =Rob
>>> [1] https://issues.apache.org/jira/browse/CASSANDRA-7069
> 

Re: Adding new node to Cassandra cluster is too slow

Posted by Pranay Agarwal <ag...@gmail.com>.
Also, the new nodes (3 of them, in *UJ state*) are showing some data size
(~10g). Is there any data loss chances with stopping the cassandra on them?

On Thu, Mar 19, 2015 at 6:02 PM, Pranay Agarwal <ag...@gmail.com>
wrote:

> Thanks Rob, You are right. I am using ReleaseVersion: 2.1.0
>
> What do you mean by point 3? Also, by doing one at a time, does it mean
> wait till nodetool status of the new node is UN from UJ?
>
> On Thu, Mar 19, 2015 at 5:44 PM, Robert Coli <rc...@eventbrite.com> wrote:
>
>> On Thu, Mar 19, 2015 at 5:32 PM, Pranay Agarwal <agarwalpranaya@gmail.com
>> > wrote:
>>
>>> I have 14 nodes cassandra cluster, each node as around 50gb of data. I
>>> added 3 new nodes to the cluster and I can see the status as *UJ *for
>>> the new nodes. They have been in that for almost a day now and their data
>>> size seems to be same as well. There is almost no CPU or disk usage either
>>> on them.
>>>
>>
>> It is not supported to add multiple nodes to a cluster simultaneously
>> until 2.1.1 [1]. Usually what happens is one or more of the bootstraps
>> fails and hangs forever. This seems to be what has happened to you.
>>
>> To resolve :
>>
>> 1) stop each of the bootstrapping nodes
>> 2) wipe their data directories completey
>> 3) verify that they do not show up in gossip on the other nodes
>> 4) bootstrap them again, one at a time
>>
>> =Rob
>> [1] https://issues.apache.org/jira/browse/CASSANDRA-7069
>>
>>
>>
>

Re: Adding new node to Cassandra cluster is too slow

Posted by Pranay Agarwal <ag...@gmail.com>.
Thanks Rob.

Anyway, Ideally for a new node to join with ~50GB data of it's share, it
should be done in couple of minutes or hour tops, right?

On Fri, Mar 20, 2015 at 6:07 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Fri, Mar 20, 2015 at 4:08 PM, Pranay Agarwal <ag...@gmail.com>
> wrote:
>
>> Also, the very fact the new nodes get some data (~10gb) and then get
>> stuck, the configurations and process I am using must be correct, and there
>> is no immediate fix besides upgrade the version?
>>
>
> That is correct, it sounds like a correct config and then streaming fails.
>
> =Rob
>
>

Re: Adding new node to Cassandra cluster is too slow

Posted by Robert Coli <rc...@eventbrite.com>.
On Fri, Mar 20, 2015 at 4:08 PM, Pranay Agarwal <ag...@gmail.com>
wrote:

> Also, the very fact the new nodes get some data (~10gb) and then get
> stuck, the configurations and process I am using must be correct, and there
> is no immediate fix besides upgrade the version?
>

That is correct, it sounds like a correct config and then streaming fails.

=Rob

Re: Adding new node to Cassandra cluster is too slow

Posted by Pranay Agarwal <ag...@gmail.com>.
Thanks, will do that.

Also, the very fact the new nodes get some data (~10gb) and then get stuck,
the configurations and process I am using must be correct, and there is no
immediate fix besides upgrade the version?

On Fri, Mar 20, 2015 at 4:02 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Fri, Mar 20, 2015 at 3:57 PM, Pranay Agarwal <ag...@gmail.com>
> wrote:
>
>> I guess now, I have decide it's better to upgrade to 2.1.6+ or downgrade
>> to stable release and safe way to do that.
>>
>
> You can't downgrade across major versions, you'd have to read out
> everything from the "new" cluster and write it to a different "old' cluster.
>
> If I were you, I would likely just prioritize upgrading to 2.1.3, and then
> immediately 2.1.4 when it comes out, etc.
>
> =Rob
>
>

Re: Adding new node to Cassandra cluster is too slow

Posted by Robert Coli <rc...@eventbrite.com>.
On Fri, Mar 20, 2015 at 3:57 PM, Pranay Agarwal <ag...@gmail.com>
wrote:

> I guess now, I have decide it's better to upgrade to 2.1.6+ or downgrade
> to stable release and safe way to do that.
>

You can't downgrade across major versions, you'd have to read out
everything from the "new" cluster and write it to a different "old' cluster.

If I were you, I would likely just prioritize upgrading to 2.1.3, and then
immediately 2.1.4 when it comes out, etc.

=Rob

Re: Adding new node to Cassandra cluster is too slow

Posted by Pranay Agarwal <ag...@gmail.com>.
Thanks a lot Rob.

I guess now, I have decide it's better to upgrade to 2.1.6+ or downgrade to
stable release and safe way to do that.


On Fri, Mar 20, 2015 at 3:35 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Thu, Mar 19, 2015 at 6:02 PM, Pranay Agarwal <ag...@gmail.com>
> wrote:
>
>> What do you mean by point 3? Also, by doing one at a time, does it mean
>> wait till nodetool status of the new node is UN from UJ?
>>
>
> point 3 is "look at nodetool status/nodetool ring/nodetool info/etc. on
> other cluster nodes and make sure the node you just stopped isn't in their
> list of, for example, UJ hosts.
>
> One at a time means, if you can afford it, I would wait for the node to be
> UN. If not, wait a few minutes between each join.
>
> Also, 2.1.0 is super broken, read this and consider using 1.2.x.
>
> https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
>
> =Rob
>
>

Re: Adding new node to Cassandra cluster is too slow

Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Mar 19, 2015 at 6:02 PM, Pranay Agarwal <ag...@gmail.com>
wrote:

> What do you mean by point 3? Also, by doing one at a time, does it mean
> wait till nodetool status of the new node is UN from UJ?
>

point 3 is "look at nodetool status/nodetool ring/nodetool info/etc. on
other cluster nodes and make sure the node you just stopped isn't in their
list of, for example, UJ hosts.

One at a time means, if you can afford it, I would wait for the node to be
UN. If not, wait a few minutes between each join.

Also, 2.1.0 is super broken, read this and consider using 1.2.x.

https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/

=Rob

Re: Adding new node to Cassandra cluster is too slow

Posted by Pranay Agarwal <ag...@gmail.com>.
Thanks Rob, You are right. I am using ReleaseVersion: 2.1.0

What do you mean by point 3? Also, by doing one at a time, does it mean
wait till nodetool status of the new node is UN from UJ?

On Thu, Mar 19, 2015 at 5:44 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Thu, Mar 19, 2015 at 5:32 PM, Pranay Agarwal <ag...@gmail.com>
> wrote:
>
>> I have 14 nodes cassandra cluster, each node as around 50gb of data. I
>> added 3 new nodes to the cluster and I can see the status as *UJ *for
>> the new nodes. They have been in that for almost a day now and their data
>> size seems to be same as well. There is almost no CPU or disk usage either
>> on them.
>>
>
> It is not supported to add multiple nodes to a cluster simultaneously
> until 2.1.1 [1]. Usually what happens is one or more of the bootstraps
> fails and hangs forever. This seems to be what has happened to you.
>
> To resolve :
>
> 1) stop each of the bootstrapping nodes
> 2) wipe their data directories completey
> 3) verify that they do not show up in gossip on the other nodes
> 4) bootstrap them again, one at a time
>
> =Rob
> [1] https://issues.apache.org/jira/browse/CASSANDRA-7069
>
>
>

Re: Adding new node to Cassandra cluster is too slow

Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Mar 19, 2015 at 5:32 PM, Pranay Agarwal <ag...@gmail.com>
wrote:

> I have 14 nodes cassandra cluster, each node as around 50gb of data. I
> added 3 new nodes to the cluster and I can see the status as *UJ *for the
> new nodes. They have been in that for almost a day now and their data size
> seems to be same as well. There is almost no CPU or disk usage either on
> them.
>

It is not supported to add multiple nodes to a cluster simultaneously until
2.1.1 [1]. Usually what happens is one or more of the bootstraps fails and
hangs forever. This seems to be what has happened to you.

To resolve :

1) stop each of the bootstrapping nodes
2) wipe their data directories completey
3) verify that they do not show up in gossip on the other nodes
4) bootstrap them again, one at a time

=Rob
[1] https://issues.apache.org/jira/browse/CASSANDRA-7069