You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Vlad <qa...@yahoo.com> on 2017/04/08 14:55:58 UTC

Multiple nodes decommission

Hi,
how multiple nodes should be decommissioned by "nodetool decommission"- one by one or in parallel ?

Thanks.

Re: Multiple nodes decommission

Posted by Vlad <qa...@yahoo.com>.
>There's a system property (actually 2)Which ones?
 

    On Wednesday, April 19, 2017 9:17 AM, Jeff Jirsa <jj...@apache.org> wrote:
 

 

On 2017-04-12 11:30 (-0700), Vlad <qa...@yahoo.com> wrote: 
> Interesting, there is no such explicit warning for v.3 https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddNodeToCluster.html
> It says  
>    - Start the bootstrap node.
>    - verify that the node is fully bootstrapped and all other nodes are up (UN)
> 
> Does it mean that we should start them one by one? May somebody from developers can clarify this issue? 

You should treat range movements (bootstrap/decom/etc) in 3.0 the same way you treated 2.0/2.1/2.2 - there's nothing special (as far as I know) to make it any more safe than 2.x was.

The warnings and restrictions are because simultaneous range movements PROBABLY violate your assumed consistency guarantees if you're using vnodes. If you're using single token, this can be avoided. 

If you really know what you're doing, you can tell cassandra to let you do simultaneous range movements anyway. There's a system property (actually 2) that will let you tell cassandra you know the tradeoffs, and then you can bootstrap/decom/etc more than one node at a time. Generally, it's one of those things where if you have to ask about it, you probably should just stick to the default one-at-a-time guidelines (which isn't meant to sound condescending, but it's an area where you can definitely violate consistency and maybe even lose data if you're not sure).

- Jeff


   

Re: Multiple nodes decommission

Posted by Jeff Jirsa <jj...@apache.org>.

On 2017-04-12 11:30 (-0700), Vlad <qa...@yahoo.com> wrote: 
> Interesting, there is no such explicit warning for v.3 https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddNodeToCluster.html
> It says   
>    - Start the bootstrap node.
>    - verify that the node is fully bootstrapped and all other nodes are up (UN)
> 
> Does it mean that we should start them one by one? May somebody from developers can clarify this issue? 

You should treat range movements (bootstrap/decom/etc) in 3.0 the same way you treated 2.0/2.1/2.2 - there's nothing special (as far as I know) to make it any more safe than 2.x was.

The warnings and restrictions are because simultaneous range movements PROBABLY violate your assumed consistency guarantees if you're using vnodes. If you're using single token, this can be avoided. 

If you really know what you're doing, you can tell cassandra to let you do simultaneous range movements anyway. There's a system property (actually 2) that will let you tell cassandra you know the tradeoffs, and then you can bootstrap/decom/etc more than one node at a time. Generally, it's one of those things where if you have to ask about it, you probably should just stick to the default one-at-a-time guidelines (which isn't meant to sound condescending, but it's an area where you can definitely violate consistency and maybe even lose data if you're not sure).

- Jeff

Re: Multiple nodes decommission

Posted by Vlad <qa...@yahoo.com>.
Interesting, there is no such explicit warning for v.3 https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddNodeToCluster.html
It says   
   - Start the bootstrap node.
   - verify that the node is fully bootstrapped and all other nodes are up (UN)

Does it mean that we should start them one by one? May somebody from developers can clarify this issue? 

    On Wednesday, April 12, 2017 9:16 PM, Jacob Shadix <ja...@gmail.com> wrote:
 

 It's still not recommended to start at the same time. Stagger by 2 minutes is what the following documentation suggests; along with additional steps. re. version 2.1
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_node_to_cluster_t.html

-- Jacob Shadix 

On Wed, Apr 12, 2017 at 1:48 PM, Vlad <qa...@yahoo.com> wrote:

But it seems OK to add multiple nodes at once, right?
 

    On Tuesday, April 11, 2017 8:38 PM, Jacob Shadix <ja...@gmail.com> wrote:
 

 Right! Another reason why I just stick with sequential decommissions. Maybe someone here could shed some light on what happens under the covers if parallel decommissions are kicked off.
-- Jacob Shadix 

On Tue, Apr 11, 2017 at 12:55 PM, benjamin roth <br...@gmail.com> wrote:

I did not test it but I'd bet that parallel decommision will lead to inconsistencies.Each decommission results in range movements and range reassignments which becomes effective after a successful decommission.If you start several decommissions at once, I guess the calculated reassignments are invalid for at least one node after the first node finished the decommission process.
I hope someone will correct me if i am wrong.
2017-04-11 18:43 GMT+02:00 Jacob Shadix <ja...@gmail.com>:

Are you using vnodes? I typically do one-by-one as the decommission will create additional load/network activity streaming data to the other nodes as the token ranges are reassigned. 
-- Jacob Shadix 

On Sat, Apr 8, 2017 at 10:55 AM, Vlad <qa...@yahoo.com> wrote:

Hi,
how multiple nodes should be decommissioned by "nodetool decommission"- one by one or in parallel ?

Thanks.








   



   

Re: Multiple nodes decommission

Posted by Jacob Shadix <ja...@gmail.com>.
It's still not recommended to start at the same time. Stagger by 2 minutes
is what the following documentation suggests; along with additional steps.
re. version 2.1

https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_node_to_cluster_t.html

-- Jacob Shadix

On Wed, Apr 12, 2017 at 1:48 PM, Vlad <qa...@yahoo.com> wrote:

> But it seems OK to add multiple nodes at once, right?
>
>
> On Tuesday, April 11, 2017 8:38 PM, Jacob Shadix <ja...@gmail.com>
> wrote:
>
>
> Right! Another reason why I just stick with sequential decommissions.
> Maybe someone here could shed some light on what happens under the covers
> if parallel decommissions are kicked off.
>
> -- Jacob Shadix
>
> On Tue, Apr 11, 2017 at 12:55 PM, benjamin roth <br...@gmail.com> wrote:
>
> I did not test it but I'd bet that parallel decommision will lead to
> inconsistencies.
> Each decommission results in range movements and range reassignments which
> becomes effective after a successful decommission.
> If you start several decommissions at once, I guess the calculated
> reassignments are invalid for at least one node after the first node
> finished the decommission process.
>
> I hope someone will correct me if i am wrong.
>
> 2017-04-11 18:43 GMT+02:00 Jacob Shadix <ja...@gmail.com>:
>
> Are you using vnodes? I typically do one-by-one as the decommission will
> create additional load/network activity streaming data to the other nodes
> as the token ranges are reassigned.
>
> -- Jacob Shadix
>
> On Sat, Apr 8, 2017 at 10:55 AM, Vlad <qa...@yahoo.com> wrote:
>
> Hi,
>
> how multiple nodes should be decommissioned by "nodetool decommission"-
> one by one or in parallel ?
>
> Thanks.
>
>
>
>
>
>
>

Re: Multiple nodes decommission

Posted by Vlad <qa...@yahoo.com>.
But it seems OK to add multiple nodes at once, right?
 

    On Tuesday, April 11, 2017 8:38 PM, Jacob Shadix <ja...@gmail.com> wrote:
 

 Right! Another reason why I just stick with sequential decommissions. Maybe someone here could shed some light on what happens under the covers if parallel decommissions are kicked off.
-- Jacob Shadix 

On Tue, Apr 11, 2017 at 12:55 PM, benjamin roth <br...@gmail.com> wrote:

I did not test it but I'd bet that parallel decommision will lead to inconsistencies.Each decommission results in range movements and range reassignments which becomes effective after a successful decommission.If you start several decommissions at once, I guess the calculated reassignments are invalid for at least one node after the first node finished the decommission process.
I hope someone will correct me if i am wrong.
2017-04-11 18:43 GMT+02:00 Jacob Shadix <ja...@gmail.com>:

Are you using vnodes? I typically do one-by-one as the decommission will create additional load/network activity streaming data to the other nodes as the token ranges are reassigned. 
-- Jacob Shadix 

On Sat, Apr 8, 2017 at 10:55 AM, Vlad <qa...@yahoo.com> wrote:

Hi,
how multiple nodes should be decommissioned by "nodetool decommission"- one by one or in parallel ?

Thanks.








   

Re: Multiple nodes decommission

Posted by Jacob Shadix <ja...@gmail.com>.
Right! Another reason why I just stick with sequential decommissions. Maybe
someone here could shed some light on what happens under the covers if
parallel decommissions are kicked off.

-- Jacob Shadix

On Tue, Apr 11, 2017 at 12:55 PM, benjamin roth <br...@gmail.com> wrote:

> I did not test it but I'd bet that parallel decommision will lead to
> inconsistencies.
> Each decommission results in range movements and range reassignments which
> becomes effective after a successful decommission.
> If you start several decommissions at once, I guess the calculated
> reassignments are invalid for at least one node after the first node
> finished the decommission process.
>
> I hope someone will correct me if i am wrong.
>
> 2017-04-11 18:43 GMT+02:00 Jacob Shadix <ja...@gmail.com>:
>
>> Are you using vnodes? I typically do one-by-one as the decommission will
>> create additional load/network activity streaming data to the other nodes
>> as the token ranges are reassigned.
>>
>> -- Jacob Shadix
>>
>> On Sat, Apr 8, 2017 at 10:55 AM, Vlad <qa...@yahoo.com> wrote:
>>
>>> Hi,
>>>
>>> how multiple nodes should be decommissioned by "nodetool decommission"-
>>> one by one or in parallel ?
>>>
>>> Thanks.
>>>
>>
>>
>

Re: Multiple nodes decommission

Posted by Jens Rantil <je...@tink.se>.
AFAIK, the fastest way to add multiple nodes is to make sure your clients
are only reading/writing to/from your current datacenter, create a new
datacenter with replication 0, add nodes to the new datacenter, increase
replication factor of the new datacenter, do `nodetool bootstrap` on all
nodes on new datacenter, point your clients to the new DC and finally
decommision the old one. I've done that multiple times and it's been much
faster than adding a few nodes. Obviously, this depends on how much data
you have...

/J

On Sat, Apr 15, 2017 at 10:19 AM, Vlad <qa...@yahoo.com> wrote:

> *>range reassignments which becomes effective after a successful
> decommission.*
>
> But during leaving nodes announce themselves as "leaving". Do other
> leaving nodes taking this into account and not stream data to them?
> (applicable also for joining). I hope so ))
>
> I guess problem with sequential adding/removing nodes is data
> overstreaming and non-even load distribution. I mean if we have three racks
> it's better to add/remove by three nodes (one in each rack) and to avoid
> state with four nodes, for example.
>
> Any thoughts?
>
>
> On Tuesday, April 11, 2017 7:55 PM, benjamin roth <br...@gmail.com>
> wrote:
>
>
> I did not test it but I'd bet that parallel decommision will lead to
> inconsistencies.
> Each decommission results in range movements and range reassignments which
> becomes effective after a successful decommission.
> If you start several decommissions at once, I guess the calculated
> reassignments are invalid for at least one node after the first node
> finished the decommission process.
>
> I hope someone will correct me if i am wrong.
>
> 2017-04-11 18:43 GMT+02:00 Jacob Shadix <ja...@gmail.com>:
>
> Are you using vnodes? I typically do one-by-one as the decommission will
> create additional load/network activity streaming data to the other nodes
> as the token ranges are reassigned.
>
> -- Jacob Shadix
>
> On Sat, Apr 8, 2017 at 10:55 AM, Vlad <qa...@yahoo.com> wrote:
>
> Hi,
>
> how multiple nodes should be decommissioned by "nodetool decommission"-
> one by one or in parallel ?
>
> Thanks.
>
>
>
>
>
>


-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
 Twitter <https://twitter.com/tink>

Re: Multiple nodes decommission

Posted by Vlad <qa...@yahoo.com>.
>range reassignments which becomes effective after a successful decommission.
But during leaving nodes announce themselves as "leaving". Do other leaving nodes taking this into account and not stream data to them? (applicable also for joining). I hope so ))

I guess problem with sequential adding/removing nodes is data overstreaming and non-even load distribution. I mean if we have three racks it's better to add/remove by three nodes (one in each rack) and to avoid state with four nodes, for example.

Any thoughts?
 

    On Tuesday, April 11, 2017 7:55 PM, benjamin roth <br...@gmail.com> wrote:
 

 I did not test it but I'd bet that parallel decommision will lead to inconsistencies.Each decommission results in range movements and range reassignments which becomes effective after a successful decommission.If you start several decommissions at once, I guess the calculated reassignments are invalid for at least one node after the first node finished the decommission process.
I hope someone will correct me if i am wrong.
2017-04-11 18:43 GMT+02:00 Jacob Shadix <ja...@gmail.com>:

Are you using vnodes? I typically do one-by-one as the decommission will create additional load/network activity streaming data to the other nodes as the token ranges are reassigned. 
-- Jacob Shadix 

On Sat, Apr 8, 2017 at 10:55 AM, Vlad <qa...@yahoo.com> wrote:

Hi,
how multiple nodes should be decommissioned by "nodetool decommission"- one by one or in parallel ?

Thanks.






   

Re: Multiple nodes decommission

Posted by benjamin roth <br...@gmail.com>.
I did not test it but I'd bet that parallel decommision will lead to
inconsistencies.
Each decommission results in range movements and range reassignments which
becomes effective after a successful decommission.
If you start several decommissions at once, I guess the calculated
reassignments are invalid for at least one node after the first node
finished the decommission process.

I hope someone will correct me if i am wrong.

2017-04-11 18:43 GMT+02:00 Jacob Shadix <ja...@gmail.com>:

> Are you using vnodes? I typically do one-by-one as the decommission will
> create additional load/network activity streaming data to the other nodes
> as the token ranges are reassigned.
>
> -- Jacob Shadix
>
> On Sat, Apr 8, 2017 at 10:55 AM, Vlad <qa...@yahoo.com> wrote:
>
>> Hi,
>>
>> how multiple nodes should be decommissioned by "nodetool decommission"-
>> one by one or in parallel ?
>>
>> Thanks.
>>
>
>

Re: Multiple nodes decommission

Posted by Jacob Shadix <ja...@gmail.com>.
Are you using vnodes? I typically do one-by-one as the decommission will
create additional load/network activity streaming data to the other nodes
as the token ranges are reassigned.

-- Jacob Shadix

On Sat, Apr 8, 2017 at 10:55 AM, Vlad <qa...@yahoo.com> wrote:

> Hi,
>
> how multiple nodes should be decommissioned by "nodetool decommission"-
> one by one or in parallel ?
>
> Thanks.
>