You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Fd Habash <fm...@gmail.com> on 2019/05/01 20:12:01 UTC

RE: Bootstrapping to Replace a Dead Node vs. Adding a New Node:Consistency Guarantees

Probably, I needed to be clearer in my inquiry ….

I’m investigating a situation where our diagnostic data is telling us that C* has lost some of the application data. I mean, getsstables for the data returns zero on all nodes in all racks. 

The last pickle article below & Jeff Jirsa had described a situation where bootstrapping a node to extend the cluster can loose data if this new node bootstraps from a stale SECONDARY replica (node that was offline > hinted had-off window). This was fixed in cassandra-2434. http://thelastpickle.com/blog/2017/05/23/auto-bootstrapping-part1.html

The article & the Jira above describe bootstrapping when extending a cluster.

I understand replacing a dead node does not involve range movement, but will the above Jira fix prevent the bootstrap process when a replacing a dead node from using secondary replica?

Thanks 

----------------
Thank you

From: Fred Habash
Sent: Wednesday, May 1, 2019 6:50 AM
To: user@cassandra.apache.org
Subject: Re: Bootstrapping to Replace a Dead Node vs. Adding a New Node:Consistency Guarantees

Thank you. 

Range movement is one reason this is enforced when adding a new node. But, what about forcing a consistent bootstrap i.e. bootstrapping from primary owner of the range and not a secondary replica. 

How’s consistent bootstrap enforced when replacing a dead node. 

————-
Thank you. 

On Apr 30, 2019, at 7:40 PM, Alok Dwivedi <al...@instaclustr.com> wrote:
When a new node joins the ring, it needs to own new token ranges. This should be unique to the new node and we don’t want to end up in a situation where two nodes joining simultaneously can own same range (and ideally evenly distributed). Cassandra has this 2 minute wait rule for gossip state to propagate before a node is added.  But this on its does not guarantees that token ranges can’t overlap. See this ticket for more details https://issues.apache.org/jira/browse/CASSANDRA-7069 To overcome this  issue, the approach was to only allow one node joining at a time. 
 
When you replace a dead node the new token range selection does not applies as the replacing node just owns the token ranges of the dead node. I think that’s why the restriction of only replacing one node at a time does not applies in this case. 
 
 
Thanks 
Alok Dwivedi
Senior Consultant
https://www.instaclustr.com/platform/
 
 
 
 
 
From: Fd Habash <fm...@gmail.com>
Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Date: Wednesday, 1 May 2019 at 06:18
To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Subject: Bootstrapping to Replace a Dead Node vs. Adding a New Node: Consistency Guarantees 
 
Reviewing the documentation &  based on my testing, using C* 2.2.8, I was not able to extend the cluster by adding multiple nodes simultaneously. I got an error message …
 
Other bootstrapping/leaving/moving nodes detected, cannot bootstrap while cassandra.consistent.rangemovement is true
 
I understand this is to force a node to bootstrap from the former owner of the range when adding a node as part of extending the cluster.
 
However, I was able to bootstrap multiple nodes to replace dead nodes. C* did not complain about it.
 
Is consistent range movement & the guarantee it offers to bootstrap from primary range owner not applicable when bootstrapping to replace dead nodes? 
 
----------------
Thank you

Re: Bootstrapping to Replace a Dead Node vs. Adding a NewNode:Consistency Guarantees

Posted by Alok Dwivedi <al...@instaclustr.com>.

Cassandra-2434 is ensuring that when we add new node, it streams data from a source that it will replace, once the data has been completely streamed. This is explained in detail in the blog post you shared. This ensures that one continues to get same consistency as it was before new node was added. So if new node D now owns data for token range that originally was owned by replicas A, B & C, then this fix ensures that if D streams from A then A no longer owns that token range once D has fully joined the cluster. It avoided previous issues where it could stream from A but  B later on is the one that no longer owns that token range (gives up its range ownership to new node D) and if A never had the data then you have kind of lost what you had in B as B no longer owns that token range. Hence the fix Cassandra-2434 helps with consistency by ensuring that node used for streaming data (A) is the one that no longer owns the data so the new node (D) along with other remaining replicas (B & C) should now give you same consistency as what you had before D joined the cluster.

Replacing a dead node is different in the sense that node from which replacing node will stream data will also continue to remain data owner. So let’s say you had A,B,C nodes, C is dead and you replace C with D. Now D can stream from either A or B but whatever it choose will also continue to own that token range i.e. after D replaces C , we have now A,B & D instead of A , B and C (as C is dead).

My understanding is that restriction of single node at a time was applied at cluster expansion time to avoid the clashes in token selection which only applies at time of extending cluster by adding new node (not when replacing dead node). This is what CASSANDRA-7069 addresses.

I think in your case, when replacing more than one nodes, in theory doing it serially won’t overcome the issue which I guess  you are highlighting here, which is, if I have to stream from A or B how do I cover the case  that A is the one with some right data while B is the one with some right data. I think streaming will use one source. So whether you do it serially or multiple at a time you have that risk (IMO). If I were you, I would do it one node at a time to avoid overloading my cluster and then I would run a repair to ensure any data I might have missed (because of the source it chose during streaming didn’t had it) I sync that with repair. Then I would move on to doing same steps with next dead node to be replaced.


Thanks
Alok Dwivedi
Senior Consultant
https://www.instaclustr.com/platform/





From: Fd Habash <fm...@gmail.com>
Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Date: Thursday, 2 May 2019 at 08:26
To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Subject: RE: Bootstrapping to Replace a Dead Node vs. Adding a NewNode:Consistency Guarantees

Appreciate your response.

As for extending the cluster & keeping the default range movement = true, C* won’t allow  me to bootstrap multiples nodes, anyway.

But, the question I’m still posing and have not gotten an answer for, is if fix Cassandra-2434 disallows bootstrapping multiple nodes to extend the cluster (which I was able to test in my lab cluster), why did it allow to bootstrap multiple nodes in the process of replacing dead nodes (no range calc).

This fix forces a node to boostrap from former owner. Is this still the case also when bootstrapping when replacing dead node.


----------------
Thank you

From: ZAIDI, ASAD A<ma...@att.com>
Sent: Wednesday, May 1, 2019 5:13 PM
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: RE: Bootstrapping to Replace a Dead Node vs. Adding a NewNode:Consistency Guarantees


The article you mentioned here clearly says  “For new users to Cassandra, the safest way to add multiple nodes into a cluster is to add them one at a time. Stay tuned as I will be following up with another post on bootstrapping.”

When extending cluster it is indeed recommended to go slow & serially. Optionally you can use cassandra.consistent.rangemovement=false but you can run in getting over streamed data.  Since you’re using release way newer when fixed introduced , I assumed you won’t see same behavior as described for the version which fix addresses. After adding node , if you won’t get  consistent data, you query consistency level should be able to pull consistent data , given you can tolerate bit latency until your repair is complete – if you go by recommendation i.e. to add one node at a time – you’ll avoid all these nuances .



From: Fd Habash [mailto:fmhabash@gmail.com]
Sent: Wednesday, May 01, 2019 3:12 PM
To: user@cassandra.apache.org
Subject: RE: Bootstrapping to Replace a Dead Node vs. Adding a New Node:Consistency Guarantees

Probably, I needed to be clearer in my inquiry ….

I’m investigating a situation where our diagnostic data is telling us that C* has lost some of the application data. I mean, getsstables for the data returns zero on all nodes in all racks.

The last pickle article below & Jeff Jirsa had described a situation where bootstrapping a node to extend the cluster can loose data if this new node bootstraps from a stale SECONDARY replica (node that was offline > hinted had-off window). This was fixed in cassandra-2434. http://thelastpickle.com/blog/2017/05/23/auto-bootstrapping-part1.html<https://urldefense.proofpoint.com/v2/url?u=http-3A__thelastpickle.com_blog_2017_05_23_auto-2Dbootstrapping-2Dpart1.html&d=DwMFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=FsmDztdsVuIKml8IDhdHdg&m=yAmuusU5W5Z8dvA1cs1QVdy7W9gg-vAXfeNGmQMPyy4&s=pbVHJ6KZhzTq8V70BIRPMwcBA9e3cfY6lWen1qDy8EU&e=>

The article & the Jira above describe bootstrapping when extending a cluster.

I understand replacing a dead node does not involve range movement, but will the above Jira fix prevent the bootstrap process when a replacing a dead node from using secondary replica?

Thanks

----------------
Thank you

From: Fred Habash<ma...@gmail.com>
Sent: Wednesday, May 1, 2019 6:50 AM
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Re: Bootstrapping to Replace a Dead Node vs. Adding a New Node:Consistency Guarantees

Thank you.

Range movement is one reason this is enforced when adding a new node. But, what about forcing a consistent bootstrap i.e. bootstrapping from primary owner of the range and not a secondary replica.

How’s consistent bootstrap enforced when replacing a dead node.

————-
Thank you.

On Apr 30, 2019, at 7:40 PM, Alok Dwivedi <al...@instaclustr.com>> wrote:
When a new node joins the ring, it needs to own new token ranges. This should be unique to the new node and we don’t want to end up in a situation where two nodes joining simultaneously can own same range (and ideally evenly distributed). Cassandra has this 2 minute wait rule for gossip state to propagate before a node is added.  But this on its does not guarantees that token ranges can’t overlap. See this ticket for more details https://issues.apache.org/jira/browse/CASSANDRA-7069<https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_CASSANDRA-2D7069&d=DwMFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=FsmDztdsVuIKml8IDhdHdg&m=yAmuusU5W5Z8dvA1cs1QVdy7W9gg-vAXfeNGmQMPyy4&s=zDOzA0az-dhAjy3JStExiYNwTeizC1MJSRRAG-1NNyA&e=> To overcome this  issue, the approach was to only allow one node joining at a time.

When you replace a dead node the new token range selection does not applies as the replacing node just owns the token ranges of the dead node. I think that’s why the restriction of only replacing one node at a time does not applies in this case.


Thanks
Alok Dwivedi
Senior Consultant
https://www.instaclustr.com/platform/<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.instaclustr.com_platform_&d=DwMFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=FsmDztdsVuIKml8IDhdHdg&m=yAmuusU5W5Z8dvA1cs1QVdy7W9gg-vAXfeNGmQMPyy4&s=bfUvw3cmdQCBT0el1ogPfMKVTFGOIzbJuKhaFtzKebw&e=>





From: Fd Habash <fm...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, 1 May 2019 at 06:18
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Bootstrapping to Replace a Dead Node vs. Adding a New Node: Consistency Guarantees

Reviewing the documentation &  based on my testing, using C* 2.2.8, I was not able to extend the cluster by adding multiple nodes simultaneously. I got an error message …

Other bootstrapping/leaving/moving nodes detected, cannot bootstrap while cassandra.consistent.rangemovement is true

I understand this is to force a node to bootstrap from the former owner of the range when adding a node as part of extending the cluster.

However, I was able to bootstrap multiple nodes to replace dead nodes. C* did not complain about it.

Is consistent range movement & the guarantee it offers to bootstrap from primary range owner not applicable when bootstrapping to replace dead nodes?

----------------
Thank you

RE: Bootstrapping to Replace a Dead Node vs. Adding a NewNode:Consistency Guarantees

Posted by Fd Habash <fm...@gmail.com>.

Appreciate your response. 

As for extending the cluster & keeping the default range movement = true, C* won’t allow  me to bootstrap multiples nodes, anyway. 

But, the question I’m still posing and have not gotten an answer for, is if fix Cassandra-2434 disallows bootstrapping multiple nodes to extend the cluster (which I was able to test in my lab cluster), why did it allow to bootstrap multiple nodes in the process of replacing dead nodes (no range calc).

This fix forces a node to boostrap from former owner. Is this still the case also when bootstrapping when replacing dead node.


----------------
Thank you

From: ZAIDI, ASAD A
Sent: Wednesday, May 1, 2019 5:13 PM
To: user@cassandra.apache.org
Subject: RE: Bootstrapping to Replace a Dead Node vs. Adding a NewNode:Consistency Guarantees


The article you mentioned here clearly says  “For new users to Cassandra, the safest way to add multiple nodes into a cluster is to add them one at a time. Stay tuned as I will be following up with another post on bootstrapping.” 

When extending cluster it is indeed recommended to go slow & serially. Optionally you can use cassandra.consistent.rangemovement=false but you can run in getting over streamed data.  Since you’re using release way newer when fixed introduced , I assumed you won’t see same behavior as described for the version which fix addresses. After adding node , if you won’t get  consistent data, you query consistency level should be able to pull consistent data , given you can tolerate bit latency until your repair is complete – if you go by recommendation i.e. to add one node at a time – you’ll avoid all these nuances .



From: Fd Habash [mailto:fmhabash@gmail.com] 
Sent: Wednesday, May 01, 2019 3:12 PM
To: user@cassandra.apache.org
Subject: RE: Bootstrapping to Replace a Dead Node vs. Adding a New Node:Consistency Guarantees

Probably, I needed to be clearer in my inquiry ….

I’m investigating a situation where our diagnostic data is telling us that C* has lost some of the application data. I mean, getsstables for the data returns zero on all nodes in all racks. 

The last pickle article below & Jeff Jirsa had described a situation where bootstrapping a node to extend the cluster can loose data if this new node bootstraps from a stale SECONDARY replica (node that was offline > hinted had-off window). This was fixed in cassandra-2434. http://thelastpickle.com/blog/2017/05/23/auto-bootstrapping-part1.html

The article & the Jira above describe bootstrapping when extending a cluster.

I understand replacing a dead node does not involve range movement, but will the above Jira fix prevent the bootstrap process when a replacing a dead node from using secondary replica?

Thanks 

----------------
Thank you

From: Fred Habash
Sent: Wednesday, May 1, 2019 6:50 AM
To: user@cassandra.apache.org
Subject: Re: Bootstrapping to Replace a Dead Node vs. Adding a New Node:Consistency Guarantees

Thank you. 

Range movement is one reason this is enforced when adding a new node. But, what about forcing a consistent bootstrap i.e. bootstrapping from primary owner of the range and not a secondary replica. 

How’s consistent bootstrap enforced when replacing a dead node. 

————-
Thank you. 

On Apr 30, 2019, at 7:40 PM, Alok Dwivedi <al...@instaclustr.com> wrote:
When a new node joins the ring, it needs to own new token ranges. This should be unique to the new node and we don’t want to end up in a situation where two nodes joining simultaneously can own same range (and ideally evenly distributed). Cassandra has this 2 minute wait rule for gossip state to propagate before a node is added.  But this on its does not guarantees that token ranges can’t overlap. See this ticket for more details https://issues.apache.org/jira/browse/CASSANDRA-7069 To overcome this  issue, the approach was to only allow one node joining at a time. 
 
When you replace a dead node the new token range selection does not applies as the replacing node just owns the token ranges of the dead node. I think that’s why the restriction of only replacing one node at a time does not applies in this case. 
 
 
Thanks 
Alok Dwivedi
Senior Consultant
https://www.instaclustr.com/platform/
 
 
 
 
 
From: Fd Habash <fm...@gmail.com>
Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Date: Wednesday, 1 May 2019 at 06:18
To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Subject: Bootstrapping to Replace a Dead Node vs. Adding a New Node: Consistency Guarantees 
 
Reviewing the documentation &  based on my testing, using C* 2.2.8, I was not able to extend the cluster by adding multiple nodes simultaneously. I got an error message …
 
Other bootstrapping/leaving/moving nodes detected, cannot bootstrap while cassandra.consistent.rangemovement is true
 
I understand this is to force a node to bootstrap from the former owner of the range when adding a node as part of extending the cluster.
 
However, I was able to bootstrap multiple nodes to replace dead nodes. C* did not complain about it.
 
Is consistent range movement & the guarantee it offers to bootstrap from primary range owner not applicable when bootstrapping to replace dead nodes? 
 
----------------
Thank you

RE: Bootstrapping to Replace a Dead Node vs. Adding a New Node:Consistency Guarantees

Posted by "ZAIDI, ASAD A" <az...@att.com>.

The article you mentioned here clearly says  “For new users to Cassandra, the safest way to add multiple nodes into a cluster is to add them one at a time. Stay tuned as I will be following up with another post on bootstrapping.”

When extending cluster it is indeed recommended to go slow & serially. Optionally you can use cassandra.consistent.rangemovement=false but you can run in getting over streamed data.  Since you’re using release way newer when fixed introduced , I assumed you won’t see same behavior as described for the version which fix addresses. After adding node , if you won’t get  consistent data, you query consistency level should be able to pull consistent data , given you can tolerate bit latency until your repair is complete – if you go by recommendation i.e. to add one node at a time – you’ll avoid all these nuances .

From: Fd Habash [mailto:fmhabash@gmail.com]
Sent: Wednesday, May 01, 2019 3:12 PM
To: user@cassandra.apache.org
Subject: RE: Bootstrapping to Replace a Dead Node vs. Adding a New Node:Consistency Guarantees

Probably, I needed to be clearer in my inquiry ….

I’m investigating a situation where our diagnostic data is telling us that C* has lost some of the application data. I mean, getsstables for the data returns zero on all nodes in all racks.

The last pickle article below & Jeff Jirsa had described a situation where bootstrapping a node to extend the cluster can loose data if this new node bootstraps from a stale SECONDARY replica (node that was offline > hinted had-off window). This was fixed in cassandra-2434. http://thelastpickle.com/blog/2017/05/23/auto-bootstrapping-part1.html<https://urldefense.proofpoint.com/v2/url?u=http-3A__thelastpickle.com_blog_2017_05_23_auto-2Dbootstrapping-2Dpart1.html&d=DwMFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=FsmDztdsVuIKml8IDhdHdg&m=yAmuusU5W5Z8dvA1cs1QVdy7W9gg-vAXfeNGmQMPyy4&s=pbVHJ6KZhzTq8V70BIRPMwcBA9e3cfY6lWen1qDy8EU&e=>

The article & the Jira above describe bootstrapping when extending a cluster.

I understand replacing a dead node does not involve range movement, but will the above Jira fix prevent the bootstrap process when a replacing a dead node from using secondary replica?

Thanks

----------------
Thank you

From: Fred Habash<ma...@gmail.com>
Sent: Wednesday, May 1, 2019 6:50 AM
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Re: Bootstrapping to Replace a Dead Node vs. Adding a New Node:Consistency Guarantees

Thank you.

Range movement is one reason this is enforced when adding a new node. But, what about forcing a consistent bootstrap i.e. bootstrapping from primary owner of the range and not a secondary replica.

How’s consistent bootstrap enforced when replacing a dead node.

————-
Thank you.

On Apr 30, 2019, at 7:40 PM, Alok Dwivedi <al...@instaclustr.com>> wrote:
When a new node joins the ring, it needs to own new token ranges. This should be unique to the new node and we don’t want to end up in a situation where two nodes joining simultaneously can own same range (and ideally evenly distributed). Cassandra has this 2 minute wait rule for gossip state to propagate before a node is added.  But this on its does not guarantees that token ranges can’t overlap. See this ticket for more details https://issues.apache.org/jira/browse/CASSANDRA-7069<https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_CASSANDRA-2D7069&d=DwMFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=FsmDztdsVuIKml8IDhdHdg&m=yAmuusU5W5Z8dvA1cs1QVdy7W9gg-vAXfeNGmQMPyy4&s=zDOzA0az-dhAjy3JStExiYNwTeizC1MJSRRAG-1NNyA&e=> To overcome this  issue, the approach was to only allow one node joining at a time.

When you replace a dead node the new token range selection does not applies as the replacing node just owns the token ranges of the dead node. I think that’s why the restriction of only replacing one node at a time does not applies in this case.

Thanks
Alok Dwivedi
Senior Consultant
https://www.instaclustr.com/platform/<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.instaclustr.com_platform_&d=DwMFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=FsmDztdsVuIKml8IDhdHdg&m=yAmuusU5W5Z8dvA1cs1QVdy7W9gg-vAXfeNGmQMPyy4&s=bfUvw3cmdQCBT0el1ogPfMKVTFGOIzbJuKhaFtzKebw&e=>

From: Fd Habash <fm...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, 1 May 2019 at 06:18
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Bootstrapping to Replace a Dead Node vs. Adding a New Node: Consistency Guarantees

Reviewing the documentation &  based on my testing, using C* 2.2.8, I was not able to extend the cluster by adding multiple nodes simultaneously. I got an error message …

Other bootstrapping/leaving/moving nodes detected, cannot bootstrap while cassandra.consistent.rangemovement is true

I understand this is to force a node to bootstrap from the former owner of the range when adding a node as part of extending the cluster.

However, I was able to bootstrap multiple nodes to replace dead nodes. C* did not complain about it.

Is consistent range movement & the guarantee it offers to bootstrap from primary range owner not applicable when bootstrapping to replace dead nodes?

----------------
Thank you