You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Kevin Burton <bu...@spinn3r.com> on 2015/10/17 19:45:05 UTC

Would we have data corruption if we bootstrapped 10 nodes at once?

We just migrated from a 30 node cluster to a 45 node cluster. (so 15 new
nodes)

By default we have auto_boostrap = false

so we just push our config to the cluster, the cassandra daemons restart,
and they're not cluster members and are the only nodes in the cluster.

Anyway.  While I was about 1/2 way done adding the 15 nodes,  I had about 7
members of the cluster and 8 not yet joined.

We are only doing 1 at a time because apparently bootstrapping more than 1
is unsafe.

I did a rolling restart whereby I went through and restarted all the
cassandra boxes.

Somehow the new nodes auto boostrapped themselves EVEN though
auto_bootstrap=false.

We don't have any errors.  Everything seems functional.  I'm just worried
about data loss.

Thoughts?

Kevin

-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

Posted by Branton Davis <br...@spanning.com>.

On Tue, Oct 20, 2015 at 3:31 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Tue, Oct 20, 2015 at 9:13 AM, Branton Davis <branton.davis@spanning.com
> > wrote:
>
>>
>>> Just to clarify, I was thinking about a scenario/disaster where we lost
>> the entire cluster and had to rebuild from backups.  I assumed we would
>> start each node with the backed up data and commit log directories already
>> there and with auto_bootstrap=false, and I also hoped that we could do all
>> nodes at once, since they each already had their data.  Is that wrong?  If
>> so, how would you handle such a situation?
>>
>
> "The one case I can imagine is when you are starting a cluster which is
> not receiving any write traffic and does contain snapshots. "
>
> The case you describe is in that class of cases.
>
> =Rob
>
>
>
>>
>>
>
>
Thanks for confirming!

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

Posted by Robert Coli <rc...@eventbrite.com>.

On Tue, Oct 20, 2015 at 9:13 AM, Branton Davis <br...@spanning.com>
wrote:

>
>> Just to clarify, I was thinking about a scenario/disaster where we lost
> the entire cluster and had to rebuild from backups.  I assumed we would
> start each node with the backed up data and commit log directories already
> there and with auto_bootstrap=false, and I also hoped that we could do all
> nodes at once, since they each already had their data.  Is that wrong?  If
> so, how would you handle such a situation?
>

"The one case I can imagine is when you are starting a cluster which is not
receiving any write traffic and does contain snapshots. "

The case you describe is in that class of cases.

=Rob

>
>

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

Posted by Branton Davis <br...@spanning.com>.

On Mon, Oct 19, 2015 at 5:42 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Mon, Oct 19, 2015 at 9:20 AM, Branton Davis <branton.davis@spanning.com
> > wrote:
>
>> Is that also true if you're standing up multiple nodes from backups that
>> already have data?  Could you not stand up more than one at a time since
>> they already have the data?
>>
>
> An operator probably almost never wants to add multiple
> not-previously-joined nodes to an active cluster via auto_bootstrap:false.
>
> The one case I can imagine is when you are starting a cluster which is not
> receiving any write traffic and does contain snapshots.
>
> =Rob
>

Just to clarify, I was thinking about a scenario/disaster where we lost the
entire cluster and had to rebuild from backups.  I assumed we would start
each node with the backed up data and commit log directories already there
and with auto_bootstrap=false, and I also hoped that we could do all nodes
at once, since they each already had their data.  Is that wrong?  If so,
how would you handle such a situation?

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

Posted by Robert Coli <rc...@eventbrite.com>.

On Mon, Oct 19, 2015 at 9:20 AM, Branton Davis <br...@spanning.com>
wrote:

> Is that also true if you're standing up multiple nodes from backups that
> already have data?  Could you not stand up more than one at a time since
> they already have the data?
>

An operator probably almost never wants to add multiple
not-previously-joined nodes to an active cluster via auto_bootstrap:false.

The one case I can imagine is when you are starting a cluster which is not
receiving any write traffic and does contain snapshots.

=Rob

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

Posted by Branton Davis <br...@spanning.com>.

Is that also true if you're standing up multiple nodes from backups that
already have data?  Could you not stand up more than one at a time since
they already have the data?

On Mon, Oct 19, 2015 at 10:48 AM, Eric Stevens <mi...@gmail.com> wrote:

> It seems to me that as long as cleanup hasn't happened, if you
> *decommission* the newly joined nodes, they'll stream whatever writes
> they took back to the original replicas.  Presumably that should be pretty
> quick as they won't have nearly as much data as the original nodes (as they
> only hold data written while they were online).  Then as long as cleanup
> hasn't happened, your cluster should have returned to a consistent view of
> the data.  You can now bootstrap the new nodes again.
>
> If you have done a cleanup, then the data is probably irreversibly
> corrupted, you will have to figure out how to restore the missing data
> incrementally from backups if they are available.
>
> On Sun, Oct 18, 2015 at 10:37 PM Raj Chudasama <ra...@gmail.com>
> wrote:
>
>> In this can does it make sense to remove newly added nodes, correct the
>> configuration and have them rejoin one at a time ?
>>
>> Thx
>>
>> Sent from my iPhone
>>
>> On Oct 18, 2015, at 11:19 PM, Jeff Jirsa <je...@crowdstrike.com>
>> wrote:
>>
>> Take a snapshot now, before you get rid of any data (whatever you do,
>> don’t run cleanup).
>>
>> If you identify missing data, you can go back to those snapshots, find
>> the nodes that had the data previously (sstable2json, for example), and
>> either re-stream that data into the cluster with sstableloader or copy it
>> to a new host and `nodetool refresh` it into the new system.
>>
>>
>>
>> From: <bu...@gmail.com> on behalf of Kevin Burton
>> Reply-To: "user@cassandra.apache.org"
>> Date: Sunday, October 18, 2015 at 8:10 PM
>> To: "user@cassandra.apache.org"
>> Subject: Re: Would we have data corruption if we bootstrapped 10 nodes
>> at once?
>>
>> ouch.. OK.. I think I really shot myself in the foot here then.  This
>> might be bad.
>>
>> I'm not sure if I would have missing data.  I mean basically the data is
>> on the other nodes.. but the cluster has been running with 10 nodes
>> accidentally bootstrapped with auto_bootstrap=false.
>>
>> So they have new data and seem to be missing values.
>>
>> this is somewhat misleading... Initially if you start it up and run
>> nodetool status , it only returns one node.
>>
>> So I assumed auto_bootstrap=false meant that it just doesn't join the
>> cluster.
>>
>> I'm running a nodetool repair now to hopefully fix this.
>>
>>
>>
>> On Sun, Oct 18, 2015 at 7:25 PM, Jeff Jirsa <je...@crowdstrike.com>
>> wrote:
>>
>>> auto_bootstrap=false tells it to join the cluster without running
>>> bootstrap – the node assumes it has all of the necessary data, and won’t
>>> stream any missing data.
>>>
>>> This generally violates consistency guarantees, but if done on a single
>>> node, is typically correctable with `nodetool repair`.
>>>
>>> If you do it on many  nodes at once, it’s possible that the new nodes
>>> could represent all 3 replicas of the data, but don’t physically have any
>>> of that data, leading to missing records.
>>>
>>>
>>>
>>> From: <bu...@gmail.com> on behalf of Kevin Burton
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Sunday, October 18, 2015 at 3:44 PM
>>> To: "user@cassandra.apache.org"
>>> Subject: Re: Would we have data corruption if we bootstrapped 10 nodes
>>> at once?
>>>
>>> An shit.. I think we're seeing corruption.. missing records :-/
>>>
>>> On Sat, Oct 17, 2015 at 10:45 AM, Kevin Burton <bu...@spinn3r.com>
>>> wrote:
>>>
>>>> We just migrated from a 30 node cluster to a 45 node cluster. (so 15
>>>> new nodes)
>>>>
>>>> By default we have auto_boostrap = false
>>>>
>>>> so we just push our config to the cluster, the cassandra daemons
>>>> restart, and they're not cluster members and are the only nodes in the
>>>> cluster.
>>>>
>>>> Anyway.  While I was about 1/2 way done adding the 15 nodes,  I had
>>>> about 7 members of the cluster and 8 not yet joined.
>>>>
>>>> We are only doing 1 at a time because apparently bootstrapping more
>>>> than 1 is unsafe.
>>>>
>>>> I did a rolling restart whereby I went through and restarted all the
>>>> cassandra boxes.
>>>>
>>>> Somehow the new nodes auto boostrapped themselves EVEN though
>>>> auto_bootstrap=false.
>>>>
>>>> We don't have any errors.  Everything seems functional.  I'm just
>>>> worried about data loss.
>>>>
>>>> Thoughts?
>>>>
>>>> Kevin
>>>>
>>>> --
>>>>
>>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>>> Engineers!
>>>>
>>>> Founder/CEO Spinn3r.com
>>>> Location: *San Francisco, CA*
>>>> blog: http://burtonator.wordpress.com
>>>> … or check out my Google+ profile
>>>> <https://plus.google.com/102718274791889610666/posts>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>> Engineers!
>>>
>>> Founder/CEO Spinn3r.com
>>> Location: *San Francisco, CA*
>>> blog: http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> <https://plus.google.com/102718274791889610666/posts>
>>>
>>>
>>
>>
>> --
>>
>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>> Engineers!
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> <https://plus.google.com/102718274791889610666/posts>
>>
>>

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

Posted by Eric Stevens <mi...@gmail.com>.

It seems to me that as long as cleanup hasn't happened, if you
*decommission* the newly joined nodes, they'll stream whatever writes they
took back to the original replicas.  Presumably that should be pretty quick
as they won't have nearly as much data as the original nodes (as they only
hold data written while they were online).  Then as long as cleanup hasn't
happened, your cluster should have returned to a consistent view of the
data.  You can now bootstrap the new nodes again.

If you have done a cleanup, then the data is probably irreversibly
corrupted, you will have to figure out how to restore the missing data
incrementally from backups if they are available.

On Sun, Oct 18, 2015 at 10:37 PM Raj Chudasama <ra...@gmail.com>
wrote:

> In this can does it make sense to remove newly added nodes, correct the
> configuration and have them rejoin one at a time ?
>
> Thx
>
> Sent from my iPhone
>
> On Oct 18, 2015, at 11:19 PM, Jeff Jirsa <je...@crowdstrike.com>
> wrote:
>
> Take a snapshot now, before you get rid of any data (whatever you do,
> don’t run cleanup).
>
> If you identify missing data, you can go back to those snapshots, find the
> nodes that had the data previously (sstable2json, for example), and either
> re-stream that data into the cluster with sstableloader or copy it to a new
> host and `nodetool refresh` it into the new system.
>
>
>
> From: <bu...@gmail.com> on behalf of Kevin Burton
> Reply-To: "user@cassandra.apache.org"
> Date: Sunday, October 18, 2015 at 8:10 PM
> To: "user@cassandra.apache.org"
> Subject: Re: Would we have data corruption if we bootstrapped 10 nodes at
> once?
>
> ouch.. OK.. I think I really shot myself in the foot here then.  This
> might be bad.
>
> I'm not sure if I would have missing data.  I mean basically the data is
> on the other nodes.. but the cluster has been running with 10 nodes
> accidentally bootstrapped with auto_bootstrap=false.
>
> So they have new data and seem to be missing values.
>
> this is somewhat misleading... Initially if you start it up and run
> nodetool status , it only returns one node.
>
> So I assumed auto_bootstrap=false meant that it just doesn't join the
> cluster.
>
> I'm running a nodetool repair now to hopefully fix this.
>
>
>
> On Sun, Oct 18, 2015 at 7:25 PM, Jeff Jirsa <je...@crowdstrike.com>
> wrote:
>
>> auto_bootstrap=false tells it to join the cluster without running
>> bootstrap – the node assumes it has all of the necessary data, and won’t
>> stream any missing data.
>>
>> This generally violates consistency guarantees, but if done on a single
>> node, is typically correctable with `nodetool repair`.
>>
>> If you do it on many  nodes at once, it’s possible that the new nodes
>> could represent all 3 replicas of the data, but don’t physically have any
>> of that data, leading to missing records.
>>
>>
>>
>> From: <bu...@gmail.com> on behalf of Kevin Burton
>> Reply-To: "user@cassandra.apache.org"
>> Date: Sunday, October 18, 2015 at 3:44 PM
>> To: "user@cassandra.apache.org"
>> Subject: Re: Would we have data corruption if we bootstrapped 10 nodes
>> at once?
>>
>> An shit.. I think we're seeing corruption.. missing records :-/
>>
>> On Sat, Oct 17, 2015 at 10:45 AM, Kevin Burton <bu...@spinn3r.com>
>> wrote:
>>
>>> We just migrated from a 30 node cluster to a 45 node cluster. (so 15 new
>>> nodes)
>>>
>>> By default we have auto_boostrap = false
>>>
>>> so we just push our config to the cluster, the cassandra daemons
>>> restart, and they're not cluster members and are the only nodes in the
>>> cluster.
>>>
>>> Anyway.  While I was about 1/2 way done adding the 15 nodes,  I had
>>> about 7 members of the cluster and 8 not yet joined.
>>>
>>> We are only doing 1 at a time because apparently bootstrapping more than
>>> 1 is unsafe.
>>>
>>> I did a rolling restart whereby I went through and restarted all the
>>> cassandra boxes.
>>>
>>> Somehow the new nodes auto boostrapped themselves EVEN though
>>> auto_bootstrap=false.
>>>
>>> We don't have any errors.  Everything seems functional.  I'm just
>>> worried about data loss.
>>>
>>> Thoughts?
>>>
>>> Kevin
>>>
>>> --
>>>
>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>> Engineers!
>>>
>>> Founder/CEO Spinn3r.com
>>> Location: *San Francisco, CA*
>>> blog: http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> <https://plus.google.com/102718274791889610666/posts>
>>>
>>>
>>
>>
>> --
>>
>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>> Engineers!
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> <https://plus.google.com/102718274791889610666/posts>
>>
>>
>
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
>
>

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

Posted by Raj Chudasama <ra...@gmail.com>.

In this can does it make sense to remove newly added nodes, correct the configuration and have them rejoin one at a time ?

Thx

Sent from my iPhone

> On Oct 18, 2015, at 11:19 PM, Jeff Jirsa <je...@crowdstrike.com> wrote:
> 
> Take a snapshot now, before you get rid of any data (whatever you do, don’t run cleanup). 
> 
> If you identify missing data, you can go back to those snapshots, find the nodes that had the data previously (sstable2json, for example), and either re-stream that data into the cluster with sstableloader or copy it to a new host and `nodetool refresh` it into the new system.
> 
> 
> 
> From: <bu...@gmail.com> on behalf of Kevin Burton
> Reply-To: "user@cassandra.apache.org"
> Date: Sunday, October 18, 2015 at 8:10 PM
> To: "user@cassandra.apache.org"
> Subject: Re: Would we have data corruption if we bootstrapped 10 nodes at once?
> 
> ouch.. OK.. I think I really shot myself in the foot here then.  This might be bad.
> 
> I'm not sure if I would have missing data.  I mean basically the data is on the other nodes.. but the cluster has been running with 10 nodes accidentally bootstrapped with auto_bootstrap=false.  
> 
> So they have new data and seem to be missing values. 
> 
> this is somewhat misleading... Initially if you start it up and run nodetool status , it only returns one node. 
> 
> So I assumed auto_bootstrap=false meant that it just doesn't join the cluster.
> 
> I'm running a nodetool repair now to hopefully fix this.
> 
> 
> 
>> On Sun, Oct 18, 2015 at 7:25 PM, Jeff Jirsa  <je...@crowdstrike.com> wrote:
>> auto_bootstrap=false tells it to join the cluster without running bootstrap – the node assumes it has all of the necessary data, and won’t stream any missing data.
>> 
>> This generally violates consistency guarantees, but if done on a single node, is typically correctable with `nodetool repair`.
>> 
>> If you do it on many  nodes at once, it’s possible that the new nodes could represent all 3 replicas of the data, but don’t physically have any of that data, leading to missing records.
>> 
>> 
>> 
>> From: <bu...@gmail.com> on behalf of Kevin Burton
>> Reply-To: "user@cassandra.apache.org"
>> Date: Sunday, October 18, 2015 at 3:44 PM
>> To: "user@cassandra.apache.org"
>> Subject: Re: Would we have data corruption if we bootstrapped 10 nodes at once?
>> 
>> An shit.. I think we're seeing corruption.. missing records :-/
>> 
>>> On Sat, Oct 17, 2015 at 10:45 AM, Kevin Burton <bu...@spinn3r.com> wrote:
>>> We just migrated from a 30 node cluster to a 45 node cluster. (so 15 new nodes)
>>> 
>>> By default we have auto_boostrap = false
>>> 
>>> so we just push our config to the cluster, the cassandra daemons restart, and they're not cluster members and are the only nodes in the cluster.
>>> 
>>> Anyway.  While I was about 1/2 way done adding the 15 nodes,  I had about 7 members of the cluster and 8 not yet joined.
>>> 
>>> We are only doing 1 at a time because apparently bootstrapping more than 1 is unsafe.  
>>> 
>>> I did a rolling restart whereby I went through and restarted all the cassandra boxes.  
>>> 
>>> Somehow the new nodes auto boostrapped themselves EVEN though auto_bootstrap=false.
>>> 
>>> We don't have any errors.  Everything seems functional.  I'm just worried about data loss.
>>> 
>>> Thoughts?
>>> 
>>> Kevin
>>> 
>>> -- 
>>> We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers!
>>> 
>>> Founder/CEO Spinn3r.com
>>> Location: San Francisco, CA
>>> blog: http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> 
>> 
>> 
>> 
>> -- 
>> We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers!
>> 
>> Founder/CEO Spinn3r.com
>> Location: San Francisco, CA
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> 
> 
> 
> 
> -- 
> We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers!
> 
> Founder/CEO Spinn3r.com
> Location: San Francisco, CA
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
>

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

Posted by Jeff Jirsa <je...@crowdstrike.com>.

Take a snapshot now, before you get rid of any data (whatever you do, don’t run cleanup). 

If you identify missing data, you can go back to those snapshots, find the nodes that had the data previously (sstable2json, for example), and either re-stream that data into the cluster with sstableloader or copy it to a new host and `nodetool refresh` it into the new system.

From:  <bu...@gmail.com> on behalf of Kevin Burton
Reply-To:  "user@cassandra.apache.org"
Date:  Sunday, October 18, 2015 at 8:10 PM
To:  "user@cassandra.apache.org"
Subject:  Re: Would we have data corruption if we bootstrapped 10 nodes at once?

ouch.. OK.. I think I really shot myself in the foot here then.  This might be bad. 

I'm not sure if I would have missing data.  I mean basically the data is on the other nodes.. but the cluster has been running with 10 nodes accidentally bootstrapped with auto_bootstrap=false.  

So they have new data and seem to be missing values. 

this is somewhat misleading... Initially if you start it up and run nodetool status , it only returns one node. 

So I assumed auto_bootstrap=false meant that it just doesn't join the cluster.

I'm running a nodetool repair now to hopefully fix this.

On Sun, Oct 18, 2015 at 7:25 PM, Jeff Jirsa <je...@crowdstrike.com> wrote:
auto_bootstrap=false tells it to join the cluster without running bootstrap – the node assumes it has all of the necessary data, and won’t stream any missing data.

This generally violates consistency guarantees, but if done on a single node, is typically correctable with `nodetool repair`.

If you do it on many  nodes at once, it’s possible that the new nodes could represent all 3 replicas of the data, but don’t physically have any of that data, leading to missing records.

From: <bu...@gmail.com> on behalf of Kevin Burton
Reply-To: "user@cassandra.apache.org"
Date: Sunday, October 18, 2015 at 3:44 PM
To: "user@cassandra.apache.org"
Subject: Re: Would we have data corruption if we bootstrapped 10 nodes at once?

An shit.. I think we're seeing corruption.. missing records :-/

On Sat, Oct 17, 2015 at 10:45 AM, Kevin Burton <bu...@spinn3r.com> wrote:
We just migrated from a 30 node cluster to a 45 node cluster. (so 15 new nodes) 

By default we have auto_boostrap = false

so we just push our config to the cluster, the cassandra daemons restart, and they're not cluster members and are the only nodes in the cluster.

Anyway.  While I was about 1/2 way done adding the 15 nodes,  I had about 7 members of the cluster and 8 not yet joined.

We are only doing 1 at a time because apparently bootstrapping more than 1 is unsafe.  

I did a rolling restart whereby I went through and restarted all the cassandra boxes.  

Somehow the new nodes auto boostrapped themselves EVEN though auto_bootstrap=false.

We don't have any errors.  Everything seems functional.  I'm just worried about data loss.

Thoughts?

Kevin

-- 
We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers!

Founder/CEO Spinn3r.com
Location: San Francisco, CA
blog: http://burtonator.wordpress.com
… or check out my Google+ profile

-- 
We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers!

Founder/CEO Spinn3r.com
Location: San Francisco, CA
blog: http://burtonator.wordpress.com
… or check out my Google+ profile

-- 
We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers!

Founder/CEO Spinn3r.com
Location: San Francisco, CA
blog: http://burtonator.wordpress.com
… or check out my Google+ profile

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

Posted by Jeff Jirsa <je...@crowdstrike.com>.

Worth noting that repair may not work, as it’s possible that NONE of the nodes with data (for some given row) are no longer valid replicas according to the DHT/Tokens, so repair will not find any of the replicas with the data. 

From:  Robert Coli
Reply-To:  "user@cassandra.apache.org"
Date:  Monday, October 19, 2015 at 3:40 PM
To:  "user@cassandra.apache.org"
Subject:  Re: Would we have data corruption if we bootstrapped 10 nodes at once?

On Sun, Oct 18, 2015 at 8:10 PM, Kevin Burton <bu...@spinn3r.com> wrote:
ouch.. OK.. I think I really shot myself in the foot here then.  This might be bad.

Yep.

https://issues.apache.org/jira/browse/CASSANDRA-7069 - "Prevent operator mistakes due to simultaneous bootstrap"

But this doesn't handle your case, where you force joined a bunch of nodes with auto_bootstrap=false.

Probably if I were in your case (and realized it immediately) I would decommission all nodes and then start again. I probably would not run repair, though that would also work. I agree with jeffj down-thread that you should not run cleanup.

=Rob

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

Posted by Robert Coli <rc...@eventbrite.com>.

On Sun, Oct 18, 2015 at 8:10 PM, Kevin Burton <bu...@spinn3r.com> wrote:

> ouch.. OK.. I think I really shot myself in the foot here then.  This
> might be bad.
>

Yep.

https://issues.apache.org/jira/browse/CASSANDRA-7069 - "Prevent operator
mistakes due to simultaneous bootstrap"

But this doesn't handle your case, where you force joined a bunch of nodes
with auto_bootstrap=false.

Probably if I were in your case (and realized it immediately) I would
decommission all nodes and then start again. I probably would not run
repair, though that would also work. I agree with jeffj down-thread that
you should not run cleanup.

=Rob

>
>

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

Posted by Kevin Burton <bu...@spinn3r.com>.

ouch.. OK.. I think I really shot myself in the foot here then.  This might
be bad.

I'm not sure if I would have missing data.  I mean basically the data is on
the other nodes.. but the cluster has been running with 10 nodes
accidentally bootstrapped with auto_bootstrap=false.

So they have new data and seem to be missing values.

this is somewhat misleading... Initially if you start it up and run
nodetool status , it only returns one node.

So I assumed auto_bootstrap=false meant that it just doesn't join the
cluster.

I'm running a nodetool repair now to hopefully fix this.



On Sun, Oct 18, 2015 at 7:25 PM, Jeff Jirsa <je...@crowdstrike.com>
wrote:

> auto_bootstrap=false tells it to join the cluster without running
> bootstrap – the node assumes it has all of the necessary data, and won’t
> stream any missing data.
>
> This generally violates consistency guarantees, but if done on a single
> node, is typically correctable with `nodetool repair`.
>
> If you do it on many  nodes at once, it’s possible that the new nodes
> could represent all 3 replicas of the data, but don’t physically have any
> of that data, leading to missing records.
>
>
>
> From: <bu...@gmail.com> on behalf of Kevin Burton
> Reply-To: "user@cassandra.apache.org"
> Date: Sunday, October 18, 2015 at 3:44 PM
> To: "user@cassandra.apache.org"
> Subject: Re: Would we have data corruption if we bootstrapped 10 nodes at
> once?
>
> An shit.. I think we're seeing corruption.. missing records :-/
>
> On Sat, Oct 17, 2015 at 10:45 AM, Kevin Burton <bu...@spinn3r.com> wrote:
>
>> We just migrated from a 30 node cluster to a 45 node cluster. (so 15 new
>> nodes)
>>
>> By default we have auto_boostrap = false
>>
>> so we just push our config to the cluster, the cassandra daemons restart,
>> and they're not cluster members and are the only nodes in the cluster.
>>
>> Anyway.  While I was about 1/2 way done adding the 15 nodes,  I had about
>> 7 members of the cluster and 8 not yet joined.
>>
>> We are only doing 1 at a time because apparently bootstrapping more than
>> 1 is unsafe.
>>
>> I did a rolling restart whereby I went through and restarted all the
>> cassandra boxes.
>>
>> Somehow the new nodes auto boostrapped themselves EVEN though
>> auto_bootstrap=false.
>>
>> We don't have any errors.  Everything seems functional.  I'm just worried
>> about data loss.
>>
>> Thoughts?
>>
>> Kevin
>>
>> --
>>
>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>> Engineers!
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> <https://plus.google.com/102718274791889610666/posts>
>>
>>
>
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
>
>


-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

Posted by Jeff Jirsa <je...@crowdstrike.com>.

auto_bootstrap=false tells it to join the cluster without running bootstrap – the node assumes it has all of the necessary data, and won’t stream any missing data.

This generally violates consistency guarantees, but if done on a single node, is typically correctable with `nodetool repair`.

If you do it on many  nodes at once, it’s possible that the new nodes could represent all 3 replicas of the data, but don’t physically have any of that data, leading to missing records.



From:  <bu...@gmail.com> on behalf of Kevin Burton
Reply-To:  "user@cassandra.apache.org"
Date:  Sunday, October 18, 2015 at 3:44 PM
To:  "user@cassandra.apache.org"
Subject:  Re: Would we have data corruption if we bootstrapped 10 nodes at once?

An shit.. I think we're seeing corruption.. missing records :-/

On Sat, Oct 17, 2015 at 10:45 AM, Kevin Burton <bu...@spinn3r.com> wrote:
We just migrated from a 30 node cluster to a 45 node cluster. (so 15 new nodes) 

By default we have auto_boostrap = false

so we just push our config to the cluster, the cassandra daemons restart, and they're not cluster members and are the only nodes in the cluster.

Anyway.  While I was about 1/2 way done adding the 15 nodes,  I had about 7 members of the cluster and 8 not yet joined.

We are only doing 1 at a time because apparently bootstrapping more than 1 is unsafe.  

I did a rolling restart whereby I went through and restarted all the cassandra boxes.  

Somehow the new nodes auto boostrapped themselves EVEN though auto_bootstrap=false.

We don't have any errors.  Everything seems functional.  I'm just worried about data loss.

Thoughts?

Kevin

-- 
We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers!

Founder/CEO Spinn3r.com
Location: San Francisco, CA
blog: http://burtonator.wordpress.com
… or check out my Google+ profile




-- 
We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers!

Founder/CEO Spinn3r.com
Location: San Francisco, CA
blog: http://burtonator.wordpress.com
… or check out my Google+ profile

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

Posted by Kevin Burton <bu...@spinn3r.com>.

An shit.. I think we're seeing corruption.. missing records :-/

On Sat, Oct 17, 2015 at 10:45 AM, Kevin Burton <bu...@spinn3r.com> wrote:

> We just migrated from a 30 node cluster to a 45 node cluster. (so 15 new
> nodes)
>
> By default we have auto_boostrap = false
>
> so we just push our config to the cluster, the cassandra daemons restart,
> and they're not cluster members and are the only nodes in the cluster.
>
> Anyway.  While I was about 1/2 way done adding the 15 nodes,  I had about
> 7 members of the cluster and 8 not yet joined.
>
> We are only doing 1 at a time because apparently bootstrapping more than 1
> is unsafe.
>
> I did a rolling restart whereby I went through and restarted all the
> cassandra boxes.
>
> Somehow the new nodes auto boostrapped themselves EVEN though
> auto_bootstrap=false.
>
> We don't have any errors.  Everything seems functional.  I'm just worried
> about data loss.
>
> Thoughts?
>
> Kevin
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
>
>


-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>