You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Or Sher <or...@gmail.com> on 2015/04/20 11:02:44 UTC

Adding nodes to existing cluster

Hi all,
In the near future I'll need to add more than 10 nodes to a 2.0.9
cluster (using vnodes).
I read this documentation on datastax website:
http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html

In one point it says:
"If you are using racks, you can safely bootstrap two nodes at a time
when both nodes are on the same rack."

And in another is says:
"Start Cassandra on each new node. Allow two minutes between node
initializations. You can monitor the startup and data streaming
process using nodetool netstats."

We're not using racks configuration and from reading this
documentation I'm not really sure is it safe for us to bootstrap all
nodes together (with two minutes between each other).
I really hate the tought of doing it one by one, I assume it will take
more than 6H per node.

What do you say?
-- 
Or Sher

RE: Adding nodes to existing cluster

Posted by Matthew Johnson <ma...@algomi.com>.
Hi Colin,



To remove your address from the list, send a message to:

   <us...@cassandra.apache.org>



Cheers,

Matt





*From:* Colin Clark [mailto:colin@clark.ws]
*Sent:* 20 April 2015 14:10
*To:* user@cassandra.apache.org
*Subject:* Re: Adding nodes to existing cluster



unsubscribe





On Apr 20, 2015, at 8:08 AM, Carlos Rolo <ro...@pythian.com> wrote:



Independent of the snitch, data needs to travel to the new nodes (plus all
the keyspace information that goes via gossip). So I won't bootstrap them
all at once, even if it is only for network traffic generated.

Don't forget to run cleanup on the old nodes once all nodes are in place to
reclaim disk space.


Regards,



Carlos Juzarte Rolo

Cassandra Consultant



Pythian - Love your data



rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
<http://linkedin.com/in/carlosjuzarterolo>*

Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649

www.pythian.com



On Mon, Apr 20, 2015 at 1:58 PM, Or Sher <or...@gmail.com> wrote:

Thanks for the response.
Sure we'll monitor as we're adding nodes.
We're now using 6 nodes on each DC. (We have 2 DCs)
Each node contains ~800GB

Do you know how rack configurations are relevant here?
Do you see any reason to bootstrap them one by one if we're not using
rack awareness?



On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo <ro...@pythian.com> wrote:
> Start one node at a time. Wait 2 minutes before starting each node.
>
>
> How much data and nodes you have already? Depending on that, the streaming
> of data can stress on the resources you have.
> I would recommend to start one and monitor, if things are ok, add another
> one. And so on.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: cjrolo | Linkedin:
linkedin.com/in/carlosjuzarterolo
> Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
> www.pythian.com
>
> On Mon, Apr 20, 2015 at 11:02 AM, Or Sher <or...@gmail.com> wrote:
>>
>> Hi all,
>> In the near future I'll need to add more than 10 nodes to a 2.0.9
>> cluster (using vnodes).
>> I read this documentation on datastax website:
>>
>>
http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
>>
>> In one point it says:
>> "If you are using racks, you can safely bootstrap two nodes at a time
>> when both nodes are on the same rack."
>>
>> And in another is says:
>> "Start Cassandra on each new node. Allow two minutes between node
>> initializations. You can monitor the startup and data streaming
>> process using nodetool netstats."
>>
>> We're not using racks configuration and from reading this
>> documentation I'm not really sure is it safe for us to bootstrap all
>> nodes together (with two minutes between each other).
>> I really hate the tought of doing it one by one, I assume it will take
>> more than 6H per node.
>>
>> What do you say?
>> --
>> Or Sher
>
>
>

> --
>
>
>



--
Or Sher





--

Re: Adding nodes to existing cluster

Posted by Colin Clark <co...@clark.ws>.
unsubscribe


> On Apr 20, 2015, at 8:08 AM, Carlos Rolo <ro...@pythian.com> wrote:
> 
> Independent of the snitch, data needs to travel to the new nodes (plus all the keyspace information that goes via gossip). So I won't bootstrap them all at once, even if it is only for network traffic generated.
> 
> Don't forget to run cleanup on the old nodes once all nodes are in place to reclaim disk space.
> 
> Regards,
> 
> Carlos Juzarte Rolo
> Cassandra Consultant
>  
> Pythian - Love your data
> 
> rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo <http://linkedin.com/in/carlosjuzarterolo>
> Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
> www.pythian.com <http://www.pythian.com/>
> On Mon, Apr 20, 2015 at 1:58 PM, Or Sher <or.sher1@gmail.com <ma...@gmail.com>> wrote:
> Thanks for the response.
> Sure we'll monitor as we're adding nodes.
> We're now using 6 nodes on each DC. (We have 2 DCs)
> Each node contains ~800GB
> 
> Do you know how rack configurations are relevant here?
> Do you see any reason to bootstrap them one by one if we're not using
> rack awareness?
> 
> 
> On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo <rolo@pythian.com <ma...@pythian.com>> wrote:
> > Start one node at a time. Wait 2 minutes before starting each node.
> >
> >
> > How much data and nodes you have already? Depending on that, the streaming
> > of data can stress on the resources you have.
> > I would recommend to start one and monitor, if things are ok, add another
> > one. And so on.
> >
> > Regards,
> >
> > Carlos Juzarte Rolo
> > Cassandra Consultant
> >
> > Pythian - Love your data
> >
> > rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo <http://linkedin.com/in/carlosjuzarterolo>
> > Mobile: +31 6 159 61 814 <tel:%2B31%206%20159%2061%20814> | Tel: +1 613 565 8696 x1649 <tel:%2B1%20613%20565%208696%20x1649>
> > www.pythian.com <http://www.pythian.com/>
> >
> > On Mon, Apr 20, 2015 at 11:02 AM, Or Sher <or.sher1@gmail.com <ma...@gmail.com>> wrote:
> >>
> >> Hi all,
> >> In the near future I'll need to add more than 10 nodes to a 2.0.9
> >> cluster (using vnodes).
> >> I read this documentation on datastax website:
> >>
> >> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html <http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html>
> >>
> >> In one point it says:
> >> "If you are using racks, you can safely bootstrap two nodes at a time
> >> when both nodes are on the same rack."
> >>
> >> And in another is says:
> >> "Start Cassandra on each new node. Allow two minutes between node
> >> initializations. You can monitor the startup and data streaming
> >> process using nodetool netstats."
> >>
> >> We're not using racks configuration and from reading this
> >> documentation I'm not really sure is it safe for us to bootstrap all
> >> nodes together (with two minutes between each other).
> >> I really hate the tought of doing it one by one, I assume it will take
> >> more than 6H per node.
> >>
> >> What do you say?
> >> --
> >> Or Sher
> >
> >
> >
> > --
> >
> >
> >
> 
> 
> 
> --
> Or Sher
> 
> 
> --
> 
> 
> 
> 
> 


Re: Adding nodes to existing cluster

Posted by Sebastian Estevez <se...@datastax.com>.
The documentation is referring to Consistent Range Movements.

There is a change in 2.1 that won't allow you to bootstrap multiple nodes
at the same time unless you explicitly turn off consistent range movements.
Check out the jira:

https://issues.apache.org/jira/browse/CASSANDRA-2434

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.estevez@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>

<http://cassandrasummit-datastax.com/>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Mon, Apr 20, 2015 at 10:40 AM, Or Sher <or...@gmail.com> wrote:

> OK.
> Thanks.
> I'll monitor the resources status (network, memory, cpu, io) as I go
> and try to bootsrap them at chunks which seems not to have a bad
> impact.
> Will do regarding the cleanup.
>
> Thanks!
>
> On Mon, Apr 20, 2015 at 4:08 PM, Carlos Rolo <ro...@pythian.com> wrote:
> > Independent of the snitch, data needs to travel to the new nodes (plus
> all
> > the keyspace information that goes via gossip). So I won't bootstrap them
> > all at once, even if it is only for network traffic generated.
> >
> > Don't forget to run cleanup on the old nodes once all nodes are in place
> to
> > reclaim disk space.
> >
> > Regards,
> >
> > Carlos Juzarte Rolo
> > Cassandra Consultant
> >
> > Pythian - Love your data
> >
> > rolo@pythian | Twitter: cjrolo | Linkedin:
> linkedin.com/in/carlosjuzarterolo
> > Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
> > www.pythian.com
> >
> > On Mon, Apr 20, 2015 at 1:58 PM, Or Sher <or...@gmail.com> wrote:
> >>
> >> Thanks for the response.
> >> Sure we'll monitor as we're adding nodes.
> >> We're now using 6 nodes on each DC. (We have 2 DCs)
> >> Each node contains ~800GB
> >>
> >> Do you know how rack configurations are relevant here?
> >> Do you see any reason to bootstrap them one by one if we're not using
> >> rack awareness?
> >>
> >>
> >> On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo <ro...@pythian.com> wrote:
> >> > Start one node at a time. Wait 2 minutes before starting each node.
> >> >
> >> >
> >> > How much data and nodes you have already? Depending on that, the
> >> > streaming
> >> > of data can stress on the resources you have.
> >> > I would recommend to start one and monitor, if things are ok, add
> >> > another
> >> > one. And so on.
> >> >
> >> > Regards,
> >> >
> >> > Carlos Juzarte Rolo
> >> > Cassandra Consultant
> >> >
> >> > Pythian - Love your data
> >> >
> >> > rolo@pythian | Twitter: cjrolo | Linkedin:
> >> > linkedin.com/in/carlosjuzarterolo
> >> > Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
> >> > www.pythian.com
> >> >
> >> > On Mon, Apr 20, 2015 at 11:02 AM, Or Sher <or...@gmail.com> wrote:
> >> >>
> >> >> Hi all,
> >> >> In the near future I'll need to add more than 10 nodes to a 2.0.9
> >> >> cluster (using vnodes).
> >> >> I read this documentation on datastax website:
> >> >>
> >> >>
> >> >>
> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
> >> >>
> >> >> In one point it says:
> >> >> "If you are using racks, you can safely bootstrap two nodes at a time
> >> >> when both nodes are on the same rack."
> >> >>
> >> >> And in another is says:
> >> >> "Start Cassandra on each new node. Allow two minutes between node
> >> >> initializations. You can monitor the startup and data streaming
> >> >> process using nodetool netstats."
> >> >>
> >> >> We're not using racks configuration and from reading this
> >> >> documentation I'm not really sure is it safe for us to bootstrap all
> >> >> nodes together (with two minutes between each other).
> >> >> I really hate the tought of doing it one by one, I assume it will
> take
> >> >> more than 6H per node.
> >> >>
> >> >> What do you say?
> >> >> --
> >> >> Or Sher
> >> >
> >> >
> >> >
> >> > --
> >> >
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Or Sher
> >
> >
> >
> > --
> >
> >
> >
>
>
>
> --
> Or Sher
>

Re: Adding nodes to existing cluster

Posted by Or Sher <or...@gmail.com>.
OK.
Thanks.
I'll monitor the resources status (network, memory, cpu, io) as I go
and try to bootsrap them at chunks which seems not to have a bad
impact.
Will do regarding the cleanup.

Thanks!

On Mon, Apr 20, 2015 at 4:08 PM, Carlos Rolo <ro...@pythian.com> wrote:
> Independent of the snitch, data needs to travel to the new nodes (plus all
> the keyspace information that goes via gossip). So I won't bootstrap them
> all at once, even if it is only for network traffic generated.
>
> Don't forget to run cleanup on the old nodes once all nodes are in place to
> reclaim disk space.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo
> Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
> www.pythian.com
>
> On Mon, Apr 20, 2015 at 1:58 PM, Or Sher <or...@gmail.com> wrote:
>>
>> Thanks for the response.
>> Sure we'll monitor as we're adding nodes.
>> We're now using 6 nodes on each DC. (We have 2 DCs)
>> Each node contains ~800GB
>>
>> Do you know how rack configurations are relevant here?
>> Do you see any reason to bootstrap them one by one if we're not using
>> rack awareness?
>>
>>
>> On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo <ro...@pythian.com> wrote:
>> > Start one node at a time. Wait 2 minutes before starting each node.
>> >
>> >
>> > How much data and nodes you have already? Depending on that, the
>> > streaming
>> > of data can stress on the resources you have.
>> > I would recommend to start one and monitor, if things are ok, add
>> > another
>> > one. And so on.
>> >
>> > Regards,
>> >
>> > Carlos Juzarte Rolo
>> > Cassandra Consultant
>> >
>> > Pythian - Love your data
>> >
>> > rolo@pythian | Twitter: cjrolo | Linkedin:
>> > linkedin.com/in/carlosjuzarterolo
>> > Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
>> > www.pythian.com
>> >
>> > On Mon, Apr 20, 2015 at 11:02 AM, Or Sher <or...@gmail.com> wrote:
>> >>
>> >> Hi all,
>> >> In the near future I'll need to add more than 10 nodes to a 2.0.9
>> >> cluster (using vnodes).
>> >> I read this documentation on datastax website:
>> >>
>> >>
>> >> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
>> >>
>> >> In one point it says:
>> >> "If you are using racks, you can safely bootstrap two nodes at a time
>> >> when both nodes are on the same rack."
>> >>
>> >> And in another is says:
>> >> "Start Cassandra on each new node. Allow two minutes between node
>> >> initializations. You can monitor the startup and data streaming
>> >> process using nodetool netstats."
>> >>
>> >> We're not using racks configuration and from reading this
>> >> documentation I'm not really sure is it safe for us to bootstrap all
>> >> nodes together (with two minutes between each other).
>> >> I really hate the tought of doing it one by one, I assume it will take
>> >> more than 6H per node.
>> >>
>> >> What do you say?
>> >> --
>> >> Or Sher
>> >
>> >
>> >
>> > --
>> >
>> >
>> >
>>
>>
>>
>> --
>> Or Sher
>
>
>
> --
>
>
>



-- 
Or Sher

Re: Adding nodes to existing cluster

Posted by Carlos Rolo <ro...@pythian.com>.
Independent of the snitch, data needs to travel to the new nodes (plus all
the keyspace information that goes via gossip). So I won't bootstrap them
all at once, even if it is only for network traffic generated.

Don't forget to run cleanup on the old nodes once all nodes are in place to
reclaim disk space.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
<http://linkedin.com/in/carlosjuzarterolo>*
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
www.pythian.com

On Mon, Apr 20, 2015 at 1:58 PM, Or Sher <or...@gmail.com> wrote:

> Thanks for the response.
> Sure we'll monitor as we're adding nodes.
> We're now using 6 nodes on each DC. (We have 2 DCs)
> Each node contains ~800GB
>
> Do you know how rack configurations are relevant here?
> Do you see any reason to bootstrap them one by one if we're not using
> rack awareness?
>
>
> On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo <ro...@pythian.com> wrote:
> > Start one node at a time. Wait 2 minutes before starting each node.
> >
> >
> > How much data and nodes you have already? Depending on that, the
> streaming
> > of data can stress on the resources you have.
> > I would recommend to start one and monitor, if things are ok, add another
> > one. And so on.
> >
> > Regards,
> >
> > Carlos Juzarte Rolo
> > Cassandra Consultant
> >
> > Pythian - Love your data
> >
> > rolo@pythian | Twitter: cjrolo | Linkedin:
> linkedin.com/in/carlosjuzarterolo
> > Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
> > www.pythian.com
> >
> > On Mon, Apr 20, 2015 at 11:02 AM, Or Sher <or...@gmail.com> wrote:
> >>
> >> Hi all,
> >> In the near future I'll need to add more than 10 nodes to a 2.0.9
> >> cluster (using vnodes).
> >> I read this documentation on datastax website:
> >>
> >>
> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
> >>
> >> In one point it says:
> >> "If you are using racks, you can safely bootstrap two nodes at a time
> >> when both nodes are on the same rack."
> >>
> >> And in another is says:
> >> "Start Cassandra on each new node. Allow two minutes between node
> >> initializations. You can monitor the startup and data streaming
> >> process using nodetool netstats."
> >>
> >> We're not using racks configuration and from reading this
> >> documentation I'm not really sure is it safe for us to bootstrap all
> >> nodes together (with two minutes between each other).
> >> I really hate the tought of doing it one by one, I assume it will take
> >> more than 6H per node.
> >>
> >> What do you say?
> >> --
> >> Or Sher
> >
> >
> >
> > --
> >
> >
> >
>
>
>
> --
> Or Sher
>

-- 


--




Re: Adding nodes to existing cluster

Posted by Or Sher <or...@gmail.com>.
Thanks for the response.
Sure we'll monitor as we're adding nodes.
We're now using 6 nodes on each DC. (We have 2 DCs)
Each node contains ~800GB

Do you know how rack configurations are relevant here?
Do you see any reason to bootstrap them one by one if we're not using
rack awareness?


On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo <ro...@pythian.com> wrote:
> Start one node at a time. Wait 2 minutes before starting each node.
>
>
> How much data and nodes you have already? Depending on that, the streaming
> of data can stress on the resources you have.
> I would recommend to start one and monitor, if things are ok, add another
> one. And so on.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo
> Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
> www.pythian.com
>
> On Mon, Apr 20, 2015 at 11:02 AM, Or Sher <or...@gmail.com> wrote:
>>
>> Hi all,
>> In the near future I'll need to add more than 10 nodes to a 2.0.9
>> cluster (using vnodes).
>> I read this documentation on datastax website:
>>
>> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
>>
>> In one point it says:
>> "If you are using racks, you can safely bootstrap two nodes at a time
>> when both nodes are on the same rack."
>>
>> And in another is says:
>> "Start Cassandra on each new node. Allow two minutes between node
>> initializations. You can monitor the startup and data streaming
>> process using nodetool netstats."
>>
>> We're not using racks configuration and from reading this
>> documentation I'm not really sure is it safe for us to bootstrap all
>> nodes together (with two minutes between each other).
>> I really hate the tought of doing it one by one, I assume it will take
>> more than 6H per node.
>>
>> What do you say?
>> --
>> Or Sher
>
>
>
> --
>
>
>



-- 
Or Sher

Re: Adding nodes to existing cluster

Posted by Carlos Rolo <ro...@pythian.com>.
Start one node at a time. Wait 2 minutes before starting each node.


How much data and nodes you have already? Depending on that, the streaming
of data can stress on the resources you have.
I would recommend to start one and monitor, if things are ok, add another
one. And so on.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
<http://linkedin.com/in/carlosjuzarterolo>*
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
www.pythian.com

On Mon, Apr 20, 2015 at 11:02 AM, Or Sher <or...@gmail.com> wrote:

> Hi all,
> In the near future I'll need to add more than 10 nodes to a 2.0.9
> cluster (using vnodes).
> I read this documentation on datastax website:
>
> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
>
> In one point it says:
> "If you are using racks, you can safely bootstrap two nodes at a time
> when both nodes are on the same rack."
>
> And in another is says:
> "Start Cassandra on each new node. Allow two minutes between node
> initializations. You can monitor the startup and data streaming
> process using nodetool netstats."
>
> We're not using racks configuration and from reading this
> documentation I'm not really sure is it safe for us to bootstrap all
> nodes together (with two minutes between each other).
> I really hate the tought of doing it one by one, I assume it will take
> more than 6H per node.
>
> What do you say?
> --
> Or Sher
>

-- 


--