You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Cogumelos Maravilha <co...@sapo.pt> on 2017/08/17 16:47:46 UTC

Adding a new node with the double of disk space

Hi all,

I need to add a new node to my cluster but this time the new node will
have the double of disk space comparing to the other nodes.

I'm using the default vnodes (num_tokens: 256). To fully use the disk
space in the new node I just have to configure num_tokens: 512?

Thanks in advance.



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


Re: Adding a new node with the double of disk space

Posted by Jeff Jirsa <jj...@gmail.com>.
You'd use different num_tokens only if you wanted an imbalance (e.g. New hardware specs where you wanted to use fewer, larger machines).

-- 
Jeff Jirsa


> On Aug 19, 2017, at 6:04 PM, Subroto Barua <sb...@yahoo.com.INVALID> wrote:
> 
> Jeff,
> 
> is it ok to have different values of num_tokens per node in a cluster? won't it create cluster imbalance? or it better to initiate it on a separate DC?
> 
> Subroto
> 
> 
> On Friday, August 18, 2017, 5:34:11 AM PDT, Durity, Sean R <SE...@homedepot.com> wrote:
> 
> 
> I am doing some on-the-job-learning on this newer feature of the 3.x line, where the token generation algorithm will compensate for different size nodes in a cluster. In fact, it is one of the main reasons I upgraded to 3.0.13, because I have a number of original nodes in a cluster that are about half the size of the newer nodes. With the same number of vnodes, they can get overwhelmed with too much data and have to be rebuilt, etc.
> 
>  
> 
> So, I am cutting vnodes in half on those original nodes and rebuilding them. So far, it is working as designed. The data size is about half on the smaller nodes.
> 
>  
> 
> With the more current advice being to use less vnodes, for the original question below, I might consider adding the new node in at 256 vnodes and then rebuilding all the other nodes at 128. Of course the cluster size and amount of data would be important factors, as well as the future growth of the cluster and the expected size of any additional nodes.
> 
>  
> 
>  
> 
> Sean Durity
> 
>  
> 
> From: Jeff Jirsa [mailto:jjirsa@gmail.com] 
> Sent: Thursday, August 17, 2017 4:20 PM
> To: cassandra <us...@cassandra.apache.org>
> Subject: Re: Adding a new node with the double of disk space
> 
>  
> 
> If you really double the hardware in every way, it's PROBABLY reasonable to double num_tokens. It won't be quite the same as doubling all-the-things, because you still have a single JVM, and you'll still have to deal with GC as you're now reading twice as much and generating twice as much garbage, but you can probably adjust the tuning of the heap to compensate.
> 
>  
> 
>  
> 
>  
> 
> On Thu, Aug 17, 2017 at 1:00 PM, Kevin O'Connor <ke...@reddit.com.invalid> wrote:
> 
> Are you saying if a node had double the hardware capacity in every way it would be a bad idea to up num_tokens? I thought that was the whole idea of that setting though?
> 
>  
> 
> On Thu, Aug 17, 2017 at 9:52 AM, Carlos Rolo <ro...@pythian.com> wrote:
> 
> No.
> 
>  
> 
> If you would double all the hardware on that node vs the others would still be a bad idea.
> 
> Keep the cluster uniform vnodes wise.
> 
> 
> Regards,
> 
>  
> 
> Carlos Juzarte Rolo
> 
> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
> 
>  
> 
> Pythian - Love your data
> 
>  
> 
> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: linkedin.com/in/carlosjuzarterolo 
> 
> Mobile: +351 918 918 100
> 
> www.pythian.com
> 
>  
> 
> On Thu, Aug 17, 2017 at 5:47 PM, Cogumelos Maravilha <co...@sapo.pt> wrote:
> 
> Hi all,
> 
> I need to add a new node to my cluster but this time the new node will
> have the double of disk space comparing to the other nodes.
> 
> I'm using the default vnodes (num_tokens: 256). To fully use the disk
> space in the new node I just have to configure num_tokens: 512?
> 
> Thanks in advance.
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>  
> 
>  
> 
> --
> 
>  
> 
>  
> 
>  
> 
> 
> 
> The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The  Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.
> 

Re: RE: Adding a new node with the double of disk space

Posted by Subroto Barua <sb...@yahoo.com.INVALID>.
Jeff,
is it ok to have different values of num_tokens per node in a cluster? won't it create cluster imbalance? or it better to initiate it on a separate DC?
Subroto

On Friday, August 18, 2017, 5:34:11 AM PDT, Durity, Sean R <SE...@homedepot.com> wrote:

#yiv5432100827 #yiv5432100827 -- _filtered #yiv5432100827 {panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv5432100827 {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;}#yiv5432100827 #yiv5432100827 p.yiv5432100827MsoNormal, #yiv5432100827 li.yiv5432100827MsoNormal, #yiv5432100827 div.yiv5432100827MsoNormal {margin:0in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv5432100827 a:link, #yiv5432100827 span.yiv5432100827MsoHyperlink {color:blue;text-decoration:underline;}#yiv5432100827 a:visited, #yiv5432100827 span.yiv5432100827MsoHyperlinkFollowed {color:purple;text-decoration:underline;}#yiv5432100827 p {margin-right:0in;margin-left:0in;font-size:12.0pt;}#yiv5432100827 p.yiv5432100827msonormal0, #yiv5432100827 li.yiv5432100827msonormal0, #yiv5432100827 div.yiv5432100827msonormal0 {margin-right:0in;margin-left:0in;font-size:12.0pt;}#yiv5432100827 span.yiv5432100827EmailStyle19 {color:#1F497D;}#yiv5432100827 .yiv5432100827MsoChpDefault {} _filtered #yiv5432100827 {margin:1.0in 1.0in 1.0in 1.0in;}#yiv5432100827 div.yiv5432100827WordSection1 {}#yiv5432100827 
I am doing some on-the-job-learning on this newer feature of the 3.x line, where the token generation algorithm will compensate for different size nodes in a cluster. In fact, it is one of the main reasons I upgraded to 3.0.13, because I have a number of original nodes in a cluster that are about half the size of the newer nodes. With the same number of vnodes, they can get overwhelmed with too much data and have to be rebuilt, etc. 
 
  
 
So, I am cutting vnodes in half on those original nodes and rebuilding them. So far, it is working as designed. The data size is about half on the smaller nodes.
 
  
 
With the more current advice being to use less vnodes, for the original question below, I might consider adding the new node in at 256 vnodes and then rebuilding all the other nodes at 128. Of course the cluster size and amount of data would be important factors, as well as the future growth of the cluster and the expected size of any additional nodes.
 
  
 
  
 
Sean Durity
 
  
 
From: Jeff Jirsa [mailto:jjirsa@gmail.com]
Sent: Thursday, August 17, 2017 4:20 PM
To: cassandra <us...@cassandra.apache.org>
Subject: Re: Adding a new node with the double of disk space
 
  
 
If you really double the hardware in every way, it's PROBABLY reasonable to double num_tokens. It won't be quite the same as doubling all-the-things, because you still have a single JVM, and you'll still have to deal with GC as you're now reading twice as much and generating twice as much garbage, but you can probably adjust the tuning of the heap to compensate.
 
  
 
  
 
  
 
On Thu, Aug 17, 2017 at 1:00 PM, Kevin O'Connor <ke...@reddit.com.invalid> wrote:
 

Are you saying if a node had double the hardware capacity in every way it would be a bad idea to up num_tokens? I thought that was the whole idea of that setting though?
 
  
 
On Thu, Aug 17, 2017 at 9:52 AM, Carlos Rolo <ro...@pythian.com> wrote:
 

No.
 
  
 
If you would double all the hardware on that node vs the others would still be a bad idea.
 
Keep the cluster uniform vnodes wise.
 


 
Regards,
 
  
 
Carlos Juzarte Rolo
 
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
 
 
 
Pythian - Love your data
 
  
 
rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: linkedin.com/in/carlosjuzarterolo

 
Mobile: +351 918 918 100 
 
www.pythian.com
 
  
 
On Thu, Aug 17, 2017 at 5:47 PM, Cogumelos Maravilha <co...@sapo.pt> wrote:
 


Hi all,

I need to add a new node to my cluster but this time the new node will
have the double of disk space comparing to the other nodes.

I'm using the default vnodes (num_tokens: 256). To fully use the disk
space in the new node I just have to configure num_tokens: 512?

Thanks in advance.



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org 

  
 
  
 
--
 
  
 

  
 

  
 

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

RE: Adding a new node with the double of disk space

Posted by "Durity, Sean R" <SE...@homedepot.com>.
I am doing some on-the-job-learning on this newer feature of the 3.x line, where the token generation algorithm will compensate for different size nodes in a cluster. In fact, it is one of the main reasons I upgraded to 3.0.13, because I have a number of original nodes in a cluster that are about half the size of the newer nodes. With the same number of vnodes, they can get overwhelmed with too much data and have to be rebuilt, etc.

So, I am cutting vnodes in half on those original nodes and rebuilding them. So far, it is working as designed. The data size is about half on the smaller nodes.

With the more current advice being to use less vnodes, for the original question below, I might consider adding the new node in at 256 vnodes and then rebuilding all the other nodes at 128. Of course the cluster size and amount of data would be important factors, as well as the future growth of the cluster and the expected size of any additional nodes.


Sean Durity

From: Jeff Jirsa [mailto:jjirsa@gmail.com]
Sent: Thursday, August 17, 2017 4:20 PM
To: cassandra <us...@cassandra.apache.org>
Subject: Re: Adding a new node with the double of disk space

If you really double the hardware in every way, it's PROBABLY reasonable to double num_tokens. It won't be quite the same as doubling all-the-things, because you still have a single JVM, and you'll still have to deal with GC as you're now reading twice as much and generating twice as much garbage, but you can probably adjust the tuning of the heap to compensate.



On Thu, Aug 17, 2017 at 1:00 PM, Kevin O'Connor <ke...@reddit.com.invalid>> wrote:
Are you saying if a node had double the hardware capacity in every way it would be a bad idea to up num_tokens? I thought that was the whole idea of that setting though?

On Thu, Aug 17, 2017 at 9:52 AM, Carlos Rolo <ro...@pythian.com>> wrote:
No.

If you would double all the hardware on that node vs the others would still be a bad idea.
Keep the cluster uniform vnodes wise.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: linkedin.com/in/carlosjuzarterolo
<https://urldefense.proofpoint.com/v2/url?u=http-3A__linkedin.com_in_carlosjuzarterolo&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=rgNxV3xe5LAF1aAaZLAUezh4puIe3DKneEjHH-cf4tk&s=z8ZBxsxrkh0RG6ClNq3p1gk-3R8hVhVe7eoUOKurPgI&e=>
Mobile: +351 918 918 100<tel:+351%20918%20918%20100>
www.pythian.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.pythian.com_&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=rgNxV3xe5LAF1aAaZLAUezh4puIe3DKneEjHH-cf4tk&s=HD-QYimYZSKc1pzlsMXGp7te8RiXRqN1XLCuSZU1jos&e=>

On Thu, Aug 17, 2017 at 5:47 PM, Cogumelos Maravilha <co...@sapo.pt>> wrote:
Hi all,

I need to add a new node to my cluster but this time the new node will
have the double of disk space comparing to the other nodes.

I'm using the default vnodes (num_tokens: 256). To fully use the disk
space in the new node I just have to configure num_tokens: 512?

Thanks in advance.



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org>
For additional commands, e-mail: user-help@cassandra.apache.org<ma...@cassandra.apache.org>



--





________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

Re: Adding a new node with the double of disk space

Posted by Carlos Rolo <ro...@pythian.com>.
I would preferably spin 2 JVMs inside the same hardware (if you double
everything) than having to deal with what Jeff stated.

Also certain operations are not really found of a large number of vnodes
(eg. repair). There was a lot of improvements in the 3.x release cycle, but
I do still tend to reduce vnodes number and not increase.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
*linkedin.com/in/carlosjuzarterolo
<http://linkedin.com/in/carlosjuzarterolo>*
Mobile: +351 918 918 100
www.pythian.com

On Thu, Aug 17, 2017 at 9:19 PM, Jeff Jirsa <jj...@gmail.com> wrote:

> If you really double the hardware in every way, it's PROBABLY reasonable
> to double num_tokens. It won't be quite the same as doubling
> all-the-things, because you still have a single JVM, and you'll still have
> to deal with GC as you're now reading twice as much and generating twice as
> much garbage, but you can probably adjust the tuning of the heap to
> compensate.
>
>
>
> On Thu, Aug 17, 2017 at 1:00 PM, Kevin O'Connor <ke...@reddit.com.invalid>
> wrote:
>
>> Are you saying if a node had double the hardware capacity in every way it
>> would be a bad idea to up num_tokens? I thought that was the whole idea of
>> that setting though?
>>
>> On Thu, Aug 17, 2017 at 9:52 AM, Carlos Rolo <ro...@pythian.com> wrote:
>>
>>> No.
>>>
>>> If you would double all the hardware on that node vs the others would
>>> still be a bad idea.
>>> Keep the cluster uniform vnodes wise.
>>>
>>> Regards,
>>>
>>> Carlos Juzarte Rolo
>>> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>>>
>>> Pythian - Love your data
>>>
>>> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
>>> *linkedin.com/in/carlosjuzarterolo
>>> <http://linkedin.com/in/carlosjuzarterolo>*
>>> Mobile: +351 918 918 100 <+351%20918%20918%20100>
>>> www.pythian.com
>>>
>>> On Thu, Aug 17, 2017 at 5:47 PM, Cogumelos Maravilha <
>>> cogumelosmaravilha@sapo.pt> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I need to add a new node to my cluster but this time the new node will
>>>> have the double of disk space comparing to the other nodes.
>>>>
>>>> I'm using the default vnodes (num_tokens: 256). To fully use the disk
>>>> space in the new node I just have to configure num_tokens: 512?
>>>>
>>>> Thanks in advance.
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>> For additional commands, e-mail: user-help@cassandra.apache.org
>>>>
>>>>
>>>
>>> --
>>>
>>>
>>>
>>>
>>
>

-- 


--




Re: Adding a new node with the double of disk space

Posted by Jeff Jirsa <jj...@gmail.com>.
If you really double the hardware in every way, it's PROBABLY reasonable to
double num_tokens. It won't be quite the same as doubling all-the-things,
because you still have a single JVM, and you'll still have to deal with GC
as you're now reading twice as much and generating twice as much garbage,
but you can probably adjust the tuning of the heap to compensate.



On Thu, Aug 17, 2017 at 1:00 PM, Kevin O'Connor <ke...@reddit.com.invalid>
wrote:

> Are you saying if a node had double the hardware capacity in every way it
> would be a bad idea to up num_tokens? I thought that was the whole idea of
> that setting though?
>
> On Thu, Aug 17, 2017 at 9:52 AM, Carlos Rolo <ro...@pythian.com> wrote:
>
>> No.
>>
>> If you would double all the hardware on that node vs the others would
>> still be a bad idea.
>> Keep the cluster uniform vnodes wise.
>>
>> Regards,
>>
>> Carlos Juzarte Rolo
>> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>>
>> Pythian - Love your data
>>
>> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
>> *linkedin.com/in/carlosjuzarterolo
>> <http://linkedin.com/in/carlosjuzarterolo>*
>> Mobile: +351 918 918 100 <+351%20918%20918%20100>
>> www.pythian.com
>>
>> On Thu, Aug 17, 2017 at 5:47 PM, Cogumelos Maravilha <
>> cogumelosmaravilha@sapo.pt> wrote:
>>
>>> Hi all,
>>>
>>> I need to add a new node to my cluster but this time the new node will
>>> have the double of disk space comparing to the other nodes.
>>>
>>> I'm using the default vnodes (num_tokens: 256). To fully use the disk
>>> space in the new node I just have to configure num_tokens: 512?
>>>
>>> Thanks in advance.
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>> For additional commands, e-mail: user-help@cassandra.apache.org
>>>
>>>
>>
>> --
>>
>>
>>
>>
>

Re: Adding a new node with the double of disk space

Posted by Kevin O'Connor <ke...@reddit.com.INVALID>.
Are you saying if a node had double the hardware capacity in every way it
would be a bad idea to up num_tokens? I thought that was the whole idea of
that setting though?

On Thu, Aug 17, 2017 at 9:52 AM, Carlos Rolo <ro...@pythian.com> wrote:

> No.
>
> If you would double all the hardware on that node vs the others would
> still be a bad idea.
> Keep the cluster uniform vnodes wise.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
> *linkedin.com/in/carlosjuzarterolo
> <http://linkedin.com/in/carlosjuzarterolo>*
> Mobile: +351 918 918 100 <+351%20918%20918%20100>
> www.pythian.com
>
> On Thu, Aug 17, 2017 at 5:47 PM, Cogumelos Maravilha <
> cogumelosmaravilha@sapo.pt> wrote:
>
>> Hi all,
>>
>> I need to add a new node to my cluster but this time the new node will
>> have the double of disk space comparing to the other nodes.
>>
>> I'm using the default vnodes (num_tokens: 256). To fully use the disk
>> space in the new node I just have to configure num_tokens: 512?
>>
>> Thanks in advance.
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>>
>
> --
>
>
>
>

Re: Adding a new node with the double of disk space

Posted by Carlos Rolo <ro...@pythian.com>.
No.

If you would double all the hardware on that node vs the others would still
be a bad idea.
Keep the cluster uniform vnodes wise.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
*linkedin.com/in/carlosjuzarterolo
<http://linkedin.com/in/carlosjuzarterolo>*
Mobile: +351 918 918 100
www.pythian.com

On Thu, Aug 17, 2017 at 5:47 PM, Cogumelos Maravilha <
cogumelosmaravilha@sapo.pt> wrote:

> Hi all,
>
> I need to add a new node to my cluster but this time the new node will
> have the double of disk space comparing to the other nodes.
>
> I'm using the default vnodes (num_tokens: 256). To fully use the disk
> space in the new node I just have to configure num_tokens: 512?
>
> Thanks in advance.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

-- 


--