You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by david lee <ie...@gmail.com> on 2011/07/07 11:10:15 UTC

minimum number of machines for RF=3

hi guys,

is there a minimum(recommended) number of machines for RF=3?

i encountered a test result where
# of nodes = 3
RF  =  3
CL.READ=QUORUM
and when 1 node was taken out and back in after which node repair was run,
the TPS dropped significantly.

is this behaviour expected since the RF (for the duration of when 1 node was
taken out)
is higher than the # of nodes?

cheers,
david

Re: minimum number of machines for RF=3

Posted by david lee <ie...@gmail.com>.
cheers~

On 7 July 2011 19:41, Watanabe Maki <wa...@gmail.com> wrote:

> It is expected behaviour and not relate on number of node.
> After the failed node bringing back, the ring will be busy by Hinted
> Handoff rewriting and Read Repair. If you run repair, all your 3 nodes need
> to build Merkel Tree, compare the hash values,  then transfer latest data to
> each other.
>
> You can tune the HH, read repair to reduce performance impact on self
> healing activities.
>
> maki
>
> On 2011/07/07, at 18:10, david lee <ie...@gmail.com> wrote:
>
> > hi guys,
> >
> > is there a minimum(recommended) number of machines for RF=3?
> >
> > i encountered a test result where
> > # of nodes = 3
> > RF  =  3
> > CL.READ=QUORUM
> > and when 1 node was taken out and back in after which node repair was
> run,
> > the TPS dropped significantly.
> >
> > is this behaviour expected since the RF (for the duration of when 1 node
> was taken out)
> > is higher than the # of nodes?
> >
> > cheers,
> > david
>
>

Re: minimum number of machines for RF=3

Posted by Watanabe Maki <wa...@gmail.com>.
It is expected behaviour and not relate on number of node.
After the failed node bringing back, the ring will be busy by Hinted Handoff rewriting and Read Repair. If you run repair, all your 3 nodes need to build Merkel Tree, compare the hash values,  then transfer latest data to each other.

You can tune the HH, read repair to reduce performance impact on self healing activities.

maki

On 2011/07/07, at 18:10, david lee <ie...@gmail.com> wrote:

> hi guys,
> 
> is there a minimum(recommended) number of machines for RF=3?
> 
> i encountered a test result where 
> # of nodes = 3
> RF  =  3 
> CL.READ=QUORUM
> and when 1 node was taken out and back in after which node repair was run,
> the TPS dropped significantly.
> 
> is this behaviour expected since the RF (for the duration of when 1 node was taken out)
> is higher than the # of nodes?
> 
> cheers,
> david 


Re: minimum number of machines for RF=3

Posted by Peter Schuller <pe...@infidyne.com>.
> is this behaviour expected since the RF (for the duration of when 1 node was
> taken out)
> is higher than the # of nodes?

Note that repair is not needed just because a node was off for a while.

I am guessing the performance difference was due to the repair. Try
running repair at any time and see if you get the same effect.

If you are testing with a large data set, repair is probably
responsible for evicting data from OS page cache.

-- 
/ Peter Schuller