You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Eric Parusel <er...@gmail.com> on 2012/12/05 17:45:27 UTC

Virtual Nodes, lots of physical nodes and potentially increasing outage count?

Hi all,

I've been wondering about virtual nodes and how cluster uptime might change
as cluster size increases.

I understand clusters will benefit from increased reliability due to faster
rebuild time, but does that hold true for large clusters?

It seems that since (and correct me if I'm wrong here) every physical node
will likely share some small amount of data with every other node, that as
the count of physical nodes in a Cassandra cluster increases (let's say
into the triple digits) that the probability of at least one failure to
Quorum read/write occurring in a given time period would *increase*.

Would this hold true, at least until physical nodes becomes greater than
num_tokens per node?

I understand that the window of failure for affected ranges would probably
be small but we do Quorum reads of many keys, so we'd likely hit every
virtual range with our queries, even if num_tokens was 256.

Thanks,
Eric

Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?

Posted by Richard Low <rl...@acunu.com>.

Eric,

You're right that prob. of 2 node failures increases to 1 as node count
increases to infinity.  But that's true with or without virtual nodes.  If
you want constant uptime while scaling your cluster, you have to increase
replication factor.

Richard.


On 11 December 2012 20:23, aaron morton <aa...@thelastpickle.com> wrote:

> Is it possible to configure or write a snitch that would create separate
> distribution zones within the cluster?  (e.g. 144 nodes in cluster, split
> into 12 zones.  Data stored to node 1 could only be replicated to one of 11
> other nodes in the same distribution zone).
>
> This is kind of what NTS does if you have nodes in different racks.
>
> A replica is placed in each rack, and the process wraps around and
> continues until RF replicas are located. If the number of racks is not
> equal to the RF you then get some unevenness (how what do you know, that's
> a real word :) )
>
> Cheers
>
>    -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 12/12/2012, at 6:42 AM, Eric Parusel <er...@gmail.com> wrote:
>
> Ok, thanks Richard.  That's good to hear.
>
> However, I still contend that as node count increases to infinity, the
> probability of there being at least two node failures in the cluster at any
> time would increase to 100%.
>
> I think of this as somewhat analogous to RAID -- I would not be
> comfortable with a 144+ disk RAID 6 array, no matter the rebuild speed :)
>
> Is it possible to configure or write a snitch that would create separate
> distribution zones within the cluster?  (e.g. 144 nodes in cluster, split
> into 12 zones.  Data stored to node 1 could only be replicated to one of 11
> other nodes in the same distribution zone).
>
>
> On Tue, Dec 11, 2012 at 3:24 AM, Richard Low <rl...@acunu.com> wrote:
>
>> Hi Eric,
>>
>> The time to recover one node is limited by that node, but the time to
>> recover that's most important is just the time to replicate the data that
>> is missing from that node.  This is the removetoken operation (called
>> removenode in 1.2), and this gets faster the more nodes you have.
>>
>> Richard.
>>
>>
>> On 11 December 2012 08:39, Eric Parusel <er...@gmail.com> wrote:
>>
>>> Thanks for your thoughts guys.
>>>
>>> I agree that with vnodes total downtime is lessened.  Although it also
>>> seems that the total number of outages (however small) would be greater.
>>>
>>> But I think downtime is only lessened up to a certain cluster size.
>>>
>>> I'm thinking that as the cluster continues to grow:
>>>   - node rebuild time will max out (a single node only has so much write
>>> bandwidth)
>>>   - the probability of 2 nodes being down at any given time will
>>> continue to increase -- even if you consider only non-correlated failures.
>>>
>>> Therefore, when adding nodes beyond the point where node rebuild time
>>> maxes out, both the total number of outages *and* overall downtime would
>>> increase?
>>>
>>> Thanks,
>>> Eric
>>>
>>>
>>>
>>>
>>> On Mon, Dec 10, 2012 at 7:00 AM, Edward Capriolo <ed...@gmail.com>wrote:
>>>
>>>> Assuming you need to work with quorum in a non-vnode scenario. That
>>>> means that if 2 nodes in a row in the ring are down some number of quorum
>>>> operations will fail with UnavailableException (TimeoutException right
>>>> after the failures). This is because the for a given range of tokens quorum
>>>> will be impossible, but quorum will be possible for others.
>>>>
>>>> In a vnode world if any two nodes are down,  then the intersection of
>>>> vnode token ranges they have are unavailable.
>>>>
>>>> I think it is two sides of the same coin.
>>>>
>>>>
>>>> On Mon, Dec 10, 2012 at 7:41 AM, Richard Low <rl...@acunu.com> wrote:
>>>>
>>>>> Hi Tyler,
>>>>>
>>>>> You're right, the math does assume independence which is unlikely to
>>>>> be accurate.  But if you do have correlated failure modes e.g. same power,
>>>>> racks, DC, etc. then you can still use Cassandra's rack-aware or DC-aware
>>>>> features to ensure replicas are spread around so your cluster can survive
>>>>> the correlated failure mode.  So I would expect vnodes to improve uptime in
>>>>> all scenarios, but haven't done the math to prove it.
>>>>>
>>>>> Richard.
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Richard Low
>> Acunu | http://www.acunu.com | @acunu
>>
>
>
>


-- 
Richard Low
Acunu | http://www.acunu.com | @acunu

Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?

Posted by aaron morton <aa...@thelastpickle.com>.

> Is it possible to configure or write a snitch that would create separate distribution zones within the cluster?  (e.g. 144 nodes in cluster, split into 12 zones.  Data stored to node 1 could only be replicated to one of 11 other nodes in the same distribution zone).
This is kind of what NTS does if you have nodes in different racks. 

A replica is placed in each rack, and the process wraps around and continues until RF replicas are located. If the number of racks is not equal to the RF you then get some unevenness (how what do you know, that's a real word :) ) 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 12/12/2012, at 6:42 AM, Eric Parusel <er...@gmail.com> wrote:

> Ok, thanks Richard.  That's good to hear.
> 
> However, I still contend that as node count increases to infinity, the probability of there being at least two node failures in the cluster at any time would increase to 100%.
> 
> I think of this as somewhat analogous to RAID -- I would not be comfortable with a 144+ disk RAID 6 array, no matter the rebuild speed :)
> 
> Is it possible to configure or write a snitch that would create separate distribution zones within the cluster?  (e.g. 144 nodes in cluster, split into 12 zones.  Data stored to node 1 could only be replicated to one of 11 other nodes in the same distribution zone).
> 
> 
> On Tue, Dec 11, 2012 at 3:24 AM, Richard Low <rl...@acunu.com> wrote:
> Hi Eric,
> 
> The time to recover one node is limited by that node, but the time to recover that's most important is just the time to replicate the data that is missing from that node.  This is the removetoken operation (called removenode in 1.2), and this gets faster the more nodes you have.
> 
> Richard.
> 
> 
> On 11 December 2012 08:39, Eric Parusel <er...@gmail.com> wrote:
> Thanks for your thoughts guys.
> 
> I agree that with vnodes total downtime is lessened.  Although it also seems that the total number of outages (however small) would be greater.
> 
> But I think downtime is only lessened up to a certain cluster size.
> 
> I'm thinking that as the cluster continues to grow:
>   - node rebuild time will max out (a single node only has so much write bandwidth)
>   - the probability of 2 nodes being down at any given time will continue to increase -- even if you consider only non-correlated failures.
> 
> Therefore, when adding nodes beyond the point where node rebuild time maxes out, both the total number of outages *and* overall downtime would increase?
> 
> Thanks,
> Eric
> 
> 
> 
> 
> On Mon, Dec 10, 2012 at 7:00 AM, Edward Capriolo <ed...@gmail.com> wrote:
> Assuming you need to work with quorum in a non-vnode scenario. That means that if 2 nodes in a row in the ring are down some number of quorum operations will fail with UnavailableException (TimeoutException right after the failures). This is because the for a given range of tokens quorum will be impossible, but quorum will be possible for others.
> 
> In a vnode world if any two nodes are down,  then the intersection of vnode token ranges they have are unavailable. 
> 
> I think it is two sides of the same coin. 
> 
> 
> On Mon, Dec 10, 2012 at 7:41 AM, Richard Low <rl...@acunu.com> wrote:
> Hi Tyler,
> 
> You're right, the math does assume independence which is unlikely to be accurate.  But if you do have correlated failure modes e.g. same power, racks, DC, etc. then you can still use Cassandra's rack-aware or DC-aware features to ensure replicas are spread around so your cluster can survive the correlated failure mode.  So I would expect vnodes to improve uptime in all scenarios, but haven't done the math to prove it.
> 
> Richard.
> 
> 
> 
> 
> 
> -- 
> Richard Low
> Acunu | http://www.acunu.com | @acunu
>

Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?

Posted by Eric Parusel <er...@gmail.com>.

Ok, thanks Richard.  That's good to hear.

However, I still contend that as node count increases to infinity, the
probability of there being at least two node failures in the cluster at any
time would increase to 100%.

I think of this as somewhat analogous to RAID -- I would not be comfortable
with a 144+ disk RAID 6 array, no matter the rebuild speed :)

Is it possible to configure or write a snitch that would create separate
distribution zones within the cluster?  (e.g. 144 nodes in cluster, split
into 12 zones.  Data stored to node 1 could only be replicated to one of 11
other nodes in the same distribution zone).


On Tue, Dec 11, 2012 at 3:24 AM, Richard Low <rl...@acunu.com> wrote:

> Hi Eric,
>
> The time to recover one node is limited by that node, but the time to
> recover that's most important is just the time to replicate the data that
> is missing from that node.  This is the removetoken operation (called
> removenode in 1.2), and this gets faster the more nodes you have.
>
> Richard.
>
>
> On 11 December 2012 08:39, Eric Parusel <er...@gmail.com> wrote:
>
>> Thanks for your thoughts guys.
>>
>> I agree that with vnodes total downtime is lessened.  Although it also
>> seems that the total number of outages (however small) would be greater.
>>
>> But I think downtime is only lessened up to a certain cluster size.
>>
>> I'm thinking that as the cluster continues to grow:
>>   - node rebuild time will max out (a single node only has so much write
>> bandwidth)
>>   - the probability of 2 nodes being down at any given time will continue
>> to increase -- even if you consider only non-correlated failures.
>>
>> Therefore, when adding nodes beyond the point where node rebuild time
>> maxes out, both the total number of outages *and* overall downtime would
>> increase?
>>
>> Thanks,
>> Eric
>>
>>
>>
>>
>> On Mon, Dec 10, 2012 at 7:00 AM, Edward Capriolo <ed...@gmail.com>wrote:
>>
>>> Assuming you need to work with quorum in a non-vnode scenario. That
>>> means that if 2 nodes in a row in the ring are down some number of quorum
>>> operations will fail with UnavailableException (TimeoutException right
>>> after the failures). This is because the for a given range of tokens quorum
>>> will be impossible, but quorum will be possible for others.
>>>
>>> In a vnode world if any two nodes are down,  then the intersection of
>>> vnode token ranges they have are unavailable.
>>>
>>> I think it is two sides of the same coin.
>>>
>>>
>>> On Mon, Dec 10, 2012 at 7:41 AM, Richard Low <rl...@acunu.com> wrote:
>>>
>>>> Hi Tyler,
>>>>
>>>> You're right, the math does assume independence which is unlikely to be
>>>> accurate.  But if you do have correlated failure modes e.g. same power,
>>>> racks, DC, etc. then you can still use Cassandra's rack-aware or DC-aware
>>>> features to ensure replicas are spread around so your cluster can survive
>>>> the correlated failure mode.  So I would expect vnodes to improve uptime in
>>>> all scenarios, but haven't done the math to prove it.
>>>>
>>>> Richard.
>>>>
>>>
>>>
>>
>
>
> --
> Richard Low
> Acunu | http://www.acunu.com | @acunu
>

Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?

Posted by Richard Low <rl...@acunu.com>.

Hi Eric,

The time to recover one node is limited by that node, but the time to
recover that's most important is just the time to replicate the data that
is missing from that node.  This is the removetoken operation (called
removenode in 1.2), and this gets faster the more nodes you have.

Richard.


On 11 December 2012 08:39, Eric Parusel <er...@gmail.com> wrote:

> Thanks for your thoughts guys.
>
> I agree that with vnodes total downtime is lessened.  Although it also
> seems that the total number of outages (however small) would be greater.
>
> But I think downtime is only lessened up to a certain cluster size.
>
> I'm thinking that as the cluster continues to grow:
>   - node rebuild time will max out (a single node only has so much write
> bandwidth)
>   - the probability of 2 nodes being down at any given time will continue
> to increase -- even if you consider only non-correlated failures.
>
> Therefore, when adding nodes beyond the point where node rebuild time
> maxes out, both the total number of outages *and* overall downtime would
> increase?
>
> Thanks,
> Eric
>
>
>
>
> On Mon, Dec 10, 2012 at 7:00 AM, Edward Capriolo <ed...@gmail.com>wrote:
>
>> Assuming you need to work with quorum in a non-vnode scenario. That means
>> that if 2 nodes in a row in the ring are down some number of quorum
>> operations will fail with UnavailableException (TimeoutException right
>> after the failures). This is because the for a given range of tokens quorum
>> will be impossible, but quorum will be possible for others.
>>
>> In a vnode world if any two nodes are down,  then the intersection of
>> vnode token ranges they have are unavailable.
>>
>> I think it is two sides of the same coin.
>>
>>
>> On Mon, Dec 10, 2012 at 7:41 AM, Richard Low <rl...@acunu.com> wrote:
>>
>>> Hi Tyler,
>>>
>>> You're right, the math does assume independence which is unlikely to be
>>> accurate.  But if you do have correlated failure modes e.g. same power,
>>> racks, DC, etc. then you can still use Cassandra's rack-aware or DC-aware
>>> features to ensure replicas are spread around so your cluster can survive
>>> the correlated failure mode.  So I would expect vnodes to improve uptime in
>>> all scenarios, but haven't done the math to prove it.
>>>
>>> Richard.
>>>
>>
>>
>


-- 
Richard Low
Acunu | http://www.acunu.com | @acunu

Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?

Posted by Eric Parusel <er...@gmail.com>.

Thanks for your thoughts guys.

I agree that with vnodes total downtime is lessened.  Although it also
seems that the total number of outages (however small) would be greater.

But I think downtime is only lessened up to a certain cluster size.

I'm thinking that as the cluster continues to grow:
  - node rebuild time will max out (a single node only has so much write
bandwidth)
  - the probability of 2 nodes being down at any given time will continue
to increase -- even if you consider only non-correlated failures.

Therefore, when adding nodes beyond the point where node rebuild time maxes
out, both the total number of outages *and* overall downtime would increase?

Thanks,
Eric




On Mon, Dec 10, 2012 at 7:00 AM, Edward Capriolo <ed...@gmail.com>wrote:

> Assuming you need to work with quorum in a non-vnode scenario. That means
> that if 2 nodes in a row in the ring are down some number of quorum
> operations will fail with UnavailableException (TimeoutException right
> after the failures). This is because the for a given range of tokens quorum
> will be impossible, but quorum will be possible for others.
>
> In a vnode world if any two nodes are down,  then the intersection of
> vnode token ranges they have are unavailable.
>
> I think it is two sides of the same coin.
>
>
> On Mon, Dec 10, 2012 at 7:41 AM, Richard Low <rl...@acunu.com> wrote:
>
>> Hi Tyler,
>>
>> You're right, the math does assume independence which is unlikely to be
>> accurate.  But if you do have correlated failure modes e.g. same power,
>> racks, DC, etc. then you can still use Cassandra's rack-aware or DC-aware
>> features to ensure replicas are spread around so your cluster can survive
>> the correlated failure mode.  So I would expect vnodes to improve uptime in
>> all scenarios, but haven't done the math to prove it.
>>
>> Richard.
>>
>
>

Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?

Posted by Edward Capriolo <ed...@gmail.com>.

Assuming you need to work with quorum in a non-vnode scenario. That means
that if 2 nodes in a row in the ring are down some number of quorum
operations will fail with UnavailableException (TimeoutException right
after the failures). This is because the for a given range of tokens quorum
will be impossible, but quorum will be possible for others.

In a vnode world if any two nodes are down,  then the intersection of vnode
token ranges they have are unavailable.

I think it is two sides of the same coin.

On Mon, Dec 10, 2012 at 7:41 AM, Richard Low <rl...@acunu.com> wrote:

> Hi Tyler,
>
> You're right, the math does assume independence which is unlikely to be
> accurate.  But if you do have correlated failure modes e.g. same power,
> racks, DC, etc. then you can still use Cassandra's rack-aware or DC-aware
> features to ensure replicas are spread around so your cluster can survive
> the correlated failure mode.  So I would expect vnodes to improve uptime in
> all scenarios, but haven't done the math to prove it.
>
> Richard.
>

Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?

Posted by Richard Low <rl...@acunu.com>.

Hi Tyler,

You're right, the math does assume independence which is unlikely to be
accurate.  But if you do have correlated failure modes e.g. same power,
racks, DC, etc. then you can still use Cassandra's rack-aware or DC-aware
features to ensure replicas are spread around so your cluster can survive
the correlated failure mode.  So I would expect vnodes to improve uptime in
all scenarios, but haven't done the math to prove it.

Richard.


On 9 December 2012 17:50, Tyler Hobbs <ty...@datastax.com> wrote:

> Nicolas,
>
> Strictly speaking, your math makes the assumption that the failure of
> different nodes are probabilistically independent events. This is, of
> course, not a accurate assumption for real world conditions.  Nodes share
> racks, networking equipment, power, availability zones, data centers, etc.
> So, I think the mathematical assertion is not quite as strong as one would
> like, but it's certainly a good argument for handling certain types of node
> failures.
>
>
> On Fri, Dec 7, 2012 at 11:27 AM, Nicolas Favre-Felix <ni...@acunu.com>wrote:
>
>> Hi Eric,
>>
>> Your concerns are perfectly valid.
>>
>> We (Acunu) led the design and implementation of this feature and spent a
>> long time looking at the impact of such a large change.
>> We summarized some of our notes and wrote about the impact of virtual
>> nodes on cluster uptime a few months back:
>> http://www.acunu.com/2/post/2012/10/improving-cassandras-uptime-with-virtual-nodes.html
>> .
>> The main argument in this blog post is that you only have a failure to
>> perform quorum read/writes if at least RF replicas fail within the time it
>> takes to rebuild the first dead node. We show that virtual nodes actually
>> decrease the probability of failure, by streaming data from all nodes and
>> thereby improving the rebuild time.
>>
>> Regards,
>>
>> Nicolas
>>
>>
>> On Wed, Dec 5, 2012 at 4:45 PM, Eric Parusel <er...@gmail.com>wrote:
>>
>>> Hi all,
>>>
>>> I've been wondering about virtual nodes and how cluster uptime might
>>> change as cluster size increases.
>>>
>>> I understand clusters will benefit from increased reliability due to
>>> faster rebuild time, but does that hold true for large clusters?
>>>
>>> It seems that since (and correct me if I'm wrong here) every physical
>>> node will likely share some small amount of data with every other node,
>>> that as the count of physical nodes in a Cassandra cluster increases (let's
>>> say into the triple digits) that the probability of at least one failure to
>>> Quorum read/write occurring in a given time period would *increase*.
>>>
>>> Would this hold true, at least until physical nodes becomes greater than
>>> num_tokens per node?
>>>
>>> I understand that the window of failure for affected ranges would
>>> probably be small but we do Quorum reads of many keys, so we'd likely hit
>>> every virtual range with our queries, even if num_tokens was 256.
>>>
>>> Thanks,
>>> Eric
>>>
>>
>>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>
>


-- 
Richard Low
Acunu | http://www.acunu.com | @acunu

Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?

Posted by Tyler Hobbs <ty...@datastax.com>.

Nicolas,

Strictly speaking, your math makes the assumption that the failure of
different nodes are probabilistically independent events. This is, of
course, not a accurate assumption for real world conditions.  Nodes share
racks, networking equipment, power, availability zones, data centers, etc.
So, I think the mathematical assertion is not quite as strong as one would
like, but it's certainly a good argument for handling certain types of node
failures.


On Fri, Dec 7, 2012 at 11:27 AM, Nicolas Favre-Felix <ni...@acunu.com>wrote:

> Hi Eric,
>
> Your concerns are perfectly valid.
>
> We (Acunu) led the design and implementation of this feature and spent a
> long time looking at the impact of such a large change.
> We summarized some of our notes and wrote about the impact of virtual
> nodes on cluster uptime a few months back:
> http://www.acunu.com/2/post/2012/10/improving-cassandras-uptime-with-virtual-nodes.html
> .
> The main argument in this blog post is that you only have a failure to
> perform quorum read/writes if at least RF replicas fail within the time it
> takes to rebuild the first dead node. We show that virtual nodes actually
> decrease the probability of failure, by streaming data from all nodes and
> thereby improving the rebuild time.
>
> Regards,
>
> Nicolas
>
>
> On Wed, Dec 5, 2012 at 4:45 PM, Eric Parusel <er...@gmail.com>wrote:
>
>> Hi all,
>>
>> I've been wondering about virtual nodes and how cluster uptime might
>> change as cluster size increases.
>>
>> I understand clusters will benefit from increased reliability due to
>> faster rebuild time, but does that hold true for large clusters?
>>
>> It seems that since (and correct me if I'm wrong here) every physical
>> node will likely share some small amount of data with every other node,
>> that as the count of physical nodes in a Cassandra cluster increases (let's
>> say into the triple digits) that the probability of at least one failure to
>> Quorum read/write occurring in a given time period would *increase*.
>>
>> Would this hold true, at least until physical nodes becomes greater than
>> num_tokens per node?
>>
>> I understand that the window of failure for affected ranges would
>> probably be small but we do Quorum reads of many keys, so we'd likely hit
>> every virtual range with our queries, even if num_tokens was 256.
>>
>> Thanks,
>> Eric
>>
>
>


-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?

Posted by Nicolas Favre-Felix <ni...@acunu.com>.

Hi Eric,

Your concerns are perfectly valid.

We (Acunu) led the design and implementation of this feature and spent a
long time looking at the impact of such a large change.
We summarized some of our notes and wrote about the impact of virtual nodes
on cluster uptime a few months back:
http://www.acunu.com/2/post/2012/10/improving-cassandras-uptime-with-virtual-nodes.html
.
The main argument in this blog post is that you only have a failure to
perform quorum read/writes if at least RF replicas fail within the time it
takes to rebuild the first dead node. We show that virtual nodes actually
decrease the probability of failure, by streaming data from all nodes and
thereby improving the rebuild time.

Regards,

Nicolas

On Wed, Dec 5, 2012 at 4:45 PM, Eric Parusel <er...@gmail.com> wrote:

> Hi all,
>
> I've been wondering about virtual nodes and how cluster uptime might
> change as cluster size increases.
>
> I understand clusters will benefit from increased reliability due to
> faster rebuild time, but does that hold true for large clusters?
>
> It seems that since (and correct me if I'm wrong here) every physical node
> will likely share some small amount of data with every other node, that as
> the count of physical nodes in a Cassandra cluster increases (let's say
> into the triple digits) that the probability of at least one failure to
> Quorum read/write occurring in a given time period would *increase*.
>
> Would this hold true, at least until physical nodes becomes greater than
> num_tokens per node?
>
> I understand that the window of failure for affected ranges would probably
> be small but we do Quorum reads of many keys, so we'd likely hit every
> virtual range with our queries, even if num_tokens was 256.
>
> Thanks,
> Eric
>

Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?

Posted by Edward Capriolo <ed...@gmail.com>.

Good point . hadoop sprays its blocks around randomly. Thus if replication
factor nodes are down some blocks are not found. The larger the cluster the
higher chance nodes are down.

To deal with this increase rf once the cluster gets to be very large.


On Wednesday, December 5, 2012, Eric Parusel <er...@gmail.com> wrote:
> Hi all,
> I've been wondering about virtual nodes and how cluster uptime might
change as cluster size increases.
> I understand clusters will benefit from increased reliability due to
faster rebuild time, but does that hold true for large clusters?
> It seems that since (and correct me if I'm wrong here) every physical
node will likely share some small amount of data with every other node,
that as the count of physical nodes in a Cassandra cluster increases (let's
say into the triple digits) that the probability of at least one failure to
Quorum read/write occurring in a given time period would *increase*.
> Would this hold true, at least until physical nodes becomes greater than
num_tokens per node?
>
> I understand that the window of failure for affected ranges would
probably be small but we do Quorum reads of many keys, so we'd likely hit
every virtual range with our queries, even if num_tokens was 256.
> Thanks,
> Eric