You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jeff Jirsa <jj...@apache.org> on 2017/07/27 06:19:58 UTC

Re: ��������� ��������� tolerate how many nodes down in the cluster


On 2017-07-26 19:38 (-0700), "Peng Xiao" <25...@qq.com> wrote: 
> Kurt/All,
> 
> 
> why the  # of racks should be equal to RF?
> 
> For example,we have 2 DCs each 6 machines with RF=3,each machine virtualized to 8 vms ,
> can we set 6 racs with RF3? I mean one machine one RAC to avoid hardware errors or only set 3 racs,1 rac with 2 machines,which is better?
> 
> 

The guarantee you get from racks is that IF you have more racks than replicas, you won't have 2 replicas on the same rack. There's no requirement that # of racks >= # of replicas, you just leave yourself exposed to losing quorum if you have an outage while # racks < # replicas. 

Yes, with a rack == a hypervisor, the snitch would avoid placing 2 replicas on the same physical machine, and would protect you against hardware errors. There's nothing to gain from having 3 racks instead of 6 in that case (in fact 6 is probably better, as you're less likely to have to skip a duplicate rack in getNaturalEndpoints()).

All of this said:

BE REALLY CAREFUL WHEN USING RACKS. 

If you start with # of racks < RF, and you try to add another rack, you will probably be very unhappy (when you add that first node in the new rack, it'll take 1/RF of the ring instantly, which usually crashes everything). For that reason, a lot of people advise not to use racks unless you have > RF racks, or you REALLY know what you're doing.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


Re: 回复: 回复: tolerate how many nodes down in the cluster

Posted by kurt greaves <ku...@instaclustr.com>.
Note that if you use more racks than RF you lose some of the operational
benefit. e.g: you'll still only be able to take out one rack at a time
(especially if using vnodes), despite the fact that you have more racks
than RF. As Jeff said this may be desirable, but really it comes down to
what your physical failure domains are and how/if you plan to scale.

As Jeff said, as long as you don't start with # racks < RF you should be
fine.