You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jens Rantil <je...@tink.se> on 2015/06/07 23:19:54 UTC

Newly added node getting more data than expected

Hi,

I had a 3-node (à 256 vnodes each) cluster with RF=3. I mistakenly added a
fourth node with "num_tokens: 1" (that is, one vnode). I've always seen
number of vnodes to be proportional to the amount of data a node would
receive. Therefor, I was expecting the node to receive something like
1/(1+3*256) of the cluster's data. However, this is not the case:

$ nodetool status mydatacenter
Datacenter: Cassandra
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       Tokens  Owns (effective)  Host ID
                Rack
UN  X.X.X.2  200.42 GB  256     87.6%
871968c9-1d6b-4f06-ba90-8b3a8d92dcf0  RAC1
UN  X.X.X.3  198.03 GB  256     53.7%
d7cacd89-8613-4de5-8a5e-a2c53c41ea45  RAC1
UN  X.X.X.4  110.57 GB  1       58.7%
55daa807-af49-44c5-9742-fe456df621a1  RAC1
UN  X.X.X.5  199.81 GB  256     100.0%
 48cb0782-6c9a-4805-9330-38e192b6b680  RAC1

The new node added is "X.X.X.4". Note that I haven't executed `nodetool
cleanup` on the old nodes yet.

Additional information:
 * I am using GossipingPropertyFileSnitch. All nodes are the same
datacenter and rack.
 * There are no pending compactions on the node.

Could anyone explain to me my new node is receiving more data than
expected? Does this have to do with the way the GossipingPropertyFileSnitch
decides where to put secondary/tertiary replicas (ie. always "next physical
node" in ring)? Do I need to execute `nodetool cleanup` also on newly
commissioned nodes?

Thanks,
Jens

-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
 Twitter <https://twitter.com/tink>

Re: Newly added node getting more data than expected

Posted by Jens Rantil <je...@tink.se>.
Hi again,

I should also point out that `nodetool ring ...` only has one entry for
"X.X.X.4" and that that token range is equally large as the other token
ranges for the virtual nodes.

Let me know if you need any more information from me.

Cheers,
Jens

On Sun, Jun 7, 2015 at 11:19 PM, Jens Rantil <je...@tink.se> wrote:

> Hi,
>
> I had a 3-node (à 256 vnodes each) cluster with RF=3. I mistakenly added a
> fourth node with "num_tokens: 1" (that is, one vnode). I've always seen
> number of vnodes to be proportional to the amount of data a node would
> receive. Therefor, I was expecting the node to receive something like
> 1/(1+3*256) of the cluster's data. However, this is not the case:
>
> $ nodetool status mydatacenter
> Datacenter: Cassandra
> =====================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address     Load       Tokens  Owns (effective)  Host ID
>                 Rack
> UN  X.X.X.2  200.42 GB  256     87.6%
> 871968c9-1d6b-4f06-ba90-8b3a8d92dcf0  RAC1
> UN  X.X.X.3  198.03 GB  256     53.7%
> d7cacd89-8613-4de5-8a5e-a2c53c41ea45  RAC1
> UN  X.X.X.4  110.57 GB  1       58.7%
> 55daa807-af49-44c5-9742-fe456df621a1  RAC1
> UN  X.X.X.5  199.81 GB  256     100.0%
>  48cb0782-6c9a-4805-9330-38e192b6b680  RAC1
>
> The new node added is "X.X.X.4". Note that I haven't executed `nodetool
> cleanup` on the old nodes yet.
>
> Additional information:
>  * I am using GossipingPropertyFileSnitch. All nodes are the same
> datacenter and rack.
>  * There are no pending compactions on the node.
>
> Could anyone explain to me my new node is receiving more data than
> expected? Does this have to do with the way the GossipingPropertyFileSnitch
> decides where to put secondary/tertiary replicas (ie. always "next physical
> node" in ring)? Do I need to execute `nodetool cleanup` also on newly
> commissioned nodes?
>
> Thanks,
> Jens
>
> --
> Jens Rantil
> Backend engineer
> Tink AB
>
> Email: jens.rantil@tink.se
> Phone: +46 708 84 18 32
> Web: www.tink.se
>
> Facebook <https://www.facebook.com/#!/tink.se> Linkedin
> <http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
>  Twitter <https://twitter.com/tink>
>



-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
 Twitter <https://twitter.com/tink>