You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Jinhua Luo <lu...@gmail.com> on 2018/04/11 10:34:55 UTC

does c* 3.0 use one ring for all datacenters?

Hi All,

I know it seems a stupid question, but I am really confused about the
documents on the internet related to this topic, especially it seems
that it has different answers for c* with vnodes or not.

Let's assume the token range is 1-100 for the whole cluster, how does
it distributed into the datacenters? Think that the number of
datacenters is dynamic in a cluster, if there is only one ring, then
the token range would change on each node when I add a new datacenter
into the cluster? Then it would involve data migration? It doesn't
make sense.

Looking forward to clarification for c* 3.0, thanks!

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jinhua Luo <lu...@gmail.com>.

Yes, I knew what you said. But in fact, I don't understand under one
ring, how token ranges get distributed into DCs, especially when DC
list is dynamic.

2018-04-11 22:15 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
> I am 100% sure it's one ring :)
>
> When a partition key is hashed, it lands somewhere on the ring.  NTS walks
> the ring forward to find replicas in each DC and hopefully in different
> racks.
>
> There's no re-balancing of tokens when adding a new DC, it's not necessary.
> The ring is enormous.  Each DC has nodes which have tokens throughout the
> ring.
>
> When configuring a DC, you say you want X replicas per DC.  This involves
> copying a lot of data between DCs but does not change the replicas for
> existing DCs and doesn't involve data movement within there.
>
> On Wed, Apr 11, 2018 at 6:27 AM Jinhua Luo <lu...@gmail.com> wrote:
>>
>> Is it a different answer? One ring?
>>
>> Could you explain your answer according to my example?
>>
>> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
>> > There has always been a single ring.
>> >
>> > You can specify how many nodes in each DC you want and it’ll figure out
>> > how
>> > to do it as long as you have the right snitch and are using
>> > NetworkToploogyStrategy.
>> >
>> >
>> > On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <lu...@gmail.com> wrote:
>> >>
>> >> Let me clarify my question:
>> >>
>> >> Given we have a cluster of two DCs, each DC has 2 nodes, each node
>> >> sets num_token as 50.
>> >> Then how are token ranges distributed in the cluster?
>> >>
>> >> If there is one global ring, then it may be (To simply the case, let's
>> >> assume vnodes=1):
>> >> {dc1, node1} 1-50
>> >> {dc2, node1} 51-100
>> >> {dc1, node1} 101-150
>> >> {dc1, node2} 151-200
>> >>
>> >> But here comes more questions:
>> >> a) what if I add a new datacenter? Then the token ranges need to be
>> >> re-balanced?
>> >> If so, what about the data associated with the ranges to be balanced?
>> >> move them among DCs?
>> >> But that doesn't make sense, because each keyspace would specify its
>> >> snith and fix the DCs to store then.
>> >>
>> >> b) It seems no benefits from same ring, because of the snith.
>> >>
>> >> If each DC has own ring, then it may be:
>> >> {dc1, node1} 1-50
>> >> {dc1, node1} 51-100
>> >> {dc2, node1} 1-50
>> >> {dc2, node1} 51-100
>> >>
>> >> I think this is not a trivial question, because each key would be
>> >> hashed to determine the token it belongs to, and
>> >> the token range distribution in turns determine which node the key
>> >> belongs
>> >> to.
>> >>
>> >> Any official answer?
>> >>
>> >>
>> >> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
>> >> <ja...@genesys.com>:
>> >> > Maybe I misunderstood something but from what I understand, each DC
>> >> > have
>> >> > the same ring (0-100 in you example) but it's split differently
>> >> > between
>> >> > nodes in each DC. I think it's the same principle if using vnode or
>> >> > not.
>> >> >
>> >> > I think the confusion comes from the fact that the ring range is the
>> >> > same (0-100) but each DC manages it differently because nodes are
>> >> > different.
>> >> >
>> >> > --
>> >> > Jacques-Henri Berthemet
>> >> >
>> >> > -----Original Message-----
>> >> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> >> > Sent: Wednesday, April 11, 2018 2:26 PM
>> >> > To: user@cassandra.apache.org
>> >> > Subject: Re: does c* 3.0 use one ring for all datacenters?
>> >> >
>> >> > Thanks for your reply. I also think separate rings are more
>> >> > reasonable.
>> >> >
>> >> > So one ring for one dc is only for c* 1.x or 2.x without vnode?
>> >> >
>> >> > Check these references:
>> >> >
>> >> >
>> >> >
>> >> > https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html
>> >> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
>> >> >
>> >> >
>> >> > https://community.apigee.com/articles/13096/cassandra-token-distribution.html
>> >> >
>> >> > Even the riak official said c* splits the ring across dc:
>> >> >
>> >> >
>> >> > http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/
>> >> >
>> >> > Why they said each dc has its own ring?
>> >> >
>> >> >
>> >> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
>> >> > <ja...@genesys.com>:
>> >> >> Hi,
>> >> >>
>> >> >> Each DC has the whole ring, each DC contains a copy of the same
>> >> >> data.
>> >> >> When you add replication to a new DC, all data is copied to the new
>> >> >> DC.
>> >> >>
>> >> >> Within a DC, each range of token is 'owned' by a (primary) node (and
>> >> >> replicas if you have RF > 1). If you add/remove a node in a DC,
>> >> >> tokens will
>> >> >> be rearranged between all nodes within the DC only, the other DCs
>> >> >> won't be
>> >> >> affected.
>> >> >>
>> >> >> --
>> >> >> Jacques-Henri Berthemet
>> >> >>
>> >> >> -----Original Message-----
>> >> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> >> >> Sent: Wednesday, April 11, 2018 12:35 PM
>> >> >> To: user@cassandra.apache.org
>> >> >> Subject: does c* 3.0 use one ring for all datacenters?
>> >> >>
>> >> >> Hi All,
>> >> >>
>> >> >> I know it seems a stupid question, but I am really confused about
>> >> >> the
>> >> >> documents on the internet related to this topic, especially it seems
>> >> >> that it
>> >> >> has different answers for c* with vnodes or not.
>> >> >>
>> >> >> Let's assume the token range is 1-100 for the whole cluster, how
>> >> >> does
>> >> >> it distributed into the datacenters? Think that the number of
>> >> >> datacenters is
>> >> >> dynamic in a cluster, if there is only one ring, then the token
>> >> >> range would
>> >> >> change on each node when I add a new datacenter into the cluster?
>> >> >> Then it
>> >> >> would involve data migration? It doesn't make sense.
>> >> >>
>> >> >> Looking forward to clarification for c* 3.0, thanks!
>> >> >>
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >>
>> >> >>
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >
>> >> > ---------------------------------------------------------------------
>> >> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> > For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jonathan Haddad <jo...@jonhaddad.com>.

I am 100% sure it's one ring :)

When a partition key is hashed, it lands somewhere on the ring.  NTS walks
the ring forward to find replicas in each DC and hopefully in different
racks.

There's no re-balancing of tokens when adding a new DC, it's not
necessary.  The ring is enormous.  Each DC has nodes which have tokens
throughout the ring.

When configuring a DC, you say you want X replicas per DC.  This involves
copying a lot of data between DCs but does not change the replicas for
existing DCs and doesn't involve data movement within there.

On Wed, Apr 11, 2018 at 6:27 AM Jinhua Luo <lu...@gmail.com> wrote:

> Is it a different answer? One ring?
>
> Could you explain your answer according to my example?
>
> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
> > There has always been a single ring.
> >
> > You can specify how many nodes in each DC you want and it’ll figure out
> how
> > to do it as long as you have the right snitch and are using
> > NetworkToploogyStrategy.
> >
> >
> > On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <lu...@gmail.com> wrote:
> >>
> >> Let me clarify my question:
> >>
> >> Given we have a cluster of two DCs, each DC has 2 nodes, each node
> >> sets num_token as 50.
> >> Then how are token ranges distributed in the cluster?
> >>
> >> If there is one global ring, then it may be (To simply the case, let's
> >> assume vnodes=1):
> >> {dc1, node1} 1-50
> >> {dc2, node1} 51-100
> >> {dc1, node1} 101-150
> >> {dc1, node2} 151-200
> >>
> >> But here comes more questions:
> >> a) what if I add a new datacenter? Then the token ranges need to be
> >> re-balanced?
> >> If so, what about the data associated with the ranges to be balanced?
> >> move them among DCs?
> >> But that doesn't make sense, because each keyspace would specify its
> >> snith and fix the DCs to store then.
> >>
> >> b) It seems no benefits from same ring, because of the snith.
> >>
> >> If each DC has own ring, then it may be:
> >> {dc1, node1} 1-50
> >> {dc1, node1} 51-100
> >> {dc2, node1} 1-50
> >> {dc2, node1} 51-100
> >>
> >> I think this is not a trivial question, because each key would be
> >> hashed to determine the token it belongs to, and
> >> the token range distribution in turns determine which node the key
> belongs
> >> to.
> >>
> >> Any official answer?
> >>
> >>
> >> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
> >> <ja...@genesys.com>:
> >> > Maybe I misunderstood something but from what I understand, each DC
> have
> >> > the same ring (0-100 in you example) but it's split differently
> between
> >> > nodes in each DC. I think it's the same principle if using vnode or
> not.
> >> >
> >> > I think the confusion comes from the fact that the ring range is the
> >> > same (0-100) but each DC manages it differently because nodes are
> different.
> >> >
> >> > --
> >> > Jacques-Henri Berthemet
> >> >
> >> > -----Original Message-----
> >> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
> >> > Sent: Wednesday, April 11, 2018 2:26 PM
> >> > To: user@cassandra.apache.org
> >> > Subject: Re: does c* 3.0 use one ring for all datacenters?
> >> >
> >> > Thanks for your reply. I also think separate rings are more
> reasonable.
> >> >
> >> > So one ring for one dc is only for c* 1.x or 2.x without vnode?
> >> >
> >> > Check these references:
> >> >
> >> >
> >> >
> https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html
> >> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
> >> >
> >> >
> https://community.apigee.com/articles/13096/cassandra-token-distribution.html
> >> >
> >> > Even the riak official said c* splits the ring across dc:
> >> >
> >> >
> http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/
> >> >
> >> > Why they said each dc has its own ring?
> >> >
> >> >
> >> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
> >> > <ja...@genesys.com>:
> >> >> Hi,
> >> >>
> >> >> Each DC has the whole ring, each DC contains a copy of the same data.
> >> >> When you add replication to a new DC, all data is copied to the new
> DC.
> >> >>
> >> >> Within a DC, each range of token is 'owned' by a (primary) node (and
> >> >> replicas if you have RF > 1). If you add/remove a node in a DC,
> tokens will
> >> >> be rearranged between all nodes within the DC only, the other DCs
> won't be
> >> >> affected.
> >> >>
> >> >> --
> >> >> Jacques-Henri Berthemet
> >> >>
> >> >> -----Original Message-----
> >> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
> >> >> Sent: Wednesday, April 11, 2018 12:35 PM
> >> >> To: user@cassandra.apache.org
> >> >> Subject: does c* 3.0 use one ring for all datacenters?
> >> >>
> >> >> Hi All,
> >> >>
> >> >> I know it seems a stupid question, but I am really confused about the
> >> >> documents on the internet related to this topic, especially it seems
> that it
> >> >> has different answers for c* with vnodes or not.
> >> >>
> >> >> Let's assume the token range is 1-100 for the whole cluster, how does
> >> >> it distributed into the datacenters? Think that the number of
> datacenters is
> >> >> dynamic in a cluster, if there is only one ring, then the token
> range would
> >> >> change on each node when I add a new datacenter into the cluster?
> Then it
> >> >> would involve data migration? It doesn't make sense.
> >> >>
> >> >> Looking forward to clarification for c* 3.0, thanks!
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >> >>
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> > For additional commands, e-mail: user-help@cassandra.apache.org
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jeff Jirsa <jj...@gmail.com>.

If you tried this it’d probably fail in an unpleasant way

Tokens never move automatically. We add new tokens. Administrators can move tokens. Cassandra doesn’t auto-move tokens.

-- 
Jeff Jirsa


> On Apr 28, 2018, at 3:05 AM, Jinhua Luo <lu...@gmail.com> wrote:
> 
> If two DC are separated at first place before they meet each other.
> And given the total token range is 100, and each DC has same tokens,
> let's say 5.
> Then because they assign tokens independently, they will has same
> token ranges, right?
> For example,
> DC1 = {0, 20, 40, 60, 80}
> DC2 = {0, 20, 40, 60, 80}
> 
> Then, when the DC meet each other, they should merge two rings into one, right?
> Here are the questions:
> a) who does the merge?
> b) the tokens change after merge?
> 
> 
> 
> 2018-04-27 1:51 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
>> 
>> 
>>> On Thu, Apr 26, 2018 at 1:34 AM, Jinhua Luo <lu...@gmail.com> wrote:
>>> 
>>> How to guarantee the tokens independent between DC?
>> 
>> 
>> Cassandra wont let you have duplicate tokens - it wont start if you do it by
>> mistake, and it won't do it automatically.
>> 
>>> 
>>> They forms one
>>> ring, and they must be (re-)assigned when needed.
>> 
>> 
>> Tokens dont move automatically. There's no auto-reassignment. You can move a
>> token, but nothing does it automatically.
>> 
>>> 
>>> Use offset per DC? But it seems that the DC list must be fixed in
>>> advanced?
>>> To make sure the tokens are evenly distributed into the ring among the
>>> DC(s), are there chances to change the tokens owned by per DC?
>>> Could you please give a detailed token re-balancing procedure in case
>>> of node add/remove?
>> 
>> 
>> Calculate final state. Run repair and cleanup. Move tokens as needed. If
>> you're not able to reason through this, you may want to consider using
>> vnodes so it becomes less of an issue.
>> 
>>> 
>>> 
>>> 2018-04-26 16:23 GMT+08:00 Xiaolong Jiang <xi...@gmail.com>:
>>>> DC are independent of each other. Adding nodes to DC1  won't have any
>>>> token
>>>> effect owned by other DC.
>>>> 
>>>>> On Thu, Apr 26, 2018 at 1:04 AM, Jinhua Luo <lu...@gmail.com> wrote:
>>>>> 
>>>>> You're assuming per DC has same total num_tokens, right?
>>>>> If I add a new node into DC1, will it change the tokens owned by DC2
>>>>> and
>>>>> DC3?
>>>>> 
>>>>> 2018-04-12 0:59 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
>>>>>> When you add DC3, they'll get tokens (that aren't currently in use in
>>>>>> any
>>>>>> existing DC). Either you assign tokens (let's pretend we manually
>>>>>> assigned
>>>>>> the other ones, since DC2 = DC1 + 1), but cassandra can also
>>>>>> auto-calculate
>>>>>> them, the exact behavior of which varies by version.
>>>>>> 
>>>>>> 
>>>>>> Let's pretend it's old style random assignment, and we end up with
>>>>>> DC3
>>>>>> having 4, 17, 22, 36, 48, 53, 64, 73, 83
>>>>>> 
>>>>>> In this case:
>>>>>> 
>>>>>> If you use SimpleStrategy and RF=3, a key with token 5 would be
>>>>>> placed
>>>>>> on
>>>>>> the hosts with token 10, 11, 17
>>>>>> If you use NetworkTopologyStrategy with RF=3 per DC, a key with token
>>>>>> 5
>>>>>> would be placed on the hosts with tokens 10,20,30 ; 11, 21,31 ; 17,
>>>>>> 22,
>>>>>> 36
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Wed, Apr 11, 2018 at 9:36 AM, Jinhua Luo <lu...@gmail.com>
>>>>>> wrote:
>>>>>>> 
>>>>>>> What if I add a new DC3?
>>>>>>> The token ranges would reshuffled into DC1, DC2, DC3?
>>>>>>> 
>>>>>>> 2018-04-11 22:06 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
>>>>>>>> Confirming again that it's definitely one ring.
>>>>>>>> 
>>>>>>>> DC1 may have tokens 0, 10, 20, 30, 40, 50, 60, 70, 80
>>>>>>>> DC2 may have tokens 1, 11, 21, 31, 41, 51, 61, 71, 81
>>>>>>>> 
>>>>>>>> If you use SimpleStrategy and RF=3, a key with token 5 would be
>>>>>>>> placed
>>>>>>>> on
>>>>>>>> the hosts with token 10, 11, 20
>>>>>>>> If you use NetworkTopologyStrategy with RF=3 per DC, a key with
>>>>>>>> token
>>>>>>>> 5
>>>>>>>> would be placed on the hosts with tokens 10,20,30 and 11, 21,31
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Wed, Apr 11, 2018 at 6:27 AM, Jinhua Luo <lu...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> Is it a different answer? One ring?
>>>>>>>>> 
>>>>>>>>> Could you explain your answer according to my example?
>>>>>>>>> 
>>>>>>>>> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
>>>>>>>>>> There has always been a single ring.
>>>>>>>>>> 
>>>>>>>>>> You can specify how many nodes in each DC you want and it’ll
>>>>>>>>>> figure
>>>>>>>>>> out
>>>>>>>>>> how
>>>>>>>>>> to do it as long as you have the right snitch and are using
>>>>>>>>>> NetworkToploogyStrategy.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo
>>>>>>>>>> <lu...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Let me clarify my question:
>>>>>>>>>>> 
>>>>>>>>>>> Given we have a cluster of two DCs, each DC has 2 nodes, each
>>>>>>>>>>> node
>>>>>>>>>>> sets num_token as 50.
>>>>>>>>>>> Then how are token ranges distributed in the cluster?
>>>>>>>>>>> 
>>>>>>>>>>> If there is one global ring, then it may be (To simply the
>>>>>>>>>>> case,
>>>>>>>>>>> let's
>>>>>>>>>>> assume vnodes=1):
>>>>>>>>>>> {dc1, node1} 1-50
>>>>>>>>>>> {dc2, node1} 51-100
>>>>>>>>>>> {dc1, node1} 101-150
>>>>>>>>>>> {dc1, node2} 151-200
>>>>>>>>>>> 
>>>>>>>>>>> But here comes more questions:
>>>>>>>>>>> a) what if I add a new datacenter? Then the token ranges need
>>>>>>>>>>> to
>>>>>>>>>>> be
>>>>>>>>>>> re-balanced?
>>>>>>>>>>> If so, what about the data associated with the ranges to be
>>>>>>>>>>> balanced?
>>>>>>>>>>> move them among DCs?
>>>>>>>>>>> But that doesn't make sense, because each keyspace would
>>>>>>>>>>> specify
>>>>>>>>>>> its
>>>>>>>>>>> snith and fix the DCs to store then.
>>>>>>>>>>> 
>>>>>>>>>>> b) It seems no benefits from same ring, because of the snith.
>>>>>>>>>>> 
>>>>>>>>>>> If each DC has own ring, then it may be:
>>>>>>>>>>> {dc1, node1} 1-50
>>>>>>>>>>> {dc1, node1} 51-100
>>>>>>>>>>> {dc2, node1} 1-50
>>>>>>>>>>> {dc2, node1} 51-100
>>>>>>>>>>> 
>>>>>>>>>>> I think this is not a trivial question, because each key would
>>>>>>>>>>> be
>>>>>>>>>>> hashed to determine the token it belongs to, and
>>>>>>>>>>> the token range distribution in turns determine which node the
>>>>>>>>>>> key
>>>>>>>>>>> belongs
>>>>>>>>>>> to.
>>>>>>>>>>> 
>>>>>>>>>>> Any official answer?
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
>>>>>>>>>>> <ja...@genesys.com>:
>>>>>>>>>>>> Maybe I misunderstood something but from what I understand,
>>>>>>>>>>>> each
>>>>>>>>>>>> DC
>>>>>>>>>>>> have
>>>>>>>>>>>> the same ring (0-100 in you example) but it's split
>>>>>>>>>>>> differently
>>>>>>>>>>>> between
>>>>>>>>>>>> nodes in each DC. I think it's the same principle if using
>>>>>>>>>>>> vnode
>>>>>>>>>>>> or
>>>>>>>>>>>> not.
>>>>>>>>>>>> 
>>>>>>>>>>>> I think the confusion comes from the fact that the ring
>>>>>>>>>>>> range
>>>>>>>>>>>> is
>>>>>>>>>>>> the
>>>>>>>>>>>> same (0-100) but each DC manages it differently because
>>>>>>>>>>>> nodes
>>>>>>>>>>>> are
>>>>>>>>>>>> different.
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Jacques-Henri Berthemet
>>>>>>>>>>>> 
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Jinhua Luo [mailto:luajit.io@gmail.com]
>>>>>>>>>>>> Sent: Wednesday, April 11, 2018 2:26 PM
>>>>>>>>>>>> To: user@cassandra.apache.org
>>>>>>>>>>>> Subject: Re: does c* 3.0 use one ring for all datacenters?
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks for your reply. I also think separate rings are more
>>>>>>>>>>>> reasonable.
>>>>>>>>>>>> 
>>>>>>>>>>>> So one ring for one dc is only for c* 1.x or 2.x without
>>>>>>>>>>>> vnode?
>>>>>>>>>>>> 
>>>>>>>>>>>> Check these references:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html
>>>>>>>>>>>> http://www.luketillman.com/one-token-ring-to-rule-them-all/
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> https://community.apigee.com/articles/13096/cassandra-token-distribution.html
>>>>>>>>>>>> 
>>>>>>>>>>>> Even the riak official said c* splits the ring across dc:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/
>>>>>>>>>>>> 
>>>>>>>>>>>> Why they said each dc has its own ring?
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
>>>>>>>>>>>> <ja...@genesys.com>:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Each DC has the whole ring, each DC contains a copy of the
>>>>>>>>>>>>> same
>>>>>>>>>>>>> data.
>>>>>>>>>>>>> When you add replication to a new DC, all data is copied to
>>>>>>>>>>>>> the
>>>>>>>>>>>>> new
>>>>>>>>>>>>> DC.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Within a DC, each range of token is 'owned' by a (primary)
>>>>>>>>>>>>> node
>>>>>>>>>>>>> (and
>>>>>>>>>>>>> replicas if you have RF > 1). If you add/remove a node in a
>>>>>>>>>>>>> DC,
>>>>>>>>>>>>> tokens will
>>>>>>>>>>>>> be rearranged between all nodes within the DC only, the
>>>>>>>>>>>>> other
>>>>>>>>>>>>> DCs
>>>>>>>>>>>>> won't be
>>>>>>>>>>>>> affected.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Jacques-Henri Berthemet
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: Jinhua Luo [mailto:luajit.io@gmail.com]
>>>>>>>>>>>>> Sent: Wednesday, April 11, 2018 12:35 PM
>>>>>>>>>>>>> To: user@cassandra.apache.org
>>>>>>>>>>>>> Subject: does c* 3.0 use one ring for all datacenters?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I know it seems a stupid question, but I am really confused
>>>>>>>>>>>>> about
>>>>>>>>>>>>> the
>>>>>>>>>>>>> documents on the internet related to this topic, especially
>>>>>>>>>>>>> it
>>>>>>>>>>>>> seems
>>>>>>>>>>>>> that it
>>>>>>>>>>>>> has different answers for c* with vnodes or not.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Let's assume the token range is 1-100 for the whole
>>>>>>>>>>>>> cluster,
>>>>>>>>>>>>> how
>>>>>>>>>>>>> does
>>>>>>>>>>>>> it distributed into the datacenters? Think that the number
>>>>>>>>>>>>> of
>>>>>>>>>>>>> datacenters is
>>>>>>>>>>>>> dynamic in a cluster, if there is only one ring, then the
>>>>>>>>>>>>> token
>>>>>>>>>>>>> range would
>>>>>>>>>>>>> change on each node when I add a new datacenter into the
>>>>>>>>>>>>> cluster?
>>>>>>>>>>>>> Then it
>>>>>>>>>>>>> would involve data migration? It doesn't make sense.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Looking forward to clarification for c* 3.0, thanks!
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>> To unsubscribe, e-mail:
>>>>>>>>>>>>> user-unsubscribe@cassandra.apache.org
>>>>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>>>>> user-help@cassandra.apache.org
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>> To unsubscribe, e-mail:
>>>>>>>>>>>>> user-unsubscribe@cassandra.apache.org
>>>>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>>>>> user-help@cassandra.apache.org
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>> To unsubscribe, e-mail:
>>>>>>>>>>>> user-unsubscribe@cassandra.apache.org
>>>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>>>> user-help@cassandra.apache.org
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>>> user-help@cassandra.apache.org
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>>>>>>> For additional commands, e-mail: user-help@cassandra.apache.org
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>>>>> For additional commands, e-mail: user-help@cassandra.apache.org
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>>> For additional commands, e-mail: user-help@cassandra.apache.org
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Best regards,
>>>> Xiaolong Jiang
>>>> 
>>>> Software Engineer at Apple
>>>> Columbia University
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>> For additional commands, e-mail: user-help@cassandra.apache.org
>>> 
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jinhua Luo <lu...@gmail.com>.

If two DC are separated at first place before they meet each other.
And given the total token range is 100, and each DC has same tokens,
let's say 5.
Then because they assign tokens independently, they will has same
token ranges, right?
For example,
DC1 = {0, 20, 40, 60, 80}
DC2 = {0, 20, 40, 60, 80}

Then, when the DC meet each other, they should merge two rings into one, right?
Here are the questions:
a) who does the merge?
b) the tokens change after merge?



2018-04-27 1:51 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
>
>
> On Thu, Apr 26, 2018 at 1:34 AM, Jinhua Luo <lu...@gmail.com> wrote:
>>
>> How to guarantee the tokens independent between DC?
>
>
> Cassandra wont let you have duplicate tokens - it wont start if you do it by
> mistake, and it won't do it automatically.
>
>>
>> They forms one
>> ring, and they must be (re-)assigned when needed.
>
>
> Tokens dont move automatically. There's no auto-reassignment. You can move a
> token, but nothing does it automatically.
>
>>
>> Use offset per DC? But it seems that the DC list must be fixed in
>> advanced?
>> To make sure the tokens are evenly distributed into the ring among the
>> DC(s), are there chances to change the tokens owned by per DC?
>> Could you please give a detailed token re-balancing procedure in case
>> of node add/remove?
>
>
> Calculate final state. Run repair and cleanup. Move tokens as needed. If
> you're not able to reason through this, you may want to consider using
> vnodes so it becomes less of an issue.
>
>>
>>
>> 2018-04-26 16:23 GMT+08:00 Xiaolong Jiang <xi...@gmail.com>:
>> > DC are independent of each other. Adding nodes to DC1  won't have any
>> > token
>> > effect owned by other DC.
>> >
>> > On Thu, Apr 26, 2018 at 1:04 AM, Jinhua Luo <lu...@gmail.com> wrote:
>> >>
>> >> You're assuming per DC has same total num_tokens, right?
>> >> If I add a new node into DC1, will it change the tokens owned by DC2
>> >> and
>> >> DC3?
>> >>
>> >> 2018-04-12 0:59 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
>> >> > When you add DC3, they'll get tokens (that aren't currently in use in
>> >> > any
>> >> > existing DC). Either you assign tokens (let's pretend we manually
>> >> > assigned
>> >> > the other ones, since DC2 = DC1 + 1), but cassandra can also
>> >> > auto-calculate
>> >> > them, the exact behavior of which varies by version.
>> >> >
>> >> >
>> >> > Let's pretend it's old style random assignment, and we end up with
>> >> > DC3
>> >> > having 4, 17, 22, 36, 48, 53, 64, 73, 83
>> >> >
>> >> > In this case:
>> >> >
>> >> > If you use SimpleStrategy and RF=3, a key with token 5 would be
>> >> > placed
>> >> > on
>> >> > the hosts with token 10, 11, 17
>> >> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token
>> >> > 5
>> >> > would be placed on the hosts with tokens 10,20,30 ; 11, 21,31 ; 17,
>> >> > 22,
>> >> > 36
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Apr 11, 2018 at 9:36 AM, Jinhua Luo <lu...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> What if I add a new DC3?
>> >> >> The token ranges would reshuffled into DC1, DC2, DC3?
>> >> >>
>> >> >> 2018-04-11 22:06 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
>> >> >> > Confirming again that it's definitely one ring.
>> >> >> >
>> >> >> > DC1 may have tokens 0, 10, 20, 30, 40, 50, 60, 70, 80
>> >> >> > DC2 may have tokens 1, 11, 21, 31, 41, 51, 61, 71, 81
>> >> >> >
>> >> >> > If you use SimpleStrategy and RF=3, a key with token 5 would be
>> >> >> > placed
>> >> >> > on
>> >> >> > the hosts with token 10, 11, 20
>> >> >> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with
>> >> >> > token
>> >> >> > 5
>> >> >> > would be placed on the hosts with tokens 10,20,30 and 11, 21,31
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Wed, Apr 11, 2018 at 6:27 AM, Jinhua Luo <lu...@gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Is it a different answer? One ring?
>> >> >> >>
>> >> >> >> Could you explain your answer according to my example?
>> >> >> >>
>> >> >> >> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
>> >> >> >> > There has always been a single ring.
>> >> >> >> >
>> >> >> >> > You can specify how many nodes in each DC you want and it’ll
>> >> >> >> > figure
>> >> >> >> > out
>> >> >> >> > how
>> >> >> >> > to do it as long as you have the right snitch and are using
>> >> >> >> > NetworkToploogyStrategy.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo
>> >> >> >> > <lu...@gmail.com>
>> >> >> >> > wrote:
>> >> >> >> >>
>> >> >> >> >> Let me clarify my question:
>> >> >> >> >>
>> >> >> >> >> Given we have a cluster of two DCs, each DC has 2 nodes, each
>> >> >> >> >> node
>> >> >> >> >> sets num_token as 50.
>> >> >> >> >> Then how are token ranges distributed in the cluster?
>> >> >> >> >>
>> >> >> >> >> If there is one global ring, then it may be (To simply the
>> >> >> >> >> case,
>> >> >> >> >> let's
>> >> >> >> >> assume vnodes=1):
>> >> >> >> >> {dc1, node1} 1-50
>> >> >> >> >> {dc2, node1} 51-100
>> >> >> >> >> {dc1, node1} 101-150
>> >> >> >> >> {dc1, node2} 151-200
>> >> >> >> >>
>> >> >> >> >> But here comes more questions:
>> >> >> >> >> a) what if I add a new datacenter? Then the token ranges need
>> >> >> >> >> to
>> >> >> >> >> be
>> >> >> >> >> re-balanced?
>> >> >> >> >> If so, what about the data associated with the ranges to be
>> >> >> >> >> balanced?
>> >> >> >> >> move them among DCs?
>> >> >> >> >> But that doesn't make sense, because each keyspace would
>> >> >> >> >> specify
>> >> >> >> >> its
>> >> >> >> >> snith and fix the DCs to store then.
>> >> >> >> >>
>> >> >> >> >> b) It seems no benefits from same ring, because of the snith.
>> >> >> >> >>
>> >> >> >> >> If each DC has own ring, then it may be:
>> >> >> >> >> {dc1, node1} 1-50
>> >> >> >> >> {dc1, node1} 51-100
>> >> >> >> >> {dc2, node1} 1-50
>> >> >> >> >> {dc2, node1} 51-100
>> >> >> >> >>
>> >> >> >> >> I think this is not a trivial question, because each key would
>> >> >> >> >> be
>> >> >> >> >> hashed to determine the token it belongs to, and
>> >> >> >> >> the token range distribution in turns determine which node the
>> >> >> >> >> key
>> >> >> >> >> belongs
>> >> >> >> >> to.
>> >> >> >> >>
>> >> >> >> >> Any official answer?
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
>> >> >> >> >> <ja...@genesys.com>:
>> >> >> >> >> > Maybe I misunderstood something but from what I understand,
>> >> >> >> >> > each
>> >> >> >> >> > DC
>> >> >> >> >> > have
>> >> >> >> >> > the same ring (0-100 in you example) but it's split
>> >> >> >> >> > differently
>> >> >> >> >> > between
>> >> >> >> >> > nodes in each DC. I think it's the same principle if using
>> >> >> >> >> > vnode
>> >> >> >> >> > or
>> >> >> >> >> > not.
>> >> >> >> >> >
>> >> >> >> >> > I think the confusion comes from the fact that the ring
>> >> >> >> >> > range
>> >> >> >> >> > is
>> >> >> >> >> > the
>> >> >> >> >> > same (0-100) but each DC manages it differently because
>> >> >> >> >> > nodes
>> >> >> >> >> > are
>> >> >> >> >> > different.
>> >> >> >> >> >
>> >> >> >> >> > --
>> >> >> >> >> > Jacques-Henri Berthemet
>> >> >> >> >> >
>> >> >> >> >> > -----Original Message-----
>> >> >> >> >> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> >> >> >> >> > Sent: Wednesday, April 11, 2018 2:26 PM
>> >> >> >> >> > To: user@cassandra.apache.org
>> >> >> >> >> > Subject: Re: does c* 3.0 use one ring for all datacenters?
>> >> >> >> >> >
>> >> >> >> >> > Thanks for your reply. I also think separate rings are more
>> >> >> >> >> > reasonable.
>> >> >> >> >> >
>> >> >> >> >> > So one ring for one dc is only for c* 1.x or 2.x without
>> >> >> >> >> > vnode?
>> >> >> >> >> >
>> >> >> >> >> > Check these references:
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html
>> >> >> >> >> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > https://community.apigee.com/articles/13096/cassandra-token-distribution.html
>> >> >> >> >> >
>> >> >> >> >> > Even the riak official said c* splits the ring across dc:
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/
>> >> >> >> >> >
>> >> >> >> >> > Why they said each dc has its own ring?
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
>> >> >> >> >> > <ja...@genesys.com>:
>> >> >> >> >> >> Hi,
>> >> >> >> >> >>
>> >> >> >> >> >> Each DC has the whole ring, each DC contains a copy of the
>> >> >> >> >> >> same
>> >> >> >> >> >> data.
>> >> >> >> >> >> When you add replication to a new DC, all data is copied to
>> >> >> >> >> >> the
>> >> >> >> >> >> new
>> >> >> >> >> >> DC.
>> >> >> >> >> >>
>> >> >> >> >> >> Within a DC, each range of token is 'owned' by a (primary)
>> >> >> >> >> >> node
>> >> >> >> >> >> (and
>> >> >> >> >> >> replicas if you have RF > 1). If you add/remove a node in a
>> >> >> >> >> >> DC,
>> >> >> >> >> >> tokens will
>> >> >> >> >> >> be rearranged between all nodes within the DC only, the
>> >> >> >> >> >> other
>> >> >> >> >> >> DCs
>> >> >> >> >> >> won't be
>> >> >> >> >> >> affected.
>> >> >> >> >> >>
>> >> >> >> >> >> --
>> >> >> >> >> >> Jacques-Henri Berthemet
>> >> >> >> >> >>
>> >> >> >> >> >> -----Original Message-----
>> >> >> >> >> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> >> >> >> >> >> Sent: Wednesday, April 11, 2018 12:35 PM
>> >> >> >> >> >> To: user@cassandra.apache.org
>> >> >> >> >> >> Subject: does c* 3.0 use one ring for all datacenters?
>> >> >> >> >> >>
>> >> >> >> >> >> Hi All,
>> >> >> >> >> >>
>> >> >> >> >> >> I know it seems a stupid question, but I am really confused
>> >> >> >> >> >> about
>> >> >> >> >> >> the
>> >> >> >> >> >> documents on the internet related to this topic, especially
>> >> >> >> >> >> it
>> >> >> >> >> >> seems
>> >> >> >> >> >> that it
>> >> >> >> >> >> has different answers for c* with vnodes or not.
>> >> >> >> >> >>
>> >> >> >> >> >> Let's assume the token range is 1-100 for the whole
>> >> >> >> >> >> cluster,
>> >> >> >> >> >> how
>> >> >> >> >> >> does
>> >> >> >> >> >> it distributed into the datacenters? Think that the number
>> >> >> >> >> >> of
>> >> >> >> >> >> datacenters is
>> >> >> >> >> >> dynamic in a cluster, if there is only one ring, then the
>> >> >> >> >> >> token
>> >> >> >> >> >> range would
>> >> >> >> >> >> change on each node when I add a new datacenter into the
>> >> >> >> >> >> cluster?
>> >> >> >> >> >> Then it
>> >> >> >> >> >> would involve data migration? It doesn't make sense.
>> >> >> >> >> >>
>> >> >> >> >> >> Looking forward to clarification for c* 3.0, thanks!
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >> ---------------------------------------------------------------------
>> >> >> >> >> >> To unsubscribe, e-mail:
>> >> >> >> >> >> user-unsubscribe@cassandra.apache.org
>> >> >> >> >> >> For additional commands, e-mail:
>> >> >> >> >> >> user-help@cassandra.apache.org
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >> ---------------------------------------------------------------------
>> >> >> >> >> >> To unsubscribe, e-mail:
>> >> >> >> >> >> user-unsubscribe@cassandra.apache.org
>> >> >> >> >> >> For additional commands, e-mail:
>> >> >> >> >> >> user-help@cassandra.apache.org
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > ---------------------------------------------------------------------
>> >> >> >> >> > To unsubscribe, e-mail:
>> >> >> >> >> > user-unsubscribe@cassandra.apache.org
>> >> >> >> >> > For additional commands, e-mail:
>> >> >> >> >> > user-help@cassandra.apache.org
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> ---------------------------------------------------------------------
>> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> >> >> For additional commands, e-mail:
>> >> >> >> >> user-help@cassandra.apache.org
>> >> >> >> >>
>> >> >> >> >
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> ---------------------------------------------------------------------
>> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >>
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >>
>> >
>> >
>> >
>> > --
>> > Best regards,
>> > Xiaolong Jiang
>> >
>> > Software Engineer at Apple
>> > Columbia University
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jeff Jirsa <jj...@gmail.com>.

On Thu, Apr 26, 2018 at 1:34 AM, Jinhua Luo <lu...@gmail.com> wrote:

> How to guarantee the tokens independent between DC?


Cassandra wont let you have duplicate tokens - it wont start if you do it
by mistake, and it won't do it automatically.


> They forms one
> ring, and they must be (re-)assigned when needed.
>

Tokens dont move automatically. There's no auto-reassignment. You can move
a token, but nothing does it automatically.


> Use offset per DC? But it seems that the DC list must be fixed in advanced?
> To make sure the tokens are evenly distributed into the ring among the
> DC(s), are there chances to change the tokens owned by per DC?
> Could you please give a detailed token re-balancing procedure in case
> of node add/remove?
>

Calculate final state. Run repair and cleanup. Move tokens as needed. If
you're not able to reason through this, you may want to consider using
vnodes so it becomes less of an issue.


>
> 2018-04-26 16:23 GMT+08:00 Xiaolong Jiang <xi...@gmail.com>:
> > DC are independent of each other. Adding nodes to DC1  won't have any
> token
> > effect owned by other DC.
> >
> > On Thu, Apr 26, 2018 at 1:04 AM, Jinhua Luo <lu...@gmail.com> wrote:
> >>
> >> You're assuming per DC has same total num_tokens, right?
> >> If I add a new node into DC1, will it change the tokens owned by DC2 and
> >> DC3?
> >>
> >> 2018-04-12 0:59 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
> >> > When you add DC3, they'll get tokens (that aren't currently in use in
> >> > any
> >> > existing DC). Either you assign tokens (let's pretend we manually
> >> > assigned
> >> > the other ones, since DC2 = DC1 + 1), but cassandra can also
> >> > auto-calculate
> >> > them, the exact behavior of which varies by version.
> >> >
> >> >
> >> > Let's pretend it's old style random assignment, and we end up with DC3
> >> > having 4, 17, 22, 36, 48, 53, 64, 73, 83
> >> >
> >> > In this case:
> >> >
> >> > If you use SimpleStrategy and RF=3, a key with token 5 would be placed
> >> > on
> >> > the hosts with token 10, 11, 17
> >> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token
> 5
> >> > would be placed on the hosts with tokens 10,20,30 ; 11, 21,31 ; 17,
> 22,
> >> > 36
> >> >
> >> >
> >> >
> >> >
> >> > On Wed, Apr 11, 2018 at 9:36 AM, Jinhua Luo <lu...@gmail.com>
> wrote:
> >> >>
> >> >> What if I add a new DC3?
> >> >> The token ranges would reshuffled into DC1, DC2, DC3?
> >> >>
> >> >> 2018-04-11 22:06 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
> >> >> > Confirming again that it's definitely one ring.
> >> >> >
> >> >> > DC1 may have tokens 0, 10, 20, 30, 40, 50, 60, 70, 80
> >> >> > DC2 may have tokens 1, 11, 21, 31, 41, 51, 61, 71, 81
> >> >> >
> >> >> > If you use SimpleStrategy and RF=3, a key with token 5 would be
> >> >> > placed
> >> >> > on
> >> >> > the hosts with token 10, 11, 20
> >> >> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with
> token
> >> >> > 5
> >> >> > would be placed on the hosts with tokens 10,20,30 and 11, 21,31
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Wed, Apr 11, 2018 at 6:27 AM, Jinhua Luo <lu...@gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> Is it a different answer? One ring?
> >> >> >>
> >> >> >> Could you explain your answer according to my example?
> >> >> >>
> >> >> >> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
> >> >> >> > There has always been a single ring.
> >> >> >> >
> >> >> >> > You can specify how many nodes in each DC you want and it’ll
> >> >> >> > figure
> >> >> >> > out
> >> >> >> > how
> >> >> >> > to do it as long as you have the right snitch and are using
> >> >> >> > NetworkToploogyStrategy.
> >> >> >> >
> >> >> >> >
> >> >> >> > On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <luajit.io@gmail.com
> >
> >> >> >> > wrote:
> >> >> >> >>
> >> >> >> >> Let me clarify my question:
> >> >> >> >>
> >> >> >> >> Given we have a cluster of two DCs, each DC has 2 nodes, each
> >> >> >> >> node
> >> >> >> >> sets num_token as 50.
> >> >> >> >> Then how are token ranges distributed in the cluster?
> >> >> >> >>
> >> >> >> >> If there is one global ring, then it may be (To simply the
> case,
> >> >> >> >> let's
> >> >> >> >> assume vnodes=1):
> >> >> >> >> {dc1, node1} 1-50
> >> >> >> >> {dc2, node1} 51-100
> >> >> >> >> {dc1, node1} 101-150
> >> >> >> >> {dc1, node2} 151-200
> >> >> >> >>
> >> >> >> >> But here comes more questions:
> >> >> >> >> a) what if I add a new datacenter? Then the token ranges need
> to
> >> >> >> >> be
> >> >> >> >> re-balanced?
> >> >> >> >> If so, what about the data associated with the ranges to be
> >> >> >> >> balanced?
> >> >> >> >> move them among DCs?
> >> >> >> >> But that doesn't make sense, because each keyspace would
> specify
> >> >> >> >> its
> >> >> >> >> snith and fix the DCs to store then.
> >> >> >> >>
> >> >> >> >> b) It seems no benefits from same ring, because of the snith.
> >> >> >> >>
> >> >> >> >> If each DC has own ring, then it may be:
> >> >> >> >> {dc1, node1} 1-50
> >> >> >> >> {dc1, node1} 51-100
> >> >> >> >> {dc2, node1} 1-50
> >> >> >> >> {dc2, node1} 51-100
> >> >> >> >>
> >> >> >> >> I think this is not a trivial question, because each key would
> be
> >> >> >> >> hashed to determine the token it belongs to, and
> >> >> >> >> the token range distribution in turns determine which node the
> >> >> >> >> key
> >> >> >> >> belongs
> >> >> >> >> to.
> >> >> >> >>
> >> >> >> >> Any official answer?
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
> >> >> >> >> <ja...@genesys.com>:
> >> >> >> >> > Maybe I misunderstood something but from what I understand,
> >> >> >> >> > each
> >> >> >> >> > DC
> >> >> >> >> > have
> >> >> >> >> > the same ring (0-100 in you example) but it's split
> differently
> >> >> >> >> > between
> >> >> >> >> > nodes in each DC. I think it's the same principle if using
> >> >> >> >> > vnode
> >> >> >> >> > or
> >> >> >> >> > not.
> >> >> >> >> >
> >> >> >> >> > I think the confusion comes from the fact that the ring range
> >> >> >> >> > is
> >> >> >> >> > the
> >> >> >> >> > same (0-100) but each DC manages it differently because nodes
> >> >> >> >> > are
> >> >> >> >> > different.
> >> >> >> >> >
> >> >> >> >> > --
> >> >> >> >> > Jacques-Henri Berthemet
> >> >> >> >> >
> >> >> >> >> > -----Original Message-----
> >> >> >> >> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
> >> >> >> >> > Sent: Wednesday, April 11, 2018 2:26 PM
> >> >> >> >> > To: user@cassandra.apache.org
> >> >> >> >> > Subject: Re: does c* 3.0 use one ring for all datacenters?
> >> >> >> >> >
> >> >> >> >> > Thanks for your reply. I also think separate rings are more
> >> >> >> >> > reasonable.
> >> >> >> >> >
> >> >> >> >> > So one ring for one dc is only for c* 1.x or 2.x without
> vnode?
> >> >> >> >> >
> >> >> >> >> > Check these references:
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > https://docs.datastax.com/en/archived/cassandra/1.1/docs/
> initialize/token_generation.html
> >> >> >> >> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > https://community.apigee.com/articles/13096/cassandra-
> token-distribution.html
> >> >> >> >> >
> >> >> >> >> > Even the riak official said c* splits the ring across dc:
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > http://basho.com/posts/business/riak-vs-cassandra-an-
> updated-brief-comparison/
> >> >> >> >> >
> >> >> >> >> > Why they said each dc has its own ring?
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
> >> >> >> >> > <ja...@genesys.com>:
> >> >> >> >> >> Hi,
> >> >> >> >> >>
> >> >> >> >> >> Each DC has the whole ring, each DC contains a copy of the
> >> >> >> >> >> same
> >> >> >> >> >> data.
> >> >> >> >> >> When you add replication to a new DC, all data is copied to
> >> >> >> >> >> the
> >> >> >> >> >> new
> >> >> >> >> >> DC.
> >> >> >> >> >>
> >> >> >> >> >> Within a DC, each range of token is 'owned' by a (primary)
> >> >> >> >> >> node
> >> >> >> >> >> (and
> >> >> >> >> >> replicas if you have RF > 1). If you add/remove a node in a
> >> >> >> >> >> DC,
> >> >> >> >> >> tokens will
> >> >> >> >> >> be rearranged between all nodes within the DC only, the
> other
> >> >> >> >> >> DCs
> >> >> >> >> >> won't be
> >> >> >> >> >> affected.
> >> >> >> >> >>
> >> >> >> >> >> --
> >> >> >> >> >> Jacques-Henri Berthemet
> >> >> >> >> >>
> >> >> >> >> >> -----Original Message-----
> >> >> >> >> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
> >> >> >> >> >> Sent: Wednesday, April 11, 2018 12:35 PM
> >> >> >> >> >> To: user@cassandra.apache.org
> >> >> >> >> >> Subject: does c* 3.0 use one ring for all datacenters?
> >> >> >> >> >>
> >> >> >> >> >> Hi All,
> >> >> >> >> >>
> >> >> >> >> >> I know it seems a stupid question, but I am really confused
> >> >> >> >> >> about
> >> >> >> >> >> the
> >> >> >> >> >> documents on the internet related to this topic, especially
> it
> >> >> >> >> >> seems
> >> >> >> >> >> that it
> >> >> >> >> >> has different answers for c* with vnodes or not.
> >> >> >> >> >>
> >> >> >> >> >> Let's assume the token range is 1-100 for the whole cluster,
> >> >> >> >> >> how
> >> >> >> >> >> does
> >> >> >> >> >> it distributed into the datacenters? Think that the number
> of
> >> >> >> >> >> datacenters is
> >> >> >> >> >> dynamic in a cluster, if there is only one ring, then the
> >> >> >> >> >> token
> >> >> >> >> >> range would
> >> >> >> >> >> change on each node when I add a new datacenter into the
> >> >> >> >> >> cluster?
> >> >> >> >> >> Then it
> >> >> >> >> >> would involve data migration? It doesn't make sense.
> >> >> >> >> >>
> >> >> >> >> >> Looking forward to clarification for c* 3.0, thanks!
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >> ------------------------------
> ---------------------------------------
> >> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.
> apache.org
> >> >> >> >> >> For additional commands, e-mail:
> >> >> >> >> >> user-help@cassandra.apache.org
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >> ------------------------------
> ---------------------------------------
> >> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.
> apache.org
> >> >> >> >> >> For additional commands, e-mail:
> >> >> >> >> >> user-help@cassandra.apache.org
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > ------------------------------------------------------------
> ---------
> >> >> >> >> > To unsubscribe, e-mail: user-unsubscribe@cassandra.
> apache.org
> >> >> >> >> > For additional commands, e-mail:
> user-help@cassandra.apache.org
> >> >> >> >> >
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> ------------------------------------------------------------
> ---------
> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> >> >> For additional commands, e-mail:
> user-help@cassandra.apache.org
> >> >> >> >>
> >> >> >> >
> >> >> >>
> >> >> >>
> >> >> >> ------------------------------------------------------------
> ---------
> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >> >> >>
> >> >> >
> >> >>
> >> >> ------------------------------------------------------------
> ---------
> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >>
> >
> >
> >
> > --
> > Best regards,
> > Xiaolong Jiang
> >
> > Software Engineer at Apple
> > Columbia University
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jinhua Luo <lu...@gmail.com>.

How to guarantee the tokens independent between DC? They forms one
ring, and they must be (re-)assigned when needed.
Use offset per DC? But it seems that the DC list must be fixed in advanced?
To make sure the tokens are evenly distributed into the ring among the
DC(s), are there chances to change the tokens owned by per DC?
Could you please give a detailed token re-balancing procedure in case
of node add/remove?

2018-04-26 16:23 GMT+08:00 Xiaolong Jiang <xi...@gmail.com>:
> DC are independent of each other. Adding nodes to DC1  won't have any token
> effect owned by other DC.
>
> On Thu, Apr 26, 2018 at 1:04 AM, Jinhua Luo <lu...@gmail.com> wrote:
>>
>> You're assuming per DC has same total num_tokens, right?
>> If I add a new node into DC1, will it change the tokens owned by DC2 and
>> DC3?
>>
>> 2018-04-12 0:59 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
>> > When you add DC3, they'll get tokens (that aren't currently in use in
>> > any
>> > existing DC). Either you assign tokens (let's pretend we manually
>> > assigned
>> > the other ones, since DC2 = DC1 + 1), but cassandra can also
>> > auto-calculate
>> > them, the exact behavior of which varies by version.
>> >
>> >
>> > Let's pretend it's old style random assignment, and we end up with DC3
>> > having 4, 17, 22, 36, 48, 53, 64, 73, 83
>> >
>> > In this case:
>> >
>> > If you use SimpleStrategy and RF=3, a key with token 5 would be placed
>> > on
>> > the hosts with token 10, 11, 17
>> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token 5
>> > would be placed on the hosts with tokens 10,20,30 ; 11, 21,31 ; 17, 22,
>> > 36
>> >
>> >
>> >
>> >
>> > On Wed, Apr 11, 2018 at 9:36 AM, Jinhua Luo <lu...@gmail.com> wrote:
>> >>
>> >> What if I add a new DC3?
>> >> The token ranges would reshuffled into DC1, DC2, DC3?
>> >>
>> >> 2018-04-11 22:06 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
>> >> > Confirming again that it's definitely one ring.
>> >> >
>> >> > DC1 may have tokens 0, 10, 20, 30, 40, 50, 60, 70, 80
>> >> > DC2 may have tokens 1, 11, 21, 31, 41, 51, 61, 71, 81
>> >> >
>> >> > If you use SimpleStrategy and RF=3, a key with token 5 would be
>> >> > placed
>> >> > on
>> >> > the hosts with token 10, 11, 20
>> >> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token
>> >> > 5
>> >> > would be placed on the hosts with tokens 10,20,30 and 11, 21,31
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Apr 11, 2018 at 6:27 AM, Jinhua Luo <lu...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Is it a different answer? One ring?
>> >> >>
>> >> >> Could you explain your answer according to my example?
>> >> >>
>> >> >> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
>> >> >> > There has always been a single ring.
>> >> >> >
>> >> >> > You can specify how many nodes in each DC you want and it’ll
>> >> >> > figure
>> >> >> > out
>> >> >> > how
>> >> >> > to do it as long as you have the right snitch and are using
>> >> >> > NetworkToploogyStrategy.
>> >> >> >
>> >> >> >
>> >> >> > On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <lu...@gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Let me clarify my question:
>> >> >> >>
>> >> >> >> Given we have a cluster of two DCs, each DC has 2 nodes, each
>> >> >> >> node
>> >> >> >> sets num_token as 50.
>> >> >> >> Then how are token ranges distributed in the cluster?
>> >> >> >>
>> >> >> >> If there is one global ring, then it may be (To simply the case,
>> >> >> >> let's
>> >> >> >> assume vnodes=1):
>> >> >> >> {dc1, node1} 1-50
>> >> >> >> {dc2, node1} 51-100
>> >> >> >> {dc1, node1} 101-150
>> >> >> >> {dc1, node2} 151-200
>> >> >> >>
>> >> >> >> But here comes more questions:
>> >> >> >> a) what if I add a new datacenter? Then the token ranges need to
>> >> >> >> be
>> >> >> >> re-balanced?
>> >> >> >> If so, what about the data associated with the ranges to be
>> >> >> >> balanced?
>> >> >> >> move them among DCs?
>> >> >> >> But that doesn't make sense, because each keyspace would specify
>> >> >> >> its
>> >> >> >> snith and fix the DCs to store then.
>> >> >> >>
>> >> >> >> b) It seems no benefits from same ring, because of the snith.
>> >> >> >>
>> >> >> >> If each DC has own ring, then it may be:
>> >> >> >> {dc1, node1} 1-50
>> >> >> >> {dc1, node1} 51-100
>> >> >> >> {dc2, node1} 1-50
>> >> >> >> {dc2, node1} 51-100
>> >> >> >>
>> >> >> >> I think this is not a trivial question, because each key would be
>> >> >> >> hashed to determine the token it belongs to, and
>> >> >> >> the token range distribution in turns determine which node the
>> >> >> >> key
>> >> >> >> belongs
>> >> >> >> to.
>> >> >> >>
>> >> >> >> Any official answer?
>> >> >> >>
>> >> >> >>
>> >> >> >> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
>> >> >> >> <ja...@genesys.com>:
>> >> >> >> > Maybe I misunderstood something but from what I understand,
>> >> >> >> > each
>> >> >> >> > DC
>> >> >> >> > have
>> >> >> >> > the same ring (0-100 in you example) but it's split differently
>> >> >> >> > between
>> >> >> >> > nodes in each DC. I think it's the same principle if using
>> >> >> >> > vnode
>> >> >> >> > or
>> >> >> >> > not.
>> >> >> >> >
>> >> >> >> > I think the confusion comes from the fact that the ring range
>> >> >> >> > is
>> >> >> >> > the
>> >> >> >> > same (0-100) but each DC manages it differently because nodes
>> >> >> >> > are
>> >> >> >> > different.
>> >> >> >> >
>> >> >> >> > --
>> >> >> >> > Jacques-Henri Berthemet
>> >> >> >> >
>> >> >> >> > -----Original Message-----
>> >> >> >> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> >> >> >> > Sent: Wednesday, April 11, 2018 2:26 PM
>> >> >> >> > To: user@cassandra.apache.org
>> >> >> >> > Subject: Re: does c* 3.0 use one ring for all datacenters?
>> >> >> >> >
>> >> >> >> > Thanks for your reply. I also think separate rings are more
>> >> >> >> > reasonable.
>> >> >> >> >
>> >> >> >> > So one ring for one dc is only for c* 1.x or 2.x without vnode?
>> >> >> >> >
>> >> >> >> > Check these references:
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html
>> >> >> >> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > https://community.apigee.com/articles/13096/cassandra-token-distribution.html
>> >> >> >> >
>> >> >> >> > Even the riak official said c* splits the ring across dc:
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/
>> >> >> >> >
>> >> >> >> > Why they said each dc has its own ring?
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
>> >> >> >> > <ja...@genesys.com>:
>> >> >> >> >> Hi,
>> >> >> >> >>
>> >> >> >> >> Each DC has the whole ring, each DC contains a copy of the
>> >> >> >> >> same
>> >> >> >> >> data.
>> >> >> >> >> When you add replication to a new DC, all data is copied to
>> >> >> >> >> the
>> >> >> >> >> new
>> >> >> >> >> DC.
>> >> >> >> >>
>> >> >> >> >> Within a DC, each range of token is 'owned' by a (primary)
>> >> >> >> >> node
>> >> >> >> >> (and
>> >> >> >> >> replicas if you have RF > 1). If you add/remove a node in a
>> >> >> >> >> DC,
>> >> >> >> >> tokens will
>> >> >> >> >> be rearranged between all nodes within the DC only, the other
>> >> >> >> >> DCs
>> >> >> >> >> won't be
>> >> >> >> >> affected.
>> >> >> >> >>
>> >> >> >> >> --
>> >> >> >> >> Jacques-Henri Berthemet
>> >> >> >> >>
>> >> >> >> >> -----Original Message-----
>> >> >> >> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> >> >> >> >> Sent: Wednesday, April 11, 2018 12:35 PM
>> >> >> >> >> To: user@cassandra.apache.org
>> >> >> >> >> Subject: does c* 3.0 use one ring for all datacenters?
>> >> >> >> >>
>> >> >> >> >> Hi All,
>> >> >> >> >>
>> >> >> >> >> I know it seems a stupid question, but I am really confused
>> >> >> >> >> about
>> >> >> >> >> the
>> >> >> >> >> documents on the internet related to this topic, especially it
>> >> >> >> >> seems
>> >> >> >> >> that it
>> >> >> >> >> has different answers for c* with vnodes or not.
>> >> >> >> >>
>> >> >> >> >> Let's assume the token range is 1-100 for the whole cluster,
>> >> >> >> >> how
>> >> >> >> >> does
>> >> >> >> >> it distributed into the datacenters? Think that the number of
>> >> >> >> >> datacenters is
>> >> >> >> >> dynamic in a cluster, if there is only one ring, then the
>> >> >> >> >> token
>> >> >> >> >> range would
>> >> >> >> >> change on each node when I add a new datacenter into the
>> >> >> >> >> cluster?
>> >> >> >> >> Then it
>> >> >> >> >> would involve data migration? It doesn't make sense.
>> >> >> >> >>
>> >> >> >> >> Looking forward to clarification for c* 3.0, thanks!
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> ---------------------------------------------------------------------
>> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> >> >> For additional commands, e-mail:
>> >> >> >> >> user-help@cassandra.apache.org
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> ---------------------------------------------------------------------
>> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> >> >> For additional commands, e-mail:
>> >> >> >> >> user-help@cassandra.apache.org
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > ---------------------------------------------------------------------
>> >> >> >> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> >> > For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >> >> >
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> ---------------------------------------------------------------------
>> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >>
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>
>
>
> --
> Best regards,
> Xiaolong Jiang
>
> Software Engineer at Apple
> Columbia University

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: does c* 3.0 use one ring for all datacenters?

Posted by Xiaolong Jiang <xi...@gmail.com>.

DC are independent of each other. Adding nodes to DC1  won't have any token
effect owned by other DC.

On Thu, Apr 26, 2018 at 1:04 AM, Jinhua Luo <lu...@gmail.com> wrote:

> You're assuming per DC has same total num_tokens, right?
> If I add a new node into DC1, will it change the tokens owned by DC2 and
> DC3?
>
> 2018-04-12 0:59 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
> > When you add DC3, they'll get tokens (that aren't currently in use in any
> > existing DC). Either you assign tokens (let's pretend we manually
> assigned
> > the other ones, since DC2 = DC1 + 1), but cassandra can also
> auto-calculate
> > them, the exact behavior of which varies by version.
> >
> >
> > Let's pretend it's old style random assignment, and we end up with DC3
> > having 4, 17, 22, 36, 48, 53, 64, 73, 83
> >
> > In this case:
> >
> > If you use SimpleStrategy and RF=3, a key with token 5 would be placed on
> > the hosts with token 10, 11, 17
> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token 5
> > would be placed on the hosts with tokens 10,20,30 ; 11, 21,31 ; 17, 22,
> 36
> >
> >
> >
> >
> > On Wed, Apr 11, 2018 at 9:36 AM, Jinhua Luo <lu...@gmail.com> wrote:
> >>
> >> What if I add a new DC3?
> >> The token ranges would reshuffled into DC1, DC2, DC3?
> >>
> >> 2018-04-11 22:06 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
> >> > Confirming again that it's definitely one ring.
> >> >
> >> > DC1 may have tokens 0, 10, 20, 30, 40, 50, 60, 70, 80
> >> > DC2 may have tokens 1, 11, 21, 31, 41, 51, 61, 71, 81
> >> >
> >> > If you use SimpleStrategy and RF=3, a key with token 5 would be placed
> >> > on
> >> > the hosts with token 10, 11, 20
> >> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token
> 5
> >> > would be placed on the hosts with tokens 10,20,30 and 11, 21,31
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Wed, Apr 11, 2018 at 6:27 AM, Jinhua Luo <lu...@gmail.com>
> wrote:
> >> >>
> >> >> Is it a different answer? One ring?
> >> >>
> >> >> Could you explain your answer according to my example?
> >> >>
> >> >> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
> >> >> > There has always been a single ring.
> >> >> >
> >> >> > You can specify how many nodes in each DC you want and it’ll figure
> >> >> > out
> >> >> > how
> >> >> > to do it as long as you have the right snitch and are using
> >> >> > NetworkToploogyStrategy.
> >> >> >
> >> >> >
> >> >> > On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <lu...@gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> Let me clarify my question:
> >> >> >>
> >> >> >> Given we have a cluster of two DCs, each DC has 2 nodes, each node
> >> >> >> sets num_token as 50.
> >> >> >> Then how are token ranges distributed in the cluster?
> >> >> >>
> >> >> >> If there is one global ring, then it may be (To simply the case,
> >> >> >> let's
> >> >> >> assume vnodes=1):
> >> >> >> {dc1, node1} 1-50
> >> >> >> {dc2, node1} 51-100
> >> >> >> {dc1, node1} 101-150
> >> >> >> {dc1, node2} 151-200
> >> >> >>
> >> >> >> But here comes more questions:
> >> >> >> a) what if I add a new datacenter? Then the token ranges need to
> be
> >> >> >> re-balanced?
> >> >> >> If so, what about the data associated with the ranges to be
> >> >> >> balanced?
> >> >> >> move them among DCs?
> >> >> >> But that doesn't make sense, because each keyspace would specify
> its
> >> >> >> snith and fix the DCs to store then.
> >> >> >>
> >> >> >> b) It seems no benefits from same ring, because of the snith.
> >> >> >>
> >> >> >> If each DC has own ring, then it may be:
> >> >> >> {dc1, node1} 1-50
> >> >> >> {dc1, node1} 51-100
> >> >> >> {dc2, node1} 1-50
> >> >> >> {dc2, node1} 51-100
> >> >> >>
> >> >> >> I think this is not a trivial question, because each key would be
> >> >> >> hashed to determine the token it belongs to, and
> >> >> >> the token range distribution in turns determine which node the key
> >> >> >> belongs
> >> >> >> to.
> >> >> >>
> >> >> >> Any official answer?
> >> >> >>
> >> >> >>
> >> >> >> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
> >> >> >> <ja...@genesys.com>:
> >> >> >> > Maybe I misunderstood something but from what I understand, each
> >> >> >> > DC
> >> >> >> > have
> >> >> >> > the same ring (0-100 in you example) but it's split differently
> >> >> >> > between
> >> >> >> > nodes in each DC. I think it's the same principle if using vnode
> >> >> >> > or
> >> >> >> > not.
> >> >> >> >
> >> >> >> > I think the confusion comes from the fact that the ring range is
> >> >> >> > the
> >> >> >> > same (0-100) but each DC manages it differently because nodes
> are
> >> >> >> > different.
> >> >> >> >
> >> >> >> > --
> >> >> >> > Jacques-Henri Berthemet
> >> >> >> >
> >> >> >> > -----Original Message-----
> >> >> >> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
> >> >> >> > Sent: Wednesday, April 11, 2018 2:26 PM
> >> >> >> > To: user@cassandra.apache.org
> >> >> >> > Subject: Re: does c* 3.0 use one ring for all datacenters?
> >> >> >> >
> >> >> >> > Thanks for your reply. I also think separate rings are more
> >> >> >> > reasonable.
> >> >> >> >
> >> >> >> > So one ring for one dc is only for c* 1.x or 2.x without vnode?
> >> >> >> >
> >> >> >> > Check these references:
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > https://docs.datastax.com/en/archived/cassandra/1.1/docs/
> initialize/token_generation.html
> >> >> >> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > https://community.apigee.com/articles/13096/cassandra-
> token-distribution.html
> >> >> >> >
> >> >> >> > Even the riak official said c* splits the ring across dc:
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > http://basho.com/posts/business/riak-vs-cassandra-an-
> updated-brief-comparison/
> >> >> >> >
> >> >> >> > Why they said each dc has its own ring?
> >> >> >> >
> >> >> >> >
> >> >> >> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
> >> >> >> > <ja...@genesys.com>:
> >> >> >> >> Hi,
> >> >> >> >>
> >> >> >> >> Each DC has the whole ring, each DC contains a copy of the same
> >> >> >> >> data.
> >> >> >> >> When you add replication to a new DC, all data is copied to the
> >> >> >> >> new
> >> >> >> >> DC.
> >> >> >> >>
> >> >> >> >> Within a DC, each range of token is 'owned' by a (primary) node
> >> >> >> >> (and
> >> >> >> >> replicas if you have RF > 1). If you add/remove a node in a DC,
> >> >> >> >> tokens will
> >> >> >> >> be rearranged between all nodes within the DC only, the other
> DCs
> >> >> >> >> won't be
> >> >> >> >> affected.
> >> >> >> >>
> >> >> >> >> --
> >> >> >> >> Jacques-Henri Berthemet
> >> >> >> >>
> >> >> >> >> -----Original Message-----
> >> >> >> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
> >> >> >> >> Sent: Wednesday, April 11, 2018 12:35 PM
> >> >> >> >> To: user@cassandra.apache.org
> >> >> >> >> Subject: does c* 3.0 use one ring for all datacenters?
> >> >> >> >>
> >> >> >> >> Hi All,
> >> >> >> >>
> >> >> >> >> I know it seems a stupid question, but I am really confused
> about
> >> >> >> >> the
> >> >> >> >> documents on the internet related to this topic, especially it
> >> >> >> >> seems
> >> >> >> >> that it
> >> >> >> >> has different answers for c* with vnodes or not.
> >> >> >> >>
> >> >> >> >> Let's assume the token range is 1-100 for the whole cluster,
> how
> >> >> >> >> does
> >> >> >> >> it distributed into the datacenters? Think that the number of
> >> >> >> >> datacenters is
> >> >> >> >> dynamic in a cluster, if there is only one ring, then the token
> >> >> >> >> range would
> >> >> >> >> change on each node when I add a new datacenter into the
> cluster?
> >> >> >> >> Then it
> >> >> >> >> would involve data migration? It doesn't make sense.
> >> >> >> >>
> >> >> >> >> Looking forward to clarification for c* 3.0, thanks!
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> ------------------------------------------------------------
> ---------
> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> >> >> For additional commands, e-mail:
> user-help@cassandra.apache.org
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> ------------------------------------------------------------
> ---------
> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> >> >> For additional commands, e-mail:
> user-help@cassandra.apache.org
> >> >> >> >
> >> >> >> >
> >> >> >> > ------------------------------------------------------------
> ---------
> >> >> >> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> >> > For additional commands, e-mail: user-help@cassandra.apache.org
> >> >> >> >
> >> >> >>
> >> >> >>
> >> >> >> ------------------------------------------------------------
> ---------
> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >> >> >>
> >> >> >
> >> >>
> >> >> ------------------------------------------------------------
> ---------
> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>


-- 
Best regards,
Xiaolong Jiang

Software Engineer at Apple
Columbia University

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jinhua Luo <lu...@gmail.com>.

You're assuming per DC has same total num_tokens, right?
If I add a new node into DC1, will it change the tokens owned by DC2 and DC3?

2018-04-12 0:59 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
> When you add DC3, they'll get tokens (that aren't currently in use in any
> existing DC). Either you assign tokens (let's pretend we manually assigned
> the other ones, since DC2 = DC1 + 1), but cassandra can also auto-calculate
> them, the exact behavior of which varies by version.
>
>
> Let's pretend it's old style random assignment, and we end up with DC3
> having 4, 17, 22, 36, 48, 53, 64, 73, 83
>
> In this case:
>
> If you use SimpleStrategy and RF=3, a key with token 5 would be placed on
> the hosts with token 10, 11, 17
> If you use NetworkTopologyStrategy with RF=3 per DC, a key with token 5
> would be placed on the hosts with tokens 10,20,30 ; 11, 21,31 ; 17, 22, 36
>
>
>
>
> On Wed, Apr 11, 2018 at 9:36 AM, Jinhua Luo <lu...@gmail.com> wrote:
>>
>> What if I add a new DC3?
>> The token ranges would reshuffled into DC1, DC2, DC3?
>>
>> 2018-04-11 22:06 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
>> > Confirming again that it's definitely one ring.
>> >
>> > DC1 may have tokens 0, 10, 20, 30, 40, 50, 60, 70, 80
>> > DC2 may have tokens 1, 11, 21, 31, 41, 51, 61, 71, 81
>> >
>> > If you use SimpleStrategy and RF=3, a key with token 5 would be placed
>> > on
>> > the hosts with token 10, 11, 20
>> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token 5
>> > would be placed on the hosts with tokens 10,20,30 and 11, 21,31
>> >
>> >
>> >
>> >
>> >
>> > On Wed, Apr 11, 2018 at 6:27 AM, Jinhua Luo <lu...@gmail.com> wrote:
>> >>
>> >> Is it a different answer? One ring?
>> >>
>> >> Could you explain your answer according to my example?
>> >>
>> >> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
>> >> > There has always been a single ring.
>> >> >
>> >> > You can specify how many nodes in each DC you want and it’ll figure
>> >> > out
>> >> > how
>> >> > to do it as long as you have the right snitch and are using
>> >> > NetworkToploogyStrategy.
>> >> >
>> >> >
>> >> > On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <lu...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Let me clarify my question:
>> >> >>
>> >> >> Given we have a cluster of two DCs, each DC has 2 nodes, each node
>> >> >> sets num_token as 50.
>> >> >> Then how are token ranges distributed in the cluster?
>> >> >>
>> >> >> If there is one global ring, then it may be (To simply the case,
>> >> >> let's
>> >> >> assume vnodes=1):
>> >> >> {dc1, node1} 1-50
>> >> >> {dc2, node1} 51-100
>> >> >> {dc1, node1} 101-150
>> >> >> {dc1, node2} 151-200
>> >> >>
>> >> >> But here comes more questions:
>> >> >> a) what if I add a new datacenter? Then the token ranges need to be
>> >> >> re-balanced?
>> >> >> If so, what about the data associated with the ranges to be
>> >> >> balanced?
>> >> >> move them among DCs?
>> >> >> But that doesn't make sense, because each keyspace would specify its
>> >> >> snith and fix the DCs to store then.
>> >> >>
>> >> >> b) It seems no benefits from same ring, because of the snith.
>> >> >>
>> >> >> If each DC has own ring, then it may be:
>> >> >> {dc1, node1} 1-50
>> >> >> {dc1, node1} 51-100
>> >> >> {dc2, node1} 1-50
>> >> >> {dc2, node1} 51-100
>> >> >>
>> >> >> I think this is not a trivial question, because each key would be
>> >> >> hashed to determine the token it belongs to, and
>> >> >> the token range distribution in turns determine which node the key
>> >> >> belongs
>> >> >> to.
>> >> >>
>> >> >> Any official answer?
>> >> >>
>> >> >>
>> >> >> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
>> >> >> <ja...@genesys.com>:
>> >> >> > Maybe I misunderstood something but from what I understand, each
>> >> >> > DC
>> >> >> > have
>> >> >> > the same ring (0-100 in you example) but it's split differently
>> >> >> > between
>> >> >> > nodes in each DC. I think it's the same principle if using vnode
>> >> >> > or
>> >> >> > not.
>> >> >> >
>> >> >> > I think the confusion comes from the fact that the ring range is
>> >> >> > the
>> >> >> > same (0-100) but each DC manages it differently because nodes are
>> >> >> > different.
>> >> >> >
>> >> >> > --
>> >> >> > Jacques-Henri Berthemet
>> >> >> >
>> >> >> > -----Original Message-----
>> >> >> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> >> >> > Sent: Wednesday, April 11, 2018 2:26 PM
>> >> >> > To: user@cassandra.apache.org
>> >> >> > Subject: Re: does c* 3.0 use one ring for all datacenters?
>> >> >> >
>> >> >> > Thanks for your reply. I also think separate rings are more
>> >> >> > reasonable.
>> >> >> >
>> >> >> > So one ring for one dc is only for c* 1.x or 2.x without vnode?
>> >> >> >
>> >> >> > Check these references:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html
>> >> >> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > https://community.apigee.com/articles/13096/cassandra-token-distribution.html
>> >> >> >
>> >> >> > Even the riak official said c* splits the ring across dc:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/
>> >> >> >
>> >> >> > Why they said each dc has its own ring?
>> >> >> >
>> >> >> >
>> >> >> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
>> >> >> > <ja...@genesys.com>:
>> >> >> >> Hi,
>> >> >> >>
>> >> >> >> Each DC has the whole ring, each DC contains a copy of the same
>> >> >> >> data.
>> >> >> >> When you add replication to a new DC, all data is copied to the
>> >> >> >> new
>> >> >> >> DC.
>> >> >> >>
>> >> >> >> Within a DC, each range of token is 'owned' by a (primary) node
>> >> >> >> (and
>> >> >> >> replicas if you have RF > 1). If you add/remove a node in a DC,
>> >> >> >> tokens will
>> >> >> >> be rearranged between all nodes within the DC only, the other DCs
>> >> >> >> won't be
>> >> >> >> affected.
>> >> >> >>
>> >> >> >> --
>> >> >> >> Jacques-Henri Berthemet
>> >> >> >>
>> >> >> >> -----Original Message-----
>> >> >> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> >> >> >> Sent: Wednesday, April 11, 2018 12:35 PM
>> >> >> >> To: user@cassandra.apache.org
>> >> >> >> Subject: does c* 3.0 use one ring for all datacenters?
>> >> >> >>
>> >> >> >> Hi All,
>> >> >> >>
>> >> >> >> I know it seems a stupid question, but I am really confused about
>> >> >> >> the
>> >> >> >> documents on the internet related to this topic, especially it
>> >> >> >> seems
>> >> >> >> that it
>> >> >> >> has different answers for c* with vnodes or not.
>> >> >> >>
>> >> >> >> Let's assume the token range is 1-100 for the whole cluster, how
>> >> >> >> does
>> >> >> >> it distributed into the datacenters? Think that the number of
>> >> >> >> datacenters is
>> >> >> >> dynamic in a cluster, if there is only one ring, then the token
>> >> >> >> range would
>> >> >> >> change on each node when I add a new datacenter into the cluster?
>> >> >> >> Then it
>> >> >> >> would involve data migration? It doesn't make sense.
>> >> >> >>
>> >> >> >> Looking forward to clarification for c* 3.0, thanks!
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> ---------------------------------------------------------------------
>> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> ---------------------------------------------------------------------
>> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >> >
>> >> >> >
>> >> >> > ---------------------------------------------------------------------
>> >> >> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> > For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >> >
>> >> >>
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >>
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

RE: [EXTERNAL] Re: does c* 3.0 use one ring for all datacenters?

Posted by "Durity, Sean R" <SE...@homedepot.com>.

Perhaps it would help to also note that every DC has a full copy (or more) of the data. Data for a given token exists on all DCs (assuming the keyspace is replicated to all DCs). A given node’s token ownership just determines where that data goes *for that DC*


Sean Durity
From: Jeff Jirsa [mailto:jjirsa@gmail.com]
Sent: Wednesday, April 11, 2018 1:55 PM
To: cassandra <us...@cassandra.apache.org>
Subject: [EXTERNAL] Re: does c* 3.0 use one ring for all datacenters?

The concept of DCs is just a feature offered by one replication strategy on top of the single ring.

What would multiple rings look like for a non-NTS replication strategy? It would become meaningless and broken.

On Wed, Apr 11, 2018 at 10:33 AM, Jinhua Luo <lu...@gmail.com>> wrote:
Even under one ring, when you write a key, you need to iterate the DC
list declared in the snith.
And for each iteration, you need to skip token ranges not owned by that DC.
So I could not figure out the rationality to mix DCs into one ring.

However, if each DC has its own ring, I could repeat SimpleStrategy
procedure on each DC, isn't that simpler?

2018-04-12 1:13 GMT+08:00 Jeff Jirsa <jj...@gmail.com>>:
> It's probably mostly a carry-over from dynamo paper.
>
> I'm skeptical of your claim that N token rings is somehow easier than 1,
> especially given the lack of familiarity with the codebase to know how many
> rings are involved.
>
>
>
> On Wed, Apr 11, 2018 at 10:07 AM, Jinhua Luo <lu...@gmail.com>> wrote:
>>
>> What's the benefits of one ring?
>> It seems that separate rings could archive the same goal, and make
>> architecture simpler.
>>
>> 2018-04-12 0:59 GMT+08:00 Jeff Jirsa <jj...@gmail.com>>:
>> > When you add DC3, they'll get tokens (that aren't currently in use in
>> > any
>> > existing DC). Either you assign tokens (let's pretend we manually
>> > assigned
>> > the other ones, since DC2 = DC1 + 1), but cassandra can also
>> > auto-calculate
>> > them, the exact behavior of which varies by version.
>> >
>> >
>> > Let's pretend it's old style random assignment, and we end up with DC3
>> > having 4, 17, 22, 36, 48, 53, 64, 73, 83
>> >
>> > In this case:
>> >
>> > If you use SimpleStrategy and RF=3, a key with token 5 would be placed
>> > on
>> > the hosts with token 10, 11, 17
>> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token 5
>> > would be placed on the hosts with tokens 10,20,30 ; 11, 21,31 ; 17, 22,
>> > 36
>> >
>> >
>> >
>> >
>> > On Wed, Apr 11, 2018 at 9:36 AM, Jinhua Luo <lu...@gmail.com>> wrote:
>> >>
>> >> What if I add a new DC3?
>> >> The token ranges would reshuffled into DC1, DC2, DC3?
>> >>
>> >> 2018-04-11 22:06 GMT+08:00 Jeff Jirsa <jj...@gmail.com>>:
>> >> > Confirming again that it's definitely one ring.
>> >> >
>> >> > DC1 may have tokens 0, 10, 20, 30, 40, 50, 60, 70, 80
>> >> > DC2 may have tokens 1, 11, 21, 31, 41, 51, 61, 71, 81
>> >> >
>> >> > If you use SimpleStrategy and RF=3, a key with token 5 would be
>> >> > placed
>> >> > on
>> >> > the hosts with token 10, 11, 20
>> >> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token
>> >> > 5
>> >> > would be placed on the hosts with tokens 10,20,30 and 11, 21,31
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Apr 11, 2018 at 6:27 AM, Jinhua Luo <lu...@gmail.com>>
>> >> > wrote:
>> >> >>
>> >> >> Is it a different answer? One ring?
>> >> >>
>> >> >> Could you explain your answer according to my example?
>> >> >>
>> >> >> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>>:
>> >> >> > There has always been a single ring.
>> >> >> >
>> >> >> > You can specify how many nodes in each DC you want and it’ll
>> >> >> > figure
>> >> >> > out
>> >> >> > how
>> >> >> > to do it as long as you have the right snitch and are using
>> >> >> > NetworkToploogyStrategy.
>> >> >> >
>> >> >> >
>> >> >> > On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <lu...@gmail.com>>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Let me clarify my question:
>> >> >> >>
>> >> >> >> Given we have a cluster of two DCs, each DC has 2 nodes, each
>> >> >> >> node
>> >> >> >> sets num_token as 50.
>> >> >> >> Then how are token ranges distributed in the cluster?
>> >> >> >>
>> >> >> >> If there is one global ring, then it may be (To simply the case,
>> >> >> >> let's
>> >> >> >> assume vnodes=1):
>> >> >> >> {dc1, node1} 1-50
>> >> >> >> {dc2, node1} 51-100
>> >> >> >> {dc1, node1} 101-150
>> >> >> >> {dc1, node2} 151-200
>> >> >> >>
>> >> >> >> But here comes more questions:
>> >> >> >> a) what if I add a new datacenter? Then the token ranges need to
>> >> >> >> be
>> >> >> >> re-balanced?
>> >> >> >> If so, what about the data associated with the ranges to be
>> >> >> >> balanced?
>> >> >> >> move them among DCs?
>> >> >> >> But that doesn't make sense, because each keyspace would specify
>> >> >> >> its
>> >> >> >> snith and fix the DCs to store then.
>> >> >> >>
>> >> >> >> b) It seems no benefits from same ring, because of the snith.
>> >> >> >>
>> >> >> >> If each DC has own ring, then it may be:
>> >> >> >> {dc1, node1} 1-50
>> >> >> >> {dc1, node1} 51-100
>> >> >> >> {dc2, node1} 1-50
>> >> >> >> {dc2, node1} 51-100
>> >> >> >>
>> >> >> >> I think this is not a trivial question, because each key would be
>> >> >> >> hashed to determine the token it belongs to, and
>> >> >> >> the token range distribution in turns determine which node the
>> >> >> >> key
>> >> >> >> belongs
>> >> >> >> to.
>> >> >> >>
>> >> >> >> Any official answer?
>> >> >> >>
>> >> >> >>
>> >> >> >> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
>> >> >> >> <ja...@genesys.com>>:
>> >> >> >> > Maybe I misunderstood something but from what I understand,
>> >> >> >> > each
>> >> >> >> > DC
>> >> >> >> > have
>> >> >> >> > the same ring (0-100 in you example) but it's split differently
>> >> >> >> > between
>> >> >> >> > nodes in each DC. I think it's the same principle if using
>> >> >> >> > vnode
>> >> >> >> > or
>> >> >> >> > not.
>> >> >> >> >
>> >> >> >> > I think the confusion comes from the fact that the ring range
>> >> >> >> > is
>> >> >> >> > the
>> >> >> >> > same (0-100) but each DC manages it differently because nodes
>> >> >> >> > are
>> >> >> >> > different.
>> >> >> >> >
>> >> >> >> > --
>> >> >> >> > Jacques-Henri Berthemet
>> >> >> >> >
>> >> >> >> > -----Original Message-----
>> >> >> >> > From: Jinhua Luo [mailto:luajit.io@gmail.com<ma...@gmail.com>]
>> >> >> >> > Sent: Wednesday, April 11, 2018 2:26 PM
>> >> >> >> > To: user@cassandra.apache.org<ma...@cassandra.apache.org>
>> >> >> >> > Subject: Re: does c* 3.0 use one ring for all datacenters?
>> >> >> >> >
>> >> >> >> > Thanks for your reply. I also think separate rings are more
>> >> >> >> > reasonable.
>> >> >> >> >
>> >> >> >> > So one ring for one dc is only for c* 1.x or 2.x without vnode?
>> >> >> >> >
>> >> >> >> > Check these references:
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.datastax.com_en_archived_cassandra_1.1_docs_initialize_token-5Fgeneration.html&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=lE-camvsUq1ytrS96w9lrdMu6OcCUPGc_DRBi6lug1I&s=MVqypsYVFXNkhOPdG6gRQ1_S4hCvS2oFRn15ZFVSDR0&e=>
>> >> >> >> > http://www.luketillman.com/one-token-ring-to-rule-them-all/<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.luketillman.com_one-2Dtoken-2Dring-2Dto-2Drule-2Dthem-2Dall_&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=lE-camvsUq1ytrS96w9lrdMu6OcCUPGc_DRBi6lug1I&s=hPiXu7mdI0gMb4yGeFU75J94zJBEivRzE5PyHlsc9CQ&e=>
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > https://community.apigee.com/articles/13096/cassandra-token-distribution.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__community.apigee.com_articles_13096_cassandra-2Dtoken-2Ddistribution.html&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=lE-camvsUq1ytrS96w9lrdMu6OcCUPGc_DRBi6lug1I&s=3YEEsf6tGrf6PdLZgB5r5jmQ0mTaUMgUbSaZD5P6QwA&e=>
>> >> >> >> >
>> >> >> >> > Even the riak official said c* splits the ring across dc:
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/<https://urldefense.proofpoint.com/v2/url?u=http-3A__basho.com_posts_business_riak-2Dvs-2Dcassandra-2Dan-2Dupdated-2Dbrief-2Dcomparison_&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=lE-camvsUq1ytrS96w9lrdMu6OcCUPGc_DRBi6lug1I&s=npfCp7wqcrxtVXNIGRTLx3T5QcDos3wNV7JpibstvEg&e=>
>> >> >> >> >
>> >> >> >> > Why they said each dc has its own ring?
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
>> >> >> >> > <ja...@genesys.com>>:
>> >> >> >> >> Hi,
>> >> >> >> >>
>> >> >> >> >> Each DC has the whole ring, each DC contains a copy of the
>> >> >> >> >> same
>> >> >> >> >> data.
>> >> >> >> >> When you add replication to a new DC, all data is copied to
>> >> >> >> >> the
>> >> >> >> >> new
>> >> >> >> >> DC.
>> >> >> >> >>
>> >> >> >> >> Within a DC, each range of token is 'owned' by a (primary)
>> >> >> >> >> node
>> >> >> >> >> (and
>> >> >> >> >> replicas if you have RF > 1). If you add/remove a node in a
>> >> >> >> >> DC,
>> >> >> >> >> tokens will
>> >> >> >> >> be rearranged between all nodes within the DC only, the other
>> >> >> >> >> DCs
>> >> >> >> >> won't be
>> >> >> >> >> affected.
>> >> >> >> >>
>> >> >> >> >> --
>> >> >> >> >> Jacques-Henri Berthemet
>> >> >> >> >>
>> >> >> >> >> -----Original Message-----
>> >> >> >> >> From: Jinhua Luo [mailto:luajit.io@gmail.com<ma...@gmail.com>]
>> >> >> >> >> Sent: Wednesday, April 11, 2018 12:35 PM
>> >> >> >> >> To: user@cassandra.apache.org<ma...@cassandra.apache.org>
>> >> >> >> >> Subject: does c* 3.0 use one ring for all datacenters?
>> >> >> >> >>
>> >> >> >> >> Hi All,
>> >> >> >> >>
>> >> >> >> >> I know it seems a stupid question, but I am really confused
>> >> >> >> >> about
>> >> >> >> >> the
>> >> >> >> >> documents on the internet related to this topic, especially it
>> >> >> >> >> seems
>> >> >> >> >> that it
>> >> >> >> >> has different answers for c* with vnodes or not.
>> >> >> >> >>
>> >> >> >> >> Let's assume the token range is 1-100 for the whole cluster,
>> >> >> >> >> how
>> >> >> >> >> does
>> >> >> >> >> it distributed into the datacenters? Think that the number of
>> >> >> >> >> datacenters is
>> >> >> >> >> dynamic in a cluster, if there is only one ring, then the
>> >> >> >> >> token
>> >> >> >> >> range would
>> >> >> >> >> change on each node when I add a new datacenter into the
>> >> >> >> >> cluster?
>> >> >> >> >> Then it
>> >> >> >> >> would involve data migration? It doesn't make sense.
>> >> >> >> >>
>> >> >> >> >> Looking forward to clarification for c* 3.0, thanks!
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> ---------------------------------------------------------------------
>> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org>
>> >> >> >> >> For additional commands, e-mail:
>> >> >> >> >> user-help@cassandra.apache.org<ma...@cassandra.apache.org>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> ---------------------------------------------------------------------
>> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org>
>> >> >> >> >> For additional commands, e-mail:
>> >> >> >> >> user-help@cassandra.apache.org<ma...@cassandra.apache.org>
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > ---------------------------------------------------------------------
>> >> >> >> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org>
>> >> >> >> > For additional commands, e-mail: user-help@cassandra.apache.org<ma...@cassandra.apache.org>
>> >> >> >> >
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> ---------------------------------------------------------------------
>> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org>
>> >> >> >> For additional commands, e-mail: user-help@cassandra.apache.org<ma...@cassandra.apache.org>
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org>
>> >> >> For additional commands, e-mail: user-help@cassandra.apache.org<ma...@cassandra.apache.org>
>> >> >>
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org>
>> >> For additional commands, e-mail: user-help@cassandra.apache.org<ma...@cassandra.apache.org>
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org>
>> For additional commands, e-mail: user-help@cassandra.apache.org<ma...@cassandra.apache.org>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org>
For additional commands, e-mail: user-help@cassandra.apache.org<ma...@cassandra.apache.org>


________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jeff Jirsa <jj...@gmail.com>.

The concept of DCs is* just* a feature offered by one replication strategy
on top of the single ring.

What would multiple rings look like for a non-NTS replication strategy? It
would become meaningless and broken.

On Wed, Apr 11, 2018 at 10:33 AM, Jinhua Luo <lu...@gmail.com> wrote:

> Even under one ring, when you write a key, you need to iterate the DC
> list declared in the snith.
> And for each iteration, you need to skip token ranges not owned by that DC.
> So I could not figure out the rationality to mix DCs into one ring.
>
> However, if each DC has its own ring, I could repeat SimpleStrategy
> procedure on each DC, isn't that simpler?
>
> 2018-04-12 1:13 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
> > It's probably mostly a carry-over from dynamo paper.
> >
> > I'm skeptical of your claim that N token rings is somehow easier than 1,
> > especially given the lack of familiarity with the codebase to know how
> many
> > rings are involved.
> >
> >
> >
> > On Wed, Apr 11, 2018 at 10:07 AM, Jinhua Luo <lu...@gmail.com>
> wrote:
> >>
> >> What's the benefits of one ring?
> >> It seems that separate rings could archive the same goal, and make
> >> architecture simpler.
> >>
> >> 2018-04-12 0:59 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
> >> > When you add DC3, they'll get tokens (that aren't currently in use in
> >> > any
> >> > existing DC). Either you assign tokens (let's pretend we manually
> >> > assigned
> >> > the other ones, since DC2 = DC1 + 1), but cassandra can also
> >> > auto-calculate
> >> > them, the exact behavior of which varies by version.
> >> >
> >> >
> >> > Let's pretend it's old style random assignment, and we end up with DC3
> >> > having 4, 17, 22, 36, 48, 53, 64, 73, 83
> >> >
> >> > In this case:
> >> >
> >> > If you use SimpleStrategy and RF=3, a key with token 5 would be placed
> >> > on
> >> > the hosts with token 10, 11, 17
> >> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token
> 5
> >> > would be placed on the hosts with tokens 10,20,30 ; 11, 21,31 ; 17,
> 22,
> >> > 36
> >> >
> >> >
> >> >
> >> >
> >> > On Wed, Apr 11, 2018 at 9:36 AM, Jinhua Luo <lu...@gmail.com>
> wrote:
> >> >>
> >> >> What if I add a new DC3?
> >> >> The token ranges would reshuffled into DC1, DC2, DC3?
> >> >>
> >> >> 2018-04-11 22:06 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
> >> >> > Confirming again that it's definitely one ring.
> >> >> >
> >> >> > DC1 may have tokens 0, 10, 20, 30, 40, 50, 60, 70, 80
> >> >> > DC2 may have tokens 1, 11, 21, 31, 41, 51, 61, 71, 81
> >> >> >
> >> >> > If you use SimpleStrategy and RF=3, a key with token 5 would be
> >> >> > placed
> >> >> > on
> >> >> > the hosts with token 10, 11, 20
> >> >> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with
> token
> >> >> > 5
> >> >> > would be placed on the hosts with tokens 10,20,30 and 11, 21,31
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Wed, Apr 11, 2018 at 6:27 AM, Jinhua Luo <lu...@gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> Is it a different answer? One ring?
> >> >> >>
> >> >> >> Could you explain your answer according to my example?
> >> >> >>
> >> >> >> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
> >> >> >> > There has always been a single ring.
> >> >> >> >
> >> >> >> > You can specify how many nodes in each DC you want and it’ll
> >> >> >> > figure
> >> >> >> > out
> >> >> >> > how
> >> >> >> > to do it as long as you have the right snitch and are using
> >> >> >> > NetworkToploogyStrategy.
> >> >> >> >
> >> >> >> >
> >> >> >> > On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <luajit.io@gmail.com
> >
> >> >> >> > wrote:
> >> >> >> >>
> >> >> >> >> Let me clarify my question:
> >> >> >> >>
> >> >> >> >> Given we have a cluster of two DCs, each DC has 2 nodes, each
> >> >> >> >> node
> >> >> >> >> sets num_token as 50.
> >> >> >> >> Then how are token ranges distributed in the cluster?
> >> >> >> >>
> >> >> >> >> If there is one global ring, then it may be (To simply the
> case,
> >> >> >> >> let's
> >> >> >> >> assume vnodes=1):
> >> >> >> >> {dc1, node1} 1-50
> >> >> >> >> {dc2, node1} 51-100
> >> >> >> >> {dc1, node1} 101-150
> >> >> >> >> {dc1, node2} 151-200
> >> >> >> >>
> >> >> >> >> But here comes more questions:
> >> >> >> >> a) what if I add a new datacenter? Then the token ranges need
> to
> >> >> >> >> be
> >> >> >> >> re-balanced?
> >> >> >> >> If so, what about the data associated with the ranges to be
> >> >> >> >> balanced?
> >> >> >> >> move them among DCs?
> >> >> >> >> But that doesn't make sense, because each keyspace would
> specify
> >> >> >> >> its
> >> >> >> >> snith and fix the DCs to store then.
> >> >> >> >>
> >> >> >> >> b) It seems no benefits from same ring, because of the snith.
> >> >> >> >>
> >> >> >> >> If each DC has own ring, then it may be:
> >> >> >> >> {dc1, node1} 1-50
> >> >> >> >> {dc1, node1} 51-100
> >> >> >> >> {dc2, node1} 1-50
> >> >> >> >> {dc2, node1} 51-100
> >> >> >> >>
> >> >> >> >> I think this is not a trivial question, because each key would
> be
> >> >> >> >> hashed to determine the token it belongs to, and
> >> >> >> >> the token range distribution in turns determine which node the
> >> >> >> >> key
> >> >> >> >> belongs
> >> >> >> >> to.
> >> >> >> >>
> >> >> >> >> Any official answer?
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
> >> >> >> >> <ja...@genesys.com>:
> >> >> >> >> > Maybe I misunderstood something but from what I understand,
> >> >> >> >> > each
> >> >> >> >> > DC
> >> >> >> >> > have
> >> >> >> >> > the same ring (0-100 in you example) but it's split
> differently
> >> >> >> >> > between
> >> >> >> >> > nodes in each DC. I think it's the same principle if using
> >> >> >> >> > vnode
> >> >> >> >> > or
> >> >> >> >> > not.
> >> >> >> >> >
> >> >> >> >> > I think the confusion comes from the fact that the ring range
> >> >> >> >> > is
> >> >> >> >> > the
> >> >> >> >> > same (0-100) but each DC manages it differently because nodes
> >> >> >> >> > are
> >> >> >> >> > different.
> >> >> >> >> >
> >> >> >> >> > --
> >> >> >> >> > Jacques-Henri Berthemet
> >> >> >> >> >
> >> >> >> >> > -----Original Message-----
> >> >> >> >> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
> >> >> >> >> > Sent: Wednesday, April 11, 2018 2:26 PM
> >> >> >> >> > To: user@cassandra.apache.org
> >> >> >> >> > Subject: Re: does c* 3.0 use one ring for all datacenters?
> >> >> >> >> >
> >> >> >> >> > Thanks for your reply. I also think separate rings are more
> >> >> >> >> > reasonable.
> >> >> >> >> >
> >> >> >> >> > So one ring for one dc is only for c* 1.x or 2.x without
> vnode?
> >> >> >> >> >
> >> >> >> >> > Check these references:
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > https://docs.datastax.com/en/archived/cassandra/1.1/docs/
> initialize/token_generation.html
> >> >> >> >> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > https://community.apigee.com/articles/13096/cassandra-
> token-distribution.html
> >> >> >> >> >
> >> >> >> >> > Even the riak official said c* splits the ring across dc:
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > http://basho.com/posts/business/riak-vs-cassandra-an-
> updated-brief-comparison/
> >> >> >> >> >
> >> >> >> >> > Why they said each dc has its own ring?
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
> >> >> >> >> > <ja...@genesys.com>:
> >> >> >> >> >> Hi,
> >> >> >> >> >>
> >> >> >> >> >> Each DC has the whole ring, each DC contains a copy of the
> >> >> >> >> >> same
> >> >> >> >> >> data.
> >> >> >> >> >> When you add replication to a new DC, all data is copied to
> >> >> >> >> >> the
> >> >> >> >> >> new
> >> >> >> >> >> DC.
> >> >> >> >> >>
> >> >> >> >> >> Within a DC, each range of token is 'owned' by a (primary)
> >> >> >> >> >> node
> >> >> >> >> >> (and
> >> >> >> >> >> replicas if you have RF > 1). If you add/remove a node in a
> >> >> >> >> >> DC,
> >> >> >> >> >> tokens will
> >> >> >> >> >> be rearranged between all nodes within the DC only, the
> other
> >> >> >> >> >> DCs
> >> >> >> >> >> won't be
> >> >> >> >> >> affected.
> >> >> >> >> >>
> >> >> >> >> >> --
> >> >> >> >> >> Jacques-Henri Berthemet
> >> >> >> >> >>
> >> >> >> >> >> -----Original Message-----
> >> >> >> >> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
> >> >> >> >> >> Sent: Wednesday, April 11, 2018 12:35 PM
> >> >> >> >> >> To: user@cassandra.apache.org
> >> >> >> >> >> Subject: does c* 3.0 use one ring for all datacenters?
> >> >> >> >> >>
> >> >> >> >> >> Hi All,
> >> >> >> >> >>
> >> >> >> >> >> I know it seems a stupid question, but I am really confused
> >> >> >> >> >> about
> >> >> >> >> >> the
> >> >> >> >> >> documents on the internet related to this topic, especially
> it
> >> >> >> >> >> seems
> >> >> >> >> >> that it
> >> >> >> >> >> has different answers for c* with vnodes or not.
> >> >> >> >> >>
> >> >> >> >> >> Let's assume the token range is 1-100 for the whole cluster,
> >> >> >> >> >> how
> >> >> >> >> >> does
> >> >> >> >> >> it distributed into the datacenters? Think that the number
> of
> >> >> >> >> >> datacenters is
> >> >> >> >> >> dynamic in a cluster, if there is only one ring, then the
> >> >> >> >> >> token
> >> >> >> >> >> range would
> >> >> >> >> >> change on each node when I add a new datacenter into the
> >> >> >> >> >> cluster?
> >> >> >> >> >> Then it
> >> >> >> >> >> would involve data migration? It doesn't make sense.
> >> >> >> >> >>
> >> >> >> >> >> Looking forward to clarification for c* 3.0, thanks!
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >> ------------------------------
> ---------------------------------------
> >> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.
> apache.org
> >> >> >> >> >> For additional commands, e-mail:
> >> >> >> >> >> user-help@cassandra.apache.org
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >> ------------------------------
> ---------------------------------------
> >> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.
> apache.org
> >> >> >> >> >> For additional commands, e-mail:
> >> >> >> >> >> user-help@cassandra.apache.org
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > ------------------------------------------------------------
> ---------
> >> >> >> >> > To unsubscribe, e-mail: user-unsubscribe@cassandra.
> apache.org
> >> >> >> >> > For additional commands, e-mail:
> user-help@cassandra.apache.org
> >> >> >> >> >
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> ------------------------------------------------------------
> ---------
> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> >> >> For additional commands, e-mail:
> user-help@cassandra.apache.org
> >> >> >> >>
> >> >> >> >
> >> >> >>
> >> >> >>
> >> >> >> ------------------------------------------------------------
> ---------
> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >> >> >>
> >> >> >
> >> >>
> >> >> ------------------------------------------------------------
> ---------
> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jinhua Luo <lu...@gmail.com>.

Even under one ring, when you write a key, you need to iterate the DC
list declared in the snith.
And for each iteration, you need to skip token ranges not owned by that DC.
So I could not figure out the rationality to mix DCs into one ring.

However, if each DC has its own ring, I could repeat SimpleStrategy
procedure on each DC, isn't that simpler?

2018-04-12 1:13 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
> It's probably mostly a carry-over from dynamo paper.
>
> I'm skeptical of your claim that N token rings is somehow easier than 1,
> especially given the lack of familiarity with the codebase to know how many
> rings are involved.
>
>
>
> On Wed, Apr 11, 2018 at 10:07 AM, Jinhua Luo <lu...@gmail.com> wrote:
>>
>> What's the benefits of one ring?
>> It seems that separate rings could archive the same goal, and make
>> architecture simpler.
>>
>> 2018-04-12 0:59 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
>> > When you add DC3, they'll get tokens (that aren't currently in use in
>> > any
>> > existing DC). Either you assign tokens (let's pretend we manually
>> > assigned
>> > the other ones, since DC2 = DC1 + 1), but cassandra can also
>> > auto-calculate
>> > them, the exact behavior of which varies by version.
>> >
>> >
>> > Let's pretend it's old style random assignment, and we end up with DC3
>> > having 4, 17, 22, 36, 48, 53, 64, 73, 83
>> >
>> > In this case:
>> >
>> > If you use SimpleStrategy and RF=3, a key with token 5 would be placed
>> > on
>> > the hosts with token 10, 11, 17
>> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token 5
>> > would be placed on the hosts with tokens 10,20,30 ; 11, 21,31 ; 17, 22,
>> > 36
>> >
>> >
>> >
>> >
>> > On Wed, Apr 11, 2018 at 9:36 AM, Jinhua Luo <lu...@gmail.com> wrote:
>> >>
>> >> What if I add a new DC3?
>> >> The token ranges would reshuffled into DC1, DC2, DC3?
>> >>
>> >> 2018-04-11 22:06 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
>> >> > Confirming again that it's definitely one ring.
>> >> >
>> >> > DC1 may have tokens 0, 10, 20, 30, 40, 50, 60, 70, 80
>> >> > DC2 may have tokens 1, 11, 21, 31, 41, 51, 61, 71, 81
>> >> >
>> >> > If you use SimpleStrategy and RF=3, a key with token 5 would be
>> >> > placed
>> >> > on
>> >> > the hosts with token 10, 11, 20
>> >> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token
>> >> > 5
>> >> > would be placed on the hosts with tokens 10,20,30 and 11, 21,31
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Apr 11, 2018 at 6:27 AM, Jinhua Luo <lu...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Is it a different answer? One ring?
>> >> >>
>> >> >> Could you explain your answer according to my example?
>> >> >>
>> >> >> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
>> >> >> > There has always been a single ring.
>> >> >> >
>> >> >> > You can specify how many nodes in each DC you want and it’ll
>> >> >> > figure
>> >> >> > out
>> >> >> > how
>> >> >> > to do it as long as you have the right snitch and are using
>> >> >> > NetworkToploogyStrategy.
>> >> >> >
>> >> >> >
>> >> >> > On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <lu...@gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Let me clarify my question:
>> >> >> >>
>> >> >> >> Given we have a cluster of two DCs, each DC has 2 nodes, each
>> >> >> >> node
>> >> >> >> sets num_token as 50.
>> >> >> >> Then how are token ranges distributed in the cluster?
>> >> >> >>
>> >> >> >> If there is one global ring, then it may be (To simply the case,
>> >> >> >> let's
>> >> >> >> assume vnodes=1):
>> >> >> >> {dc1, node1} 1-50
>> >> >> >> {dc2, node1} 51-100
>> >> >> >> {dc1, node1} 101-150
>> >> >> >> {dc1, node2} 151-200
>> >> >> >>
>> >> >> >> But here comes more questions:
>> >> >> >> a) what if I add a new datacenter? Then the token ranges need to
>> >> >> >> be
>> >> >> >> re-balanced?
>> >> >> >> If so, what about the data associated with the ranges to be
>> >> >> >> balanced?
>> >> >> >> move them among DCs?
>> >> >> >> But that doesn't make sense, because each keyspace would specify
>> >> >> >> its
>> >> >> >> snith and fix the DCs to store then.
>> >> >> >>
>> >> >> >> b) It seems no benefits from same ring, because of the snith.
>> >> >> >>
>> >> >> >> If each DC has own ring, then it may be:
>> >> >> >> {dc1, node1} 1-50
>> >> >> >> {dc1, node1} 51-100
>> >> >> >> {dc2, node1} 1-50
>> >> >> >> {dc2, node1} 51-100
>> >> >> >>
>> >> >> >> I think this is not a trivial question, because each key would be
>> >> >> >> hashed to determine the token it belongs to, and
>> >> >> >> the token range distribution in turns determine which node the
>> >> >> >> key
>> >> >> >> belongs
>> >> >> >> to.
>> >> >> >>
>> >> >> >> Any official answer?
>> >> >> >>
>> >> >> >>
>> >> >> >> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
>> >> >> >> <ja...@genesys.com>:
>> >> >> >> > Maybe I misunderstood something but from what I understand,
>> >> >> >> > each
>> >> >> >> > DC
>> >> >> >> > have
>> >> >> >> > the same ring (0-100 in you example) but it's split differently
>> >> >> >> > between
>> >> >> >> > nodes in each DC. I think it's the same principle if using
>> >> >> >> > vnode
>> >> >> >> > or
>> >> >> >> > not.
>> >> >> >> >
>> >> >> >> > I think the confusion comes from the fact that the ring range
>> >> >> >> > is
>> >> >> >> > the
>> >> >> >> > same (0-100) but each DC manages it differently because nodes
>> >> >> >> > are
>> >> >> >> > different.
>> >> >> >> >
>> >> >> >> > --
>> >> >> >> > Jacques-Henri Berthemet
>> >> >> >> >
>> >> >> >> > -----Original Message-----
>> >> >> >> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> >> >> >> > Sent: Wednesday, April 11, 2018 2:26 PM
>> >> >> >> > To: user@cassandra.apache.org
>> >> >> >> > Subject: Re: does c* 3.0 use one ring for all datacenters?
>> >> >> >> >
>> >> >> >> > Thanks for your reply. I also think separate rings are more
>> >> >> >> > reasonable.
>> >> >> >> >
>> >> >> >> > So one ring for one dc is only for c* 1.x or 2.x without vnode?
>> >> >> >> >
>> >> >> >> > Check these references:
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html
>> >> >> >> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > https://community.apigee.com/articles/13096/cassandra-token-distribution.html
>> >> >> >> >
>> >> >> >> > Even the riak official said c* splits the ring across dc:
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/
>> >> >> >> >
>> >> >> >> > Why they said each dc has its own ring?
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
>> >> >> >> > <ja...@genesys.com>:
>> >> >> >> >> Hi,
>> >> >> >> >>
>> >> >> >> >> Each DC has the whole ring, each DC contains a copy of the
>> >> >> >> >> same
>> >> >> >> >> data.
>> >> >> >> >> When you add replication to a new DC, all data is copied to
>> >> >> >> >> the
>> >> >> >> >> new
>> >> >> >> >> DC.
>> >> >> >> >>
>> >> >> >> >> Within a DC, each range of token is 'owned' by a (primary)
>> >> >> >> >> node
>> >> >> >> >> (and
>> >> >> >> >> replicas if you have RF > 1). If you add/remove a node in a
>> >> >> >> >> DC,
>> >> >> >> >> tokens will
>> >> >> >> >> be rearranged between all nodes within the DC only, the other
>> >> >> >> >> DCs
>> >> >> >> >> won't be
>> >> >> >> >> affected.
>> >> >> >> >>
>> >> >> >> >> --
>> >> >> >> >> Jacques-Henri Berthemet
>> >> >> >> >>
>> >> >> >> >> -----Original Message-----
>> >> >> >> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> >> >> >> >> Sent: Wednesday, April 11, 2018 12:35 PM
>> >> >> >> >> To: user@cassandra.apache.org
>> >> >> >> >> Subject: does c* 3.0 use one ring for all datacenters?
>> >> >> >> >>
>> >> >> >> >> Hi All,
>> >> >> >> >>
>> >> >> >> >> I know it seems a stupid question, but I am really confused
>> >> >> >> >> about
>> >> >> >> >> the
>> >> >> >> >> documents on the internet related to this topic, especially it
>> >> >> >> >> seems
>> >> >> >> >> that it
>> >> >> >> >> has different answers for c* with vnodes or not.
>> >> >> >> >>
>> >> >> >> >> Let's assume the token range is 1-100 for the whole cluster,
>> >> >> >> >> how
>> >> >> >> >> does
>> >> >> >> >> it distributed into the datacenters? Think that the number of
>> >> >> >> >> datacenters is
>> >> >> >> >> dynamic in a cluster, if there is only one ring, then the
>> >> >> >> >> token
>> >> >> >> >> range would
>> >> >> >> >> change on each node when I add a new datacenter into the
>> >> >> >> >> cluster?
>> >> >> >> >> Then it
>> >> >> >> >> would involve data migration? It doesn't make sense.
>> >> >> >> >>
>> >> >> >> >> Looking forward to clarification for c* 3.0, thanks!
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> ---------------------------------------------------------------------
>> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> >> >> For additional commands, e-mail:
>> >> >> >> >> user-help@cassandra.apache.org
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> ---------------------------------------------------------------------
>> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> >> >> For additional commands, e-mail:
>> >> >> >> >> user-help@cassandra.apache.org
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > ---------------------------------------------------------------------
>> >> >> >> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> >> > For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >> >> >
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> ---------------------------------------------------------------------
>> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >>
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jeff Jirsa <jj...@gmail.com>.

It's probably mostly a carry-over from dynamo paper.

I'm skeptical of your claim that N token rings is somehow easier than 1,
especially given the lack of familiarity with the codebase to know how many
rings are involved.



On Wed, Apr 11, 2018 at 10:07 AM, Jinhua Luo <lu...@gmail.com> wrote:

> What's the benefits of one ring?
> It seems that separate rings could archive the same goal, and make
> architecture simpler.
>
> 2018-04-12 0:59 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
> > When you add DC3, they'll get tokens (that aren't currently in use in any
> > existing DC). Either you assign tokens (let's pretend we manually
> assigned
> > the other ones, since DC2 = DC1 + 1), but cassandra can also
> auto-calculate
> > them, the exact behavior of which varies by version.
> >
> >
> > Let's pretend it's old style random assignment, and we end up with DC3
> > having 4, 17, 22, 36, 48, 53, 64, 73, 83
> >
> > In this case:
> >
> > If you use SimpleStrategy and RF=3, a key with token 5 would be placed on
> > the hosts with token 10, 11, 17
> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token 5
> > would be placed on the hosts with tokens 10,20,30 ; 11, 21,31 ; 17, 22,
> 36
> >
> >
> >
> >
> > On Wed, Apr 11, 2018 at 9:36 AM, Jinhua Luo <lu...@gmail.com> wrote:
> >>
> >> What if I add a new DC3?
> >> The token ranges would reshuffled into DC1, DC2, DC3?
> >>
> >> 2018-04-11 22:06 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
> >> > Confirming again that it's definitely one ring.
> >> >
> >> > DC1 may have tokens 0, 10, 20, 30, 40, 50, 60, 70, 80
> >> > DC2 may have tokens 1, 11, 21, 31, 41, 51, 61, 71, 81
> >> >
> >> > If you use SimpleStrategy and RF=3, a key with token 5 would be placed
> >> > on
> >> > the hosts with token 10, 11, 20
> >> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token
> 5
> >> > would be placed on the hosts with tokens 10,20,30 and 11, 21,31
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Wed, Apr 11, 2018 at 6:27 AM, Jinhua Luo <lu...@gmail.com>
> wrote:
> >> >>
> >> >> Is it a different answer? One ring?
> >> >>
> >> >> Could you explain your answer according to my example?
> >> >>
> >> >> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
> >> >> > There has always been a single ring.
> >> >> >
> >> >> > You can specify how many nodes in each DC you want and it’ll figure
> >> >> > out
> >> >> > how
> >> >> > to do it as long as you have the right snitch and are using
> >> >> > NetworkToploogyStrategy.
> >> >> >
> >> >> >
> >> >> > On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <lu...@gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> Let me clarify my question:
> >> >> >>
> >> >> >> Given we have a cluster of two DCs, each DC has 2 nodes, each node
> >> >> >> sets num_token as 50.
> >> >> >> Then how are token ranges distributed in the cluster?
> >> >> >>
> >> >> >> If there is one global ring, then it may be (To simply the case,
> >> >> >> let's
> >> >> >> assume vnodes=1):
> >> >> >> {dc1, node1} 1-50
> >> >> >> {dc2, node1} 51-100
> >> >> >> {dc1, node1} 101-150
> >> >> >> {dc1, node2} 151-200
> >> >> >>
> >> >> >> But here comes more questions:
> >> >> >> a) what if I add a new datacenter? Then the token ranges need to
> be
> >> >> >> re-balanced?
> >> >> >> If so, what about the data associated with the ranges to be
> >> >> >> balanced?
> >> >> >> move them among DCs?
> >> >> >> But that doesn't make sense, because each keyspace would specify
> its
> >> >> >> snith and fix the DCs to store then.
> >> >> >>
> >> >> >> b) It seems no benefits from same ring, because of the snith.
> >> >> >>
> >> >> >> If each DC has own ring, then it may be:
> >> >> >> {dc1, node1} 1-50
> >> >> >> {dc1, node1} 51-100
> >> >> >> {dc2, node1} 1-50
> >> >> >> {dc2, node1} 51-100
> >> >> >>
> >> >> >> I think this is not a trivial question, because each key would be
> >> >> >> hashed to determine the token it belongs to, and
> >> >> >> the token range distribution in turns determine which node the key
> >> >> >> belongs
> >> >> >> to.
> >> >> >>
> >> >> >> Any official answer?
> >> >> >>
> >> >> >>
> >> >> >> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
> >> >> >> <ja...@genesys.com>:
> >> >> >> > Maybe I misunderstood something but from what I understand, each
> >> >> >> > DC
> >> >> >> > have
> >> >> >> > the same ring (0-100 in you example) but it's split differently
> >> >> >> > between
> >> >> >> > nodes in each DC. I think it's the same principle if using vnode
> >> >> >> > or
> >> >> >> > not.
> >> >> >> >
> >> >> >> > I think the confusion comes from the fact that the ring range is
> >> >> >> > the
> >> >> >> > same (0-100) but each DC manages it differently because nodes
> are
> >> >> >> > different.
> >> >> >> >
> >> >> >> > --
> >> >> >> > Jacques-Henri Berthemet
> >> >> >> >
> >> >> >> > -----Original Message-----
> >> >> >> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
> >> >> >> > Sent: Wednesday, April 11, 2018 2:26 PM
> >> >> >> > To: user@cassandra.apache.org
> >> >> >> > Subject: Re: does c* 3.0 use one ring for all datacenters?
> >> >> >> >
> >> >> >> > Thanks for your reply. I also think separate rings are more
> >> >> >> > reasonable.
> >> >> >> >
> >> >> >> > So one ring for one dc is only for c* 1.x or 2.x without vnode?
> >> >> >> >
> >> >> >> > Check these references:
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > https://docs.datastax.com/en/archived/cassandra/1.1/docs/
> initialize/token_generation.html
> >> >> >> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > https://community.apigee.com/articles/13096/cassandra-
> token-distribution.html
> >> >> >> >
> >> >> >> > Even the riak official said c* splits the ring across dc:
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > http://basho.com/posts/business/riak-vs-cassandra-an-
> updated-brief-comparison/
> >> >> >> >
> >> >> >> > Why they said each dc has its own ring?
> >> >> >> >
> >> >> >> >
> >> >> >> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
> >> >> >> > <ja...@genesys.com>:
> >> >> >> >> Hi,
> >> >> >> >>
> >> >> >> >> Each DC has the whole ring, each DC contains a copy of the same
> >> >> >> >> data.
> >> >> >> >> When you add replication to a new DC, all data is copied to the
> >> >> >> >> new
> >> >> >> >> DC.
> >> >> >> >>
> >> >> >> >> Within a DC, each range of token is 'owned' by a (primary) node
> >> >> >> >> (and
> >> >> >> >> replicas if you have RF > 1). If you add/remove a node in a DC,
> >> >> >> >> tokens will
> >> >> >> >> be rearranged between all nodes within the DC only, the other
> DCs
> >> >> >> >> won't be
> >> >> >> >> affected.
> >> >> >> >>
> >> >> >> >> --
> >> >> >> >> Jacques-Henri Berthemet
> >> >> >> >>
> >> >> >> >> -----Original Message-----
> >> >> >> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
> >> >> >> >> Sent: Wednesday, April 11, 2018 12:35 PM
> >> >> >> >> To: user@cassandra.apache.org
> >> >> >> >> Subject: does c* 3.0 use one ring for all datacenters?
> >> >> >> >>
> >> >> >> >> Hi All,
> >> >> >> >>
> >> >> >> >> I know it seems a stupid question, but I am really confused
> about
> >> >> >> >> the
> >> >> >> >> documents on the internet related to this topic, especially it
> >> >> >> >> seems
> >> >> >> >> that it
> >> >> >> >> has different answers for c* with vnodes or not.
> >> >> >> >>
> >> >> >> >> Let's assume the token range is 1-100 for the whole cluster,
> how
> >> >> >> >> does
> >> >> >> >> it distributed into the datacenters? Think that the number of
> >> >> >> >> datacenters is
> >> >> >> >> dynamic in a cluster, if there is only one ring, then the token
> >> >> >> >> range would
> >> >> >> >> change on each node when I add a new datacenter into the
> cluster?
> >> >> >> >> Then it
> >> >> >> >> would involve data migration? It doesn't make sense.
> >> >> >> >>
> >> >> >> >> Looking forward to clarification for c* 3.0, thanks!
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> ------------------------------------------------------------
> ---------
> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> >> >> For additional commands, e-mail:
> user-help@cassandra.apache.org
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> ------------------------------------------------------------
> ---------
> >> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> >> >> For additional commands, e-mail:
> user-help@cassandra.apache.org
> >> >> >> >
> >> >> >> >
> >> >> >> > ------------------------------------------------------------
> ---------
> >> >> >> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> >> > For additional commands, e-mail: user-help@cassandra.apache.org
> >> >> >> >
> >> >> >>
> >> >> >>
> >> >> >> ------------------------------------------------------------
> ---------
> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >> >> >>
> >> >> >
> >> >>
> >> >> ------------------------------------------------------------
> ---------
> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jinhua Luo <lu...@gmail.com>.

What's the benefits of one ring?
It seems that separate rings could archive the same goal, and make
architecture simpler.

2018-04-12 0:59 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
> When you add DC3, they'll get tokens (that aren't currently in use in any
> existing DC). Either you assign tokens (let's pretend we manually assigned
> the other ones, since DC2 = DC1 + 1), but cassandra can also auto-calculate
> them, the exact behavior of which varies by version.
>
>
> Let's pretend it's old style random assignment, and we end up with DC3
> having 4, 17, 22, 36, 48, 53, 64, 73, 83
>
> In this case:
>
> If you use SimpleStrategy and RF=3, a key with token 5 would be placed on
> the hosts with token 10, 11, 17
> If you use NetworkTopologyStrategy with RF=3 per DC, a key with token 5
> would be placed on the hosts with tokens 10,20,30 ; 11, 21,31 ; 17, 22, 36
>
>
>
>
> On Wed, Apr 11, 2018 at 9:36 AM, Jinhua Luo <lu...@gmail.com> wrote:
>>
>> What if I add a new DC3?
>> The token ranges would reshuffled into DC1, DC2, DC3?
>>
>> 2018-04-11 22:06 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
>> > Confirming again that it's definitely one ring.
>> >
>> > DC1 may have tokens 0, 10, 20, 30, 40, 50, 60, 70, 80
>> > DC2 may have tokens 1, 11, 21, 31, 41, 51, 61, 71, 81
>> >
>> > If you use SimpleStrategy and RF=3, a key with token 5 would be placed
>> > on
>> > the hosts with token 10, 11, 20
>> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token 5
>> > would be placed on the hosts with tokens 10,20,30 and 11, 21,31
>> >
>> >
>> >
>> >
>> >
>> > On Wed, Apr 11, 2018 at 6:27 AM, Jinhua Luo <lu...@gmail.com> wrote:
>> >>
>> >> Is it a different answer? One ring?
>> >>
>> >> Could you explain your answer according to my example?
>> >>
>> >> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
>> >> > There has always been a single ring.
>> >> >
>> >> > You can specify how many nodes in each DC you want and it’ll figure
>> >> > out
>> >> > how
>> >> > to do it as long as you have the right snitch and are using
>> >> > NetworkToploogyStrategy.
>> >> >
>> >> >
>> >> > On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <lu...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Let me clarify my question:
>> >> >>
>> >> >> Given we have a cluster of two DCs, each DC has 2 nodes, each node
>> >> >> sets num_token as 50.
>> >> >> Then how are token ranges distributed in the cluster?
>> >> >>
>> >> >> If there is one global ring, then it may be (To simply the case,
>> >> >> let's
>> >> >> assume vnodes=1):
>> >> >> {dc1, node1} 1-50
>> >> >> {dc2, node1} 51-100
>> >> >> {dc1, node1} 101-150
>> >> >> {dc1, node2} 151-200
>> >> >>
>> >> >> But here comes more questions:
>> >> >> a) what if I add a new datacenter? Then the token ranges need to be
>> >> >> re-balanced?
>> >> >> If so, what about the data associated with the ranges to be
>> >> >> balanced?
>> >> >> move them among DCs?
>> >> >> But that doesn't make sense, because each keyspace would specify its
>> >> >> snith and fix the DCs to store then.
>> >> >>
>> >> >> b) It seems no benefits from same ring, because of the snith.
>> >> >>
>> >> >> If each DC has own ring, then it may be:
>> >> >> {dc1, node1} 1-50
>> >> >> {dc1, node1} 51-100
>> >> >> {dc2, node1} 1-50
>> >> >> {dc2, node1} 51-100
>> >> >>
>> >> >> I think this is not a trivial question, because each key would be
>> >> >> hashed to determine the token it belongs to, and
>> >> >> the token range distribution in turns determine which node the key
>> >> >> belongs
>> >> >> to.
>> >> >>
>> >> >> Any official answer?
>> >> >>
>> >> >>
>> >> >> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
>> >> >> <ja...@genesys.com>:
>> >> >> > Maybe I misunderstood something but from what I understand, each
>> >> >> > DC
>> >> >> > have
>> >> >> > the same ring (0-100 in you example) but it's split differently
>> >> >> > between
>> >> >> > nodes in each DC. I think it's the same principle if using vnode
>> >> >> > or
>> >> >> > not.
>> >> >> >
>> >> >> > I think the confusion comes from the fact that the ring range is
>> >> >> > the
>> >> >> > same (0-100) but each DC manages it differently because nodes are
>> >> >> > different.
>> >> >> >
>> >> >> > --
>> >> >> > Jacques-Henri Berthemet
>> >> >> >
>> >> >> > -----Original Message-----
>> >> >> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> >> >> > Sent: Wednesday, April 11, 2018 2:26 PM
>> >> >> > To: user@cassandra.apache.org
>> >> >> > Subject: Re: does c* 3.0 use one ring for all datacenters?
>> >> >> >
>> >> >> > Thanks for your reply. I also think separate rings are more
>> >> >> > reasonable.
>> >> >> >
>> >> >> > So one ring for one dc is only for c* 1.x or 2.x without vnode?
>> >> >> >
>> >> >> > Check these references:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html
>> >> >> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > https://community.apigee.com/articles/13096/cassandra-token-distribution.html
>> >> >> >
>> >> >> > Even the riak official said c* splits the ring across dc:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/
>> >> >> >
>> >> >> > Why they said each dc has its own ring?
>> >> >> >
>> >> >> >
>> >> >> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
>> >> >> > <ja...@genesys.com>:
>> >> >> >> Hi,
>> >> >> >>
>> >> >> >> Each DC has the whole ring, each DC contains a copy of the same
>> >> >> >> data.
>> >> >> >> When you add replication to a new DC, all data is copied to the
>> >> >> >> new
>> >> >> >> DC.
>> >> >> >>
>> >> >> >> Within a DC, each range of token is 'owned' by a (primary) node
>> >> >> >> (and
>> >> >> >> replicas if you have RF > 1). If you add/remove a node in a DC,
>> >> >> >> tokens will
>> >> >> >> be rearranged between all nodes within the DC only, the other DCs
>> >> >> >> won't be
>> >> >> >> affected.
>> >> >> >>
>> >> >> >> --
>> >> >> >> Jacques-Henri Berthemet
>> >> >> >>
>> >> >> >> -----Original Message-----
>> >> >> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> >> >> >> Sent: Wednesday, April 11, 2018 12:35 PM
>> >> >> >> To: user@cassandra.apache.org
>> >> >> >> Subject: does c* 3.0 use one ring for all datacenters?
>> >> >> >>
>> >> >> >> Hi All,
>> >> >> >>
>> >> >> >> I know it seems a stupid question, but I am really confused about
>> >> >> >> the
>> >> >> >> documents on the internet related to this topic, especially it
>> >> >> >> seems
>> >> >> >> that it
>> >> >> >> has different answers for c* with vnodes or not.
>> >> >> >>
>> >> >> >> Let's assume the token range is 1-100 for the whole cluster, how
>> >> >> >> does
>> >> >> >> it distributed into the datacenters? Think that the number of
>> >> >> >> datacenters is
>> >> >> >> dynamic in a cluster, if there is only one ring, then the token
>> >> >> >> range would
>> >> >> >> change on each node when I add a new datacenter into the cluster?
>> >> >> >> Then it
>> >> >> >> would involve data migration? It doesn't make sense.
>> >> >> >>
>> >> >> >> Looking forward to clarification for c* 3.0, thanks!
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> ---------------------------------------------------------------------
>> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> ---------------------------------------------------------------------
>> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >> >
>> >> >> >
>> >> >> > ---------------------------------------------------------------------
>> >> >> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> > For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >> >
>> >> >>
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >>
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jeff Jirsa <jj...@gmail.com>.

When you add DC3, they'll get tokens (that aren't currently in use in any
existing DC). Either you assign tokens (let's pretend we manually assigned
the other ones, since DC2 = DC1 + 1), but cassandra can also auto-calculate
them, the exact behavior of which varies by version.


Let's pretend it's old style random assignment, and we end up with DC3
having 4, 17, 22, 36, 48, 53, 64, 73, 83

In this case:

If you use SimpleStrategy and RF=3, a key with token 5 would be placed on  the
hosts with token 10, 11, 17
If you use NetworkTopologyStrategy with RF=3 per DC, a key with token 5  would
be placed on the hosts with tokens 10,20,30 ; 11, 21,31 ; 17, 22, 36




On Wed, Apr 11, 2018 at 9:36 AM, Jinhua Luo <lu...@gmail.com> wrote:

> What if I add a new DC3?
> The token ranges would reshuffled into DC1, DC2, DC3?
>
> 2018-04-11 22:06 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
> > Confirming again that it's definitely one ring.
> >
> > DC1 may have tokens 0, 10, 20, 30, 40, 50, 60, 70, 80
> > DC2 may have tokens 1, 11, 21, 31, 41, 51, 61, 71, 81
> >
> > If you use SimpleStrategy and RF=3, a key with token 5 would be placed on
> > the hosts with token 10, 11, 20
> > If you use NetworkTopologyStrategy with RF=3 per DC, a key with token 5
> > would be placed on the hosts with tokens 10,20,30 and 11, 21,31
> >
> >
> >
> >
> >
> > On Wed, Apr 11, 2018 at 6:27 AM, Jinhua Luo <lu...@gmail.com> wrote:
> >>
> >> Is it a different answer? One ring?
> >>
> >> Could you explain your answer according to my example?
> >>
> >> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
> >> > There has always been a single ring.
> >> >
> >> > You can specify how many nodes in each DC you want and it’ll figure
> out
> >> > how
> >> > to do it as long as you have the right snitch and are using
> >> > NetworkToploogyStrategy.
> >> >
> >> >
> >> > On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <lu...@gmail.com>
> wrote:
> >> >>
> >> >> Let me clarify my question:
> >> >>
> >> >> Given we have a cluster of two DCs, each DC has 2 nodes, each node
> >> >> sets num_token as 50.
> >> >> Then how are token ranges distributed in the cluster?
> >> >>
> >> >> If there is one global ring, then it may be (To simply the case,
> let's
> >> >> assume vnodes=1):
> >> >> {dc1, node1} 1-50
> >> >> {dc2, node1} 51-100
> >> >> {dc1, node1} 101-150
> >> >> {dc1, node2} 151-200
> >> >>
> >> >> But here comes more questions:
> >> >> a) what if I add a new datacenter? Then the token ranges need to be
> >> >> re-balanced?
> >> >> If so, what about the data associated with the ranges to be balanced?
> >> >> move them among DCs?
> >> >> But that doesn't make sense, because each keyspace would specify its
> >> >> snith and fix the DCs to store then.
> >> >>
> >> >> b) It seems no benefits from same ring, because of the snith.
> >> >>
> >> >> If each DC has own ring, then it may be:
> >> >> {dc1, node1} 1-50
> >> >> {dc1, node1} 51-100
> >> >> {dc2, node1} 1-50
> >> >> {dc2, node1} 51-100
> >> >>
> >> >> I think this is not a trivial question, because each key would be
> >> >> hashed to determine the token it belongs to, and
> >> >> the token range distribution in turns determine which node the key
> >> >> belongs
> >> >> to.
> >> >>
> >> >> Any official answer?
> >> >>
> >> >>
> >> >> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
> >> >> <ja...@genesys.com>:
> >> >> > Maybe I misunderstood something but from what I understand, each DC
> >> >> > have
> >> >> > the same ring (0-100 in you example) but it's split differently
> >> >> > between
> >> >> > nodes in each DC. I think it's the same principle if using vnode or
> >> >> > not.
> >> >> >
> >> >> > I think the confusion comes from the fact that the ring range is
> the
> >> >> > same (0-100) but each DC manages it differently because nodes are
> >> >> > different.
> >> >> >
> >> >> > --
> >> >> > Jacques-Henri Berthemet
> >> >> >
> >> >> > -----Original Message-----
> >> >> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
> >> >> > Sent: Wednesday, April 11, 2018 2:26 PM
> >> >> > To: user@cassandra.apache.org
> >> >> > Subject: Re: does c* 3.0 use one ring for all datacenters?
> >> >> >
> >> >> > Thanks for your reply. I also think separate rings are more
> >> >> > reasonable.
> >> >> >
> >> >> > So one ring for one dc is only for c* 1.x or 2.x without vnode?
> >> >> >
> >> >> > Check these references:
> >> >> >
> >> >> >
> >> >> >
> >> >> > https://docs.datastax.com/en/archived/cassandra/1.1/docs/
> initialize/token_generation.html
> >> >> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
> >> >> >
> >> >> >
> >> >> > https://community.apigee.com/articles/13096/cassandra-
> token-distribution.html
> >> >> >
> >> >> > Even the riak official said c* splits the ring across dc:
> >> >> >
> >> >> >
> >> >> > http://basho.com/posts/business/riak-vs-cassandra-an-
> updated-brief-comparison/
> >> >> >
> >> >> > Why they said each dc has its own ring?
> >> >> >
> >> >> >
> >> >> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
> >> >> > <ja...@genesys.com>:
> >> >> >> Hi,
> >> >> >>
> >> >> >> Each DC has the whole ring, each DC contains a copy of the same
> >> >> >> data.
> >> >> >> When you add replication to a new DC, all data is copied to the
> new
> >> >> >> DC.
> >> >> >>
> >> >> >> Within a DC, each range of token is 'owned' by a (primary) node
> (and
> >> >> >> replicas if you have RF > 1). If you add/remove a node in a DC,
> >> >> >> tokens will
> >> >> >> be rearranged between all nodes within the DC only, the other DCs
> >> >> >> won't be
> >> >> >> affected.
> >> >> >>
> >> >> >> --
> >> >> >> Jacques-Henri Berthemet
> >> >> >>
> >> >> >> -----Original Message-----
> >> >> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
> >> >> >> Sent: Wednesday, April 11, 2018 12:35 PM
> >> >> >> To: user@cassandra.apache.org
> >> >> >> Subject: does c* 3.0 use one ring for all datacenters?
> >> >> >>
> >> >> >> Hi All,
> >> >> >>
> >> >> >> I know it seems a stupid question, but I am really confused about
> >> >> >> the
> >> >> >> documents on the internet related to this topic, especially it
> seems
> >> >> >> that it
> >> >> >> has different answers for c* with vnodes or not.
> >> >> >>
> >> >> >> Let's assume the token range is 1-100 for the whole cluster, how
> >> >> >> does
> >> >> >> it distributed into the datacenters? Think that the number of
> >> >> >> datacenters is
> >> >> >> dynamic in a cluster, if there is only one ring, then the token
> >> >> >> range would
> >> >> >> change on each node when I add a new datacenter into the cluster?
> >> >> >> Then it
> >> >> >> would involve data migration? It doesn't make sense.
> >> >> >>
> >> >> >> Looking forward to clarification for c* 3.0, thanks!
> >> >> >>
> >> >> >>
> >> >> >> ------------------------------------------------------------
> ---------
> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> ------------------------------------------------------------
> ---------
> >> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >> >> >
> >> >> > ------------------------------------------------------------
> ---------
> >> >> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> > For additional commands, e-mail: user-help@cassandra.apache.org
> >> >> >
> >> >>
> >> >> ------------------------------------------------------------
> ---------
> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jinhua Luo <lu...@gmail.com>.

What if I add a new DC3?
The token ranges would reshuffled into DC1, DC2, DC3?

2018-04-11 22:06 GMT+08:00 Jeff Jirsa <jj...@gmail.com>:
> Confirming again that it's definitely one ring.
>
> DC1 may have tokens 0, 10, 20, 30, 40, 50, 60, 70, 80
> DC2 may have tokens 1, 11, 21, 31, 41, 51, 61, 71, 81
>
> If you use SimpleStrategy and RF=3, a key with token 5 would be placed on
> the hosts with token 10, 11, 20
> If you use NetworkTopologyStrategy with RF=3 per DC, a key with token 5
> would be placed on the hosts with tokens 10,20,30 and 11, 21,31
>
>
>
>
>
> On Wed, Apr 11, 2018 at 6:27 AM, Jinhua Luo <lu...@gmail.com> wrote:
>>
>> Is it a different answer? One ring?
>>
>> Could you explain your answer according to my example?
>>
>> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
>> > There has always been a single ring.
>> >
>> > You can specify how many nodes in each DC you want and it’ll figure out
>> > how
>> > to do it as long as you have the right snitch and are using
>> > NetworkToploogyStrategy.
>> >
>> >
>> > On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <lu...@gmail.com> wrote:
>> >>
>> >> Let me clarify my question:
>> >>
>> >> Given we have a cluster of two DCs, each DC has 2 nodes, each node
>> >> sets num_token as 50.
>> >> Then how are token ranges distributed in the cluster?
>> >>
>> >> If there is one global ring, then it may be (To simply the case, let's
>> >> assume vnodes=1):
>> >> {dc1, node1} 1-50
>> >> {dc2, node1} 51-100
>> >> {dc1, node1} 101-150
>> >> {dc1, node2} 151-200
>> >>
>> >> But here comes more questions:
>> >> a) what if I add a new datacenter? Then the token ranges need to be
>> >> re-balanced?
>> >> If so, what about the data associated with the ranges to be balanced?
>> >> move them among DCs?
>> >> But that doesn't make sense, because each keyspace would specify its
>> >> snith and fix the DCs to store then.
>> >>
>> >> b) It seems no benefits from same ring, because of the snith.
>> >>
>> >> If each DC has own ring, then it may be:
>> >> {dc1, node1} 1-50
>> >> {dc1, node1} 51-100
>> >> {dc2, node1} 1-50
>> >> {dc2, node1} 51-100
>> >>
>> >> I think this is not a trivial question, because each key would be
>> >> hashed to determine the token it belongs to, and
>> >> the token range distribution in turns determine which node the key
>> >> belongs
>> >> to.
>> >>
>> >> Any official answer?
>> >>
>> >>
>> >> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
>> >> <ja...@genesys.com>:
>> >> > Maybe I misunderstood something but from what I understand, each DC
>> >> > have
>> >> > the same ring (0-100 in you example) but it's split differently
>> >> > between
>> >> > nodes in each DC. I think it's the same principle if using vnode or
>> >> > not.
>> >> >
>> >> > I think the confusion comes from the fact that the ring range is the
>> >> > same (0-100) but each DC manages it differently because nodes are
>> >> > different.
>> >> >
>> >> > --
>> >> > Jacques-Henri Berthemet
>> >> >
>> >> > -----Original Message-----
>> >> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> >> > Sent: Wednesday, April 11, 2018 2:26 PM
>> >> > To: user@cassandra.apache.org
>> >> > Subject: Re: does c* 3.0 use one ring for all datacenters?
>> >> >
>> >> > Thanks for your reply. I also think separate rings are more
>> >> > reasonable.
>> >> >
>> >> > So one ring for one dc is only for c* 1.x or 2.x without vnode?
>> >> >
>> >> > Check these references:
>> >> >
>> >> >
>> >> >
>> >> > https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html
>> >> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
>> >> >
>> >> >
>> >> > https://community.apigee.com/articles/13096/cassandra-token-distribution.html
>> >> >
>> >> > Even the riak official said c* splits the ring across dc:
>> >> >
>> >> >
>> >> > http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/
>> >> >
>> >> > Why they said each dc has its own ring?
>> >> >
>> >> >
>> >> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
>> >> > <ja...@genesys.com>:
>> >> >> Hi,
>> >> >>
>> >> >> Each DC has the whole ring, each DC contains a copy of the same
>> >> >> data.
>> >> >> When you add replication to a new DC, all data is copied to the new
>> >> >> DC.
>> >> >>
>> >> >> Within a DC, each range of token is 'owned' by a (primary) node (and
>> >> >> replicas if you have RF > 1). If you add/remove a node in a DC,
>> >> >> tokens will
>> >> >> be rearranged between all nodes within the DC only, the other DCs
>> >> >> won't be
>> >> >> affected.
>> >> >>
>> >> >> --
>> >> >> Jacques-Henri Berthemet
>> >> >>
>> >> >> -----Original Message-----
>> >> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> >> >> Sent: Wednesday, April 11, 2018 12:35 PM
>> >> >> To: user@cassandra.apache.org
>> >> >> Subject: does c* 3.0 use one ring for all datacenters?
>> >> >>
>> >> >> Hi All,
>> >> >>
>> >> >> I know it seems a stupid question, but I am really confused about
>> >> >> the
>> >> >> documents on the internet related to this topic, especially it seems
>> >> >> that it
>> >> >> has different answers for c* with vnodes or not.
>> >> >>
>> >> >> Let's assume the token range is 1-100 for the whole cluster, how
>> >> >> does
>> >> >> it distributed into the datacenters? Think that the number of
>> >> >> datacenters is
>> >> >> dynamic in a cluster, if there is only one ring, then the token
>> >> >> range would
>> >> >> change on each node when I add a new datacenter into the cluster?
>> >> >> Then it
>> >> >> would involve data migration? It doesn't make sense.
>> >> >>
>> >> >> Looking forward to clarification for c* 3.0, thanks!
>> >> >>
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >>
>> >> >>
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >
>> >> > ---------------------------------------------------------------------
>> >> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> > For additional commands, e-mail: user-help@cassandra.apache.org
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jeff Jirsa <jj...@gmail.com>.

Confirming again that it's definitely one ring.

DC1 may have tokens 0, 10, 20, 30, 40, 50, 60, 70, 80
DC2 may have tokens 1, 11, 21, 31, 41, 51, 61, 71, 81

If you use SimpleStrategy and RF=3, a key with token 5 would be placed on
the hosts with token 10, 11, 20
If you use NetworkTopologyStrategy with RF=3 per DC, a key with token 5
would be placed on the hosts with tokens 10,20,30 and 11, 21,31





On Wed, Apr 11, 2018 at 6:27 AM, Jinhua Luo <lu...@gmail.com> wrote:

> Is it a different answer? One ring?
>
> Could you explain your answer according to my example?
>
> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
> > There has always been a single ring.
> >
> > You can specify how many nodes in each DC you want and it’ll figure out
> how
> > to do it as long as you have the right snitch and are using
> > NetworkToploogyStrategy.
> >
> >
> > On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <lu...@gmail.com> wrote:
> >>
> >> Let me clarify my question:
> >>
> >> Given we have a cluster of two DCs, each DC has 2 nodes, each node
> >> sets num_token as 50.
> >> Then how are token ranges distributed in the cluster?
> >>
> >> If there is one global ring, then it may be (To simply the case, let's
> >> assume vnodes=1):
> >> {dc1, node1} 1-50
> >> {dc2, node1} 51-100
> >> {dc1, node1} 101-150
> >> {dc1, node2} 151-200
> >>
> >> But here comes more questions:
> >> a) what if I add a new datacenter? Then the token ranges need to be
> >> re-balanced?
> >> If so, what about the data associated with the ranges to be balanced?
> >> move them among DCs?
> >> But that doesn't make sense, because each keyspace would specify its
> >> snith and fix the DCs to store then.
> >>
> >> b) It seems no benefits from same ring, because of the snith.
> >>
> >> If each DC has own ring, then it may be:
> >> {dc1, node1} 1-50
> >> {dc1, node1} 51-100
> >> {dc2, node1} 1-50
> >> {dc2, node1} 51-100
> >>
> >> I think this is not a trivial question, because each key would be
> >> hashed to determine the token it belongs to, and
> >> the token range distribution in turns determine which node the key
> belongs
> >> to.
> >>
> >> Any official answer?
> >>
> >>
> >> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
> >> <ja...@genesys.com>:
> >> > Maybe I misunderstood something but from what I understand, each DC
> have
> >> > the same ring (0-100 in you example) but it's split differently
> between
> >> > nodes in each DC. I think it's the same principle if using vnode or
> not.
> >> >
> >> > I think the confusion comes from the fact that the ring range is the
> >> > same (0-100) but each DC manages it differently because nodes are
> different.
> >> >
> >> > --
> >> > Jacques-Henri Berthemet
> >> >
> >> > -----Original Message-----
> >> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
> >> > Sent: Wednesday, April 11, 2018 2:26 PM
> >> > To: user@cassandra.apache.org
> >> > Subject: Re: does c* 3.0 use one ring for all datacenters?
> >> >
> >> > Thanks for your reply. I also think separate rings are more
> reasonable.
> >> >
> >> > So one ring for one dc is only for c* 1.x or 2.x without vnode?
> >> >
> >> > Check these references:
> >> >
> >> >
> >> > https://docs.datastax.com/en/archived/cassandra/1.1/docs/
> initialize/token_generation.html
> >> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
> >> >
> >> > https://community.apigee.com/articles/13096/cassandra-
> token-distribution.html
> >> >
> >> > Even the riak official said c* splits the ring across dc:
> >> >
> >> > http://basho.com/posts/business/riak-vs-cassandra-an-
> updated-brief-comparison/
> >> >
> >> > Why they said each dc has its own ring?
> >> >
> >> >
> >> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
> >> > <ja...@genesys.com>:
> >> >> Hi,
> >> >>
> >> >> Each DC has the whole ring, each DC contains a copy of the same data.
> >> >> When you add replication to a new DC, all data is copied to the new
> DC.
> >> >>
> >> >> Within a DC, each range of token is 'owned' by a (primary) node (and
> >> >> replicas if you have RF > 1). If you add/remove a node in a DC,
> tokens will
> >> >> be rearranged between all nodes within the DC only, the other DCs
> won't be
> >> >> affected.
> >> >>
> >> >> --
> >> >> Jacques-Henri Berthemet
> >> >>
> >> >> -----Original Message-----
> >> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
> >> >> Sent: Wednesday, April 11, 2018 12:35 PM
> >> >> To: user@cassandra.apache.org
> >> >> Subject: does c* 3.0 use one ring for all datacenters?
> >> >>
> >> >> Hi All,
> >> >>
> >> >> I know it seems a stupid question, but I am really confused about the
> >> >> documents on the internet related to this topic, especially it seems
> that it
> >> >> has different answers for c* with vnodes or not.
> >> >>
> >> >> Let's assume the token range is 1-100 for the whole cluster, how does
> >> >> it distributed into the datacenters? Think that the number of
> datacenters is
> >> >> dynamic in a cluster, if there is only one ring, then the token
> range would
> >> >> change on each node when I add a new datacenter into the cluster?
> Then it
> >> >> would involve data migration? It doesn't make sense.
> >> >>
> >> >> Looking forward to clarification for c* 3.0, thanks!
> >> >>
> >> >> ------------------------------------------------------------
> ---------
> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >> >>
> >> >>
> >> >> ------------------------------------------------------------
> ---------
> >> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> > For additional commands, e-mail: user-help@cassandra.apache.org
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jinhua Luo <lu...@gmail.com>.

Is it a different answer? One ring?

Could you explain your answer according to my example?

2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jo...@jonhaddad.com>:
> There has always been a single ring.
>
> You can specify how many nodes in each DC you want and it’ll figure out how
> to do it as long as you have the right snitch and are using
> NetworkToploogyStrategy.
>
>
> On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <lu...@gmail.com> wrote:
>>
>> Let me clarify my question:
>>
>> Given we have a cluster of two DCs, each DC has 2 nodes, each node
>> sets num_token as 50.
>> Then how are token ranges distributed in the cluster?
>>
>> If there is one global ring, then it may be (To simply the case, let's
>> assume vnodes=1):
>> {dc1, node1} 1-50
>> {dc2, node1} 51-100
>> {dc1, node1} 101-150
>> {dc1, node2} 151-200
>>
>> But here comes more questions:
>> a) what if I add a new datacenter? Then the token ranges need to be
>> re-balanced?
>> If so, what about the data associated with the ranges to be balanced?
>> move them among DCs?
>> But that doesn't make sense, because each keyspace would specify its
>> snith and fix the DCs to store then.
>>
>> b) It seems no benefits from same ring, because of the snith.
>>
>> If each DC has own ring, then it may be:
>> {dc1, node1} 1-50
>> {dc1, node1} 51-100
>> {dc2, node1} 1-50
>> {dc2, node1} 51-100
>>
>> I think this is not a trivial question, because each key would be
>> hashed to determine the token it belongs to, and
>> the token range distribution in turns determine which node the key belongs
>> to.
>>
>> Any official answer?
>>
>>
>> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
>> <ja...@genesys.com>:
>> > Maybe I misunderstood something but from what I understand, each DC have
>> > the same ring (0-100 in you example) but it's split differently between
>> > nodes in each DC. I think it's the same principle if using vnode or not.
>> >
>> > I think the confusion comes from the fact that the ring range is the
>> > same (0-100) but each DC manages it differently because nodes are different.
>> >
>> > --
>> > Jacques-Henri Berthemet
>> >
>> > -----Original Message-----
>> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> > Sent: Wednesday, April 11, 2018 2:26 PM
>> > To: user@cassandra.apache.org
>> > Subject: Re: does c* 3.0 use one ring for all datacenters?
>> >
>> > Thanks for your reply. I also think separate rings are more reasonable.
>> >
>> > So one ring for one dc is only for c* 1.x or 2.x without vnode?
>> >
>> > Check these references:
>> >
>> >
>> > https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html
>> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
>> >
>> > https://community.apigee.com/articles/13096/cassandra-token-distribution.html
>> >
>> > Even the riak official said c* splits the ring across dc:
>> >
>> > http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/
>> >
>> > Why they said each dc has its own ring?
>> >
>> >
>> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
>> > <ja...@genesys.com>:
>> >> Hi,
>> >>
>> >> Each DC has the whole ring, each DC contains a copy of the same data.
>> >> When you add replication to a new DC, all data is copied to the new DC.
>> >>
>> >> Within a DC, each range of token is 'owned' by a (primary) node (and
>> >> replicas if you have RF > 1). If you add/remove a node in a DC, tokens will
>> >> be rearranged between all nodes within the DC only, the other DCs won't be
>> >> affected.
>> >>
>> >> --
>> >> Jacques-Henri Berthemet
>> >>
>> >> -----Original Message-----
>> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> >> Sent: Wednesday, April 11, 2018 12:35 PM
>> >> To: user@cassandra.apache.org
>> >> Subject: does c* 3.0 use one ring for all datacenters?
>> >>
>> >> Hi All,
>> >>
>> >> I know it seems a stupid question, but I am really confused about the
>> >> documents on the internet related to this topic, especially it seems that it
>> >> has different answers for c* with vnodes or not.
>> >>
>> >> Let's assume the token range is 1-100 for the whole cluster, how does
>> >> it distributed into the datacenters? Think that the number of datacenters is
>> >> dynamic in a cluster, if there is only one ring, then the token range would
>> >> change on each node when I add a new datacenter into the cluster? Then it
>> >> would involve data migration? It doesn't make sense.
>> >>
>> >> Looking forward to clarification for c* 3.0, thanks!
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> >> For additional commands, e-mail: user-help@cassandra.apache.org
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> > For additional commands, e-mail: user-help@cassandra.apache.org
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jonathan Haddad <jo...@jonhaddad.com>.

There has always been a single ring.

You can specify how many nodes in each DC you want and it’ll figure out how
to do it as long as you have the right snitch and are using
NetworkToploogyStrategy.


On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo <lu...@gmail.com> wrote:

> Let me clarify my question:
>
> Given we have a cluster of two DCs, each DC has 2 nodes, each node
> sets num_token as 50.
> Then how are token ranges distributed in the cluster?
>
> If there is one global ring, then it may be (To simply the case, let's
> assume vnodes=1):
> {dc1, node1} 1-50
> {dc2, node1} 51-100
> {dc1, node1} 101-150
> {dc1, node2} 151-200
>
> But here comes more questions:
> a) what if I add a new datacenter? Then the token ranges need to be
> re-balanced?
> If so, what about the data associated with the ranges to be balanced?
> move them among DCs?
> But that doesn't make sense, because each keyspace would specify its
> snith and fix the DCs to store then.
>
> b) It seems no benefits from same ring, because of the snith.
>
> If each DC has own ring, then it may be:
> {dc1, node1} 1-50
> {dc1, node1} 51-100
> {dc2, node1} 1-50
> {dc2, node1} 51-100
>
> I think this is not a trivial question, because each key would be
> hashed to determine the token it belongs to, and
> the token range distribution in turns determine which node the key belongs
> to.
>
> Any official answer?
>
>
> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
> <ja...@genesys.com>:
> > Maybe I misunderstood something but from what I understand, each DC have
> the same ring (0-100 in you example) but it's split differently between
> nodes in each DC. I think it's the same principle if using vnode or not.
> >
> > I think the confusion comes from the fact that the ring range is the
> same (0-100) but each DC manages it differently because nodes are different.
> >
> > --
> > Jacques-Henri Berthemet
> >
> > -----Original Message-----
> > From: Jinhua Luo [mailto:luajit.io@gmail.com]
> > Sent: Wednesday, April 11, 2018 2:26 PM
> > To: user@cassandra.apache.org
> > Subject: Re: does c* 3.0 use one ring for all datacenters?
> >
> > Thanks for your reply. I also think separate rings are more reasonable.
> >
> > So one ring for one dc is only for c* 1.x or 2.x without vnode?
> >
> > Check these references:
> >
> >
> https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html
> > http://www.luketillman.com/one-token-ring-to-rule-them-all/
> >
> https://community.apigee.com/articles/13096/cassandra-token-distribution.html
> >
> > Even the riak official said c* splits the ring across dc:
> >
> http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/
> >
> > Why they said each dc has its own ring?
> >
> >
> > 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
> > <ja...@genesys.com>:
> >> Hi,
> >>
> >> Each DC has the whole ring, each DC contains a copy of the same data.
> When you add replication to a new DC, all data is copied to the new DC.
> >>
> >> Within a DC, each range of token is 'owned' by a (primary) node (and
> replicas if you have RF > 1). If you add/remove a node in a DC, tokens will
> be rearranged between all nodes within the DC only, the other DCs won't be
> affected.
> >>
> >> --
> >> Jacques-Henri Berthemet
> >>
> >> -----Original Message-----
> >> From: Jinhua Luo [mailto:luajit.io@gmail.com]
> >> Sent: Wednesday, April 11, 2018 12:35 PM
> >> To: user@cassandra.apache.org
> >> Subject: does c* 3.0 use one ring for all datacenters?
> >>
> >> Hi All,
> >>
> >> I know it seems a stupid question, but I am really confused about the
> documents on the internet related to this topic, especially it seems that
> it has different answers for c* with vnodes or not.
> >>
> >> Let's assume the token range is 1-100 for the whole cluster, how does
> it distributed into the datacenters? Think that the number of datacenters
> is dynamic in a cluster, if there is only one ring, then the token range
> would change on each node when I add a new datacenter into the cluster?
> Then it would involve data migration? It doesn't make sense.
> >>
> >> Looking forward to clarification for c* 3.0, thanks!
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: user-help@cassandra.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jinhua Luo <lu...@gmail.com>.

Let me clarify my question:

Given we have a cluster of two DCs, each DC has 2 nodes, each node
sets num_token as 50.
Then how are token ranges distributed in the cluster?

If there is one global ring, then it may be (To simply the case, let's
assume vnodes=1):
{dc1, node1} 1-50
{dc2, node1} 51-100
{dc1, node1} 101-150
{dc1, node2} 151-200

But here comes more questions:
a) what if I add a new datacenter? Then the token ranges need to be re-balanced?
If so, what about the data associated with the ranges to be balanced?
move them among DCs?
But that doesn't make sense, because each keyspace would specify its
snith and fix the DCs to store then.

b) It seems no benefits from same ring, because of the snith.

If each DC has own ring, then it may be:
{dc1, node1} 1-50
{dc1, node1} 51-100
{dc2, node1} 1-50
{dc2, node1} 51-100

I think this is not a trivial question, because each key would be
hashed to determine the token it belongs to, and
the token range distribution in turns determine which node the key belongs to.

Any official answer?


2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
<ja...@genesys.com>:
> Maybe I misunderstood something but from what I understand, each DC have the same ring (0-100 in you example) but it's split differently between nodes in each DC. I think it's the same principle if using vnode or not.
>
> I think the confusion comes from the fact that the ring range is the same (0-100) but each DC manages it differently because nodes are different.
>
> --
> Jacques-Henri Berthemet
>
> -----Original Message-----
> From: Jinhua Luo [mailto:luajit.io@gmail.com]
> Sent: Wednesday, April 11, 2018 2:26 PM
> To: user@cassandra.apache.org
> Subject: Re: does c* 3.0 use one ring for all datacenters?
>
> Thanks for your reply. I also think separate rings are more reasonable.
>
> So one ring for one dc is only for c* 1.x or 2.x without vnode?
>
> Check these references:
>
> https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html
> http://www.luketillman.com/one-token-ring-to-rule-them-all/
> https://community.apigee.com/articles/13096/cassandra-token-distribution.html
>
> Even the riak official said c* splits the ring across dc:
> http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/
>
> Why they said each dc has its own ring?
>
>
> 2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
> <ja...@genesys.com>:
>> Hi,
>>
>> Each DC has the whole ring, each DC contains a copy of the same data. When you add replication to a new DC, all data is copied to the new DC.
>>
>> Within a DC, each range of token is 'owned' by a (primary) node (and replicas if you have RF > 1). If you add/remove a node in a DC, tokens will be rearranged between all nodes within the DC only, the other DCs won't be affected.
>>
>> --
>> Jacques-Henri Berthemet
>>
>> -----Original Message-----
>> From: Jinhua Luo [mailto:luajit.io@gmail.com]
>> Sent: Wednesday, April 11, 2018 12:35 PM
>> To: user@cassandra.apache.org
>> Subject: does c* 3.0 use one ring for all datacenters?
>>
>> Hi All,
>>
>> I know it seems a stupid question, but I am really confused about the documents on the internet related to this topic, especially it seems that it has different answers for c* with vnodes or not.
>>
>> Let's assume the token range is 1-100 for the whole cluster, how does it distributed into the datacenters? Think that the number of datacenters is dynamic in a cluster, if there is only one ring, then the token range would change on each node when I add a new datacenter into the cluster? Then it would involve data migration? It doesn't make sense.
>>
>> Looking forward to clarification for c* 3.0, thanks!
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

RE: does c* 3.0 use one ring for all datacenters?

Posted by Jacques-Henri Berthemet <ja...@genesys.com>.

Maybe I misunderstood something but from what I understand, each DC have the same ring (0-100 in you example) but it's split differently between nodes in each DC. I think it's the same principle if using vnode or not.

I think the confusion comes from the fact that the ring range is the same (0-100) but each DC manages it differently because nodes are different.

--
Jacques-Henri Berthemet

-----Original Message-----
From: Jinhua Luo [mailto:luajit.io@gmail.com] 
Sent: Wednesday, April 11, 2018 2:26 PM
To: user@cassandra.apache.org
Subject: Re: does c* 3.0 use one ring for all datacenters?

Thanks for your reply. I also think separate rings are more reasonable.

So one ring for one dc is only for c* 1.x or 2.x without vnode?

Check these references:

https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html
http://www.luketillman.com/one-token-ring-to-rule-them-all/
https://community.apigee.com/articles/13096/cassandra-token-distribution.html

Even the riak official said c* splits the ring across dc:
http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/

Why they said each dc has its own ring?


2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
<ja...@genesys.com>:
> Hi,
>
> Each DC has the whole ring, each DC contains a copy of the same data. When you add replication to a new DC, all data is copied to the new DC.
>
> Within a DC, each range of token is 'owned' by a (primary) node (and replicas if you have RF > 1). If you add/remove a node in a DC, tokens will be rearranged between all nodes within the DC only, the other DCs won't be affected.
>
> --
> Jacques-Henri Berthemet
>
> -----Original Message-----
> From: Jinhua Luo [mailto:luajit.io@gmail.com]
> Sent: Wednesday, April 11, 2018 12:35 PM
> To: user@cassandra.apache.org
> Subject: does c* 3.0 use one ring for all datacenters?
>
> Hi All,
>
> I know it seems a stupid question, but I am really confused about the documents on the internet related to this topic, especially it seems that it has different answers for c* with vnodes or not.
>
> Let's assume the token range is 1-100 for the whole cluster, how does it distributed into the datacenters? Think that the number of datacenters is dynamic in a cluster, if there is only one ring, then the token range would change on each node when I add a new datacenter into the cluster? Then it would involve data migration? It doesn't make sense.
>
> Looking forward to clarification for c* 3.0, thanks!
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: does c* 3.0 use one ring for all datacenters?

Posted by Jinhua Luo <lu...@gmail.com>.

Thanks for your reply. I also think separate rings are more reasonable.

So one ring for one dc is only for c* 1.x or 2.x without vnode?

Check these references:

https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html
http://www.luketillman.com/one-token-ring-to-rule-them-all/
https://community.apigee.com/articles/13096/cassandra-token-distribution.html

Even the riak official said c* splits the ring across dc:
http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/

Why they said each dc has its own ring?


2018-04-11 19:55 GMT+08:00 Jacques-Henri Berthemet
<ja...@genesys.com>:
> Hi,
>
> Each DC has the whole ring, each DC contains a copy of the same data. When you add replication to a new DC, all data is copied to the new DC.
>
> Within a DC, each range of token is 'owned' by a (primary) node (and replicas if you have RF > 1). If you add/remove a node in a DC, tokens will be rearranged between all nodes within the DC only, the other DCs won't be affected.
>
> --
> Jacques-Henri Berthemet
>
> -----Original Message-----
> From: Jinhua Luo [mailto:luajit.io@gmail.com]
> Sent: Wednesday, April 11, 2018 12:35 PM
> To: user@cassandra.apache.org
> Subject: does c* 3.0 use one ring for all datacenters?
>
> Hi All,
>
> I know it seems a stupid question, but I am really confused about the documents on the internet related to this topic, especially it seems that it has different answers for c* with vnodes or not.
>
> Let's assume the token range is 1-100 for the whole cluster, how does it distributed into the datacenters? Think that the number of datacenters is dynamic in a cluster, if there is only one ring, then the token range would change on each node when I add a new datacenter into the cluster? Then it would involve data migration? It doesn't make sense.
>
> Looking forward to clarification for c* 3.0, thanks!
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

RE: does c* 3.0 use one ring for all datacenters?

Posted by Jacques-Henri Berthemet <ja...@genesys.com>.

Hi,

Each DC has the whole ring, each DC contains a copy of the same data. When you add replication to a new DC, all data is copied to the new DC.

Within a DC, each range of token is 'owned' by a (primary) node (and replicas if you have RF > 1). If you add/remove a node in a DC, tokens will be rearranged between all nodes within the DC only, the other DCs won't be affected.

--
Jacques-Henri Berthemet

-----Original Message-----
From: Jinhua Luo [mailto:luajit.io@gmail.com] 
Sent: Wednesday, April 11, 2018 12:35 PM
To: user@cassandra.apache.org
Subject: does c* 3.0 use one ring for all datacenters?

Hi All,

I know it seems a stupid question, but I am really confused about the documents on the internet related to this topic, especially it seems that it has different answers for c* with vnodes or not.

Let's assume the token range is 1-100 for the whole cluster, how does it distributed into the datacenters? Think that the number of datacenters is dynamic in a cluster, if there is only one ring, then the token range would change on each node when I add a new datacenter into the cluster? Then it would involve data migration? It doesn't make sense.

Looking forward to clarification for c* 3.0, thanks!

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org