You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Sergio Bilello <la...@gmail.com> on 2019/10/23 08:06:09 UTC

Cassandra Rack - Datacenter Load Balancing relations

Hello guys!
I was reading about https://cassandra.apache.org/doc/latest/architecture/dynamo.html#networktopologystrategy
I would like to understand a concept related to the node load balancing.
I know that Jon recommends Vnodes = 4 but right now I found a cluster with vnodes = 256 replication factor = 3 and 2 racks. This is unbalanced because the racks are not a multiplier of the replication factor.
However, my plan is to move all the nodes in a single rack to eventually scale up and down the node in the cluster once at the time. 
If I had 3 racks and I would like to keep the things balanced I should scale up 3 nodes at the time one for each rack.
If I would have 3 racks, should I have also 3 different datacenters so one datacenter for each rack? 
Can I have 2 datacenters and 3 racks? If this is possible one datacenter would have more nodes than the others? Could it be a problem?
I am thinking to split my cluster in one datacenter for reads and one for writes and keep all the nodes in the same rack so I can scale up once node at the time.

Please correct me if I am wrong

Thanks,

Sergio

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


Re: Cassandra Rack - Datacenter Load Balancing relations

Posted by Sergio <la...@gmail.com>.
Thanks, Jon!

I just added the AZ for each rack on the right column.
However thanks for your reply and clarification.
Maybe I should have marked the rack names with RACK-READ and RACK-WRITE to
avoid confusion and not use ONE and TWO.

What's more, fault-tolerant between with RF=3:

A) spread each DC across 3 AZ
B) assign to each DC a separate AZ

I assume that I should adjust the consistency level accordingly in case of
failures:
If I have 3 nodes and 1 goes down with RF = 3 and LOCAL_QUORUM consistency
I should downgrade to LOCAL_ONE if I want to keep serving traffic for reads.

Best,

Sergio





Il giorno mer 23 ott 2019 alle ore 14:12 Jon Haddad <jo...@jonhaddad.com> ha
scritto:

> Oh, my bad.  There was a flood of information there, I didn't realize you
> had switched to two DCs.  It's been a long day.
>
> I'll be honest, it's really hard to read your various options as you've
> intermixed terminology from AWS and Cassandra in a weird way and there's
> several pages of information here to go through.  I don't have time to
> decipher it, sorry.
>
> Spread a DC across 3 AZs if you want to be fault tolerant and will use
> RF=3, use a single AZ if you don't care about full DC failure in the case
> of an AZ failure or you're not using RF=3.
>
>
> On Wed, Oct 23, 2019 at 4:56 PM Sergio <la...@gmail.com> wrote:
>
>> OPTION C or OPTION A?
>>
>> Which one are you referring to?
>>
>> Both have separate DCs to keep the workload separate.
>>
>>    - OPTION A)
>>    - Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>    - 3 read ONE us-east-1a
>>    - 4 write TWO us-east-1b 5 write TWO us-east-1b
>>    - 6 write TWO us-east-1b
>>
>>
>> Here we have 2 DC read and write
>> One Rack per DC
>> One Availability Zone per DC
>>
>> Thanks,
>>
>> Sergio
>>
>>
>> On Wed, Oct 23, 2019, 1:11 PM Jon Haddad <jo...@jonhaddad.com> wrote:
>>
>>> Personally, I wouldn't ever do this.  I recommend separate DCs if you
>>> want to keep workloads separate.
>>>
>>> On Wed, Oct 23, 2019 at 4:06 PM Sergio <la...@gmail.com>
>>> wrote:
>>>
>>>>           I forgot to comment for
>>>>
>>>>    OPTION C)
>>>>    1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>>>    2. 3 read ONE us-east-1c
>>>>    3. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>>>    4. 6 write TWO us-east-1c I would expect that I need to decrease
>>>>    the Consistency Level in the reads if one of the AZ goes down. Please
>>>>    consider the below one as the real OPTION A. The previous one looks to be
>>>>    wrong because the same rack is assigned to 2 different DC.
>>>>    5. OPTION A)
>>>>    6. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>>>    7. 3 read ONE us-east-1a
>>>>    8. 4 write TWO us-east-1b 5 write TWO us-east-1b
>>>>    9. 6 write TWO us-east-1b
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Sergio
>>>>
>>>> Il giorno mer 23 ott 2019 alle ore 12:33 Sergio <
>>>> lapostadisergio@gmail.com> ha scritto:
>>>>
>>>>> Hi Reid,
>>>>>
>>>>> Thank you very much for clearing these concepts for me.
>>>>> https://community.datastax.com/comments/1133/view.html I posted this
>>>>> question on the datastax forum regarding our cluster that it is unbalanced
>>>>> and the reply was related that the *number of racks should be a
>>>>> multiplier of the replication factor *in order to be balanced or 1. I
>>>>> thought then if I have 3 availability zones I should have 3 racks for each
>>>>> datacenter and not 2 (us-east-1b, us-east-1a) as I have right now or in the
>>>>> easiest way, I should have a rack for each datacenter.
>>>>>
>>>>>
>>>>>
>>>>>    1. Datacenter: live
>>>>>    ================
>>>>>    Status=Up/Down
>>>>>    |/ State=Normal/Leaving/Joining/Moving
>>>>>    --  Address      Load       Tokens       Owns    Host ID
>>>>>                        Rack
>>>>>    UN  10.1.20.49   289.75 GiB  256          ?
>>>>>    be5a0193-56e7-4d42-8cc8-5d2141ab4872  us-east-1a
>>>>>    UN  10.1.30.112  103.03 GiB  256          ?
>>>>>    e5108a8e-cc2f-4914-a86e-fccf770e3f0f  us-east-1b
>>>>>    UN  10.1.19.163  129.61 GiB  256          ?
>>>>>    3c2efdda-8dd4-4f08-b991-9aff062a5388  us-east-1a
>>>>>    UN  10.1.26.181  145.28 GiB  256          ?
>>>>>    0a8f07ba-a129-42b0-b73a-df649bd076ef  us-east-1b
>>>>>    UN  10.1.17.213  149.04 GiB  256          ?
>>>>>    71563e86-b2ae-4d2c-91c5-49aa08386f67  us-east-1a
>>>>>    DN  10.1.19.198  52.41 GiB  256          ?
>>>>>    613b43c0-0688-4b86-994c-dc772b6fb8d2  us-east-1b
>>>>>    UN  10.1.31.60   195.17 GiB  256          ?
>>>>>    3647fcca-688a-4851-ab15-df36819910f4  us-east-1b
>>>>>    UN  10.1.25.206  100.67 GiB  256          ?
>>>>>    f43532ad-7d2e-4480-a9ce-2529b47f823d  us-east-1b
>>>>>    So each rack label right now matches the availability zone and we
>>>>>    have 3 Datacenters and 2 Availability Zone with 2 racks per DC but the
>>>>>    above is clearly unbalanced
>>>>>    If I have a keyspace with a replication factor = 3 and I want to
>>>>>    minimize the number of nodes to scale up and down the cluster and keep it
>>>>>    balanced should I consider an approach like OPTION A)
>>>>>    2. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>>>>    3. 3 read ONE us-east-1a
>>>>>    4. 4 write ONE us-east-1b 5 write ONE us-east-1b
>>>>>    5. 6 write ONE us-east-1b
>>>>>    6. OPTION B)
>>>>>    7. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>>>>    8. 3 read ONE us-east-1a
>>>>>    9. 4 write TWO us-east-1b 5 write TWO us-east-1b
>>>>>    10. 6 write TWO us-east-1b
>>>>>    11. *7 read ONE us-east-1c 8 write TWO us-east-1c*
>>>>>    12. *9 read ONE us-east-1c* Option B looks to be unbalanced and I
>>>>>    would exclude it OPTION C)
>>>>>    13. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>>>>    14. 3 read ONE us-east-1c
>>>>>    15. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>>>>    16. 6 write TWO us-east-1c
>>>>>    17.
>>>>>
>>>>>
>>>>>    so I am thinking of A if I have the restriction of 2 AZ but I
>>>>>    guess that option C would be the best. If I have to add another DC for
>>>>>    reads because we want to assign a new DC for each new microservice it would
>>>>>    look like:
>>>>>       OPTION EXTRA DC For Reads
>>>>>       1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>>>>       2. 3 read ONE us-east-1c
>>>>>       3. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>>>>       4. 6 write TWO us-east-1c 7 extra-read THREE us-east-1a
>>>>>       5. 8 extra-read THREE us-east-1b
>>>>>       6.
>>>>>          7.
>>>>>
>>>>>
>>>>>    1. 9 extra-read THREE us-east-1c
>>>>>       2.
>>>>>    The DC for *write* will replicate the data in the other
>>>>>    datacenters. My scope is to keep the *read* machines dedicated to
>>>>>    serve reads and *write* machines to serve writes. Cassandra will
>>>>>    handle the replication for me. Is there any other option that is I missing
>>>>>    or wrong assumption? I am thinking that I will write a blog post about all
>>>>>    my learnings so far, thank you very much for the replies Best, Sergio
>>>>>
>>>>>
>>>>> Il giorno mer 23 ott 2019 alle ore 10:57 Reid Pinchback <
>>>>> rpinchback@tripadvisor.com> ha scritto:
>>>>>
>>>>>> No, that’s not correct.  The point of racks is to help you distribute
>>>>>> the replicas, not further-replicate the replicas.  Data centers are what do
>>>>>> the latter.  So for example, if you wanted to be able to ensure that you
>>>>>> always had quorum if an AZ went down, then you could have two DCs where one
>>>>>> was in each AZ, and use one rack in each DC.  In your situation I think I’d
>>>>>> be more tempted to consider that.  Then if an AZ went away, you could fail
>>>>>> over your traffic to the remaining DC and still be perfectly fine.
>>>>>>
>>>>>>
>>>>>>
>>>>>> For background on replicas vs racks, I believe the information you
>>>>>> want is under the heading ‘NetworkTopologyStrategy’ at:
>>>>>>
>>>>>> http://cassandra.apache.org/doc/latest/architecture/dynamo.html
>>>>>>
>>>>>>
>>>>>>
>>>>>> That should help you better understand how replicas distribute.
>>>>>>
>>>>>>
>>>>>>
>>>>>> As mentioned before, while you can choose to do the reads in one DC,
>>>>>> except for concerns about contention related to network traffic and
>>>>>> connection handling, you can’t isolate reads from writes.  You can _
>>>>>> *mostly*_ insulate the write DC from the activity within the read
>>>>>> DC, and even that isn’t an absolute because of repairs.  However, your
>>>>>> mileage may vary, so do what makes sense for your usage pattern.
>>>>>>
>>>>>>
>>>>>>
>>>>>> R
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From: *Sergio <la...@gmail.com>
>>>>>> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
>>>>>> *Date: *Wednesday, October 23, 2019 at 12:50 PM
>>>>>> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
>>>>>> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Message from External Sender*
>>>>>>
>>>>>> Hi Reid,
>>>>>>
>>>>>> Thanks for your reply. I really appreciate your explanation.
>>>>>>
>>>>>> We are in AWS and we are using right now 2 Availability Zone and not
>>>>>> 3. We found our cluster really unbalanced because the keyspace has a
>>>>>> replication factor = 3 and the number of racks is 2 with 2 datacenters.
>>>>>> We want the writes spread across all the nodes but we wanted the
>>>>>> reads isolated from the writes to keep the load on that node low and to be
>>>>>> able to identify problems in the consumers (reads) or producers (writes)
>>>>>> applications.
>>>>>> It looks like that each rack contains an entire copy of the data so
>>>>>> this would lead to replicate for each rack and then for each node the
>>>>>> information. If I am correct if we have  a keyspace with 100GB and
>>>>>> Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
>>>>>> If I had only one rack across 2 or even 3 availability zone I would
>>>>>> save in space and I would have 300GB only. Please correct me if I am wrong.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Sergio
>>>>>>
>>>>>>
>>>>>>
>>>>>> Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <
>>>>>> rpinchback@tripadvisor.com> ha scritto:
>>>>>>
>>>>>> Datacenters and racks are different concepts.  While they don't have
>>>>>> to be associated with their historical meanings, the historical meanings
>>>>>> probably provide a helpful model for understanding what you want from them.
>>>>>>
>>>>>> When companies own their own physical servers and have them housed
>>>>>> somewhere, the questions arise on where you want to locate any particular
>>>>>> server.  It's a balancing act on things like network speed of related
>>>>>> servers being able to talk to each other, versus fault-tolerance of having
>>>>>> many servers not all exposed to the same risks.
>>>>>>
>>>>>> "Same rack" in that physical world tended to mean something like "all
>>>>>> behind the same network switch and all sharing the same power bus".  The
>>>>>> morning after an electrical glitch fries a power bus and thus everything in
>>>>>> that rack, you realize you wished you didn't have so many of the same type
>>>>>> of server together.  Well, they were servers.  Now they are door stops.
>>>>>> Badness and sadness.
>>>>>>
>>>>>> That's kind of the mindset to have in mind with racks in Cassandra.
>>>>>> It's an artifact for you to separate servers into pools so that the
>>>>>> disparate pools have hopefully somewhat independent infrastructure risks.
>>>>>> However, all those servers are still doing the same kind of work, are the
>>>>>> same version, etc.
>>>>>>
>>>>>> Datacenters are amalgams of those racks, and how similar or different
>>>>>> they are from each other depends on what you want to do with them.  What is
>>>>>> true is that if you have N datacenters, each one of them must have enough
>>>>>> disk storage to house all the data.  The actual physical footprint of that
>>>>>> data in each DC depends on the replication factors in play.
>>>>>>
>>>>>> Note that you sorta can't have "one datacenter for writes" because
>>>>>> the writes will replicate across the data centers.  You could definitely
>>>>>> choose to have only one that takes read queries, but best to think of
>>>>>> writing as being universal.  One scenario you can have is where the DC not
>>>>>> taking live traffic read queries is the one you use for maintenance or
>>>>>> performance testing or version upgrades.
>>>>>>
>>>>>> One rack makes your life easier if you don't have a reason for
>>>>>> multiple racks. It depends on the environment you deploy into and your
>>>>>> fault tolerance goals.  If you were in AWS and wanting to spread risk
>>>>>> across availability zones, then you would likely have as many racks as AZs
>>>>>> you choose to be in, because that's really the point of using multiple AZs.
>>>>>>
>>>>>> R
>>>>>>
>>>>>>
>>>>>> On 10/23/19, 4:06 AM, "Sergio Bilello" <la...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>      Message from External Sender
>>>>>>
>>>>>>     Hello guys!
>>>>>>
>>>>>>     I was reading about
>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=
>>>>>>
>>>>>>     I would like to understand a concept related to the node load
>>>>>> balancing.
>>>>>>
>>>>>>     I know that Jon recommends Vnodes = 4 but right now I found a
>>>>>> cluster with vnodes = 256 replication factor = 3 and 2 racks. This is
>>>>>> unbalanced because the racks are not a multiplier of the replication factor.
>>>>>>
>>>>>>     However, my plan is to move all the nodes in a single rack to
>>>>>> eventually scale up and down the node in the cluster once at the time.
>>>>>>
>>>>>>     If I had 3 racks and I would like to keep the things balanced I
>>>>>> should scale up 3 nodes at the time one for each rack.
>>>>>>
>>>>>>     If I would have 3 racks, should I have also 3 different
>>>>>> datacenters so one datacenter for each rack?
>>>>>>
>>>>>>     Can I have 2 datacenters and 3 racks? If this is possible one
>>>>>> datacenter would have more nodes than the others? Could it be a problem?
>>>>>>
>>>>>>     I am thinking to split my cluster in one datacenter for reads and
>>>>>> one for writes and keep all the nodes in the same rack so I can scale up
>>>>>> once node at the time.
>>>>>>
>>>>>>
>>>>>>
>>>>>>     Please correct me if I am wrong
>>>>>>
>>>>>>
>>>>>>
>>>>>>     Thanks,
>>>>>>
>>>>>>
>>>>>>
>>>>>>     Sergio
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>>     To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>>>>
>>>>>>     For additional commands, e-mail: user-help@cassandra.apache.org
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>

Re: Cassandra Rack - Datacenter Load Balancing relations

Posted by Jon Haddad <jo...@jonhaddad.com>.
Oh, my bad.  There was a flood of information there, I didn't realize you
had switched to two DCs.  It's been a long day.

I'll be honest, it's really hard to read your various options as you've
intermixed terminology from AWS and Cassandra in a weird way and there's
several pages of information here to go through.  I don't have time to
decipher it, sorry.

Spread a DC across 3 AZs if you want to be fault tolerant and will use
RF=3, use a single AZ if you don't care about full DC failure in the case
of an AZ failure or you're not using RF=3.


On Wed, Oct 23, 2019 at 4:56 PM Sergio <la...@gmail.com> wrote:

> OPTION C or OPTION A?
>
> Which one are you referring to?
>
> Both have separate DCs to keep the workload separate.
>
>    - OPTION A)
>    - Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>    - 3 read ONE us-east-1a
>    - 4 write TWO us-east-1b 5 write TWO us-east-1b
>    - 6 write TWO us-east-1b
>
>
> Here we have 2 DC read and write
> One Rack per DC
> One Availability Zone per DC
>
> Thanks,
>
> Sergio
>
>
> On Wed, Oct 23, 2019, 1:11 PM Jon Haddad <jo...@jonhaddad.com> wrote:
>
>> Personally, I wouldn't ever do this.  I recommend separate DCs if you
>> want to keep workloads separate.
>>
>> On Wed, Oct 23, 2019 at 4:06 PM Sergio <la...@gmail.com> wrote:
>>
>>>           I forgot to comment for
>>>
>>>    OPTION C)
>>>    1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>>    2. 3 read ONE us-east-1c
>>>    3. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>>    4. 6 write TWO us-east-1c I would expect that I need to decrease the
>>>    Consistency Level in the reads if one of the AZ goes down. Please consider
>>>    the below one as the real OPTION A. The previous one looks to be wrong
>>>    because the same rack is assigned to 2 different DC.
>>>    5. OPTION A)
>>>    6. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>>    7. 3 read ONE us-east-1a
>>>    8. 4 write TWO us-east-1b 5 write TWO us-east-1b
>>>    9. 6 write TWO us-east-1b
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Sergio
>>>
>>> Il giorno mer 23 ott 2019 alle ore 12:33 Sergio <
>>> lapostadisergio@gmail.com> ha scritto:
>>>
>>>> Hi Reid,
>>>>
>>>> Thank you very much for clearing these concepts for me.
>>>> https://community.datastax.com/comments/1133/view.html I posted this
>>>> question on the datastax forum regarding our cluster that it is unbalanced
>>>> and the reply was related that the *number of racks should be a
>>>> multiplier of the replication factor *in order to be balanced or 1. I
>>>> thought then if I have 3 availability zones I should have 3 racks for each
>>>> datacenter and not 2 (us-east-1b, us-east-1a) as I have right now or in the
>>>> easiest way, I should have a rack for each datacenter.
>>>>
>>>>
>>>>
>>>>    1. Datacenter: live
>>>>    ================
>>>>    Status=Up/Down
>>>>    |/ State=Normal/Leaving/Joining/Moving
>>>>    --  Address      Load       Tokens       Owns    Host ID
>>>>                        Rack
>>>>    UN  10.1.20.49   289.75 GiB  256          ?
>>>>    be5a0193-56e7-4d42-8cc8-5d2141ab4872  us-east-1a
>>>>    UN  10.1.30.112  103.03 GiB  256          ?
>>>>    e5108a8e-cc2f-4914-a86e-fccf770e3f0f  us-east-1b
>>>>    UN  10.1.19.163  129.61 GiB  256          ?
>>>>    3c2efdda-8dd4-4f08-b991-9aff062a5388  us-east-1a
>>>>    UN  10.1.26.181  145.28 GiB  256          ?
>>>>    0a8f07ba-a129-42b0-b73a-df649bd076ef  us-east-1b
>>>>    UN  10.1.17.213  149.04 GiB  256          ?
>>>>    71563e86-b2ae-4d2c-91c5-49aa08386f67  us-east-1a
>>>>    DN  10.1.19.198  52.41 GiB  256          ?
>>>>    613b43c0-0688-4b86-994c-dc772b6fb8d2  us-east-1b
>>>>    UN  10.1.31.60   195.17 GiB  256          ?
>>>>    3647fcca-688a-4851-ab15-df36819910f4  us-east-1b
>>>>    UN  10.1.25.206  100.67 GiB  256          ?
>>>>    f43532ad-7d2e-4480-a9ce-2529b47f823d  us-east-1b
>>>>    So each rack label right now matches the availability zone and we
>>>>    have 3 Datacenters and 2 Availability Zone with 2 racks per DC but the
>>>>    above is clearly unbalanced
>>>>    If I have a keyspace with a replication factor = 3 and I want to
>>>>    minimize the number of nodes to scale up and down the cluster and keep it
>>>>    balanced should I consider an approach like OPTION A)
>>>>    2. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>>>    3. 3 read ONE us-east-1a
>>>>    4. 4 write ONE us-east-1b 5 write ONE us-east-1b
>>>>    5. 6 write ONE us-east-1b
>>>>    6. OPTION B)
>>>>    7. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>>>    8. 3 read ONE us-east-1a
>>>>    9. 4 write TWO us-east-1b 5 write TWO us-east-1b
>>>>    10. 6 write TWO us-east-1b
>>>>    11. *7 read ONE us-east-1c 8 write TWO us-east-1c*
>>>>    12. *9 read ONE us-east-1c* Option B looks to be unbalanced and I
>>>>    would exclude it OPTION C)
>>>>    13. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>>>    14. 3 read ONE us-east-1c
>>>>    15. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>>>    16. 6 write TWO us-east-1c
>>>>    17.
>>>>
>>>>
>>>>    so I am thinking of A if I have the restriction of 2 AZ but I guess
>>>>    that option C would be the best. If I have to add another DC for reads
>>>>    because we want to assign a new DC for each new microservice it would look
>>>>    like:
>>>>       OPTION EXTRA DC For Reads
>>>>       1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>>>       2. 3 read ONE us-east-1c
>>>>       3. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>>>       4. 6 write TWO us-east-1c 7 extra-read THREE us-east-1a
>>>>       5. 8 extra-read THREE us-east-1b
>>>>       6.
>>>>          7.
>>>>
>>>>
>>>>    1. 9 extra-read THREE us-east-1c
>>>>       2.
>>>>    The DC for *write* will replicate the data in the other
>>>>    datacenters. My scope is to keep the *read* machines dedicated to
>>>>    serve reads and *write* machines to serve writes. Cassandra will
>>>>    handle the replication for me. Is there any other option that is I missing
>>>>    or wrong assumption? I am thinking that I will write a blog post about all
>>>>    my learnings so far, thank you very much for the replies Best, Sergio
>>>>
>>>>
>>>> Il giorno mer 23 ott 2019 alle ore 10:57 Reid Pinchback <
>>>> rpinchback@tripadvisor.com> ha scritto:
>>>>
>>>>> No, that’s not correct.  The point of racks is to help you distribute
>>>>> the replicas, not further-replicate the replicas.  Data centers are what do
>>>>> the latter.  So for example, if you wanted to be able to ensure that you
>>>>> always had quorum if an AZ went down, then you could have two DCs where one
>>>>> was in each AZ, and use one rack in each DC.  In your situation I think I’d
>>>>> be more tempted to consider that.  Then if an AZ went away, you could fail
>>>>> over your traffic to the remaining DC and still be perfectly fine.
>>>>>
>>>>>
>>>>>
>>>>> For background on replicas vs racks, I believe the information you
>>>>> want is under the heading ‘NetworkTopologyStrategy’ at:
>>>>>
>>>>> http://cassandra.apache.org/doc/latest/architecture/dynamo.html
>>>>>
>>>>>
>>>>>
>>>>> That should help you better understand how replicas distribute.
>>>>>
>>>>>
>>>>>
>>>>> As mentioned before, while you can choose to do the reads in one DC,
>>>>> except for concerns about contention related to network traffic and
>>>>> connection handling, you can’t isolate reads from writes.  You can _
>>>>> *mostly*_ insulate the write DC from the activity within the read DC,
>>>>> and even that isn’t an absolute because of repairs.  However, your mileage
>>>>> may vary, so do what makes sense for your usage pattern.
>>>>>
>>>>>
>>>>>
>>>>> R
>>>>>
>>>>>
>>>>>
>>>>> *From: *Sergio <la...@gmail.com>
>>>>> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
>>>>> *Date: *Wednesday, October 23, 2019 at 12:50 PM
>>>>> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
>>>>> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>>>>>
>>>>>
>>>>>
>>>>> *Message from External Sender*
>>>>>
>>>>> Hi Reid,
>>>>>
>>>>> Thanks for your reply. I really appreciate your explanation.
>>>>>
>>>>> We are in AWS and we are using right now 2 Availability Zone and not
>>>>> 3. We found our cluster really unbalanced because the keyspace has a
>>>>> replication factor = 3 and the number of racks is 2 with 2 datacenters.
>>>>> We want the writes spread across all the nodes but we wanted the reads
>>>>> isolated from the writes to keep the load on that node low and to be able
>>>>> to identify problems in the consumers (reads) or producers (writes)
>>>>> applications.
>>>>> It looks like that each rack contains an entire copy of the data so
>>>>> this would lead to replicate for each rack and then for each node the
>>>>> information. If I am correct if we have  a keyspace with 100GB and
>>>>> Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
>>>>> If I had only one rack across 2 or even 3 availability zone I would
>>>>> save in space and I would have 300GB only. Please correct me if I am wrong.
>>>>>
>>>>> Best,
>>>>>
>>>>> Sergio
>>>>>
>>>>>
>>>>>
>>>>> Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <
>>>>> rpinchback@tripadvisor.com> ha scritto:
>>>>>
>>>>> Datacenters and racks are different concepts.  While they don't have
>>>>> to be associated with their historical meanings, the historical meanings
>>>>> probably provide a helpful model for understanding what you want from them.
>>>>>
>>>>> When companies own their own physical servers and have them housed
>>>>> somewhere, the questions arise on where you want to locate any particular
>>>>> server.  It's a balancing act on things like network speed of related
>>>>> servers being able to talk to each other, versus fault-tolerance of having
>>>>> many servers not all exposed to the same risks.
>>>>>
>>>>> "Same rack" in that physical world tended to mean something like "all
>>>>> behind the same network switch and all sharing the same power bus".  The
>>>>> morning after an electrical glitch fries a power bus and thus everything in
>>>>> that rack, you realize you wished you didn't have so many of the same type
>>>>> of server together.  Well, they were servers.  Now they are door stops.
>>>>> Badness and sadness.
>>>>>
>>>>> That's kind of the mindset to have in mind with racks in Cassandra.
>>>>> It's an artifact for you to separate servers into pools so that the
>>>>> disparate pools have hopefully somewhat independent infrastructure risks.
>>>>> However, all those servers are still doing the same kind of work, are the
>>>>> same version, etc.
>>>>>
>>>>> Datacenters are amalgams of those racks, and how similar or different
>>>>> they are from each other depends on what you want to do with them.  What is
>>>>> true is that if you have N datacenters, each one of them must have enough
>>>>> disk storage to house all the data.  The actual physical footprint of that
>>>>> data in each DC depends on the replication factors in play.
>>>>>
>>>>> Note that you sorta can't have "one datacenter for writes" because the
>>>>> writes will replicate across the data centers.  You could definitely choose
>>>>> to have only one that takes read queries, but best to think of writing as
>>>>> being universal.  One scenario you can have is where the DC not taking live
>>>>> traffic read queries is the one you use for maintenance or performance
>>>>> testing or version upgrades.
>>>>>
>>>>> One rack makes your life easier if you don't have a reason for
>>>>> multiple racks. It depends on the environment you deploy into and your
>>>>> fault tolerance goals.  If you were in AWS and wanting to spread risk
>>>>> across availability zones, then you would likely have as many racks as AZs
>>>>> you choose to be in, because that's really the point of using multiple AZs.
>>>>>
>>>>> R
>>>>>
>>>>>
>>>>> On 10/23/19, 4:06 AM, "Sergio Bilello" <la...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>      Message from External Sender
>>>>>
>>>>>     Hello guys!
>>>>>
>>>>>     I was reading about
>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=
>>>>>
>>>>>     I would like to understand a concept related to the node load
>>>>> balancing.
>>>>>
>>>>>     I know that Jon recommends Vnodes = 4 but right now I found a
>>>>> cluster with vnodes = 256 replication factor = 3 and 2 racks. This is
>>>>> unbalanced because the racks are not a multiplier of the replication factor.
>>>>>
>>>>>     However, my plan is to move all the nodes in a single rack to
>>>>> eventually scale up and down the node in the cluster once at the time.
>>>>>
>>>>>     If I had 3 racks and I would like to keep the things balanced I
>>>>> should scale up 3 nodes at the time one for each rack.
>>>>>
>>>>>     If I would have 3 racks, should I have also 3 different
>>>>> datacenters so one datacenter for each rack?
>>>>>
>>>>>     Can I have 2 datacenters and 3 racks? If this is possible one
>>>>> datacenter would have more nodes than the others? Could it be a problem?
>>>>>
>>>>>     I am thinking to split my cluster in one datacenter for reads and
>>>>> one for writes and keep all the nodes in the same rack so I can scale up
>>>>> once node at the time.
>>>>>
>>>>>
>>>>>
>>>>>     Please correct me if I am wrong
>>>>>
>>>>>
>>>>>
>>>>>     Thanks,
>>>>>
>>>>>
>>>>>
>>>>>     Sergio
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>>
>>>>>     To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>>>
>>>>>     For additional commands, e-mail: user-help@cassandra.apache.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>

Re: Cassandra Rack - Datacenter Load Balancing relations

Posted by Sergio <la...@gmail.com>.
OPTION C or OPTION A?

Which one are you referring to?

Both have separate DCs to keep the workload separate.

   - OPTION A)
   - Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
   - 3 read ONE us-east-1a
   - 4 write TWO us-east-1b 5 write TWO us-east-1b
   - 6 write TWO us-east-1b


Here we have 2 DC read and write
One Rack per DC
One Availability Zone per DC

Thanks,

Sergio


On Wed, Oct 23, 2019, 1:11 PM Jon Haddad <jo...@jonhaddad.com> wrote:

> Personally, I wouldn't ever do this.  I recommend separate DCs if you want
> to keep workloads separate.
>
> On Wed, Oct 23, 2019 at 4:06 PM Sergio <la...@gmail.com> wrote:
>
>>           I forgot to comment for
>>
>>    OPTION C)
>>    1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>    2. 3 read ONE us-east-1c
>>    3. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>    4. 6 write TWO us-east-1c I would expect that I need to decrease the
>>    Consistency Level in the reads if one of the AZ goes down. Please consider
>>    the below one as the real OPTION A. The previous one looks to be wrong
>>    because the same rack is assigned to 2 different DC.
>>    5. OPTION A)
>>    6. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>    7. 3 read ONE us-east-1a
>>    8. 4 write TWO us-east-1b 5 write TWO us-east-1b
>>    9. 6 write TWO us-east-1b
>>
>>
>>
>> Thanks,
>>
>> Sergio
>>
>> Il giorno mer 23 ott 2019 alle ore 12:33 Sergio <
>> lapostadisergio@gmail.com> ha scritto:
>>
>>> Hi Reid,
>>>
>>> Thank you very much for clearing these concepts for me.
>>> https://community.datastax.com/comments/1133/view.html I posted this
>>> question on the datastax forum regarding our cluster that it is unbalanced
>>> and the reply was related that the *number of racks should be a
>>> multiplier of the replication factor *in order to be balanced or 1. I
>>> thought then if I have 3 availability zones I should have 3 racks for each
>>> datacenter and not 2 (us-east-1b, us-east-1a) as I have right now or in the
>>> easiest way, I should have a rack for each datacenter.
>>>
>>>
>>>
>>>    1. Datacenter: live
>>>    ================
>>>    Status=Up/Down
>>>    |/ State=Normal/Leaving/Joining/Moving
>>>    --  Address      Load       Tokens       Owns    Host ID
>>>                      Rack
>>>    UN  10.1.20.49   289.75 GiB  256          ?
>>>    be5a0193-56e7-4d42-8cc8-5d2141ab4872  us-east-1a
>>>    UN  10.1.30.112  103.03 GiB  256          ?
>>>    e5108a8e-cc2f-4914-a86e-fccf770e3f0f  us-east-1b
>>>    UN  10.1.19.163  129.61 GiB  256          ?
>>>    3c2efdda-8dd4-4f08-b991-9aff062a5388  us-east-1a
>>>    UN  10.1.26.181  145.28 GiB  256          ?
>>>    0a8f07ba-a129-42b0-b73a-df649bd076ef  us-east-1b
>>>    UN  10.1.17.213  149.04 GiB  256          ?
>>>    71563e86-b2ae-4d2c-91c5-49aa08386f67  us-east-1a
>>>    DN  10.1.19.198  52.41 GiB  256          ?
>>>    613b43c0-0688-4b86-994c-dc772b6fb8d2  us-east-1b
>>>    UN  10.1.31.60   195.17 GiB  256          ?
>>>    3647fcca-688a-4851-ab15-df36819910f4  us-east-1b
>>>    UN  10.1.25.206  100.67 GiB  256          ?
>>>    f43532ad-7d2e-4480-a9ce-2529b47f823d  us-east-1b
>>>    So each rack label right now matches the availability zone and we
>>>    have 3 Datacenters and 2 Availability Zone with 2 racks per DC but the
>>>    above is clearly unbalanced
>>>    If I have a keyspace with a replication factor = 3 and I want to
>>>    minimize the number of nodes to scale up and down the cluster and keep it
>>>    balanced should I consider an approach like OPTION A)
>>>    2. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>>    3. 3 read ONE us-east-1a
>>>    4. 4 write ONE us-east-1b 5 write ONE us-east-1b
>>>    5. 6 write ONE us-east-1b
>>>    6. OPTION B)
>>>    7. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>>    8. 3 read ONE us-east-1a
>>>    9. 4 write TWO us-east-1b 5 write TWO us-east-1b
>>>    10. 6 write TWO us-east-1b
>>>    11. *7 read ONE us-east-1c 8 write TWO us-east-1c*
>>>    12. *9 read ONE us-east-1c* Option B looks to be unbalanced and I
>>>    would exclude it OPTION C)
>>>    13. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>>    14. 3 read ONE us-east-1c
>>>    15. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>>    16. 6 write TWO us-east-1c
>>>    17.
>>>
>>>
>>>    so I am thinking of A if I have the restriction of 2 AZ but I guess
>>>    that option C would be the best. If I have to add another DC for reads
>>>    because we want to assign a new DC for each new microservice it would look
>>>    like:
>>>       OPTION EXTRA DC For Reads
>>>       1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>>       2. 3 read ONE us-east-1c
>>>       3. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>>       4. 6 write TWO us-east-1c 7 extra-read THREE us-east-1a
>>>       5. 8 extra-read THREE us-east-1b
>>>       6.
>>>          7.
>>>
>>>
>>>    1. 9 extra-read THREE us-east-1c
>>>       2.
>>>    The DC for *write* will replicate the data in the other datacenters.
>>>    My scope is to keep the *read* machines dedicated to serve reads and
>>>    *write* machines to serve writes. Cassandra will handle the
>>>    replication for me. Is there any other option that is I missing or wrong
>>>    assumption? I am thinking that I will write a blog post about all my
>>>    learnings so far, thank you very much for the replies Best, Sergio
>>>
>>>
>>> Il giorno mer 23 ott 2019 alle ore 10:57 Reid Pinchback <
>>> rpinchback@tripadvisor.com> ha scritto:
>>>
>>>> No, that’s not correct.  The point of racks is to help you distribute
>>>> the replicas, not further-replicate the replicas.  Data centers are what do
>>>> the latter.  So for example, if you wanted to be able to ensure that you
>>>> always had quorum if an AZ went down, then you could have two DCs where one
>>>> was in each AZ, and use one rack in each DC.  In your situation I think I’d
>>>> be more tempted to consider that.  Then if an AZ went away, you could fail
>>>> over your traffic to the remaining DC and still be perfectly fine.
>>>>
>>>>
>>>>
>>>> For background on replicas vs racks, I believe the information you want
>>>> is under the heading ‘NetworkTopologyStrategy’ at:
>>>>
>>>> http://cassandra.apache.org/doc/latest/architecture/dynamo.html
>>>>
>>>>
>>>>
>>>> That should help you better understand how replicas distribute.
>>>>
>>>>
>>>>
>>>> As mentioned before, while you can choose to do the reads in one DC,
>>>> except for concerns about contention related to network traffic and
>>>> connection handling, you can’t isolate reads from writes.  You can _
>>>> *mostly*_ insulate the write DC from the activity within the read DC,
>>>> and even that isn’t an absolute because of repairs.  However, your mileage
>>>> may vary, so do what makes sense for your usage pattern.
>>>>
>>>>
>>>>
>>>> R
>>>>
>>>>
>>>>
>>>> *From: *Sergio <la...@gmail.com>
>>>> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
>>>> *Date: *Wednesday, October 23, 2019 at 12:50 PM
>>>> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
>>>> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>>>>
>>>>
>>>>
>>>> *Message from External Sender*
>>>>
>>>> Hi Reid,
>>>>
>>>> Thanks for your reply. I really appreciate your explanation.
>>>>
>>>> We are in AWS and we are using right now 2 Availability Zone and not 3.
>>>> We found our cluster really unbalanced because the keyspace has a
>>>> replication factor = 3 and the number of racks is 2 with 2 datacenters.
>>>> We want the writes spread across all the nodes but we wanted the reads
>>>> isolated from the writes to keep the load on that node low and to be able
>>>> to identify problems in the consumers (reads) or producers (writes)
>>>> applications.
>>>> It looks like that each rack contains an entire copy of the data so
>>>> this would lead to replicate for each rack and then for each node the
>>>> information. If I am correct if we have  a keyspace with 100GB and
>>>> Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
>>>> If I had only one rack across 2 or even 3 availability zone I would
>>>> save in space and I would have 300GB only. Please correct me if I am wrong.
>>>>
>>>> Best,
>>>>
>>>> Sergio
>>>>
>>>>
>>>>
>>>> Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <
>>>> rpinchback@tripadvisor.com> ha scritto:
>>>>
>>>> Datacenters and racks are different concepts.  While they don't have to
>>>> be associated with their historical meanings, the historical meanings
>>>> probably provide a helpful model for understanding what you want from them.
>>>>
>>>> When companies own their own physical servers and have them housed
>>>> somewhere, the questions arise on where you want to locate any particular
>>>> server.  It's a balancing act on things like network speed of related
>>>> servers being able to talk to each other, versus fault-tolerance of having
>>>> many servers not all exposed to the same risks.
>>>>
>>>> "Same rack" in that physical world tended to mean something like "all
>>>> behind the same network switch and all sharing the same power bus".  The
>>>> morning after an electrical glitch fries a power bus and thus everything in
>>>> that rack, you realize you wished you didn't have so many of the same type
>>>> of server together.  Well, they were servers.  Now they are door stops.
>>>> Badness and sadness.
>>>>
>>>> That's kind of the mindset to have in mind with racks in Cassandra.
>>>> It's an artifact for you to separate servers into pools so that the
>>>> disparate pools have hopefully somewhat independent infrastructure risks.
>>>> However, all those servers are still doing the same kind of work, are the
>>>> same version, etc.
>>>>
>>>> Datacenters are amalgams of those racks, and how similar or different
>>>> they are from each other depends on what you want to do with them.  What is
>>>> true is that if you have N datacenters, each one of them must have enough
>>>> disk storage to house all the data.  The actual physical footprint of that
>>>> data in each DC depends on the replication factors in play.
>>>>
>>>> Note that you sorta can't have "one datacenter for writes" because the
>>>> writes will replicate across the data centers.  You could definitely choose
>>>> to have only one that takes read queries, but best to think of writing as
>>>> being universal.  One scenario you can have is where the DC not taking live
>>>> traffic read queries is the one you use for maintenance or performance
>>>> testing or version upgrades.
>>>>
>>>> One rack makes your life easier if you don't have a reason for multiple
>>>> racks. It depends on the environment you deploy into and your fault
>>>> tolerance goals.  If you were in AWS and wanting to spread risk across
>>>> availability zones, then you would likely have as many racks as AZs you
>>>> choose to be in, because that's really the point of using multiple AZs.
>>>>
>>>> R
>>>>
>>>>
>>>> On 10/23/19, 4:06 AM, "Sergio Bilello" <la...@gmail.com>
>>>> wrote:
>>>>
>>>>      Message from External Sender
>>>>
>>>>     Hello guys!
>>>>
>>>>     I was reading about
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=
>>>>
>>>>     I would like to understand a concept related to the node load
>>>> balancing.
>>>>
>>>>     I know that Jon recommends Vnodes = 4 but right now I found a
>>>> cluster with vnodes = 256 replication factor = 3 and 2 racks. This is
>>>> unbalanced because the racks are not a multiplier of the replication factor.
>>>>
>>>>     However, my plan is to move all the nodes in a single rack to
>>>> eventually scale up and down the node in the cluster once at the time.
>>>>
>>>>     If I had 3 racks and I would like to keep the things balanced I
>>>> should scale up 3 nodes at the time one for each rack.
>>>>
>>>>     If I would have 3 racks, should I have also 3 different datacenters
>>>> so one datacenter for each rack?
>>>>
>>>>     Can I have 2 datacenters and 3 racks? If this is possible one
>>>> datacenter would have more nodes than the others? Could it be a problem?
>>>>
>>>>     I am thinking to split my cluster in one datacenter for reads and
>>>> one for writes and keep all the nodes in the same rack so I can scale up
>>>> once node at the time.
>>>>
>>>>
>>>>
>>>>     Please correct me if I am wrong
>>>>
>>>>
>>>>
>>>>     Thanks,
>>>>
>>>>
>>>>
>>>>     Sergio
>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>>
>>>>     To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>>
>>>>     For additional commands, e-mail: user-help@cassandra.apache.org
>>>>
>>>>
>>>>
>>>>
>>>>

Re: Cassandra Rack - Datacenter Load Balancing relations

Posted by Jon Haddad <jo...@jonhaddad.com>.
Personally, I wouldn't ever do this.  I recommend separate DCs if you want
to keep workloads separate.

On Wed, Oct 23, 2019 at 4:06 PM Sergio <la...@gmail.com> wrote:

>           I forgot to comment for
>
>    OPTION C)
>    1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>    2. 3 read ONE us-east-1c
>    3. 4 write TWO us-east-1a 5 write TWO us-east-1b
>    4. 6 write TWO us-east-1c I would expect that I need to decrease the
>    Consistency Level in the reads if one of the AZ goes down. Please consider
>    the below one as the real OPTION A. The previous one looks to be wrong
>    because the same rack is assigned to 2 different DC.
>    5. OPTION A)
>    6. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>    7. 3 read ONE us-east-1a
>    8. 4 write TWO us-east-1b 5 write TWO us-east-1b
>    9. 6 write TWO us-east-1b
>
>
>
> Thanks,
>
> Sergio
>
> Il giorno mer 23 ott 2019 alle ore 12:33 Sergio <la...@gmail.com>
> ha scritto:
>
>> Hi Reid,
>>
>> Thank you very much for clearing these concepts for me.
>> https://community.datastax.com/comments/1133/view.html I posted this
>> question on the datastax forum regarding our cluster that it is unbalanced
>> and the reply was related that the *number of racks should be a
>> multiplier of the replication factor *in order to be balanced or 1. I
>> thought then if I have 3 availability zones I should have 3 racks for each
>> datacenter and not 2 (us-east-1b, us-east-1a) as I have right now or in the
>> easiest way, I should have a rack for each datacenter.
>>
>>
>>
>>    1. Datacenter: live
>>    ================
>>    Status=Up/Down
>>    |/ State=Normal/Leaving/Joining/Moving
>>    --  Address      Load       Tokens       Owns    Host ID
>>                      Rack
>>    UN  10.1.20.49   289.75 GiB  256          ?
>>    be5a0193-56e7-4d42-8cc8-5d2141ab4872  us-east-1a
>>    UN  10.1.30.112  103.03 GiB  256          ?
>>    e5108a8e-cc2f-4914-a86e-fccf770e3f0f  us-east-1b
>>    UN  10.1.19.163  129.61 GiB  256          ?
>>    3c2efdda-8dd4-4f08-b991-9aff062a5388  us-east-1a
>>    UN  10.1.26.181  145.28 GiB  256          ?
>>    0a8f07ba-a129-42b0-b73a-df649bd076ef  us-east-1b
>>    UN  10.1.17.213  149.04 GiB  256          ?
>>    71563e86-b2ae-4d2c-91c5-49aa08386f67  us-east-1a
>>    DN  10.1.19.198  52.41 GiB  256          ?
>>    613b43c0-0688-4b86-994c-dc772b6fb8d2  us-east-1b
>>    UN  10.1.31.60   195.17 GiB  256          ?
>>    3647fcca-688a-4851-ab15-df36819910f4  us-east-1b
>>    UN  10.1.25.206  100.67 GiB  256          ?
>>    f43532ad-7d2e-4480-a9ce-2529b47f823d  us-east-1b
>>    So each rack label right now matches the availability zone and we
>>    have 3 Datacenters and 2 Availability Zone with 2 racks per DC but the
>>    above is clearly unbalanced
>>    If I have a keyspace with a replication factor = 3 and I want to
>>    minimize the number of nodes to scale up and down the cluster and keep it
>>    balanced should I consider an approach like OPTION A)
>>    2. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>    3. 3 read ONE us-east-1a
>>    4. 4 write ONE us-east-1b 5 write ONE us-east-1b
>>    5. 6 write ONE us-east-1b
>>    6. OPTION B)
>>    7. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>    8. 3 read ONE us-east-1a
>>    9. 4 write TWO us-east-1b 5 write TWO us-east-1b
>>    10. 6 write TWO us-east-1b
>>    11. *7 read ONE us-east-1c 8 write TWO us-east-1c*
>>    12. *9 read ONE us-east-1c* Option B looks to be unbalanced and I
>>    would exclude it OPTION C)
>>    13. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>    14. 3 read ONE us-east-1c
>>    15. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>    16. 6 write TWO us-east-1c
>>    17.
>>
>>
>>    so I am thinking of A if I have the restriction of 2 AZ but I guess
>>    that option C would be the best. If I have to add another DC for reads
>>    because we want to assign a new DC for each new microservice it would look
>>    like:
>>       OPTION EXTRA DC For Reads
>>       1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>       2. 3 read ONE us-east-1c
>>       3. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>       4. 6 write TWO us-east-1c 7 extra-read THREE us-east-1a
>>       5. 8 extra-read THREE us-east-1b
>>       6.
>>          7.
>>
>>
>>    1. 9 extra-read THREE us-east-1c
>>       2.
>>    The DC for *write* will replicate the data in the other datacenters.
>>    My scope is to keep the *read* machines dedicated to serve reads and
>>    *write* machines to serve writes. Cassandra will handle the
>>    replication for me. Is there any other option that is I missing or wrong
>>    assumption? I am thinking that I will write a blog post about all my
>>    learnings so far, thank you very much for the replies Best, Sergio
>>
>>
>> Il giorno mer 23 ott 2019 alle ore 10:57 Reid Pinchback <
>> rpinchback@tripadvisor.com> ha scritto:
>>
>>> No, that’s not correct.  The point of racks is to help you distribute
>>> the replicas, not further-replicate the replicas.  Data centers are what do
>>> the latter.  So for example, if you wanted to be able to ensure that you
>>> always had quorum if an AZ went down, then you could have two DCs where one
>>> was in each AZ, and use one rack in each DC.  In your situation I think I’d
>>> be more tempted to consider that.  Then if an AZ went away, you could fail
>>> over your traffic to the remaining DC and still be perfectly fine.
>>>
>>>
>>>
>>> For background on replicas vs racks, I believe the information you want
>>> is under the heading ‘NetworkTopologyStrategy’ at:
>>>
>>> http://cassandra.apache.org/doc/latest/architecture/dynamo.html
>>>
>>>
>>>
>>> That should help you better understand how replicas distribute.
>>>
>>>
>>>
>>> As mentioned before, while you can choose to do the reads in one DC,
>>> except for concerns about contention related to network traffic and
>>> connection handling, you can’t isolate reads from writes.  You can _
>>> *mostly*_ insulate the write DC from the activity within the read DC,
>>> and even that isn’t an absolute because of repairs.  However, your mileage
>>> may vary, so do what makes sense for your usage pattern.
>>>
>>>
>>>
>>> R
>>>
>>>
>>>
>>> *From: *Sergio <la...@gmail.com>
>>> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
>>> *Date: *Wednesday, October 23, 2019 at 12:50 PM
>>> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
>>> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>>>
>>>
>>>
>>> *Message from External Sender*
>>>
>>> Hi Reid,
>>>
>>> Thanks for your reply. I really appreciate your explanation.
>>>
>>> We are in AWS and we are using right now 2 Availability Zone and not 3.
>>> We found our cluster really unbalanced because the keyspace has a
>>> replication factor = 3 and the number of racks is 2 with 2 datacenters.
>>> We want the writes spread across all the nodes but we wanted the reads
>>> isolated from the writes to keep the load on that node low and to be able
>>> to identify problems in the consumers (reads) or producers (writes)
>>> applications.
>>> It looks like that each rack contains an entire copy of the data so this
>>> would lead to replicate for each rack and then for each node the
>>> information. If I am correct if we have  a keyspace with 100GB and
>>> Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
>>> If I had only one rack across 2 or even 3 availability zone I would save
>>> in space and I would have 300GB only. Please correct me if I am wrong.
>>>
>>> Best,
>>>
>>> Sergio
>>>
>>>
>>>
>>> Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <
>>> rpinchback@tripadvisor.com> ha scritto:
>>>
>>> Datacenters and racks are different concepts.  While they don't have to
>>> be associated with their historical meanings, the historical meanings
>>> probably provide a helpful model for understanding what you want from them.
>>>
>>> When companies own their own physical servers and have them housed
>>> somewhere, the questions arise on where you want to locate any particular
>>> server.  It's a balancing act on things like network speed of related
>>> servers being able to talk to each other, versus fault-tolerance of having
>>> many servers not all exposed to the same risks.
>>>
>>> "Same rack" in that physical world tended to mean something like "all
>>> behind the same network switch and all sharing the same power bus".  The
>>> morning after an electrical glitch fries a power bus and thus everything in
>>> that rack, you realize you wished you didn't have so many of the same type
>>> of server together.  Well, they were servers.  Now they are door stops.
>>> Badness and sadness.
>>>
>>> That's kind of the mindset to have in mind with racks in Cassandra.
>>> It's an artifact for you to separate servers into pools so that the
>>> disparate pools have hopefully somewhat independent infrastructure risks.
>>> However, all those servers are still doing the same kind of work, are the
>>> same version, etc.
>>>
>>> Datacenters are amalgams of those racks, and how similar or different
>>> they are from each other depends on what you want to do with them.  What is
>>> true is that if you have N datacenters, each one of them must have enough
>>> disk storage to house all the data.  The actual physical footprint of that
>>> data in each DC depends on the replication factors in play.
>>>
>>> Note that you sorta can't have "one datacenter for writes" because the
>>> writes will replicate across the data centers.  You could definitely choose
>>> to have only one that takes read queries, but best to think of writing as
>>> being universal.  One scenario you can have is where the DC not taking live
>>> traffic read queries is the one you use for maintenance or performance
>>> testing or version upgrades.
>>>
>>> One rack makes your life easier if you don't have a reason for multiple
>>> racks. It depends on the environment you deploy into and your fault
>>> tolerance goals.  If you were in AWS and wanting to spread risk across
>>> availability zones, then you would likely have as many racks as AZs you
>>> choose to be in, because that's really the point of using multiple AZs.
>>>
>>> R
>>>
>>>
>>> On 10/23/19, 4:06 AM, "Sergio Bilello" <la...@gmail.com>
>>> wrote:
>>>
>>>      Message from External Sender
>>>
>>>     Hello guys!
>>>
>>>     I was reading about
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=
>>>
>>>     I would like to understand a concept related to the node load
>>> balancing.
>>>
>>>     I know that Jon recommends Vnodes = 4 but right now I found a
>>> cluster with vnodes = 256 replication factor = 3 and 2 racks. This is
>>> unbalanced because the racks are not a multiplier of the replication factor.
>>>
>>>     However, my plan is to move all the nodes in a single rack to
>>> eventually scale up and down the node in the cluster once at the time.
>>>
>>>     If I had 3 racks and I would like to keep the things balanced I
>>> should scale up 3 nodes at the time one for each rack.
>>>
>>>     If I would have 3 racks, should I have also 3 different datacenters
>>> so one datacenter for each rack?
>>>
>>>     Can I have 2 datacenters and 3 racks? If this is possible one
>>> datacenter would have more nodes than the others? Could it be a problem?
>>>
>>>     I am thinking to split my cluster in one datacenter for reads and
>>> one for writes and keep all the nodes in the same rack so I can scale up
>>> once node at the time.
>>>
>>>
>>>
>>>     Please correct me if I am wrong
>>>
>>>
>>>
>>>     Thanks,
>>>
>>>
>>>
>>>     Sergio
>>>
>>>
>>>
>>>     ---------------------------------------------------------------------
>>>
>>>     To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>
>>>     For additional commands, e-mail: user-help@cassandra.apache.org
>>>
>>>
>>>
>>>
>>>

Re: Cassandra Rack - Datacenter Load Balancing relations

Posted by Sergio <la...@gmail.com>.
          I forgot to comment for

   OPTION C)
   1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
   2. 3 read ONE us-east-1c
   3. 4 write TWO us-east-1a 5 write TWO us-east-1b
   4. 6 write TWO us-east-1c I would expect that I need to decrease the
   Consistency Level in the reads if one of the AZ goes down. Please consider
   the below one as the real OPTION A. The previous one looks to be wrong
   because the same rack is assigned to 2 different DC.
   5. OPTION A)
   6. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
   7. 3 read ONE us-east-1a
   8. 4 write TWO us-east-1b 5 write TWO us-east-1b
   9. 6 write TWO us-east-1b



Thanks,

Sergio

Il giorno mer 23 ott 2019 alle ore 12:33 Sergio <la...@gmail.com>
ha scritto:

> Hi Reid,
>
> Thank you very much for clearing these concepts for me.
> https://community.datastax.com/comments/1133/view.html I posted this
> question on the datastax forum regarding our cluster that it is unbalanced
> and the reply was related that the *number of racks should be a
> multiplier of the replication factor *in order to be balanced or 1. I
> thought then if I have 3 availability zones I should have 3 racks for each
> datacenter and not 2 (us-east-1b, us-east-1a) as I have right now or in the
> easiest way, I should have a rack for each datacenter.
>
>
>
>    1. Datacenter: live
>    ================
>    Status=Up/Down
>    |/ State=Normal/Leaving/Joining/Moving
>    --  Address      Load       Tokens       Owns    Host ID
>                    Rack
>    UN  10.1.20.49   289.75 GiB  256          ?
>    be5a0193-56e7-4d42-8cc8-5d2141ab4872  us-east-1a
>    UN  10.1.30.112  103.03 GiB  256          ?
>    e5108a8e-cc2f-4914-a86e-fccf770e3f0f  us-east-1b
>    UN  10.1.19.163  129.61 GiB  256          ?
>    3c2efdda-8dd4-4f08-b991-9aff062a5388  us-east-1a
>    UN  10.1.26.181  145.28 GiB  256          ?
>    0a8f07ba-a129-42b0-b73a-df649bd076ef  us-east-1b
>    UN  10.1.17.213  149.04 GiB  256          ?
>    71563e86-b2ae-4d2c-91c5-49aa08386f67  us-east-1a
>    DN  10.1.19.198  52.41 GiB  256          ?
>    613b43c0-0688-4b86-994c-dc772b6fb8d2  us-east-1b
>    UN  10.1.31.60   195.17 GiB  256          ?
>    3647fcca-688a-4851-ab15-df36819910f4  us-east-1b
>    UN  10.1.25.206  100.67 GiB  256          ?
>    f43532ad-7d2e-4480-a9ce-2529b47f823d  us-east-1b
>    So each rack label right now matches the availability zone and we have
>    3 Datacenters and 2 Availability Zone with 2 racks per DC but the above is
>    clearly unbalanced
>    If I have a keyspace with a replication factor = 3 and I want to
>    minimize the number of nodes to scale up and down the cluster and keep it
>    balanced should I consider an approach like OPTION A)
>    2. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>    3. 3 read ONE us-east-1a
>    4. 4 write ONE us-east-1b 5 write ONE us-east-1b
>    5. 6 write ONE us-east-1b
>    6. OPTION B)
>    7. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>    8. 3 read ONE us-east-1a
>    9. 4 write TWO us-east-1b 5 write TWO us-east-1b
>    10. 6 write TWO us-east-1b
>    11. *7 read ONE us-east-1c 8 write TWO us-east-1c*
>    12. *9 read ONE us-east-1c* Option B looks to be unbalanced and I
>    would exclude it OPTION C)
>    13. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>    14. 3 read ONE us-east-1c
>    15. 4 write TWO us-east-1a 5 write TWO us-east-1b
>    16. 6 write TWO us-east-1c
>    17.
>
>
>    so I am thinking of A if I have the restriction of 2 AZ but I guess
>    that option C would be the best. If I have to add another DC for reads
>    because we want to assign a new DC for each new microservice it would look
>    like:
>       OPTION EXTRA DC For Reads
>       1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>       2. 3 read ONE us-east-1c
>       3. 4 write TWO us-east-1a 5 write TWO us-east-1b
>       4. 6 write TWO us-east-1c 7 extra-read THREE us-east-1a
>       5. 8 extra-read THREE us-east-1b
>       6.
>          7.
>
>
>    1. 9 extra-read THREE us-east-1c
>       2.
>    The DC for *write* will replicate the data in the other datacenters.
>    My scope is to keep the *read* machines dedicated to serve reads and
>    *write* machines to serve writes. Cassandra will handle the
>    replication for me. Is there any other option that is I missing or wrong
>    assumption? I am thinking that I will write a blog post about all my
>    learnings so far, thank you very much for the replies Best, Sergio
>
>
> Il giorno mer 23 ott 2019 alle ore 10:57 Reid Pinchback <
> rpinchback@tripadvisor.com> ha scritto:
>
>> No, that’s not correct.  The point of racks is to help you distribute the
>> replicas, not further-replicate the replicas.  Data centers are what do the
>> latter.  So for example, if you wanted to be able to ensure that you always
>> had quorum if an AZ went down, then you could have two DCs where one was in
>> each AZ, and use one rack in each DC.  In your situation I think I’d be
>> more tempted to consider that.  Then if an AZ went away, you could fail
>> over your traffic to the remaining DC and still be perfectly fine.
>>
>>
>>
>> For background on replicas vs racks, I believe the information you want
>> is under the heading ‘NetworkTopologyStrategy’ at:
>>
>> http://cassandra.apache.org/doc/latest/architecture/dynamo.html
>>
>>
>>
>> That should help you better understand how replicas distribute.
>>
>>
>>
>> As mentioned before, while you can choose to do the reads in one DC,
>> except for concerns about contention related to network traffic and
>> connection handling, you can’t isolate reads from writes.  You can _
>> *mostly*_ insulate the write DC from the activity within the read DC,
>> and even that isn’t an absolute because of repairs.  However, your mileage
>> may vary, so do what makes sense for your usage pattern.
>>
>>
>>
>> R
>>
>>
>>
>> *From: *Sergio <la...@gmail.com>
>> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
>> *Date: *Wednesday, October 23, 2019 at 12:50 PM
>> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
>> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>>
>>
>>
>> *Message from External Sender*
>>
>> Hi Reid,
>>
>> Thanks for your reply. I really appreciate your explanation.
>>
>> We are in AWS and we are using right now 2 Availability Zone and not 3.
>> We found our cluster really unbalanced because the keyspace has a
>> replication factor = 3 and the number of racks is 2 with 2 datacenters.
>> We want the writes spread across all the nodes but we wanted the reads
>> isolated from the writes to keep the load on that node low and to be able
>> to identify problems in the consumers (reads) or producers (writes)
>> applications.
>> It looks like that each rack contains an entire copy of the data so this
>> would lead to replicate for each rack and then for each node the
>> information. If I am correct if we have  a keyspace with 100GB and
>> Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
>> If I had only one rack across 2 or even 3 availability zone I would save
>> in space and I would have 300GB only. Please correct me if I am wrong.
>>
>> Best,
>>
>> Sergio
>>
>>
>>
>> Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <
>> rpinchback@tripadvisor.com> ha scritto:
>>
>> Datacenters and racks are different concepts.  While they don't have to
>> be associated with their historical meanings, the historical meanings
>> probably provide a helpful model for understanding what you want from them.
>>
>> When companies own their own physical servers and have them housed
>> somewhere, the questions arise on where you want to locate any particular
>> server.  It's a balancing act on things like network speed of related
>> servers being able to talk to each other, versus fault-tolerance of having
>> many servers not all exposed to the same risks.
>>
>> "Same rack" in that physical world tended to mean something like "all
>> behind the same network switch and all sharing the same power bus".  The
>> morning after an electrical glitch fries a power bus and thus everything in
>> that rack, you realize you wished you didn't have so many of the same type
>> of server together.  Well, they were servers.  Now they are door stops.
>> Badness and sadness.
>>
>> That's kind of the mindset to have in mind with racks in Cassandra.  It's
>> an artifact for you to separate servers into pools so that the disparate
>> pools have hopefully somewhat independent infrastructure risks.  However,
>> all those servers are still doing the same kind of work, are the same
>> version, etc.
>>
>> Datacenters are amalgams of those racks, and how similar or different
>> they are from each other depends on what you want to do with them.  What is
>> true is that if you have N datacenters, each one of them must have enough
>> disk storage to house all the data.  The actual physical footprint of that
>> data in each DC depends on the replication factors in play.
>>
>> Note that you sorta can't have "one datacenter for writes" because the
>> writes will replicate across the data centers.  You could definitely choose
>> to have only one that takes read queries, but best to think of writing as
>> being universal.  One scenario you can have is where the DC not taking live
>> traffic read queries is the one you use for maintenance or performance
>> testing or version upgrades.
>>
>> One rack makes your life easier if you don't have a reason for multiple
>> racks. It depends on the environment you deploy into and your fault
>> tolerance goals.  If you were in AWS and wanting to spread risk across
>> availability zones, then you would likely have as many racks as AZs you
>> choose to be in, because that's really the point of using multiple AZs.
>>
>> R
>>
>>
>> On 10/23/19, 4:06 AM, "Sergio Bilello" <la...@gmail.com> wrote:
>>
>>      Message from External Sender
>>
>>     Hello guys!
>>
>>     I was reading about
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=
>>
>>     I would like to understand a concept related to the node load
>> balancing.
>>
>>     I know that Jon recommends Vnodes = 4 but right now I found a cluster
>> with vnodes = 256 replication factor = 3 and 2 racks. This is unbalanced
>> because the racks are not a multiplier of the replication factor.
>>
>>     However, my plan is to move all the nodes in a single rack to
>> eventually scale up and down the node in the cluster once at the time.
>>
>>     If I had 3 racks and I would like to keep the things balanced I
>> should scale up 3 nodes at the time one for each rack.
>>
>>     If I would have 3 racks, should I have also 3 different datacenters
>> so one datacenter for each rack?
>>
>>     Can I have 2 datacenters and 3 racks? If this is possible one
>> datacenter would have more nodes than the others? Could it be a problem?
>>
>>     I am thinking to split my cluster in one datacenter for reads and one
>> for writes and keep all the nodes in the same rack so I can scale up once
>> node at the time.
>>
>>
>>
>>     Please correct me if I am wrong
>>
>>
>>
>>     Thanks,
>>
>>
>>
>>     Sergio
>>
>>
>>
>>     ---------------------------------------------------------------------
>>
>>     To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>
>>     For additional commands, e-mail: user-help@cassandra.apache.org
>>
>>
>>
>>
>>

RE: Cassandra Rack - Datacenter Load Balancing relations

Posted by "Durity, Sean R" <SE...@homedepot.com>.
+1 for removing complexity to be able to create (and maintain!) “reasoned” systems!


Sean Durity – Staff Systems Engineer, Cassandra

From: Reid Pinchback <rp...@tripadvisor.com>
Sent: Thursday, October 24, 2019 10:28 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Cassandra Rack - Datacenter Load Balancing relations

Hey Sergio,

Forgive but I’m at work and had to skim the info quickly.

When in doubt, simplify.  So 1 rack per DC.  Distributed systems get rapidly harder to reason about the more complicated you make them.  There’s more than enough to learn about C* without jumping into the complexity too soon.

To deal with the unbalancing issue, pay attention to Jon Haddad’s advice on vnode count and how to fairly distribute tokens with a small vnode count.  I’d rather point you to his information, as I haven’t dug into vnode counts and token distribution in detail; he’s got a lot more time in C* than I do.  I come at this more as a traditional RDBMS and Java guy who has slowly gotten up to speed on C* over the last few years, and dealt with DynamoDB a lot so have lived with a lot of similarity in data modelling concerns.  Detailed internals I only know in cases where I had reason to dig into C* source.

There are so many knobs to turn in C* that it can be very easy to overthink things.  Simplify where you can.  Remove GC pressure wherever you can.  Negotiate with your consumers to have data models that make sense for C*.  If you have those three criteria foremost in mind, you’ll likely be fine for quite some time.  And in the times where something isn’t going well, simpler is easier to investigate.

R

From: Sergio <la...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, October 23, 2019 at 3:34 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Cassandra Rack - Datacenter Load Balancing relations

Message from External Sender
Hi Reid,

Thank you very much for clearing these concepts for me.
https://community.datastax.com/comments/1133/view.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__community.datastax.com_comments_1133_view.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=hcKr__B8MyXvYx8vQx20B_KN89ZynwB-N4px87tcYY8&s=RSwuSea6HjOb3gChVS_i4GnKgl--H0q-VHz38_setfc&e=> I posted this question on the datastax forum regarding our cluster that it is unbalanced and the reply was related that the number of racks should be a multiplier of the replication factor in order to be balanced or 1. I thought then if I have 3 availability zones I should have 3 racks for each datacenter and not 2 (us-east-1b, us-east-1a) as I have right now or in the easiest way, I should have a rack for each datacenter.



1.       Datacenter: live
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load       Tokens       Owns    Host ID                               Rack
UN  10.1.20.49   289.75 GiB  256          ?       be5a0193-56e7-4d42-8cc8-5d2141ab4872  us-east-1a
UN  10.1.30.112  103.03 GiB  256          ?       e5108a8e-cc2f-4914-a86e-fccf770e3f0f  us-east-1b
UN  10.1.19.163  129.61 GiB  256          ?       3c2efdda-8dd4-4f08-b991-9aff062a5388  us-east-1a
UN  10.1.26.181  145.28 GiB  256          ?       0a8f07ba-a129-42b0-b73a-df649bd076ef  us-east-1b
UN  10.1.17.213  149.04 GiB  256          ?       71563e86-b2ae-4d2c-91c5-49aa08386f67  us-east-1a
DN  10.1.19.198  52.41 GiB  256          ?       613b43c0-0688-4b86-994c-dc772b6fb8d2  us-east-1b
UN  10.1.31.60   195.17 GiB  256          ?       3647fcca-688a-4851-ab15-df36819910f4  us-east-1b
UN  10.1.25.206  100.67 GiB  256          ?       f43532ad-7d2e-4480-a9ce-2529b47f823d  us-east-1b
So each rack label right now matches the availability zone and we have 3 Datacenters and 2 Availability Zone with 2 racks per DC but the above is clearly unbalanced
If I have a keyspace with a replication factor = 3 and I want to minimize the number of nodes to scale up and down the cluster and keep it balanced should I consider an approach like OPTION A)

2.       Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a

3.       3 read ONE us-east-1a

4.       4 write ONE us-east-1b 5 write ONE us-east-1b

5.       6 write ONE us-east-1b

6.       OPTION B)

7.       Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a

8.       3 read ONE us-east-1a

9.       4 write TWO us-east-1b 5 write TWO us-east-1b

10.   6 write TWO us-east-1b

11.   7 read ONE us-east-1c 8 write TWO us-east-1c

12.   9 read ONE us-east-1c Option B looks to be unbalanced and I would exclude it OPTION C)

13.   Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b

14.   3 read ONE us-east-1c

15.   4 write TWO us-east-1a 5 write TWO us-east-1b

16.   6 write TWO us-east-1c

17.
so I am thinking of A if I have the restriction of 2 AZ but I guess that option C would be the best. If I have to add another DC for reads because we want to assign a new DC for each new microservice it would look like:
OPTION EXTRA DC For Reads

1.       Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b

2.       3 read ONE us-east-1c

3.       4 write TWO us-east-1a 5 write TWO us-east-1b

4.       6 write TWO us-east-1c 7 extra-read THREE us-east-1a

5.       8 extra-read THREE us-east-1b

6.

7.

1.       9 extra-read THREE us-east-1c

2.
The DC for write will replicate the data in the other datacenters. My scope is to keep the read machines dedicated to serve reads and write machines to serve writes. Cassandra will handle the replication for me. Is there any other option that is I missing or wrong assumption? I am thinking that I will write a blog post about all my learnings so far, thank you very much for the replies Best, Sergio

Il giorno mer 23 ott 2019 alle ore 10:57 Reid Pinchback <rp...@tripadvisor.com>> ha scritto:
No, that’s not correct.  The point of racks is to help you distribute the replicas, not further-replicate the replicas.  Data centers are what do the latter.  So for example, if you wanted to be able to ensure that you always had quorum if an AZ went down, then you could have two DCs where one was in each AZ, and use one rack in each DC.  In your situation I think I’d be more tempted to consider that.  Then if an AZ went away, you could fail over your traffic to the remaining DC and still be perfectly fine.

For background on replicas vs racks, I believe the information you want is under the heading ‘NetworkTopologyStrategy’ at:
http://cassandra.apache.org/doc/latest/architecture/dynamo.html<https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=hcKr__B8MyXvYx8vQx20B_KN89ZynwB-N4px87tcYY8&s=BhioPylf2Zs5ocBSiSQX--IeP2ojSoTiaq66SXbYN6w&e=>

That should help you better understand how replicas distribute.

As mentioned before, while you can choose to do the reads in one DC, except for concerns about contention related to network traffic and connection handling, you can’t isolate reads from writes.  You can _mostly_ insulate the write DC from the activity within the read DC, and even that isn’t an absolute because of repairs.  However, your mileage may vary, so do what makes sense for your usage pattern.

R

From: Sergio <la...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, October 23, 2019 at 12:50 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Cassandra Rack - Datacenter Load Balancing relations

Message from External Sender
Hi Reid,

Thanks for your reply. I really appreciate your explanation.

We are in AWS and we are using right now 2 Availability Zone and not 3. We found our cluster really unbalanced because the keyspace has a replication factor = 3 and the number of racks is 2 with 2 datacenters.
We want the writes spread across all the nodes but we wanted the reads isolated from the writes to keep the load on that node low and to be able to identify problems in the consumers (reads) or producers (writes) applications.
It looks like that each rack contains an entire copy of the data so this would lead to replicate for each rack and then for each node the information. If I am correct if we have  a keyspace with 100GB and Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
If I had only one rack across 2 or even 3 availability zone I would save in space and I would have 300GB only. Please correct me if I am wrong.

Best,

Sergio

Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <rp...@tripadvisor.com>> ha scritto:
Datacenters and racks are different concepts.  While they don't have to be associated with their historical meanings, the historical meanings probably provide a helpful model for understanding what you want from them.

When companies own their own physical servers and have them housed somewhere, the questions arise on where you want to locate any particular server.  It's a balancing act on things like network speed of related servers being able to talk to each other, versus fault-tolerance of having many servers not all exposed to the same risks.

"Same rack" in that physical world tended to mean something like "all behind the same network switch and all sharing the same power bus".  The morning after an electrical glitch fries a power bus and thus everything in that rack, you realize you wished you didn't have so many of the same type of server together.  Well, they were servers.  Now they are door stops.  Badness and sadness.

That's kind of the mindset to have in mind with racks in Cassandra.  It's an artifact for you to separate servers into pools so that the disparate pools have hopefully somewhat independent infrastructure risks.  However, all those servers are still doing the same kind of work, are the same version, etc.

Datacenters are amalgams of those racks, and how similar or different they are from each other depends on what you want to do with them.  What is true is that if you have N datacenters, each one of them must have enough disk storage to house all the data.  The actual physical footprint of that data in each DC depends on the replication factors in play.

Note that you sorta can't have "one datacenter for writes" because the writes will replicate across the data centers.  You could definitely choose to have only one that takes read queries, but best to think of writing as being universal.  One scenario you can have is where the DC not taking live traffic read queries is the one you use for maintenance or performance testing or version upgrades.

One rack makes your life easier if you don't have a reason for multiple racks. It depends on the environment you deploy into and your fault tolerance goals.  If you were in AWS and wanting to spread risk across availability zones, then you would likely have as many racks as AZs you choose to be in, because that's really the point of using multiple AZs.

R


On 10/23/19, 4:06 AM, "Sergio Bilello" <la...@gmail.com>> wrote:

     Message from External Sender

    Hello guys!

    I was reading about https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=

    I would like to understand a concept related to the node load balancing.

    I know that Jon recommends Vnodes = 4 but right now I found a cluster with vnodes = 256 replication factor = 3 and 2 racks. This is unbalanced because the racks are not a multiplier of the replication factor.

    However, my plan is to move all the nodes in a single rack to eventually scale up and down the node in the cluster once at the time.

    If I had 3 racks and I would like to keep the things balanced I should scale up 3 nodes at the time one for each rack.

    If I would have 3 racks, should I have also 3 different datacenters so one datacenter for each rack?

    Can I have 2 datacenters and 3 racks? If this is possible one datacenter would have more nodes than the others? Could it be a problem?

    I am thinking to split my cluster in one datacenter for reads and one for writes and keep all the nodes in the same rack so I can scale up once node at the time.



    Please correct me if I am wrong



    Thanks,



    Sergio



    ---------------------------------------------------------------------

    To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org>

    For additional commands, e-mail: user-help@cassandra.apache.org<ma...@cassandra.apache.org>



________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

Re: Cassandra Rack - Datacenter Load Balancing relations

Posted by Sergio <la...@gmail.com>.
Thanks Reid!

I agree with all the things that you said!

Best,
Sergio

Il giorno gio 24 ott 2019 alle ore 09:25 Reid Pinchback <
rpinchback@tripadvisor.com> ha scritto:

> Two different AWS AZs are in two different physical locations.  Typically
> different cities.  Which means that you’re trying to manage the risk of an
> AZ going dark, so you use more than one AZ just in case.  The downside is
> that you will have some degree of network performance difference between
> AZs because of whatever WAN pipe AWS owns/leased to connect between them.
>
>
>
> Having a DC in one AZ is easy to reason about.  The AZ is there, or it is
> not.  If you have two DCs in your cluster, and you lose an AZ, it means you
> still have a functioning cluster with one DC and you still have quorum.
> Yay, even in an outage, you know you can still do business.  You would only
> have to route any traffic normally sent to the other DC to the remaining
> one, so as long as there is resource headroom planning in how you provision
> your hardware, you’re in a safe state.
>
>
>
> If you start splitting a DC across AZs without using racks to organize
> nodes on a per-AZ basis, off the top of my head I don’t know how you reason
> about your risks for losing quorum without pausing to really think through
> vnodes and token distribution and whatnot.  I’m not a fan of topologies I
> can’t reason about when paged at 3 in the morning and I’m half asleep.  I
> prefer simple until the workload motivates complex.
>
>
>
> R
>
>
>
>
>
> *From: *Sergio <la...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Date: *Thursday, October 24, 2019 at 12:06 PM
> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>
>
>
> *Message from External Sender*
>
> Thanks Reid and Jon!
>
>
>
> Yes I will stick with one rack per DC for sure and I will look at the
> Vnodes problem later on.
>
>
>
>
>
> What's the difference in terms of reliability between
>
> A) spreading 2 Datacenters across 3 AZ
>
> B) having 2 Datacenters in 2 separate AZ
>
> ?
>
>
>
>
>
> Best,
>
>
>
> Sergio
>
>
>
> On Thu, Oct 24, 2019, 7:36 AM Reid Pinchback <rp...@tripadvisor.com>
> wrote:
>
> Hey Sergio,
>
>
>
> Forgive but I’m at work and had to skim the info quickly.
>
>
>
> When in doubt, simplify.  So 1 rack per DC.  Distributed systems get
> rapidly harder to reason about the more complicated you make them.  There’s
> more than enough to learn about C* without jumping into the complexity too
> soon.
>
>
>
> To deal with the unbalancing issue, pay attention to Jon Haddad’s advice
> on vnode count and how to fairly distribute tokens with a small vnode
> count.  I’d rather point you to his information, as I haven’t dug into
> vnode counts and token distribution in detail; he’s got a lot more time in
> C* than I do.  I come at this more as a traditional RDBMS and Java guy who
> has slowly gotten up to speed on C* over the last few years, and dealt with
> DynamoDB a lot so have lived with a lot of similarity in data modelling
> concerns.  Detailed internals I only know in cases where I had reason to
> dig into C* source.
>
>
>
> There are so many knobs to turn in C* that it can be very easy to
> overthink things.  Simplify where you can.  Remove GC pressure wherever you
> can.  Negotiate with your consumers to have data models that make sense for
> C*.  If you have those three criteria foremost in mind, you’ll likely be
> fine for quite some time.  And in the times where something isn’t going
> well, simpler is easier to investigate.
>
>
> R
>
>
>
> *From: *Sergio <la...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Date: *Wednesday, October 23, 2019 at 3:34 PM
> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>
>
>
> *Message from External Sender*
>
> Hi Reid,
>
> Thank you very much for clearing these concepts for me.
> https://community.datastax.com/comments/1133/view.html
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__community.datastax.com_comments_1133_view.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=hcKr__B8MyXvYx8vQx20B_KN89ZynwB-N4px87tcYY8&s=RSwuSea6HjOb3gChVS_i4GnKgl--H0q-VHz38_setfc&e=>
> I posted this question on the datastax forum regarding our cluster that it
> is unbalanced and the reply was related that the *number of racks should
> be a multiplier of the replication factor *in order to be balanced or 1.
> I thought then if I have 3 availability zones I should have 3 racks for
> each datacenter and not 2 (us-east-1b, us-east-1a) as I have right now or
> in the easiest way, I should have a rack for each datacenter.
>
>
> 1.  Datacenter: live
> ================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address      Load       Tokens       Owns    Host ID
>             Rack
> UN  10.1.20.49   289.75 GiB  256          ?
> be5a0193-56e7-4d42-8cc8-5d2141ab4872  us-east-1a
> UN  10.1.30.112  103.03 GiB  256          ?
> e5108a8e-cc2f-4914-a86e-fccf770e3f0f  us-east-1b
> UN  10.1.19.163  129.61 GiB  256          ?
> 3c2efdda-8dd4-4f08-b991-9aff062a5388  us-east-1a
> UN  10.1.26.181  145.28 GiB  256          ?
> 0a8f07ba-a129-42b0-b73a-df649bd076ef  us-east-1b
> UN  10.1.17.213  149.04 GiB  256          ?
> 71563e86-b2ae-4d2c-91c5-49aa08386f67  us-east-1a
> DN  10.1.19.198  52.41 GiB  256          ?
> 613b43c0-0688-4b86-994c-dc772b6fb8d2  us-east-1b
> UN  10.1.31.60   195.17 GiB  256          ?
> 3647fcca-688a-4851-ab15-df36819910f4  us-east-1b
> UN  10.1.25.206  100.67 GiB  256          ?
> f43532ad-7d2e-4480-a9ce-2529b47f823d  us-east-1b
> So each rack label right now matches the availability zone and we have 3
> Datacenters and 2 Availability Zone with 2 racks per DC but the above is
> clearly unbalanced
> If I have a keyspace with a replication factor = 3 and I want to minimize
> the number of nodes to scale up and down the cluster and keep it balanced
> should I consider an approach like OPTION A)
>
> 2.  Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>
> 3.  3 read ONE us-east-1a
>
> 4.  4 write ONE us-east-1b 5 write ONE us-east-1b
>
> 5.  6 write ONE us-east-1b
>
> 6.  OPTION B)
>
> 7.  Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>
> 8.  3 read ONE us-east-1a
>
> 9.  4 write TWO us-east-1b 5 write TWO us-east-1b
>
> 10.6 write TWO us-east-1b
>
> 11.*7 read ONE us-east-1c 8 write TWO us-east-1c*
>
> 12.*9 read ONE us-east-1c* Option B looks to be unbalanced and I would
> exclude it OPTION C)
>
> 13.Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>
> 14.3 read ONE us-east-1c
>
> 15.4 write TWO us-east-1a 5 write TWO us-east-1b
>
> 16.6 write TWO us-east-1c
>
> 17.
>
> so I am thinking of A if I have the restriction of 2 AZ but I guess that
> option C would be the best. If I have to add another DC for reads because
> we want to assign a new DC for each new microservice it would look like:
>
> OPTION EXTRA DC For Reads
>
> 1.  Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>
> 2.  3 read ONE us-east-1c
>
> 3.  4 write TWO us-east-1a 5 write TWO us-east-1b
>
> 4.  6 write TWO us-east-1c 7 extra-read THREE us-east-1a
>
> 5.  8 extra-read THREE us-east-1b
>
> 6.
>
> 7.
>
> 1.  9 extra-read THREE us-east-1c
>
> 2.
>
> The DC for *write* will replicate the data in the other datacenters. My
> scope is to keep the *read* machines dedicated to serve reads and *write*
> machines to serve writes. Cassandra will handle the replication for me. Is
> there any other option that is I missing or wrong assumption? I am thinking
> that I will write a blog post about all my learnings so far, thank you very
> much for the replies Best, Sergio
>
>
>
> Il giorno mer 23 ott 2019 alle ore 10:57 Reid Pinchback <
> rpinchback@tripadvisor.com> ha scritto:
>
> No, that’s not correct.  The point of racks is to help you distribute the
> replicas, not further-replicate the replicas.  Data centers are what do the
> latter.  So for example, if you wanted to be able to ensure that you always
> had quorum if an AZ went down, then you could have two DCs where one was in
> each AZ, and use one rack in each DC.  In your situation I think I’d be
> more tempted to consider that.  Then if an AZ went away, you could fail
> over your traffic to the remaining DC and still be perfectly fine.
>
>
>
> For background on replicas vs racks, I believe the information you want is
> under the heading ‘NetworkTopologyStrategy’ at:
>
> http://cassandra.apache.org/doc/latest/architecture/dynamo.html
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=hcKr__B8MyXvYx8vQx20B_KN89ZynwB-N4px87tcYY8&s=BhioPylf2Zs5ocBSiSQX--IeP2ojSoTiaq66SXbYN6w&e=>
>
>
>
> That should help you better understand how replicas distribute.
>
>
>
> As mentioned before, while you can choose to do the reads in one DC,
> except for concerns about contention related to network traffic and
> connection handling, you can’t isolate reads from writes.  You can _
> *mostly*_ insulate the write DC from the activity within the read DC, and
> even that isn’t an absolute because of repairs.  However, your mileage may
> vary, so do what makes sense for your usage pattern.
>
>
>
> R
>
>
>
> *From: *Sergio <la...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Date: *Wednesday, October 23, 2019 at 12:50 PM
> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>
>
>
> *Message from External Sender*
>
> Hi Reid,
>
> Thanks for your reply. I really appreciate your explanation.
>
> We are in AWS and we are using right now 2 Availability Zone and not 3. We
> found our cluster really unbalanced because the keyspace has a replication
> factor = 3 and the number of racks is 2 with 2 datacenters.
> We want the writes spread across all the nodes but we wanted the reads
> isolated from the writes to keep the load on that node low and to be able
> to identify problems in the consumers (reads) or producers (writes)
> applications.
> It looks like that each rack contains an entire copy of the data so this
> would lead to replicate for each rack and then for each node the
> information. If I am correct if we have  a keyspace with 100GB and
> Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
> If I had only one rack across 2 or even 3 availability zone I would save
> in space and I would have 300GB only. Please correct me if I am wrong.
>
> Best,
>
> Sergio
>
>
>
> Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <
> rpinchback@tripadvisor.com> ha scritto:
>
> Datacenters and racks are different concepts.  While they don't have to be
> associated with their historical meanings, the historical meanings probably
> provide a helpful model for understanding what you want from them.
>
> When companies own their own physical servers and have them housed
> somewhere, the questions arise on where you want to locate any particular
> server.  It's a balancing act on things like network speed of related
> servers being able to talk to each other, versus fault-tolerance of having
> many servers not all exposed to the same risks.
>
> "Same rack" in that physical world tended to mean something like "all
> behind the same network switch and all sharing the same power bus".  The
> morning after an electrical glitch fries a power bus and thus everything in
> that rack, you realize you wished you didn't have so many of the same type
> of server together.  Well, they were servers.  Now they are door stops.
> Badness and sadness.
>
> That's kind of the mindset to have in mind with racks in Cassandra.  It's
> an artifact for you to separate servers into pools so that the disparate
> pools have hopefully somewhat independent infrastructure risks.  However,
> all those servers are still doing the same kind of work, are the same
> version, etc.
>
> Datacenters are amalgams of those racks, and how similar or different they
> are from each other depends on what you want to do with them.  What is true
> is that if you have N datacenters, each one of them must have enough disk
> storage to house all the data.  The actual physical footprint of that data
> in each DC depends on the replication factors in play.
>
> Note that you sorta can't have "one datacenter for writes" because the
> writes will replicate across the data centers.  You could definitely choose
> to have only one that takes read queries, but best to think of writing as
> being universal.  One scenario you can have is where the DC not taking live
> traffic read queries is the one you use for maintenance or performance
> testing or version upgrades.
>
> One rack makes your life easier if you don't have a reason for multiple
> racks. It depends on the environment you deploy into and your fault
> tolerance goals.  If you were in AWS and wanting to spread risk across
> availability zones, then you would likely have as many racks as AZs you
> choose to be in, because that's really the point of using multiple AZs.
>
> R
>
>
> On 10/23/19, 4:06 AM, "Sergio Bilello" <la...@gmail.com> wrote:
>
>      Message from External Sender
>
>     Hello guys!
>
>     I was reading about
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=
>
>     I would like to understand a concept related to the node load
> balancing.
>
>     I know that Jon recommends Vnodes = 4 but right now I found a cluster
> with vnodes = 256 replication factor = 3 and 2 racks. This is unbalanced
> because the racks are not a multiplier of the replication factor.
>
>     However, my plan is to move all the nodes in a single rack to
> eventually scale up and down the node in the cluster once at the time.
>
>     If I had 3 racks and I would like to keep the things balanced I should
> scale up 3 nodes at the time one for each rack.
>
>     If I would have 3 racks, should I have also 3 different datacenters so
> one datacenter for each rack?
>
>     Can I have 2 datacenters and 3 racks? If this is possible one
> datacenter would have more nodes than the others? Could it be a problem?
>
>     I am thinking to split my cluster in one datacenter for reads and one
> for writes and keep all the nodes in the same rack so I can scale up once
> node at the time.
>
>
>
>     Please correct me if I am wrong
>
>
>
>     Thanks,
>
>
>
>     Sergio
>
>
>
>     ---------------------------------------------------------------------
>
>     To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>
>     For additional commands, e-mail: user-help@cassandra.apache.org
>
>
>

Re: Cassandra Rack - Datacenter Load Balancing relations

Posted by Reid Pinchback <rp...@tripadvisor.com>.
Two different AWS AZs are in two different physical locations.  Typically different cities.  Which means that you’re trying to manage the risk of an AZ going dark, so you use more than one AZ just in case.  The downside is that you will have some degree of network performance difference between AZs because of whatever WAN pipe AWS owns/leased to connect between them.

Having a DC in one AZ is easy to reason about.  The AZ is there, or it is not.  If you have two DCs in your cluster, and you lose an AZ, it means you still have a functioning cluster with one DC and you still have quorum.  Yay, even in an outage, you know you can still do business.  You would only have to route any traffic normally sent to the other DC to the remaining one, so as long as there is resource headroom planning in how you provision your hardware, you’re in a safe state.

If you start splitting a DC across AZs without using racks to organize nodes on a per-AZ basis, off the top of my head I don’t know how you reason about your risks for losing quorum without pausing to really think through vnodes and token distribution and whatnot.  I’m not a fan of topologies I can’t reason about when paged at 3 in the morning and I’m half asleep.  I prefer simple until the workload motivates complex.

R


From: Sergio <la...@gmail.com>
Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Date: Thursday, October 24, 2019 at 12:06 PM
To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Subject: Re: Cassandra Rack - Datacenter Load Balancing relations

Message from External Sender
Thanks Reid and Jon!

Yes I will stick with one rack per DC for sure and I will look at the Vnodes problem later on.


What's the difference in terms of reliability between
A) spreading 2 Datacenters across 3 AZ
B) having 2 Datacenters in 2 separate AZ
?


Best,

Sergio

On Thu, Oct 24, 2019, 7:36 AM Reid Pinchback <rp...@tripadvisor.com>> wrote:
Hey Sergio,

Forgive but I’m at work and had to skim the info quickly.

When in doubt, simplify.  So 1 rack per DC.  Distributed systems get rapidly harder to reason about the more complicated you make them.  There’s more than enough to learn about C* without jumping into the complexity too soon.

To deal with the unbalancing issue, pay attention to Jon Haddad’s advice on vnode count and how to fairly distribute tokens with a small vnode count.  I’d rather point you to his information, as I haven’t dug into vnode counts and token distribution in detail; he’s got a lot more time in C* than I do.  I come at this more as a traditional RDBMS and Java guy who has slowly gotten up to speed on C* over the last few years, and dealt with DynamoDB a lot so have lived with a lot of similarity in data modelling concerns.  Detailed internals I only know in cases where I had reason to dig into C* source.

There are so many knobs to turn in C* that it can be very easy to overthink things.  Simplify where you can.  Remove GC pressure wherever you can.  Negotiate with your consumers to have data models that make sense for C*.  If you have those three criteria foremost in mind, you’ll likely be fine for quite some time.  And in the times where something isn’t going well, simpler is easier to investigate.

R

From: Sergio <la...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, October 23, 2019 at 3:34 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Cassandra Rack - Datacenter Load Balancing relations

Message from External Sender
Hi Reid,

Thank you very much for clearing these concepts for me.
https://community.datastax.com/comments/1133/view.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__community.datastax.com_comments_1133_view.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=hcKr__B8MyXvYx8vQx20B_KN89ZynwB-N4px87tcYY8&s=RSwuSea6HjOb3gChVS_i4GnKgl--H0q-VHz38_setfc&e=> I posted this question on the datastax forum regarding our cluster that it is unbalanced and the reply was related that the number of racks should be a multiplier of the replication factor in order to be balanced or 1. I thought then if I have 3 availability zones I should have 3 racks for each datacenter and not 2 (us-east-1b, us-east-1a) as I have right now or in the easiest way, I should have a rack for each datacenter.



1.  Datacenter: live
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load       Tokens       Owns    Host ID                               Rack
UN  10.1.20.49   289.75 GiB  256          ?       be5a0193-56e7-4d42-8cc8-5d2141ab4872  us-east-1a
UN  10.1.30.112  103.03 GiB  256          ?       e5108a8e-cc2f-4914-a86e-fccf770e3f0f  us-east-1b
UN  10.1.19.163  129.61 GiB  256          ?       3c2efdda-8dd4-4f08-b991-9aff062a5388  us-east-1a
UN  10.1.26.181  145.28 GiB  256          ?       0a8f07ba-a129-42b0-b73a-df649bd076ef  us-east-1b
UN  10.1.17.213  149.04 GiB  256          ?       71563e86-b2ae-4d2c-91c5-49aa08386f67  us-east-1a
DN  10.1.19.198  52.41 GiB  256          ?       613b43c0-0688-4b86-994c-dc772b6fb8d2  us-east-1b
UN  10.1.31.60   195.17 GiB  256          ?       3647fcca-688a-4851-ab15-df36819910f4  us-east-1b
UN  10.1.25.206  100.67 GiB  256          ?       f43532ad-7d2e-4480-a9ce-2529b47f823d  us-east-1b
So each rack label right now matches the availability zone and we have 3 Datacenters and 2 Availability Zone with 2 racks per DC but the above is clearly unbalanced
If I have a keyspace with a replication factor = 3 and I want to minimize the number of nodes to scale up and down the cluster and keep it balanced should I consider an approach like OPTION A)

2.  Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a

3.  3 read ONE us-east-1a

4.  4 write ONE us-east-1b 5 write ONE us-east-1b

5.  6 write ONE us-east-1b

6.  OPTION B)

7.  Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a

8.  3 read ONE us-east-1a

9.  4 write TWO us-east-1b 5 write TWO us-east-1b

10.6 write TWO us-east-1b

11.7 read ONE us-east-1c 8 write TWO us-east-1c

12.9 read ONE us-east-1c Option B looks to be unbalanced and I would exclude it OPTION C)

13.Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b

14.3 read ONE us-east-1c

15.4 write TWO us-east-1a 5 write TWO us-east-1b

16.6 write TWO us-east-1c

17.
so I am thinking of A if I have the restriction of 2 AZ but I guess that option C would be the best. If I have to add another DC for reads because we want to assign a new DC for each new microservice it would look like:
OPTION EXTRA DC For Reads

1.  Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b

2.  3 read ONE us-east-1c

3.  4 write TWO us-east-1a 5 write TWO us-east-1b

4.  6 write TWO us-east-1c 7 extra-read THREE us-east-1a

5.  8 extra-read THREE us-east-1b

6.

7.

1.  9 extra-read THREE us-east-1c

2.
The DC for write will replicate the data in the other datacenters. My scope is to keep the read machines dedicated to serve reads and write machines to serve writes. Cassandra will handle the replication for me. Is there any other option that is I missing or wrong assumption? I am thinking that I will write a blog post about all my learnings so far, thank you very much for the replies Best, Sergio

Il giorno mer 23 ott 2019 alle ore 10:57 Reid Pinchback <rp...@tripadvisor.com>> ha scritto:
No, that’s not correct.  The point of racks is to help you distribute the replicas, not further-replicate the replicas.  Data centers are what do the latter.  So for example, if you wanted to be able to ensure that you always had quorum if an AZ went down, then you could have two DCs where one was in each AZ, and use one rack in each DC.  In your situation I think I’d be more tempted to consider that.  Then if an AZ went away, you could fail over your traffic to the remaining DC and still be perfectly fine.

For background on replicas vs racks, I believe the information you want is under the heading ‘NetworkTopologyStrategy’ at:
http://cassandra.apache.org/doc/latest/architecture/dynamo.html<https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=hcKr__B8MyXvYx8vQx20B_KN89ZynwB-N4px87tcYY8&s=BhioPylf2Zs5ocBSiSQX--IeP2ojSoTiaq66SXbYN6w&e=>

That should help you better understand how replicas distribute.

As mentioned before, while you can choose to do the reads in one DC, except for concerns about contention related to network traffic and connection handling, you can’t isolate reads from writes.  You can _mostly_ insulate the write DC from the activity within the read DC, and even that isn’t an absolute because of repairs.  However, your mileage may vary, so do what makes sense for your usage pattern.

R

From: Sergio <la...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, October 23, 2019 at 12:50 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Cassandra Rack - Datacenter Load Balancing relations

Message from External Sender
Hi Reid,

Thanks for your reply. I really appreciate your explanation.

We are in AWS and we are using right now 2 Availability Zone and not 3. We found our cluster really unbalanced because the keyspace has a replication factor = 3 and the number of racks is 2 with 2 datacenters.
We want the writes spread across all the nodes but we wanted the reads isolated from the writes to keep the load on that node low and to be able to identify problems in the consumers (reads) or producers (writes) applications.
It looks like that each rack contains an entire copy of the data so this would lead to replicate for each rack and then for each node the information. If I am correct if we have  a keyspace with 100GB and Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
If I had only one rack across 2 or even 3 availability zone I would save in space and I would have 300GB only. Please correct me if I am wrong.

Best,

Sergio

Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <rp...@tripadvisor.com>> ha scritto:
Datacenters and racks are different concepts.  While they don't have to be associated with their historical meanings, the historical meanings probably provide a helpful model for understanding what you want from them.

When companies own their own physical servers and have them housed somewhere, the questions arise on where you want to locate any particular server.  It's a balancing act on things like network speed of related servers being able to talk to each other, versus fault-tolerance of having many servers not all exposed to the same risks.

"Same rack" in that physical world tended to mean something like "all behind the same network switch and all sharing the same power bus".  The morning after an electrical glitch fries a power bus and thus everything in that rack, you realize you wished you didn't have so many of the same type of server together.  Well, they were servers.  Now they are door stops.  Badness and sadness.

That's kind of the mindset to have in mind with racks in Cassandra.  It's an artifact for you to separate servers into pools so that the disparate pools have hopefully somewhat independent infrastructure risks.  However, all those servers are still doing the same kind of work, are the same version, etc.

Datacenters are amalgams of those racks, and how similar or different they are from each other depends on what you want to do with them.  What is true is that if you have N datacenters, each one of them must have enough disk storage to house all the data.  The actual physical footprint of that data in each DC depends on the replication factors in play.

Note that you sorta can't have "one datacenter for writes" because the writes will replicate across the data centers.  You could definitely choose to have only one that takes read queries, but best to think of writing as being universal.  One scenario you can have is where the DC not taking live traffic read queries is the one you use for maintenance or performance testing or version upgrades.

One rack makes your life easier if you don't have a reason for multiple racks. It depends on the environment you deploy into and your fault tolerance goals.  If you were in AWS and wanting to spread risk across availability zones, then you would likely have as many racks as AZs you choose to be in, because that's really the point of using multiple AZs.

R


On 10/23/19, 4:06 AM, "Sergio Bilello" <la...@gmail.com>> wrote:

     Message from External Sender

    Hello guys!

    I was reading about https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=

    I would like to understand a concept related to the node load balancing.

    I know that Jon recommends Vnodes = 4 but right now I found a cluster with vnodes = 256 replication factor = 3 and 2 racks. This is unbalanced because the racks are not a multiplier of the replication factor.

    However, my plan is to move all the nodes in a single rack to eventually scale up and down the node in the cluster once at the time.

    If I had 3 racks and I would like to keep the things balanced I should scale up 3 nodes at the time one for each rack.

    If I would have 3 racks, should I have also 3 different datacenters so one datacenter for each rack?

    Can I have 2 datacenters and 3 racks? If this is possible one datacenter would have more nodes than the others? Could it be a problem?

    I am thinking to split my cluster in one datacenter for reads and one for writes and keep all the nodes in the same rack so I can scale up once node at the time.



    Please correct me if I am wrong



    Thanks,



    Sergio



    ---------------------------------------------------------------------

    To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org>

    For additional commands, e-mail: user-help@cassandra.apache.org<ma...@cassandra.apache.org>



Re: Cassandra Rack - Datacenter Load Balancing relations

Posted by Sergio <la...@gmail.com>.
Thanks Reid and Jon!

Yes I will stick with one rack per DC for sure and I will look at the
Vnodes problem later on.


What's the difference in terms of reliability between
A) spreading 2 Datacenters across 3 AZ
B) having 2 Datacenters in 2 separate AZ
?


Best,

Sergio

On Thu, Oct 24, 2019, 7:36 AM Reid Pinchback <rp...@tripadvisor.com>
wrote:

> Hey Sergio,
>
>
>
> Forgive but I’m at work and had to skim the info quickly.
>
>
>
> When in doubt, simplify.  So 1 rack per DC.  Distributed systems get
> rapidly harder to reason about the more complicated you make them.  There’s
> more than enough to learn about C* without jumping into the complexity too
> soon.
>
>
>
> To deal with the unbalancing issue, pay attention to Jon Haddad’s advice
> on vnode count and how to fairly distribute tokens with a small vnode
> count.  I’d rather point you to his information, as I haven’t dug into
> vnode counts and token distribution in detail; he’s got a lot more time in
> C* than I do.  I come at this more as a traditional RDBMS and Java guy who
> has slowly gotten up to speed on C* over the last few years, and dealt with
> DynamoDB a lot so have lived with a lot of similarity in data modelling
> concerns.  Detailed internals I only know in cases where I had reason to
> dig into C* source.
>
>
>
> There are so many knobs to turn in C* that it can be very easy to
> overthink things.  Simplify where you can.  Remove GC pressure wherever you
> can.  Negotiate with your consumers to have data models that make sense for
> C*.  If you have those three criteria foremost in mind, you’ll likely be
> fine for quite some time.  And in the times where something isn’t going
> well, simpler is easier to investigate.
>
>
> R
>
>
>
> *From: *Sergio <la...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Date: *Wednesday, October 23, 2019 at 3:34 PM
> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>
>
>
> *Message from External Sender*
>
> Hi Reid,
>
> Thank you very much for clearing these concepts for me.
> https://community.datastax.com/comments/1133/view.html
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__community.datastax.com_comments_1133_view.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=hcKr__B8MyXvYx8vQx20B_KN89ZynwB-N4px87tcYY8&s=RSwuSea6HjOb3gChVS_i4GnKgl--H0q-VHz38_setfc&e=>
> I posted this question on the datastax forum regarding our cluster that it
> is unbalanced and the reply was related that the *number of racks should
> be a multiplier of the replication factor *in order to be balanced or 1.
> I thought then if I have 3 availability zones I should have 3 racks for
> each datacenter and not 2 (us-east-1b, us-east-1a) as I have right now or
> in the easiest way, I should have a rack for each datacenter.
>
>
>
> 1.  Datacenter: live
> ================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address      Load       Tokens       Owns    Host ID
>             Rack
> UN  10.1.20.49   289.75 GiB  256          ?
> be5a0193-56e7-4d42-8cc8-5d2141ab4872  us-east-1a
> UN  10.1.30.112  103.03 GiB  256          ?
> e5108a8e-cc2f-4914-a86e-fccf770e3f0f  us-east-1b
> UN  10.1.19.163  129.61 GiB  256          ?
> 3c2efdda-8dd4-4f08-b991-9aff062a5388  us-east-1a
> UN  10.1.26.181  145.28 GiB  256          ?
> 0a8f07ba-a129-42b0-b73a-df649bd076ef  us-east-1b
> UN  10.1.17.213  149.04 GiB  256          ?
> 71563e86-b2ae-4d2c-91c5-49aa08386f67  us-east-1a
> DN  10.1.19.198  52.41 GiB  256          ?
> 613b43c0-0688-4b86-994c-dc772b6fb8d2  us-east-1b
> UN  10.1.31.60   195.17 GiB  256          ?
> 3647fcca-688a-4851-ab15-df36819910f4  us-east-1b
> UN  10.1.25.206  100.67 GiB  256          ?
> f43532ad-7d2e-4480-a9ce-2529b47f823d  us-east-1b
> So each rack label right now matches the availability zone and we have 3
> Datacenters and 2 Availability Zone with 2 racks per DC but the above is
> clearly unbalanced
> If I have a keyspace with a replication factor = 3 and I want to minimize
> the number of nodes to scale up and down the cluster and keep it balanced
> should I consider an approach like OPTION A)
>
> 2.  Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>
> 3.  3 read ONE us-east-1a
>
> 4.  4 write ONE us-east-1b 5 write ONE us-east-1b
>
> 5.  6 write ONE us-east-1b
>
> 6.  OPTION B)
>
> 7.  Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>
> 8.  3 read ONE us-east-1a
>
> 9.  4 write TWO us-east-1b 5 write TWO us-east-1b
>
> 10.6 write TWO us-east-1b
>
> 11.*7 read ONE us-east-1c 8 write TWO us-east-1c*
>
> 12.*9 read ONE us-east-1c* Option B looks to be unbalanced and I would
> exclude it OPTION C)
>
> 13.Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>
> 14.3 read ONE us-east-1c
>
> 15.4 write TWO us-east-1a 5 write TWO us-east-1b
>
> 16.6 write TWO us-east-1c
>
> 17.
>
> so I am thinking of A if I have the restriction of 2 AZ but I guess that
> option C would be the best. If I have to add another DC for reads because
> we want to assign a new DC for each new microservice it would look like:
>
> OPTION EXTRA DC For Reads
>
> 1.  Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>
> 2.  3 read ONE us-east-1c
>
> 3.  4 write TWO us-east-1a 5 write TWO us-east-1b
>
> 4.  6 write TWO us-east-1c 7 extra-read THREE us-east-1a
>
> 5.  8 extra-read THREE us-east-1b
>
> 6.
>
> 7.
>
> 1.  9 extra-read THREE us-east-1c
>
> 2.
>
> The DC for *write* will replicate the data in the other datacenters. My
> scope is to keep the *read* machines dedicated to serve reads and *write*
> machines to serve writes. Cassandra will handle the replication for me. Is
> there any other option that is I missing or wrong assumption? I am thinking
> that I will write a blog post about all my learnings so far, thank you very
> much for the replies Best, Sergio
>
>
>
> Il giorno mer 23 ott 2019 alle ore 10:57 Reid Pinchback <
> rpinchback@tripadvisor.com> ha scritto:
>
> No, that’s not correct.  The point of racks is to help you distribute the
> replicas, not further-replicate the replicas.  Data centers are what do the
> latter.  So for example, if you wanted to be able to ensure that you always
> had quorum if an AZ went down, then you could have two DCs where one was in
> each AZ, and use one rack in each DC.  In your situation I think I’d be
> more tempted to consider that.  Then if an AZ went away, you could fail
> over your traffic to the remaining DC and still be perfectly fine.
>
>
>
> For background on replicas vs racks, I believe the information you want is
> under the heading ‘NetworkTopologyStrategy’ at:
>
> http://cassandra.apache.org/doc/latest/architecture/dynamo.html
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=hcKr__B8MyXvYx8vQx20B_KN89ZynwB-N4px87tcYY8&s=BhioPylf2Zs5ocBSiSQX--IeP2ojSoTiaq66SXbYN6w&e=>
>
>
>
> That should help you better understand how replicas distribute.
>
>
>
> As mentioned before, while you can choose to do the reads in one DC,
> except for concerns about contention related to network traffic and
> connection handling, you can’t isolate reads from writes.  You can _
> *mostly*_ insulate the write DC from the activity within the read DC, and
> even that isn’t an absolute because of repairs.  However, your mileage may
> vary, so do what makes sense for your usage pattern.
>
>
>
> R
>
>
>
> *From: *Sergio <la...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Date: *Wednesday, October 23, 2019 at 12:50 PM
> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>
>
>
> *Message from External Sender*
>
> Hi Reid,
>
> Thanks for your reply. I really appreciate your explanation.
>
> We are in AWS and we are using right now 2 Availability Zone and not 3. We
> found our cluster really unbalanced because the keyspace has a replication
> factor = 3 and the number of racks is 2 with 2 datacenters.
> We want the writes spread across all the nodes but we wanted the reads
> isolated from the writes to keep the load on that node low and to be able
> to identify problems in the consumers (reads) or producers (writes)
> applications.
> It looks like that each rack contains an entire copy of the data so this
> would lead to replicate for each rack and then for each node the
> information. If I am correct if we have  a keyspace with 100GB and
> Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
> If I had only one rack across 2 or even 3 availability zone I would save
> in space and I would have 300GB only. Please correct me if I am wrong.
>
> Best,
>
> Sergio
>
>
>
> Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <
> rpinchback@tripadvisor.com> ha scritto:
>
> Datacenters and racks are different concepts.  While they don't have to be
> associated with their historical meanings, the historical meanings probably
> provide a helpful model for understanding what you want from them.
>
> When companies own their own physical servers and have them housed
> somewhere, the questions arise on where you want to locate any particular
> server.  It's a balancing act on things like network speed of related
> servers being able to talk to each other, versus fault-tolerance of having
> many servers not all exposed to the same risks.
>
> "Same rack" in that physical world tended to mean something like "all
> behind the same network switch and all sharing the same power bus".  The
> morning after an electrical glitch fries a power bus and thus everything in
> that rack, you realize you wished you didn't have so many of the same type
> of server together.  Well, they were servers.  Now they are door stops.
> Badness and sadness.
>
> That's kind of the mindset to have in mind with racks in Cassandra.  It's
> an artifact for you to separate servers into pools so that the disparate
> pools have hopefully somewhat independent infrastructure risks.  However,
> all those servers are still doing the same kind of work, are the same
> version, etc.
>
> Datacenters are amalgams of those racks, and how similar or different they
> are from each other depends on what you want to do with them.  What is true
> is that if you have N datacenters, each one of them must have enough disk
> storage to house all the data.  The actual physical footprint of that data
> in each DC depends on the replication factors in play.
>
> Note that you sorta can't have "one datacenter for writes" because the
> writes will replicate across the data centers.  You could definitely choose
> to have only one that takes read queries, but best to think of writing as
> being universal.  One scenario you can have is where the DC not taking live
> traffic read queries is the one you use for maintenance or performance
> testing or version upgrades.
>
> One rack makes your life easier if you don't have a reason for multiple
> racks. It depends on the environment you deploy into and your fault
> tolerance goals.  If you were in AWS and wanting to spread risk across
> availability zones, then you would likely have as many racks as AZs you
> choose to be in, because that's really the point of using multiple AZs.
>
> R
>
>
> On 10/23/19, 4:06 AM, "Sergio Bilello" <la...@gmail.com> wrote:
>
>      Message from External Sender
>
>     Hello guys!
>
>     I was reading about
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=
>
>     I would like to understand a concept related to the node load
> balancing.
>
>     I know that Jon recommends Vnodes = 4 but right now I found a cluster
> with vnodes = 256 replication factor = 3 and 2 racks. This is unbalanced
> because the racks are not a multiplier of the replication factor.
>
>     However, my plan is to move all the nodes in a single rack to
> eventually scale up and down the node in the cluster once at the time.
>
>     If I had 3 racks and I would like to keep the things balanced I should
> scale up 3 nodes at the time one for each rack.
>
>     If I would have 3 racks, should I have also 3 different datacenters so
> one datacenter for each rack?
>
>     Can I have 2 datacenters and 3 racks? If this is possible one
> datacenter would have more nodes than the others? Could it be a problem?
>
>     I am thinking to split my cluster in one datacenter for reads and one
> for writes and keep all the nodes in the same rack so I can scale up once
> node at the time.
>
>
>
>     Please correct me if I am wrong
>
>
>
>     Thanks,
>
>
>
>     Sergio
>
>
>
>     ---------------------------------------------------------------------
>
>     To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>
>     For additional commands, e-mail: user-help@cassandra.apache.org
>
>
>
>

Re: Cassandra Rack - Datacenter Load Balancing relations

Posted by Reid Pinchback <rp...@tripadvisor.com>.
Hey Sergio,

Forgive but I’m at work and had to skim the info quickly.

When in doubt, simplify.  So 1 rack per DC.  Distributed systems get rapidly harder to reason about the more complicated you make them.  There’s more than enough to learn about C* without jumping into the complexity too soon.

To deal with the unbalancing issue, pay attention to Jon Haddad’s advice on vnode count and how to fairly distribute tokens with a small vnode count.  I’d rather point you to his information, as I haven’t dug into vnode counts and token distribution in detail; he’s got a lot more time in C* than I do.  I come at this more as a traditional RDBMS and Java guy who has slowly gotten up to speed on C* over the last few years, and dealt with DynamoDB a lot so have lived with a lot of similarity in data modelling concerns.  Detailed internals I only know in cases where I had reason to dig into C* source.

There are so many knobs to turn in C* that it can be very easy to overthink things.  Simplify where you can.  Remove GC pressure wherever you can.  Negotiate with your consumers to have data models that make sense for C*.  If you have those three criteria foremost in mind, you’ll likely be fine for quite some time.  And in the times where something isn’t going well, simpler is easier to investigate.

R

From: Sergio <la...@gmail.com>
Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Date: Wednesday, October 23, 2019 at 3:34 PM
To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Subject: Re: Cassandra Rack - Datacenter Load Balancing relations

Message from External Sender
Hi Reid,

Thank you very much for clearing these concepts for me.
https://community.datastax.com/comments/1133/view.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__community.datastax.com_comments_1133_view.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=hcKr__B8MyXvYx8vQx20B_KN89ZynwB-N4px87tcYY8&s=RSwuSea6HjOb3gChVS_i4GnKgl--H0q-VHz38_setfc&e=> I posted this question on the datastax forum regarding our cluster that it is unbalanced and the reply was related that the number of racks should be a multiplier of the replication factor in order to be balanced or 1. I thought then if I have 3 availability zones I should have 3 racks for each datacenter and not 2 (us-east-1b, us-east-1a) as I have right now or in the easiest way, I should have a rack for each datacenter.




1.  Datacenter: live
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load       Tokens       Owns    Host ID                               Rack
UN  10.1.20.49   289.75 GiB  256          ?       be5a0193-56e7-4d42-8cc8-5d2141ab4872  us-east-1a
UN  10.1.30.112  103.03 GiB  256          ?       e5108a8e-cc2f-4914-a86e-fccf770e3f0f  us-east-1b
UN  10.1.19.163  129.61 GiB  256          ?       3c2efdda-8dd4-4f08-b991-9aff062a5388  us-east-1a
UN  10.1.26.181  145.28 GiB  256          ?       0a8f07ba-a129-42b0-b73a-df649bd076ef  us-east-1b
UN  10.1.17.213  149.04 GiB  256          ?       71563e86-b2ae-4d2c-91c5-49aa08386f67  us-east-1a
DN  10.1.19.198  52.41 GiB  256          ?       613b43c0-0688-4b86-994c-dc772b6fb8d2  us-east-1b
UN  10.1.31.60   195.17 GiB  256          ?       3647fcca-688a-4851-ab15-df36819910f4  us-east-1b
UN  10.1.25.206  100.67 GiB  256          ?       f43532ad-7d2e-4480-a9ce-2529b47f823d  us-east-1b
So each rack label right now matches the availability zone and we have 3 Datacenters and 2 Availability Zone with 2 racks per DC but the above is clearly unbalanced
If I have a keyspace with a replication factor = 3 and I want to minimize the number of nodes to scale up and down the cluster and keep it balanced should I consider an approach like OPTION A)

2.  Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a

3.  3 read ONE us-east-1a

4.  4 write ONE us-east-1b 5 write ONE us-east-1b

5.  6 write ONE us-east-1b

6.  OPTION B)

7.  Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a

8.  3 read ONE us-east-1a

9.  4 write TWO us-east-1b 5 write TWO us-east-1b

10.6 write TWO us-east-1b

11.7 read ONE us-east-1c 8 write TWO us-east-1c

12.9 read ONE us-east-1c Option B looks to be unbalanced and I would exclude it OPTION C)

13.Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b

14.3 read ONE us-east-1c

15.4 write TWO us-east-1a 5 write TWO us-east-1b

16.6 write TWO us-east-1c

17.
so I am thinking of A if I have the restriction of 2 AZ but I guess that option C would be the best. If I have to add another DC for reads because we want to assign a new DC for each new microservice it would look like:
OPTION EXTRA DC For Reads

1.  Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b

2.  3 read ONE us-east-1c

3.  4 write TWO us-east-1a 5 write TWO us-east-1b

4.  6 write TWO us-east-1c 7 extra-read THREE us-east-1a

5.  8 extra-read THREE us-east-1b

6.

7.

1.  9 extra-read THREE us-east-1c

2.
The DC for write will replicate the data in the other datacenters. My scope is to keep the read machines dedicated to serve reads and write machines to serve writes. Cassandra will handle the replication for me. Is there any other option that is I missing or wrong assumption? I am thinking that I will write a blog post about all my learnings so far, thank you very much for the replies Best, Sergio

Il giorno mer 23 ott 2019 alle ore 10:57 Reid Pinchback <rp...@tripadvisor.com>> ha scritto:
No, that’s not correct.  The point of racks is to help you distribute the replicas, not further-replicate the replicas.  Data centers are what do the latter.  So for example, if you wanted to be able to ensure that you always had quorum if an AZ went down, then you could have two DCs where one was in each AZ, and use one rack in each DC.  In your situation I think I’d be more tempted to consider that.  Then if an AZ went away, you could fail over your traffic to the remaining DC and still be perfectly fine.

For background on replicas vs racks, I believe the information you want is under the heading ‘NetworkTopologyStrategy’ at:
http://cassandra.apache.org/doc/latest/architecture/dynamo.html<https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=hcKr__B8MyXvYx8vQx20B_KN89ZynwB-N4px87tcYY8&s=BhioPylf2Zs5ocBSiSQX--IeP2ojSoTiaq66SXbYN6w&e=>

That should help you better understand how replicas distribute.

As mentioned before, while you can choose to do the reads in one DC, except for concerns about contention related to network traffic and connection handling, you can’t isolate reads from writes.  You can _mostly_ insulate the write DC from the activity within the read DC, and even that isn’t an absolute because of repairs.  However, your mileage may vary, so do what makes sense for your usage pattern.

R

From: Sergio <la...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, October 23, 2019 at 12:50 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Cassandra Rack - Datacenter Load Balancing relations

Message from External Sender
Hi Reid,

Thanks for your reply. I really appreciate your explanation.

We are in AWS and we are using right now 2 Availability Zone and not 3. We found our cluster really unbalanced because the keyspace has a replication factor = 3 and the number of racks is 2 with 2 datacenters.
We want the writes spread across all the nodes but we wanted the reads isolated from the writes to keep the load on that node low and to be able to identify problems in the consumers (reads) or producers (writes) applications.
It looks like that each rack contains an entire copy of the data so this would lead to replicate for each rack and then for each node the information. If I am correct if we have  a keyspace with 100GB and Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
If I had only one rack across 2 or even 3 availability zone I would save in space and I would have 300GB only. Please correct me if I am wrong.

Best,

Sergio

Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <rp...@tripadvisor.com>> ha scritto:
Datacenters and racks are different concepts.  While they don't have to be associated with their historical meanings, the historical meanings probably provide a helpful model for understanding what you want from them.

When companies own their own physical servers and have them housed somewhere, the questions arise on where you want to locate any particular server.  It's a balancing act on things like network speed of related servers being able to talk to each other, versus fault-tolerance of having many servers not all exposed to the same risks.

"Same rack" in that physical world tended to mean something like "all behind the same network switch and all sharing the same power bus".  The morning after an electrical glitch fries a power bus and thus everything in that rack, you realize you wished you didn't have so many of the same type of server together.  Well, they were servers.  Now they are door stops.  Badness and sadness.

That's kind of the mindset to have in mind with racks in Cassandra.  It's an artifact for you to separate servers into pools so that the disparate pools have hopefully somewhat independent infrastructure risks.  However, all those servers are still doing the same kind of work, are the same version, etc.

Datacenters are amalgams of those racks, and how similar or different they are from each other depends on what you want to do with them.  What is true is that if you have N datacenters, each one of them must have enough disk storage to house all the data.  The actual physical footprint of that data in each DC depends on the replication factors in play.

Note that you sorta can't have "one datacenter for writes" because the writes will replicate across the data centers.  You could definitely choose to have only one that takes read queries, but best to think of writing as being universal.  One scenario you can have is where the DC not taking live traffic read queries is the one you use for maintenance or performance testing or version upgrades.

One rack makes your life easier if you don't have a reason for multiple racks. It depends on the environment you deploy into and your fault tolerance goals.  If you were in AWS and wanting to spread risk across availability zones, then you would likely have as many racks as AZs you choose to be in, because that's really the point of using multiple AZs.

R


On 10/23/19, 4:06 AM, "Sergio Bilello" <la...@gmail.com>> wrote:

     Message from External Sender

    Hello guys!

    I was reading about https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=

    I would like to understand a concept related to the node load balancing.

    I know that Jon recommends Vnodes = 4 but right now I found a cluster with vnodes = 256 replication factor = 3 and 2 racks. This is unbalanced because the racks are not a multiplier of the replication factor.

    However, my plan is to move all the nodes in a single rack to eventually scale up and down the node in the cluster once at the time.

    If I had 3 racks and I would like to keep the things balanced I should scale up 3 nodes at the time one for each rack.

    If I would have 3 racks, should I have also 3 different datacenters so one datacenter for each rack?

    Can I have 2 datacenters and 3 racks? If this is possible one datacenter would have more nodes than the others? Could it be a problem?

    I am thinking to split my cluster in one datacenter for reads and one for writes and keep all the nodes in the same rack so I can scale up once node at the time.



    Please correct me if I am wrong



    Thanks,



    Sergio



    ---------------------------------------------------------------------

    To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org>

    For additional commands, e-mail: user-help@cassandra.apache.org<ma...@cassandra.apache.org>




Re: Cassandra Rack - Datacenter Load Balancing relations

Posted by Sergio <la...@gmail.com>.
Hi Reid,

Thank you very much for clearing these concepts for me.
https://community.datastax.com/comments/1133/view.html I posted this
question on the datastax forum regarding our cluster that it is unbalanced
and the reply was related that the *number of racks should be a multiplier
of the replication factor *in order to be balanced or 1. I thought then if
I have 3 availability zones I should have 3 racks for each datacenter and
not 2 (us-east-1b, us-east-1a) as I have right now or in the easiest way, I
should have a rack for each datacenter.



   1. Datacenter: live
   ================
   Status=Up/Down
   |/ State=Normal/Leaving/Joining/Moving
   --  Address      Load       Tokens       Owns    Host ID
                 Rack
   UN  10.1.20.49   289.75 GiB  256          ?
   be5a0193-56e7-4d42-8cc8-5d2141ab4872  us-east-1a
   UN  10.1.30.112  103.03 GiB  256          ?
   e5108a8e-cc2f-4914-a86e-fccf770e3f0f  us-east-1b
   UN  10.1.19.163  129.61 GiB  256          ?
   3c2efdda-8dd4-4f08-b991-9aff062a5388  us-east-1a
   UN  10.1.26.181  145.28 GiB  256          ?
   0a8f07ba-a129-42b0-b73a-df649bd076ef  us-east-1b
   UN  10.1.17.213  149.04 GiB  256          ?
   71563e86-b2ae-4d2c-91c5-49aa08386f67  us-east-1a
   DN  10.1.19.198  52.41 GiB  256          ?
   613b43c0-0688-4b86-994c-dc772b6fb8d2  us-east-1b
   UN  10.1.31.60   195.17 GiB  256          ?
   3647fcca-688a-4851-ab15-df36819910f4  us-east-1b
   UN  10.1.25.206  100.67 GiB  256          ?
   f43532ad-7d2e-4480-a9ce-2529b47f823d  us-east-1b
   So each rack label right now matches the availability zone and we have 3
   Datacenters and 2 Availability Zone with 2 racks per DC but the above is
   clearly unbalanced
   If I have a keyspace with a replication factor = 3 and I want to
   minimize the number of nodes to scale up and down the cluster and keep it
   balanced should I consider an approach like OPTION A)
   2. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
   3. 3 read ONE us-east-1a
   4. 4 write ONE us-east-1b 5 write ONE us-east-1b
   5. 6 write ONE us-east-1b
   6. OPTION B)
   7. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
   8. 3 read ONE us-east-1a
   9. 4 write TWO us-east-1b 5 write TWO us-east-1b
   10. 6 write TWO us-east-1b
   11. *7 read ONE us-east-1c 8 write TWO us-east-1c*
   12. *9 read ONE us-east-1c* Option B looks to be unbalanced and I would
   exclude it OPTION C)
   13. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
   14. 3 read ONE us-east-1c
   15. 4 write TWO us-east-1a 5 write TWO us-east-1b
   16. 6 write TWO us-east-1c
   17.


   so I am thinking of A if I have the restriction of 2 AZ but I guess that
   option C would be the best. If I have to add another DC for reads because
   we want to assign a new DC for each new microservice it would look like:
      OPTION EXTRA DC For Reads
      1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
      2. 3 read ONE us-east-1c
      3. 4 write TWO us-east-1a 5 write TWO us-east-1b
      4. 6 write TWO us-east-1c 7 extra-read THREE us-east-1a
      5. 8 extra-read THREE us-east-1b
      6.
         7.


   1. 9 extra-read THREE us-east-1c
      2.
   The DC for *write* will replicate the data in the other datacenters. My
   scope is to keep the *read* machines dedicated to serve reads and *write*
   machines to serve writes. Cassandra will handle the replication for me. Is
   there any other option that is I missing or wrong assumption? I am thinking
   that I will write a blog post about all my learnings so far, thank you very
   much for the replies Best, Sergio


Il giorno mer 23 ott 2019 alle ore 10:57 Reid Pinchback <
rpinchback@tripadvisor.com> ha scritto:

> No, that’s not correct.  The point of racks is to help you distribute the
> replicas, not further-replicate the replicas.  Data centers are what do the
> latter.  So for example, if you wanted to be able to ensure that you always
> had quorum if an AZ went down, then you could have two DCs where one was in
> each AZ, and use one rack in each DC.  In your situation I think I’d be
> more tempted to consider that.  Then if an AZ went away, you could fail
> over your traffic to the remaining DC and still be perfectly fine.
>
>
>
> For background on replicas vs racks, I believe the information you want is
> under the heading ‘NetworkTopologyStrategy’ at:
>
> http://cassandra.apache.org/doc/latest/architecture/dynamo.html
>
>
>
> That should help you better understand how replicas distribute.
>
>
>
> As mentioned before, while you can choose to do the reads in one DC,
> except for concerns about contention related to network traffic and
> connection handling, you can’t isolate reads from writes.  You can _
> *mostly*_ insulate the write DC from the activity within the read DC, and
> even that isn’t an absolute because of repairs.  However, your mileage may
> vary, so do what makes sense for your usage pattern.
>
>
>
> R
>
>
>
> *From: *Sergio <la...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Date: *Wednesday, October 23, 2019 at 12:50 PM
> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>
>
>
> *Message from External Sender*
>
> Hi Reid,
>
> Thanks for your reply. I really appreciate your explanation.
>
> We are in AWS and we are using right now 2 Availability Zone and not 3. We
> found our cluster really unbalanced because the keyspace has a replication
> factor = 3 and the number of racks is 2 with 2 datacenters.
> We want the writes spread across all the nodes but we wanted the reads
> isolated from the writes to keep the load on that node low and to be able
> to identify problems in the consumers (reads) or producers (writes)
> applications.
> It looks like that each rack contains an entire copy of the data so this
> would lead to replicate for each rack and then for each node the
> information. If I am correct if we have  a keyspace with 100GB and
> Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
> If I had only one rack across 2 or even 3 availability zone I would save
> in space and I would have 300GB only. Please correct me if I am wrong.
>
> Best,
>
> Sergio
>
>
>
> Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <
> rpinchback@tripadvisor.com> ha scritto:
>
> Datacenters and racks are different concepts.  While they don't have to be
> associated with their historical meanings, the historical meanings probably
> provide a helpful model for understanding what you want from them.
>
> When companies own their own physical servers and have them housed
> somewhere, the questions arise on where you want to locate any particular
> server.  It's a balancing act on things like network speed of related
> servers being able to talk to each other, versus fault-tolerance of having
> many servers not all exposed to the same risks.
>
> "Same rack" in that physical world tended to mean something like "all
> behind the same network switch and all sharing the same power bus".  The
> morning after an electrical glitch fries a power bus and thus everything in
> that rack, you realize you wished you didn't have so many of the same type
> of server together.  Well, they were servers.  Now they are door stops.
> Badness and sadness.
>
> That's kind of the mindset to have in mind with racks in Cassandra.  It's
> an artifact for you to separate servers into pools so that the disparate
> pools have hopefully somewhat independent infrastructure risks.  However,
> all those servers are still doing the same kind of work, are the same
> version, etc.
>
> Datacenters are amalgams of those racks, and how similar or different they
> are from each other depends on what you want to do with them.  What is true
> is that if you have N datacenters, each one of them must have enough disk
> storage to house all the data.  The actual physical footprint of that data
> in each DC depends on the replication factors in play.
>
> Note that you sorta can't have "one datacenter for writes" because the
> writes will replicate across the data centers.  You could definitely choose
> to have only one that takes read queries, but best to think of writing as
> being universal.  One scenario you can have is where the DC not taking live
> traffic read queries is the one you use for maintenance or performance
> testing or version upgrades.
>
> One rack makes your life easier if you don't have a reason for multiple
> racks. It depends on the environment you deploy into and your fault
> tolerance goals.  If you were in AWS and wanting to spread risk across
> availability zones, then you would likely have as many racks as AZs you
> choose to be in, because that's really the point of using multiple AZs.
>
> R
>
>
> On 10/23/19, 4:06 AM, "Sergio Bilello" <la...@gmail.com> wrote:
>
>      Message from External Sender
>
>     Hello guys!
>
>     I was reading about
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=
>
>     I would like to understand a concept related to the node load
> balancing.
>
>     I know that Jon recommends Vnodes = 4 but right now I found a cluster
> with vnodes = 256 replication factor = 3 and 2 racks. This is unbalanced
> because the racks are not a multiplier of the replication factor.
>
>     However, my plan is to move all the nodes in a single rack to
> eventually scale up and down the node in the cluster once at the time.
>
>     If I had 3 racks and I would like to keep the things balanced I should
> scale up 3 nodes at the time one for each rack.
>
>     If I would have 3 racks, should I have also 3 different datacenters so
> one datacenter for each rack?
>
>     Can I have 2 datacenters and 3 racks? If this is possible one
> datacenter would have more nodes than the others? Could it be a problem?
>
>     I am thinking to split my cluster in one datacenter for reads and one
> for writes and keep all the nodes in the same rack so I can scale up once
> node at the time.
>
>
>
>     Please correct me if I am wrong
>
>
>
>     Thanks,
>
>
>
>     Sergio
>
>
>
>     ---------------------------------------------------------------------
>
>     To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>
>     For additional commands, e-mail: user-help@cassandra.apache.org
>
>
>
>
>

Re: Cassandra Rack - Datacenter Load Balancing relations

Posted by Reid Pinchback <rp...@tripadvisor.com>.
No, that’s not correct.  The point of racks is to help you distribute the replicas, not further-replicate the replicas.  Data centers are what do the latter.  So for example, if you wanted to be able to ensure that you always had quorum if an AZ went down, then you could have two DCs where one was in each AZ, and use one rack in each DC.  In your situation I think I’d be more tempted to consider that.  Then if an AZ went away, you could fail over your traffic to the remaining DC and still be perfectly fine.

For background on replicas vs racks, I believe the information you want is under the heading ‘NetworkTopologyStrategy’ at:

http://cassandra.apache.org/doc/latest/architecture/dynamo.html

That should help you better understand how replicas distribute.

As mentioned before, while you can choose to do the reads in one DC, except for concerns about contention related to network traffic and connection handling, you can’t isolate reads from writes.  You can _mostly_ insulate the write DC from the activity within the read DC, and even that isn’t an absolute because of repairs.  However, your mileage may vary, so do what makes sense for your usage pattern.

R

From: Sergio <la...@gmail.com>
Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Date: Wednesday, October 23, 2019 at 12:50 PM
To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Subject: Re: Cassandra Rack - Datacenter Load Balancing relations

Message from External Sender
Hi Reid,

Thanks for your reply. I really appreciate your explanation.

We are in AWS and we are using right now 2 Availability Zone and not 3. We found our cluster really unbalanced because the keyspace has a replication factor = 3 and the number of racks is 2 with 2 datacenters.
We want the writes spread across all the nodes but we wanted the reads isolated from the writes to keep the load on that node low and to be able to identify problems in the consumers (reads) or producers (writes) applications.
It looks like that each rack contains an entire copy of the data so this would lead to replicate for each rack and then for each node the information. If I am correct if we have  a keyspace with 100GB and Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
If I had only one rack across 2 or even 3 availability zone I would save in space and I would have 300GB only. Please correct me if I am wrong.

Best,

Sergio


Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <rp...@tripadvisor.com>> ha scritto:
Datacenters and racks are different concepts.  While they don't have to be associated with their historical meanings, the historical meanings probably provide a helpful model for understanding what you want from them.

When companies own their own physical servers and have them housed somewhere, the questions arise on where you want to locate any particular server.  It's a balancing act on things like network speed of related servers being able to talk to each other, versus fault-tolerance of having many servers not all exposed to the same risks.

"Same rack" in that physical world tended to mean something like "all behind the same network switch and all sharing the same power bus".  The morning after an electrical glitch fries a power bus and thus everything in that rack, you realize you wished you didn't have so many of the same type of server together.  Well, they were servers.  Now they are door stops.  Badness and sadness.

That's kind of the mindset to have in mind with racks in Cassandra.  It's an artifact for you to separate servers into pools so that the disparate pools have hopefully somewhat independent infrastructure risks.  However, all those servers are still doing the same kind of work, are the same version, etc.

Datacenters are amalgams of those racks, and how similar or different they are from each other depends on what you want to do with them.  What is true is that if you have N datacenters, each one of them must have enough disk storage to house all the data.  The actual physical footprint of that data in each DC depends on the replication factors in play.

Note that you sorta can't have "one datacenter for writes" because the writes will replicate across the data centers.  You could definitely choose to have only one that takes read queries, but best to think of writing as being universal.  One scenario you can have is where the DC not taking live traffic read queries is the one you use for maintenance or performance testing or version upgrades.

One rack makes your life easier if you don't have a reason for multiple racks. It depends on the environment you deploy into and your fault tolerance goals.  If you were in AWS and wanting to spread risk across availability zones, then you would likely have as many racks as AZs you choose to be in, because that's really the point of using multiple AZs.

R


On 10/23/19, 4:06 AM, "Sergio Bilello" <la...@gmail.com>> wrote:

     Message from External Sender

    Hello guys!

    I was reading about https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=

    I would like to understand a concept related to the node load balancing.

    I know that Jon recommends Vnodes = 4 but right now I found a cluster with vnodes = 256 replication factor = 3 and 2 racks. This is unbalanced because the racks are not a multiplier of the replication factor.

    However, my plan is to move all the nodes in a single rack to eventually scale up and down the node in the cluster once at the time.

    If I had 3 racks and I would like to keep the things balanced I should scale up 3 nodes at the time one for each rack.

    If I would have 3 racks, should I have also 3 different datacenters so one datacenter for each rack?

    Can I have 2 datacenters and 3 racks? If this is possible one datacenter would have more nodes than the others? Could it be a problem?

    I am thinking to split my cluster in one datacenter for reads and one for writes and keep all the nodes in the same rack so I can scale up once node at the time.



    Please correct me if I am wrong



    Thanks,



    Sergio



    ---------------------------------------------------------------------

    To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org>

    For additional commands, e-mail: user-help@cassandra.apache.org<ma...@cassandra.apache.org>





Re: Cassandra Rack - Datacenter Load Balancing relations

Posted by Sergio <la...@gmail.com>.
Hi Reid,

Thanks for your reply. I really appreciate your explanation.

We are in AWS and we are using right now 2 Availability Zone and not 3. We
found our cluster really unbalanced because the keyspace has a replication
factor = 3 and the number of racks is 2 with 2 datacenters.
We want the writes spread across all the nodes but we wanted the reads
isolated from the writes to keep the load on that node low and to be able
to identify problems in the consumers (reads) or producers (writes)
applications.
It looks like that each rack contains an entire copy of the data so this
would lead to replicate for each rack and then for each node the
information. If I am correct if we have  a keyspace with 100GB and
Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
If I had only one rack across 2 or even 3 availability zone I would save in
space and I would have 300GB only. Please correct me if I am wrong.

Best,

Sergio



Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <
rpinchback@tripadvisor.com> ha scritto:

> Datacenters and racks are different concepts.  While they don't have to be
> associated with their historical meanings, the historical meanings probably
> provide a helpful model for understanding what you want from them.
>
> When companies own their own physical servers and have them housed
> somewhere, the questions arise on where you want to locate any particular
> server.  It's a balancing act on things like network speed of related
> servers being able to talk to each other, versus fault-tolerance of having
> many servers not all exposed to the same risks.
>
> "Same rack" in that physical world tended to mean something like "all
> behind the same network switch and all sharing the same power bus".  The
> morning after an electrical glitch fries a power bus and thus everything in
> that rack, you realize you wished you didn't have so many of the same type
> of server together.  Well, they were servers.  Now they are door stops.
> Badness and sadness.
>
> That's kind of the mindset to have in mind with racks in Cassandra.  It's
> an artifact for you to separate servers into pools so that the disparate
> pools have hopefully somewhat independent infrastructure risks.  However,
> all those servers are still doing the same kind of work, are the same
> version, etc.
>
> Datacenters are amalgams of those racks, and how similar or different they
> are from each other depends on what you want to do with them.  What is true
> is that if you have N datacenters, each one of them must have enough disk
> storage to house all the data.  The actual physical footprint of that data
> in each DC depends on the replication factors in play.
>
> Note that you sorta can't have "one datacenter for writes" because the
> writes will replicate across the data centers.  You could definitely choose
> to have only one that takes read queries, but best to think of writing as
> being universal.  One scenario you can have is where the DC not taking live
> traffic read queries is the one you use for maintenance or performance
> testing or version upgrades.
>
> One rack makes your life easier if you don't have a reason for multiple
> racks. It depends on the environment you deploy into and your fault
> tolerance goals.  If you were in AWS and wanting to spread risk across
> availability zones, then you would likely have as many racks as AZs you
> choose to be in, because that's really the point of using multiple AZs.
>
> R
>
>
> On 10/23/19, 4:06 AM, "Sergio Bilello" <la...@gmail.com> wrote:
>
>      Message from External Sender
>
>     Hello guys!
>
>     I was reading about
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=
>
>     I would like to understand a concept related to the node load
> balancing.
>
>     I know that Jon recommends Vnodes = 4 but right now I found a cluster
> with vnodes = 256 replication factor = 3 and 2 racks. This is unbalanced
> because the racks are not a multiplier of the replication factor.
>
>     However, my plan is to move all the nodes in a single rack to
> eventually scale up and down the node in the cluster once at the time.
>
>     If I had 3 racks and I would like to keep the things balanced I should
> scale up 3 nodes at the time one for each rack.
>
>     If I would have 3 racks, should I have also 3 different datacenters so
> one datacenter for each rack?
>
>     Can I have 2 datacenters and 3 racks? If this is possible one
> datacenter would have more nodes than the others? Could it be a problem?
>
>     I am thinking to split my cluster in one datacenter for reads and one
> for writes and keep all the nodes in the same rack so I can scale up once
> node at the time.
>
>
>
>     Please correct me if I am wrong
>
>
>
>     Thanks,
>
>
>
>     Sergio
>
>
>
>     ---------------------------------------------------------------------
>
>     To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>
>     For additional commands, e-mail: user-help@cassandra.apache.org
>
>
>
>
>
>

Re: Cassandra Rack - Datacenter Load Balancing relations

Posted by Reid Pinchback <rp...@tripadvisor.com>.
Datacenters and racks are different concepts.  While they don't have to be associated with their historical meanings, the historical meanings probably provide a helpful model for understanding what you want from them.

When companies own their own physical servers and have them housed somewhere, the questions arise on where you want to locate any particular server.  It's a balancing act on things like network speed of related servers being able to talk to each other, versus fault-tolerance of having many servers not all exposed to the same risks.  

"Same rack" in that physical world tended to mean something like "all behind the same network switch and all sharing the same power bus".  The morning after an electrical glitch fries a power bus and thus everything in that rack, you realize you wished you didn't have so many of the same type of server together.  Well, they were servers.  Now they are door stops.  Badness and sadness.  

That's kind of the mindset to have in mind with racks in Cassandra.  It's an artifact for you to separate servers into pools so that the disparate pools have hopefully somewhat independent infrastructure risks.  However, all those servers are still doing the same kind of work, are the same version, etc.

Datacenters are amalgams of those racks, and how similar or different they are from each other depends on what you want to do with them.  What is true is that if you have N datacenters, each one of them must have enough disk storage to house all the data.  The actual physical footprint of that data in each DC depends on the replication factors in play.

Note that you sorta can't have "one datacenter for writes" because the writes will replicate across the data centers.  You could definitely choose to have only one that takes read queries, but best to think of writing as being universal.  One scenario you can have is where the DC not taking live traffic read queries is the one you use for maintenance or performance testing or version upgrades.

One rack makes your life easier if you don't have a reason for multiple racks. It depends on the environment you deploy into and your fault tolerance goals.  If you were in AWS and wanting to spread risk across availability zones, then you would likely have as many racks as AZs you choose to be in, because that's really the point of using multiple AZs.

R


On 10/23/19, 4:06 AM, "Sergio Bilello" <la...@gmail.com> wrote:

     Message from External Sender
    
    Hello guys!
    
    I was reading about https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e= 
    
    I would like to understand a concept related to the node load balancing.
    
    I know that Jon recommends Vnodes = 4 but right now I found a cluster with vnodes = 256 replication factor = 3 and 2 racks. This is unbalanced because the racks are not a multiplier of the replication factor.
    
    However, my plan is to move all the nodes in a single rack to eventually scale up and down the node in the cluster once at the time. 
    
    If I had 3 racks and I would like to keep the things balanced I should scale up 3 nodes at the time one for each rack.
    
    If I would have 3 racks, should I have also 3 different datacenters so one datacenter for each rack? 
    
    Can I have 2 datacenters and 3 racks? If this is possible one datacenter would have more nodes than the others? Could it be a problem?
    
    I am thinking to split my cluster in one datacenter for reads and one for writes and keep all the nodes in the same rack so I can scale up once node at the time.
    
    
    
    Please correct me if I am wrong
    
    
    
    Thanks,
    
    
    
    Sergio
    
    
    
    ---------------------------------------------------------------------
    
    To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
    
    For additional commands, e-mail: user-help@cassandra.apache.org