You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Erick Ramirez <er...@datastax.com> on 2021/03/01 12:15:43 UTC

Re: Cassandra on arm aws instances

The instance types you refer to are contradictory so I'm not really sure if
this is really about Arm-based servers. The i3en-vs-r6 is not an
apples-for-apples comparison.

The R6g type is EBS-only so they will perform significantly worse than i3
instances. R6gd come with NVMe SSDs but they are disproportionately small
compared to the CPU+RAM they have. For example, a r6gd.2xlarge which has 8
cores + 64GB RAM only has a 474GB NVMe SSD so they're not a good back for
the buck.

On the other hand, i3en instances are intended for dense storage. I'd
discourage you from choosing this type since it will be tempting to have
dense nodes and are problematic when it comes to operations such as
bootstrapping, decommissions and running repairs. For example, an
i3en.2xlarge with 8 cores + 64GB RAM can potentially have 5TB of disks (2 x
2.5TB NVMe SSDs).

In my experience, i3 instances are the optimal choice such as i3.2xlarge. I
think 8 cores + 61GB RAM + 1.9TB NVMe SSD is the sweet spot for price and
performance. Cheers!

Re: Cassandra on arm aws instances

Posted by Gil Ganz <gi...@gmail.com>.
I think the value of the r6gd (assuming cpu is good compared to intel) is
more cpu, not disk.
I'm not running on spark the cassandra servers, and having more cpu cores
in my cluster will sure help. It all depends on the workloads, some
workloads need more io, some cpu.

i3 servers are great servers, they are currently our servers in production,
be it i3.4xlarge in one env, or i3.2xlarge in another, but the thing is we
want to use less servers, and sadly there isn't a i3en.4xlarge server, so I
started to look for other options.

Take your example, 10.7k for 32cores and 256gb memory vs 3.9k for 8cores
and 60gb , you can clearly see you pay less for each core and each gb of
memory. Why do you see I will only use part of the power? My bottleneck
today is cpu, the more writes I have the more cpu my servers use. All
assuming Arm will give me good performance, but that's another discussion.

Regarding moving some of the data to EBS, the plan is using symbolic links
for a specific set of tables, I've done it in the past, it's not ideal, but
if it will allow me to save a lot of money by using a different server
class, I would do it.


On Mon, Mar 1, 2021 at 11:20 PM Erick Ramirez <er...@datastax.com>
wrote:

> it's not the same, notice I wrote r6gd, these are the ones with nvme, i'm
>> looking just at those.
>>
>
> I'm aware. I did use r6gd.2xlarge in my example. :)
>
>
>> I do not need all the space that i3en gives me (and probably won't be
>> able to use it all due to memory usage, or have other issues just like you
>> mention), so the plan is use the big enough r6gd nodes, such as
>> r6gd.8xlarge, it has 1.9tb nvme, it should good enough for my needs
>>
>
> I feel like we have a disconnect here. :) You won't get value from the
> r6gd.8xlarge. You're paying for 32 cores + 256GB RAM which are mostly
> unusable to you unless you have a configuration where Spark is co-located
> with C* on the servers. It's the equivalent of using a truck to transport 2
> boxes when a car will suffice.
>
> From a dollar perspective, you're opting to pay $10,714/yr for
> a r6gd.8xlarge (I arbitrarily picked a standard 1-year term in West coast)
> versus $3839/year for an i3.2xlarge just because you want Arm but will end
> up using just a quarter (maybe half if I'm generous) of the compute power.
> It doesn't stack up for me. But YMMV. :)
>
>
>> (I would also add that a big chunk of the data that is not read that
>> frequently, so I might be ok with putting a specific set of tables on EBS)
>>
>
> Interestingly, how do you plan to configure that? Unless I'm mistaken, C*
> doesn't support tiered storage. Cheers!
>

Re: Cassandra on arm aws instances

Posted by Erick Ramirez <er...@datastax.com>.
>
> it's not the same, notice I wrote r6gd, these are the ones with nvme, i'm
> looking just at those.
>

I'm aware. I did use r6gd.2xlarge in my example. :)


> I do not need all the space that i3en gives me (and probably won't be able
> to use it all due to memory usage, or have other issues just like you
> mention), so the plan is use the big enough r6gd nodes, such as
> r6gd.8xlarge, it has 1.9tb nvme, it should good enough for my needs
>

I feel like we have a disconnect here. :) You won't get value from the
r6gd.8xlarge. You're paying for 32 cores + 256GB RAM which are mostly
unusable to you unless you have a configuration where Spark is co-located
with C* on the servers. It's the equivalent of using a truck to transport 2
boxes when a car will suffice.

From a dollar perspective, you're opting to pay $10,714/yr for
a r6gd.8xlarge (I arbitrarily picked a standard 1-year term in West coast)
versus $3839/year for an i3.2xlarge just because you want Arm but will end
up using just a quarter (maybe half if I'm generous) of the compute power.
It doesn't stack up for me. But YMMV. :)


> (I would also add that a big chunk of the data that is not read that
> frequently, so I might be ok with putting a specific set of tables on EBS)
>

Interestingly, how do you plan to configure that? Unless I'm mistaken, C*
doesn't support tiered storage. Cheers!

Re: Cassandra on arm aws instances

Posted by Gil Ganz <gi...@gmail.com>.
it's not the same, notice I wrote r6gd, these are the ones with nvme, i'm
looking just at those.
I do not need all the space that i3en gives me (and probably won't be able
to use it all due to memory usage, or have other issues just like you
mention), so the plan is use the big enough r6gd nodes, such as
r6gd.8xlarge, it has 1.9tb nvme, it should good enough for my needs (I
would also add that a big chunk of the data that is not read that
frequently, so I might be ok with putting a specific set of tables on EBS)

We are currently running with i3 servers, as you mention, they are indeed
the sweet spot, but we are looking to move to bigger nodes.

On Mon, Mar 1, 2021 at 2:16 PM Erick Ramirez <er...@datastax.com>
wrote:

> The instance types you refer to are contradictory so I'm not really sure
> if this is really about Arm-based servers. The i3en-vs-r6 is not an
> apples-for-apples comparison.
>
> The R6g type is EBS-only so they will perform significantly worse than i3
> instances. R6gd come with NVMe SSDs but they are disproportionately small
> compared to the CPU+RAM they have. For example, a r6gd.2xlarge which has 8
> cores + 64GB RAM only has a 474GB NVMe SSD so they're not a good back for
> the buck.
>
> On the other hand, i3en instances are intended for dense storage. I'd
> discourage you from choosing this type since it will be tempting to have
> dense nodes and are problematic when it comes to operations such as
> bootstrapping, decommissions and running repairs. For example, an
> i3en.2xlarge with 8 cores + 64GB RAM can potentially have 5TB of disks (2 x
> 2.5TB NVMe SSDs).
>
> In my experience, i3 instances are the optimal choice such as i3.2xlarge.
> I think 8 cores + 61GB RAM + 1.9TB NVMe SSD is the sweet spot for price and
> performance. Cheers!
>