You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Eric Plowe <er...@gmail.com> on 2016/01/30 01:33:18 UTC

EC2 storage options for C*

My company is planning on rolling out a C* cluster in EC2. We are thinking
about going with ephemeral SSDs. The question is this: Should we put two in
RAID 0 or just go with one? We currently run a cluster in our data center
with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the
performance we are seeing thus far.

Thanks!

Eric

Re: EC2 storage options for C*

Posted by Eric Plowe <er...@gmail.com>.

RAID 0 regardless of instance type*

On Friday, January 29, 2016, Eric Plowe <er...@gmail.com> wrote:

> Bryan,
>
> Correct, I should have clarified that. I'm evaluating instance types based
> on one SSD or two in RAID 0. I thinking its going to be two in RAID 0,
> but as I've had no experience running a production C* cluster in EC2, I
> wanted to reach out to the list.
>
> Sorry for the half-baked question :)
>
> Eric
>
> On Friday, January 29, 2016, Bryan Cheng <bryan@blockcypher.com
> <javascript:_e(%7B%7D,'cvml','bryan@blockcypher.com');>> wrote:
>
>> Do you have any idea what kind of disk performance you need?
>>
>> Cassandra with RAID 0 is a fairly common configuration (Al's awesome
>> tuning guide has a blurb on it
>> https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html), so
>> if you feel comfortable with the operational overhead it seems like a solid
>> choice.
>>
>> To clarify, though,  by "just one", do you mean just using one of two
>> available ephemeral disks available to the instance, or are you evaluating
>> different instance types based on one disk vs two?
>>
>> On Fri, Jan 29, 2016 at 4:33 PM, Eric Plowe <er...@gmail.com> wrote:
>>
>>> My company is planning on rolling out a C* cluster in EC2. We are
>>> thinking about going with ephemeral SSDs. The question is this: Should we
>>> put two in RAID 0 or just go with one? We currently run a cluster in our
>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>> the performance we are seeing thus far.
>>>
>>> Thanks!
>>>
>>> Eric
>>>
>>
>>

Re: EC2 storage options for C*

Posted by Eric Plowe <er...@gmail.com>.

Bryan,

Correct, I should have clarified that. I'm evaluating instance types based
on one SSD or two in RAID 0. I thinking its going to be two in RAID 0, but
as I've had no experience running a production C* cluster in EC2, I wanted
to reach out to the list.

Sorry for the half-baked question :)

Eric

On Friday, January 29, 2016, Bryan Cheng <br...@blockcypher.com> wrote:

> Do you have any idea what kind of disk performance you need?
>
> Cassandra with RAID 0 is a fairly common configuration (Al's awesome
> tuning guide has a blurb on it
> https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html), so if
> you feel comfortable with the operational overhead it seems like a solid
> choice.
>
> To clarify, though,  by "just one", do you mean just using one of two
> available ephemeral disks available to the instance, or are you evaluating
> different instance types based on one disk vs two?
>
> On Fri, Jan 29, 2016 at 4:33 PM, Eric Plowe <eric.plowe@gmail.com
> <javascript:_e(%7B%7D,'cvml','eric.plowe@gmail.com');>> wrote:
>
>> My company is planning on rolling out a C* cluster in EC2. We are
>> thinking about going with ephemeral SSDs. The question is this: Should we
>> put two in RAID 0 or just go with one? We currently run a cluster in our
>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>> the performance we are seeing thus far.
>>
>> Thanks!
>>
>> Eric
>>
>
>

Re: EC2 storage options for C*

Posted by Bryan Cheng <br...@blockcypher.com>.

Do you have any idea what kind of disk performance you need?

Cassandra with RAID 0 is a fairly common configuration (Al's awesome tuning
guide has a blurb on it
https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html), so if
you feel comfortable with the operational overhead it seems like a solid
choice.

To clarify, though,  by "just one", do you mean just using one of two
available ephemeral disks available to the instance, or are you evaluating
different instance types based on one disk vs two?

On Fri, Jan 29, 2016 at 4:33 PM, Eric Plowe <er...@gmail.com> wrote:

> My company is planning on rolling out a C* cluster in EC2. We are thinking
> about going with ephemeral SSDs. The question is this: Should we put two in
> RAID 0 or just go with one? We currently run a cluster in our data center
> with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the
> performance we are seeing thus far.
>
> Thanks!
>
> Eric
>

Re: EC2 storage options for C*

Posted by Jack Krupansky <ja...@gmail.com>.

Oops... that was supposed to be "not a fan of video"! I have no problem
with the guys in the video!

-- Jack Krupansky

On Mon, Feb 1, 2016 at 8:51 AM, Jack Krupansky <ja...@gmail.com>
wrote:

> I'm not a fan of guy - this appears to be the slideshare corresponding to
> the video:
>
> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>
> My apologies if my questions are actually answered on the video or slides,
> I just did a quick scan of the slide text.
>
> I'm curious where the EBS physical devices actually reside - are they in
> the same rack, the same data center, same availability zone? I mean, people
> try to minimize network latency between nodes, so how exactly is EBS able
> to avoid network latency?
>
> Did your test use Amazon EBS–Optimized Instances?
>
> SSD or magnetic or does it make any difference?
>
> What info is available on EBS performance at peak times, when multiple AWS
> customers have spikes of demand?
>
> Is RAID much of a factor or help at all using EBS?
>
> How exactly is EBS provisioned in terms of its own HA - I mean, with a
> properly configured Cassandra cluster RF provides HA, so what is the
> equivalent for EBS? If I have RF=3, what assurance is there that those
> three EBS volumes aren't all in the same physical rack?
>
> For multi-data center operation, what configuration options assure that
> the EBS volumes for each DC are truly physically separated?
>
> In terms of syncing data for the commit log, if the OS call to sync an EBS
> volume returns, is the commit log data absolutely 100% synced at the
> hardware level on the EBS end, such that a power failure of the systems on
> which the EBS volumes reside will still guarantee availability of the
> fsynced data. As well, is return from fsync an absolute guarantee of
> sstable durability when Cassandra is about to delete the commit log,
> including when the two are on different volumes? In practice, we would like
> some significant degree of pipelining of data, such as during the full
> processing of flushing memtables, but for the fsync at the end a solid
> guarantee is needed.
>
>
> -- Jack Krupansky
>
> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com> wrote:
>
>> Jeff,
>>
>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>> discounting EBS, but prior outages are worrisome.
>>
>>
>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>> wrote:
>>
>>> Free to choose what you'd like, but EBS outages were also addressed in
>>> that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the
>>> same as 2011 EBS.
>>>
>>> --
>>> Jeff Jirsa
>>>
>>>
>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com> wrote:
>>>
>>> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral.
>>> GP2 after testing is a viable contender for our workload. The only worry I
>>> have is EBS outages, which have happened.
>>>
>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>> wrote:
>>>
>>>> Also in that video - it's long but worth watching
>>>>
>>>> We tested up to 1M reads/second as well, blowing out page cache to
>>>> ensure we weren't "just" reading from memory
>>>>
>>>>
>>>>
>>>> --
>>>> Jeff Jirsa
>>>>
>>>>
>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <ja...@gmail.com>
>>>> wrote:
>>>>
>>>> How about reads? Any differences between read-intensive and
>>>> write-intensive workloads?
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <jeff.jirsa@crowdstrike.com
>>>> > wrote:
>>>>
>>>>> Hi John,
>>>>>
>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M
>>>>> writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>> necessary.
>>>>>
>>>>>
>>>>>
>>>>> From: John Wong
>>>>> Reply-To: "user@cassandra.apache.org"
>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>> To: "user@cassandra.apache.org"
>>>>> Subject: Re: EC2 storage options for C*
>>>>>
>>>>> For production I'd stick with ephemeral disks (aka instance storage)
>>>>> if you have running a lot of transaction.
>>>>> However, for regular small testing/qa cluster, or something you know
>>>>> you want to reload often, EBS is definitely good enough and we haven't had
>>>>> issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>>
>>>>> But Jeff, kudo that you are able to use EBS. I didn't go through the
>>>>> video, do you actually use PIOPS or just standard GP2 in your production
>>>>> cluster?
>>>>>
>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com>
>>>>> wrote:
>>>>>
>>>>>> Yep, that motivated my question "Do you have any idea what kind of
>>>>>> disk performance you need?". If you need the performance, its hard to beat
>>>>>> ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>>
>>>>>> Personally, on small clusters like ours (12 nodes), we've found our
>>>>>> choice of instance dictated much more by the balance of price, CPU, and
>>>>>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>>>>>> rarely the bottleneck. YMMV, of course.
>>>>>>
>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>
>>>>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>>>>> instances with GP2 EBS.  When you don’t care about replacing a node because
>>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>>>>>> capable of amazing things, and greatly simplifies life.
>>>>>>>
>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very
>>>>>>> much a viable option, despite any old documents online that say otherwise.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> From: Eric Plowe
>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>> To: "user@cassandra.apache.org"
>>>>>>> Subject: EC2 storage options for C*
>>>>>>>
>>>>>>> My company is planning on rolling out a C* cluster in EC2. We are
>>>>>>> thinking about going with ephemeral SSDs. The question is this: Should we
>>>>>>> put two in RAID 0 or just go with one? We currently run a cluster in our
>>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>>>>> the performance we are seeing thus far.
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> Eric
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>

Re: EC2 storage options for C*

Posted by Steve Robenalt <sr...@highwire.org>.

Yeah, I would only start a previous-generation instance for non-production
purposes at this point and I suspect that many of them will be retired
sooner or later.

Given a choice, I'd use i2 instances for most purposes, and use the d2
instances where storage volume and access patterns demanded it.

I have seen reports of other instances running SSD-backed EBS. Not sure if
the network lag is a significant issue in that case or not, but it would
open up other options for Cassandra clusters.

On Mon, Feb 1, 2016 at 1:55 PM, Jack Krupansky <ja...@gmail.com>
wrote:

> Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2
> Dense Storage".
>
> The remaining question is whether any of the "Previous Generation
> Instances" should be publicly recommended going forward.
>
> And whether non-SSD instances should be recommended going forward as well.
> sure, technically, someone could use the legacy instances, but the question
> is what we should be recommending as best practice going forward.
>
> Yeah, the i2 instances look like the sweet spot for any non-EBS clusters.
>
> -- Jack Krupansky
>
> On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <sr...@highwire.org>
> wrote:
>
>> Hi Jack,
>>
>> At the bottom of the instance-types page, there is a link to the previous
>> generations, which includes the older series (m1, m2, etc), many of which
>> have HDD options.
>>
>> There are also the d2 (Dense Storage) instances in the current generation
>> that include various combos of local HDDs.
>>
>> The i2 series has good sized SSDs available, and has the advanced
>> networking option, which is also useful for Cassandra. The enhanced
>> networking is available with other instance types as well, as you'll see on
>> the feature list under each type.
>>
>> Steve
>>
>>
>>
>> On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <ja...@gmail.com>
>> wrote:
>>
>>> Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic
>>> question, it seems like magnetic (HDD) is no longer a recommended storage
>>> option for databases on AWS. In particular, only the C2 Dense Storage
>>> instances have local magnetic storage - all the other instance types are
>>> SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent Data
>>> Access."
>>>
>>> For the record, that AWS doc has Cassandra listed as a use case for i2
>>> instance types.
>>>
>>> Also, the AWS doc lists EBS io2 for the NoSQL database use case and gp2
>>> only for the "small to medium databases" use case.
>>>
>>> Do older instances with local HDD still exist on AWS (m1, m2, etc.)? Is
>>> the doc simply for any newly started instances?
>>>
>>> See:
>>> https://aws.amazon.com/ec2/instance-types/
>>> http://aws.amazon.com/ebs/details/
>>>
>>>
>>> -- Jack Krupansky
>>>
>>> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <je...@crowdstrike.com>
>>> wrote:
>>>
>>>> > My apologies if my questions are actually answered on the video or
>>>> slides, I just did a quick scan of the slide text.
>>>>
>>>> Virtually all of them are covered.
>>>>
>>>> > I'm curious where the EBS physical devices actually reside - are they
>>>> in the same rack, the same data center, same availability zone? I mean,
>>>> people try to minimize network latency between nodes, so how exactly is EBS
>>>> able to avoid network latency?
>>>>
>>>> Not published,and probably not a straight forward answer (probably have
>>>> redundancy cross-az, if it matches some of their other published
>>>> behaviors). The promise they give you is ‘iops’, with a certain block size.
>>>> Some instance types are optimized for dedicated, ebs-only network
>>>> interfaces. Like most things in cassandra / cloud, the only way to know for
>>>> sure is to test it yourself and see if observed latency is acceptable (or
>>>> trust our testing, if you assume we’re sufficiently smart and honest).
>>>>
>>>> > Did your test use Amazon EBS–Optimized Instances?
>>>>
>>>> We tested dozens of instance type/size combinations (literally). The
>>>> best performance was clearly with ebs-optimized instances that also have
>>>> enhanced networking (c4, m4, etc) - slide 43
>>>>
>>>> > SSD or magnetic or does it make any difference?
>>>>
>>>> SSD, GP2 (slide 64)
>>>>
>>>> > What info is available on EBS performance at peak times, when
>>>> multiple AWS customers have spikes of demand?
>>>>
>>>> Not published, but experiments show that we can hit 10k iops all day
>>>> every day with only trivial noisy neighbor problems, not enough to impact a
>>>> real cluster (slide 58)
>>>>
>>>> > Is RAID much of a factor or help at all using EBS?
>>>>
>>>> You can use RAID to get higher IOPS than you’d normally get by default
>>>> (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you need more
>>>> than 10k, you can stripe volumes together up to the ebs network link max)
>>>> (hinted at in slide 64)
>>>>
>>>> > How exactly is EBS provisioned in terms of its own HA - I mean, with
>>>> a properly configured Cassandra cluster RF provides HA, so what is the
>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>> three EBS volumes aren't all in the same physical rack?
>>>>
>>>> There is HA, I’m not sure that AWS publishes specifics. Occasionally
>>>> specific volumes will have issues (hypervisor’s dedicated ethernet link to
>>>> EBS network fails, for example). Occasionally instances will have issues.
>>>> The volume-specific issues seem to be less common than the instance-store
>>>> “instance retired” or “instance is running on degraded hardware” events.
>>>> Stop/Start and you’ve recovered (possible with EBS, not possible with
>>>> instance store). The assurances are in AWS’ SLA – if the SLA is
>>>> insufficient (and it probably is insufficient), use more than one AZ and/or
>>>> AWS region or cloud vendor.
>>>>
>>>> > For multi-data center operation, what configuration options assure
>>>> that the EBS volumes for each DC are truly physically separated?
>>>>
>>>> It used to be true that EBS control plane for a given region spanned
>>>> AZs. That’s no longer true. AWS asserts that failure modes for each AZ are
>>>> isolated (data may replicate between AZs, but a full outage in us-east-1a
>>>> shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide 65
>>>>
>>>> > In terms of syncing data for the commit log, if the OS call to sync
>>>> an EBS volume returns, is the commit log data absolutely 100% synced at the
>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>> which the EBS volumes reside will still guarantee availability of the
>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>> sstable durability when Cassandra is about to delete the commit log,
>>>> including when the two are on different volumes? In practice, we would like
>>>> some significant degree of pipelining of data, such as during the full
>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>> guarantee is needed.
>>>>
>>>> Most of the answers in this block are “probably not 100%, you should be
>>>> writing to more than one host/AZ/DC/vendor to protect your organization
>>>> from failures”. AWS targets something like 0.1% annual failure rate per
>>>> volume and 99.999% availability (slide 66). We believe they’re exceeding
>>>> those goals (at least based with the petabytes of data we have on gp2
>>>> volumes).
>>>>
>>>>
>>>>
>>>> From: Jack Krupansky
>>>> Reply-To: "user@cassandra.apache.org"
>>>> Date: Monday, February 1, 2016 at 5:51 AM
>>>>
>>>> To: "user@cassandra.apache.org"
>>>> Subject: Re: EC2 storage options for C*
>>>>
>>>> I'm not a fan of guy - this appears to be the slideshare corresponding
>>>> to the video:
>>>>
>>>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>>>>
>>>> My apologies if my questions are actually answered on the video or
>>>> slides, I just did a quick scan of the slide text.
>>>>
>>>> I'm curious where the EBS physical devices actually reside - are they
>>>> in the same rack, the same data center, same availability zone? I mean,
>>>> people try to minimize network latency between nodes, so how exactly is EBS
>>>> able to avoid network latency?
>>>>
>>>> Did your test use Amazon EBS–Optimized Instances?
>>>>
>>>> SSD or magnetic or does it make any difference?
>>>>
>>>> What info is available on EBS performance at peak times, when multiple
>>>> AWS customers have spikes of demand?
>>>>
>>>> Is RAID much of a factor or help at all using EBS?
>>>>
>>>> How exactly is EBS provisioned in terms of its own HA - I mean, with a
>>>> properly configured Cassandra cluster RF provides HA, so what is the
>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>> three EBS volumes aren't all in the same physical rack?
>>>>
>>>> For multi-data center operation, what configuration options assure that
>>>> the EBS volumes for each DC are truly physically separated?
>>>>
>>>> In terms of syncing data for the commit log, if the OS call to sync an
>>>> EBS volume returns, is the commit log data absolutely 100% synced at the
>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>> which the EBS volumes reside will still guarantee availability of the
>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>> sstable durability when Cassandra is about to delete the commit log,
>>>> including when the two are on different volumes? In practice, we would like
>>>> some significant degree of pipelining of data, such as during the full
>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>> guarantee is needed.
>>>>
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com>
>>>> wrote:
>>>>
>>>>> Jeff,
>>>>>
>>>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>>>>> discounting EBS, but prior outages are worrisome.
>>>>>
>>>>>
>>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>>> wrote:
>>>>>
>>>>>> Free to choose what you'd like, but EBS outages were also addressed
>>>>>> in that video (second half, discussion by Dennis Opacki). 2016 EBS isn't
>>>>>> the same as 2011 EBS.
>>>>>>
>>>>>> --
>>>>>> Jeff Jirsa
>>>>>>
>>>>>>
>>>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com> wrote:
>>>>>>
>>>>>> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral.
>>>>>> GP2 after testing is a viable contender for our workload. The only worry I
>>>>>> have is EBS outages, which have happened.
>>>>>>
>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Also in that video - it's long but worth watching
>>>>>>>
>>>>>>> We tested up to 1M reads/second as well, blowing out page cache to
>>>>>>> ensure we weren't "just" reading from memory
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jeff Jirsa
>>>>>>>
>>>>>>>
>>>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <
>>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>>
>>>>>>> How about reads? Any differences between read-intensive and
>>>>>>> write-intensive workloads?
>>>>>>>
>>>>>>> -- Jack Krupansky
>>>>>>>
>>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <
>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>
>>>>>>>> Hi John,
>>>>>>>>
>>>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M
>>>>>>>> writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>>>>> necessary.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> From: John Wong
>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>
>>>>>>>> For production I'd stick with ephemeral disks (aka instance
>>>>>>>> storage) if you have running a lot of transaction.
>>>>>>>> However, for regular small testing/qa cluster, or something you
>>>>>>>> know you want to reload often, EBS is definitely good enough and we haven't
>>>>>>>> had issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>>>>>
>>>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go through
>>>>>>>> the video, do you actually use PIOPS or just standard GP2 in your
>>>>>>>> production cluster?
>>>>>>>>
>>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <bryan@blockcypher.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Yep, that motivated my question "Do you have any idea what kind
>>>>>>>>> of disk performance you need?". If you need the performance, its hard to
>>>>>>>>> beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>>>>>
>>>>>>>>> Personally, on small clusters like ours (12 nodes), we've found
>>>>>>>>> our choice of instance dictated much more by the balance of price, CPU, and
>>>>>>>>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>>>>>>>>> rarely the bottleneck. YMMV, of course.
>>>>>>>>>
>>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>
>>>>>>>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>>>>>>>> instances with GP2 EBS.  When you don’t care about replacing a node because
>>>>>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>>>>>>>>> capable of amazing things, and greatly simplifies life.
>>>>>>>>>>
>>>>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very
>>>>>>>>>> much a viable option, despite any old documents online that say otherwise.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> From: Eric Plowe
>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>>>>
>>>>>>>>>> My company is planning on rolling out a C* cluster in EC2. We are
>>>>>>>>>> thinking about going with ephemeral SSDs. The question is this: Should we
>>>>>>>>>> put two in RAID 0 or just go with one? We currently run a cluster in our
>>>>>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>>>>>>>> the performance we are seeing thus far.
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>> Eric
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>
>>>
>>
>>
>> --
>> Steve Robenalt
>> Software Architect
>> srobenalt@highwire.org <bz...@highwire.org>
>> (office/cell): 916-505-1785
>>
>> HighWire Press, Inc.
>> 425 Broadway St, Redwood City, CA 94063
>> www.highwire.org
>>
>> Technology for Scholarly Communication
>>
>
>


-- 
Steve Robenalt
Software Architect
srobenalt@highwire.org <bz...@highwire.org>
(office/cell): 916-505-1785

HighWire Press, Inc.
425 Broadway St, Redwood City, CA 94063
www.highwire.org

Technology for Scholarly Communication

Re: EC2 storage options for C*

Posted by Steve Robenalt <sr...@highwire.org>.

Hi Jeff,

I'm going to go back and review your presentation. I missed it at Cassandra
Summit and didn't make it to re:Invent last year. The opinion I voiced was
from my own direct experience. Didn't mean to imply that there weren't
other good options available.

Thanks,
Steve

On Mon, Feb 1, 2016 at 2:12 PM, Jeff Jirsa <je...@crowdstrike.com>
wrote:

> A lot of people use the old gen instances (m1 in particular) because they
> came with a ton of effectively free ephemeral storage (up to 1.6TB).
> Whether or not they’re viable is a decision for each user to make. They’re
> very, very commonly used for C*, though. At a time when EBS was not
> sufficiently robust or reliable, a cluster of m1 instances was the de facto
> standard.
>
> The canonical “best practice” in 2015 was i2. We believe we’ve made a
> compelling argument to use m4 or c4 instead of i2. There exists a company
> we know currently testing d2 at scale, though I’m not sure they have much
> in terms of concrete results at this time.
>
> - Jeff
>
> From: Jack Krupansky
> Reply-To: "user@cassandra.apache.org"
> Date: Monday, February 1, 2016 at 1:55 PM
>
> To: "user@cassandra.apache.org"
> Subject: Re: EC2 storage options for C*
>
> Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2
> Dense Storage".
>
> The remaining question is whether any of the "Previous Generation
> Instances" should be publicly recommended going forward.
>
> And whether non-SSD instances should be recommended going forward as well.
> sure, technically, someone could use the legacy instances, but the question
> is what we should be recommending as best practice going forward.
>
> Yeah, the i2 instances look like the sweet spot for any non-EBS clusters.
>
> -- Jack Krupansky
>
> On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <sr...@highwire.org>
> wrote:
>
>> Hi Jack,
>>
>> At the bottom of the instance-types page, there is a link to the previous
>> generations, which includes the older series (m1, m2, etc), many of which
>> have HDD options.
>>
>> There are also the d2 (Dense Storage) instances in the current generation
>> that include various combos of local HDDs.
>>
>> The i2 series has good sized SSDs available, and has the advanced
>> networking option, which is also useful for Cassandra. The enhanced
>> networking is available with other instance types as well, as you'll see on
>> the feature list under each type.
>>
>> Steve
>>
>>
>>
>> On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <ja...@gmail.com>
>> wrote:
>>
>>> Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic
>>> question, it seems like magnetic (HDD) is no longer a recommended storage
>>> option for databases on AWS. In particular, only the C2 Dense Storage
>>> instances have local magnetic storage - all the other instance types are
>>> SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent Data
>>> Access."
>>>
>>> For the record, that AWS doc has Cassandra listed as a use case for i2
>>> instance types.
>>>
>>> Also, the AWS doc lists EBS io2 for the NoSQL database use case and gp2
>>> only for the "small to medium databases" use case.
>>>
>>> Do older instances with local HDD still exist on AWS (m1, m2, etc.)? Is
>>> the doc simply for any newly started instances?
>>>
>>> See:
>>> https://aws.amazon.com/ec2/instance-types/
>>> http://aws.amazon.com/ebs/details/
>>>
>>>
>>> -- Jack Krupansky
>>>
>>> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <je...@crowdstrike.com>
>>> wrote:
>>>
>>>> > My apologies if my questions are actually answered on the video or
>>>> slides, I just did a quick scan of the slide text.
>>>>
>>>> Virtually all of them are covered.
>>>>
>>>> > I'm curious where the EBS physical devices actually reside - are they
>>>> in the same rack, the same data center, same availability zone? I mean,
>>>> people try to minimize network latency between nodes, so how exactly is EBS
>>>> able to avoid network latency?
>>>>
>>>> Not published,and probably not a straight forward answer (probably have
>>>> redundancy cross-az, if it matches some of their other published
>>>> behaviors). The promise they give you is ‘iops’, with a certain block size.
>>>> Some instance types are optimized for dedicated, ebs-only network
>>>> interfaces. Like most things in cassandra / cloud, the only way to know for
>>>> sure is to test it yourself and see if observed latency is acceptable (or
>>>> trust our testing, if you assume we’re sufficiently smart and honest).
>>>>
>>>> > Did your test use Amazon EBS–Optimized Instances?
>>>>
>>>> We tested dozens of instance type/size combinations (literally). The
>>>> best performance was clearly with ebs-optimized instances that also have
>>>> enhanced networking (c4, m4, etc) - slide 43
>>>>
>>>> > SSD or magnetic or does it make any difference?
>>>>
>>>> SSD, GP2 (slide 64)
>>>>
>>>> > What info is available on EBS performance at peak times, when
>>>> multiple AWS customers have spikes of demand?
>>>>
>>>> Not published, but experiments show that we can hit 10k iops all day
>>>> every day with only trivial noisy neighbor problems, not enough to impact a
>>>> real cluster (slide 58)
>>>>
>>>> > Is RAID much of a factor or help at all using EBS?
>>>>
>>>> You can use RAID to get higher IOPS than you’d normally get by default
>>>> (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you need more
>>>> than 10k, you can stripe volumes together up to the ebs network link max)
>>>> (hinted at in slide 64)
>>>>
>>>> > How exactly is EBS provisioned in terms of its own HA - I mean, with
>>>> a properly configured Cassandra cluster RF provides HA, so what is the
>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>> three EBS volumes aren't all in the same physical rack?
>>>>
>>>> There is HA, I’m not sure that AWS publishes specifics. Occasionally
>>>> specific volumes will have issues (hypervisor’s dedicated ethernet link to
>>>> EBS network fails, for example). Occasionally instances will have issues.
>>>> The volume-specific issues seem to be less common than the instance-store
>>>> “instance retired” or “instance is running on degraded hardware” events.
>>>> Stop/Start and you’ve recovered (possible with EBS, not possible with
>>>> instance store). The assurances are in AWS’ SLA – if the SLA is
>>>> insufficient (and it probably is insufficient), use more than one AZ and/or
>>>> AWS region or cloud vendor.
>>>>
>>>> > For multi-data center operation, what configuration options assure
>>>> that the EBS volumes for each DC are truly physically separated?
>>>>
>>>> It used to be true that EBS control plane for a given region spanned
>>>> AZs. That’s no longer true. AWS asserts that failure modes for each AZ are
>>>> isolated (data may replicate between AZs, but a full outage in us-east-1a
>>>> shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide 65
>>>>
>>>> > In terms of syncing data for the commit log, if the OS call to sync
>>>> an EBS volume returns, is the commit log data absolutely 100% synced at the
>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>> which the EBS volumes reside will still guarantee availability of the
>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>> sstable durability when Cassandra is about to delete the commit log,
>>>> including when the two are on different volumes? In practice, we would like
>>>> some significant degree of pipelining of data, such as during the full
>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>> guarantee is needed.
>>>>
>>>> Most of the answers in this block are “probably not 100%, you should be
>>>> writing to more than one host/AZ/DC/vendor to protect your organization
>>>> from failures”. AWS targets something like 0.1% annual failure rate per
>>>> volume and 99.999% availability (slide 66). We believe they’re exceeding
>>>> those goals (at least based with the petabytes of data we have on gp2
>>>> volumes).
>>>>
>>>>
>>>>
>>>> From: Jack Krupansky
>>>> Reply-To: "user@cassandra.apache.org"
>>>> Date: Monday, February 1, 2016 at 5:51 AM
>>>>
>>>> To: "user@cassandra.apache.org"
>>>> Subject: Re: EC2 storage options for C*
>>>>
>>>> I'm not a fan of guy - this appears to be the slideshare corresponding
>>>> to the video:
>>>>
>>>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>>>>
>>>> My apologies if my questions are actually answered on the video or
>>>> slides, I just did a quick scan of the slide text.
>>>>
>>>> I'm curious where the EBS physical devices actually reside - are they
>>>> in the same rack, the same data center, same availability zone? I mean,
>>>> people try to minimize network latency between nodes, so how exactly is EBS
>>>> able to avoid network latency?
>>>>
>>>> Did your test use Amazon EBS–Optimized Instances?
>>>>
>>>> SSD or magnetic or does it make any difference?
>>>>
>>>> What info is available on EBS performance at peak times, when multiple
>>>> AWS customers have spikes of demand?
>>>>
>>>> Is RAID much of a factor or help at all using EBS?
>>>>
>>>> How exactly is EBS provisioned in terms of its own HA - I mean, with a
>>>> properly configured Cassandra cluster RF provides HA, so what is the
>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>> three EBS volumes aren't all in the same physical rack?
>>>>
>>>> For multi-data center operation, what configuration options assure that
>>>> the EBS volumes for each DC are truly physically separated?
>>>>
>>>> In terms of syncing data for the commit log, if the OS call to sync an
>>>> EBS volume returns, is the commit log data absolutely 100% synced at the
>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>> which the EBS volumes reside will still guarantee availability of the
>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>> sstable durability when Cassandra is about to delete the commit log,
>>>> including when the two are on different volumes? In practice, we would like
>>>> some significant degree of pipelining of data, such as during the full
>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>> guarantee is needed.
>>>>
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com>
>>>> wrote:
>>>>
>>>>> Jeff,
>>>>>
>>>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>>>>> discounting EBS, but prior outages are worrisome.
>>>>>
>>>>>
>>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>>> wrote:
>>>>>
>>>>>> Free to choose what you'd like, but EBS outages were also addressed
>>>>>> in that video (second half, discussion by Dennis Opacki). 2016 EBS isn't
>>>>>> the same as 2011 EBS.
>>>>>>
>>>>>> --
>>>>>> Jeff Jirsa
>>>>>>
>>>>>>
>>>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com> wrote:
>>>>>>
>>>>>> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral.
>>>>>> GP2 after testing is a viable contender for our workload. The only worry I
>>>>>> have is EBS outages, which have happened.
>>>>>>
>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Also in that video - it's long but worth watching
>>>>>>>
>>>>>>> We tested up to 1M reads/second as well, blowing out page cache to
>>>>>>> ensure we weren't "just" reading from memory
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jeff Jirsa
>>>>>>>
>>>>>>>
>>>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <
>>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>>
>>>>>>> How about reads? Any differences between read-intensive and
>>>>>>> write-intensive workloads?
>>>>>>>
>>>>>>> -- Jack Krupansky
>>>>>>>
>>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <
>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>
>>>>>>>> Hi John,
>>>>>>>>
>>>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M
>>>>>>>> writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>>>>> necessary.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> From: John Wong
>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>
>>>>>>>> For production I'd stick with ephemeral disks (aka instance
>>>>>>>> storage) if you have running a lot of transaction.
>>>>>>>> However, for regular small testing/qa cluster, or something you
>>>>>>>> know you want to reload often, EBS is definitely good enough and we haven't
>>>>>>>> had issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>>>>>
>>>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go through
>>>>>>>> the video, do you actually use PIOPS or just standard GP2 in your
>>>>>>>> production cluster?
>>>>>>>>
>>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <bryan@blockcypher.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Yep, that motivated my question "Do you have any idea what kind
>>>>>>>>> of disk performance you need?". If you need the performance, its hard to
>>>>>>>>> beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>>>>>
>>>>>>>>> Personally, on small clusters like ours (12 nodes), we've found
>>>>>>>>> our choice of instance dictated much more by the balance of price, CPU, and
>>>>>>>>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>>>>>>>>> rarely the bottleneck. YMMV, of course.
>>>>>>>>>
>>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>
>>>>>>>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>>>>>>>> instances with GP2 EBS.  When you don’t care about replacing a node because
>>>>>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>>>>>>>>> capable of amazing things, and greatly simplifies life.
>>>>>>>>>>
>>>>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very
>>>>>>>>>> much a viable option, despite any old documents online that say otherwise.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> From: Eric Plowe
>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>>>>
>>>>>>>>>> My company is planning on rolling out a C* cluster in EC2. We are
>>>>>>>>>> thinking about going with ephemeral SSDs. The question is this: Should we
>>>>>>>>>> put two in RAID 0 or just go with one? We currently run a cluster in our
>>>>>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>>>>>>>> the performance we are seeing thus far.
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>> Eric
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>
>>>
>>
>>
>> --
>> Steve Robenalt
>> Software Architect
>> srobenalt@highwire.org <bz...@highwire.org>
>> (office/cell): 916-505-1785
>>
>> HighWire Press, Inc.
>> 425 Broadway St, Redwood City, CA 94063
>> www.highwire.org
>>
>> Technology for Scholarly Communication
>>
>
>


-- 
Steve Robenalt
Software Architect
srobenalt@highwire.org <bz...@highwire.org>
(office/cell): 916-505-1785

HighWire Press, Inc.
425 Broadway St, Redwood City, CA 94063
www.highwire.org

Technology for Scholarly Communication

Re: EC2 storage options for C*

Posted by Will Hayworth <wh...@atlassian.com>.

We're using GP2 EBS (3 TB volumes) with m4.xlarges after originally looking
at I2 and D2 instances (thanks, Jeff, for your advice with that one). So
far, so good. (Our workload is write-heavy at the moment but reads are
steadily increasing.)

___________________________________________________________
Will Hayworth
Developer, Engagement Engine
My pronoun is "they". <http://pronoun.is/they>



On Wed, Feb 3, 2016 at 12:17 PM, Ben Bromhead <be...@instaclustr.com> wrote:

> For what it's worth we've tried d2 instances and they encourage terrible
> things like super dense nodes (increases your replacement time). In terms
> of useable storage I would go with gp2 EBS on a m4 based instance.
>
> On Mon, 1 Feb 2016 at 14:25 Jack Krupansky <ja...@gmail.com>
> wrote:
>
>> Ah, yes, the good old days of m1.large.
>>
>> -- Jack Krupansky
>>
>> On Mon, Feb 1, 2016 at 5:12 PM, Jeff Jirsa <je...@crowdstrike.com>
>> wrote:
>>
>>> A lot of people use the old gen instances (m1 in particular) because
>>> they came with a ton of effectively free ephemeral storage (up to 1.6TB).
>>> Whether or not they’re viable is a decision for each user to make. They’re
>>> very, very commonly used for C*, though. At a time when EBS was not
>>> sufficiently robust or reliable, a cluster of m1 instances was the de facto
>>> standard.
>>>
>>> The canonical “best practice” in 2015 was i2. We believe we’ve made a
>>> compelling argument to use m4 or c4 instead of i2. There exists a company
>>> we know currently testing d2 at scale, though I’m not sure they have much
>>> in terms of concrete results at this time.
>>>
>>> - Jeff
>>>
>>> From: Jack Krupansky
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Monday, February 1, 2016 at 1:55 PM
>>>
>>> To: "user@cassandra.apache.org"
>>> Subject: Re: EC2 storage options for C*
>>>
>>> Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2
>>> Dense Storage".
>>>
>>> The remaining question is whether any of the "Previous Generation
>>> Instances" should be publicly recommended going forward.
>>>
>>> And whether non-SSD instances should be recommended going forward as
>>> well. sure, technically, someone could use the legacy instances, but the
>>> question is what we should be recommending as best practice going forward.
>>>
>>> Yeah, the i2 instances look like the sweet spot for any non-EBS clusters.
>>>
>>> -- Jack Krupansky
>>>
>>> On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <sr...@highwire.org>
>>> wrote:
>>>
>>>> Hi Jack,
>>>>
>>>> At the bottom of the instance-types page, there is a link to the
>>>> previous generations, which includes the older series (m1, m2, etc), many
>>>> of which have HDD options.
>>>>
>>>> There are also the d2 (Dense Storage) instances in the current
>>>> generation that include various combos of local HDDs.
>>>>
>>>> The i2 series has good sized SSDs available, and has the advanced
>>>> networking option, which is also useful for Cassandra. The enhanced
>>>> networking is available with other instance types as well, as you'll see on
>>>> the feature list under each type.
>>>>
>>>> Steve
>>>>
>>>>
>>>>
>>>> On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <
>>>> jack.krupansky@gmail.com> wrote:
>>>>
>>>>> Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic
>>>>> question, it seems like magnetic (HDD) is no longer a recommended storage
>>>>> option for databases on AWS. In particular, only the C2 Dense Storage
>>>>> instances have local magnetic storage - all the other instance types are
>>>>> SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent Data
>>>>> Access."
>>>>>
>>>>> For the record, that AWS doc has Cassandra listed as a use case for i2
>>>>> instance types.
>>>>>
>>>>> Also, the AWS doc lists EBS io2 for the NoSQL database use case and
>>>>> gp2 only for the "small to medium databases" use case.
>>>>>
>>>>> Do older instances with local HDD still exist on AWS (m1, m2, etc.)?
>>>>> Is the doc simply for any newly started instances?
>>>>>
>>>>> See:
>>>>> https://aws.amazon.com/ec2/instance-types/
>>>>> http://aws.amazon.com/ebs/details/
>>>>>
>>>>>
>>>>> -- Jack Krupansky
>>>>>
>>>>> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <jeff.jirsa@crowdstrike.com
>>>>> > wrote:
>>>>>
>>>>>> > My apologies if my questions are actually answered on the video or
>>>>>> slides, I just did a quick scan of the slide text.
>>>>>>
>>>>>> Virtually all of them are covered.
>>>>>>
>>>>>> > I'm curious where the EBS physical devices actually reside - are
>>>>>> they in the same rack, the same data center, same availability zone? I
>>>>>> mean, people try to minimize network latency between nodes, so how exactly
>>>>>> is EBS able to avoid network latency?
>>>>>>
>>>>>> Not published,and probably not a straight forward answer (probably
>>>>>> have redundancy cross-az, if it matches some of their other published
>>>>>> behaviors). The promise they give you is ‘iops’, with a certain block size.
>>>>>> Some instance types are optimized for dedicated, ebs-only network
>>>>>> interfaces. Like most things in cassandra / cloud, the only way to know for
>>>>>> sure is to test it yourself and see if observed latency is acceptable (or
>>>>>> trust our testing, if you assume we’re sufficiently smart and honest).
>>>>>>
>>>>>> > Did your test use Amazon EBS–Optimized Instances?
>>>>>>
>>>>>> We tested dozens of instance type/size combinations (literally). The
>>>>>> best performance was clearly with ebs-optimized instances that also have
>>>>>> enhanced networking (c4, m4, etc) - slide 43
>>>>>>
>>>>>> > SSD or magnetic or does it make any difference?
>>>>>>
>>>>>> SSD, GP2 (slide 64)
>>>>>>
>>>>>> > What info is available on EBS performance at peak times, when
>>>>>> multiple AWS customers have spikes of demand?
>>>>>>
>>>>>> Not published, but experiments show that we can hit 10k iops all day
>>>>>> every day with only trivial noisy neighbor problems, not enough to impact a
>>>>>> real cluster (slide 58)
>>>>>>
>>>>>> > Is RAID much of a factor or help at all using EBS?
>>>>>>
>>>>>> You can use RAID to get higher IOPS than you’d normally get by
>>>>>> default (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you
>>>>>> need more than 10k, you can stripe volumes together up to the ebs network
>>>>>> link max) (hinted at in slide 64)
>>>>>>
>>>>>> > How exactly is EBS provisioned in terms of its own HA - I mean,
>>>>>> with a properly configured Cassandra cluster RF provides HA, so what is the
>>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>>
>>>>>> There is HA, I’m not sure that AWS publishes specifics. Occasionally
>>>>>> specific volumes will have issues (hypervisor’s dedicated ethernet link to
>>>>>> EBS network fails, for example). Occasionally instances will have issues.
>>>>>> The volume-specific issues seem to be less common than the instance-store
>>>>>> “instance retired” or “instance is running on degraded hardware” events.
>>>>>> Stop/Start and you’ve recovered (possible with EBS, not possible with
>>>>>> instance store). The assurances are in AWS’ SLA – if the SLA is
>>>>>> insufficient (and it probably is insufficient), use more than one AZ and/or
>>>>>> AWS region or cloud vendor.
>>>>>>
>>>>>> > For multi-data center operation, what configuration options assure
>>>>>> that the EBS volumes for each DC are truly physically separated?
>>>>>>
>>>>>> It used to be true that EBS control plane for a given region spanned
>>>>>> AZs. That’s no longer true. AWS asserts that failure modes for each AZ are
>>>>>> isolated (data may replicate between AZs, but a full outage in us-east-1a
>>>>>> shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide 65
>>>>>>
>>>>>> > In terms of syncing data for the commit log, if the OS call to sync
>>>>>> an EBS volume returns, is the commit log data absolutely 100% synced at the
>>>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>>>> which the EBS volumes reside will still guarantee availability of the
>>>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>>> including when the two are on different volumes? In practice, we would like
>>>>>> some significant degree of pipelining of data, such as during the full
>>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>>> guarantee is needed.
>>>>>>
>>>>>> Most of the answers in this block are “probably not 100%, you should
>>>>>> be writing to more than one host/AZ/DC/vendor to protect your organization
>>>>>> from failures”. AWS targets something like 0.1% annual failure rate per
>>>>>> volume and 99.999% availability (slide 66). We believe they’re exceeding
>>>>>> those goals (at least based with the petabytes of data we have on gp2
>>>>>> volumes).
>>>>>>
>>>>>>
>>>>>>
>>>>>> From: Jack Krupansky
>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>> Date: Monday, February 1, 2016 at 5:51 AM
>>>>>>
>>>>>> To: "user@cassandra.apache.org"
>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>
>>>>>> I'm not a fan of guy - this appears to be the slideshare
>>>>>> corresponding to the video:
>>>>>>
>>>>>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>>>>>>
>>>>>> My apologies if my questions are actually answered on the video or
>>>>>> slides, I just did a quick scan of the slide text.
>>>>>>
>>>>>> I'm curious where the EBS physical devices actually reside - are they
>>>>>> in the same rack, the same data center, same availability zone? I mean,
>>>>>> people try to minimize network latency between nodes, so how exactly is EBS
>>>>>> able to avoid network latency?
>>>>>>
>>>>>> Did your test use Amazon EBS–Optimized Instances?
>>>>>>
>>>>>> SSD or magnetic or does it make any difference?
>>>>>>
>>>>>> What info is available on EBS performance at peak times, when
>>>>>> multiple AWS customers have spikes of demand?
>>>>>>
>>>>>> Is RAID much of a factor or help at all using EBS?
>>>>>>
>>>>>> How exactly is EBS provisioned in terms of its own HA - I mean, with
>>>>>> a properly configured Cassandra cluster RF provides HA, so what is the
>>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>>
>>>>>> For multi-data center operation, what configuration options assure
>>>>>> that the EBS volumes for each DC are truly physically separated?
>>>>>>
>>>>>> In terms of syncing data for the commit log, if the OS call to sync
>>>>>> an EBS volume returns, is the commit log data absolutely 100% synced at the
>>>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>>>> which the EBS volumes reside will still guarantee availability of the
>>>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>>> including when the two are on different volumes? In practice, we would like
>>>>>> some significant degree of pipelining of data, such as during the full
>>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>>> guarantee is needed.
>>>>>>
>>>>>>
>>>>>> -- Jack Krupansky
>>>>>>
>>>>>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Jeff,
>>>>>>>
>>>>>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>>>>>>> discounting EBS, but prior outages are worrisome.
>>>>>>>
>>>>>>>
>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Free to choose what you'd like, but EBS outages were also addressed
>>>>>>>> in that video (second half, discussion by Dennis Opacki). 2016 EBS isn't
>>>>>>>> the same as 2011 EBS.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jeff Jirsa
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Thank you all for the suggestions. I'm torn between GP2 vs
>>>>>>>> Ephemeral. GP2 after testing is a viable contender for our workload. The
>>>>>>>> only worry I have is EBS outages, which have happened.
>>>>>>>>
>>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Also in that video - it's long but worth watching
>>>>>>>>>
>>>>>>>>> We tested up to 1M reads/second as well, blowing out page cache to
>>>>>>>>> ensure we weren't "just" reading from memory
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Jeff Jirsa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <
>>>>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> How about reads? Any differences between read-intensive and
>>>>>>>>> write-intensive workloads?
>>>>>>>>>
>>>>>>>>> -- Jack Krupansky
>>>>>>>>>
>>>>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <
>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi John,
>>>>>>>>>>
>>>>>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M
>>>>>>>>>> writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>>>>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>>>>>>> necessary.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> From: John Wong
>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>>>
>>>>>>>>>> For production I'd stick with ephemeral disks (aka instance
>>>>>>>>>> storage) if you have running a lot of transaction.
>>>>>>>>>> However, for regular small testing/qa cluster, or something you
>>>>>>>>>> know you want to reload often, EBS is definitely good enough and we haven't
>>>>>>>>>> had issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>>>>>>>
>>>>>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go through
>>>>>>>>>> the video, do you actually use PIOPS or just standard GP2 in your
>>>>>>>>>> production cluster?
>>>>>>>>>>
>>>>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <
>>>>>>>>>> bryan@blockcypher.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yep, that motivated my question "Do you have any idea what kind
>>>>>>>>>>> of disk performance you need?". If you need the performance, its hard to
>>>>>>>>>>> beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>>>>>>>
>>>>>>>>>>> Personally, on small clusters like ours (12 nodes), we've found
>>>>>>>>>>> our choice of instance dictated much more by the balance of price, CPU, and
>>>>>>>>>>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>>>>>>>>>>> rarely the bottleneck. YMMV, of course.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>>>>>>>>>> instances with GP2 EBS.  When you don’t care about replacing a node because
>>>>>>>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>>>>>>>>>>> capable of amazing things, and greatly simplifies life.
>>>>>>>>>>>>
>>>>>>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s
>>>>>>>>>>>> very much a viable option, despite any old documents online that say
>>>>>>>>>>>> otherwise.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> From: Eric Plowe
>>>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>>>>>>
>>>>>>>>>>>> My company is planning on rolling out a C* cluster in EC2. We
>>>>>>>>>>>> are thinking about going with ephemeral SSDs. The question is this: Should
>>>>>>>>>>>> we put two in RAID 0 or just go with one? We currently run a cluster in our
>>>>>>>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>>>>>>>>>> the performance we are seeing thus far.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>> Eric
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Steve Robenalt
>>>> Software Architect
>>>> srobenalt@highwire.org <bz...@highwire.org>
>>>> (office/cell): 916-505-1785
>>>>
>>>> HighWire Press, Inc.
>>>> 425 Broadway St, Redwood City, CA 94063
>>>> www.highwire.org
>>>>
>>>> Technology for Scholarly Communication
>>>>
>>>
>>>
>> --
> Ben Bromhead
> CTO | Instaclustr
> +1 650 284 9692
>

Re: EC2 storage options for C*

Posted by Jeff Jirsa <je...@crowdstrike.com>.

I don’t want to be “that guy”, but there’s literally almost a dozen emails in this thread answering exactly that question. Did you read the thread to which you replied? 

From:  James Rothering
Reply-To:  "user@cassandra.apache.org"
Date:  Wednesday, February 3, 2016 at 4:09 PM
To:  "user@cassandra.apache.org"
Subject:  Re: EC2 storage options for C*

Just curious here ... when did EBS become OK for C*? Didn't they always push towards using ephemeral disks?

On Wed, Feb 3, 2016 at 12:17 PM, Ben Bromhead <be...@instaclustr.com> wrote:
For what it's worth we've tried d2 instances and they encourage terrible things like super dense nodes (increases your replacement time). In terms of useable storage I would go with gp2 EBS on a m4 based instance. 

On Mon, 1 Feb 2016 at 14:25 Jack Krupansky <ja...@gmail.com> wrote:
Ah, yes, the good old days of m1.large.

-- Jack Krupansky

On Mon, Feb 1, 2016 at 5:12 PM, Jeff Jirsa <je...@crowdstrike.com> wrote:
A lot of people use the old gen instances (m1 in particular) because they came with a ton of effectively free ephemeral storage (up to 1.6TB). Whether or not they’re viable is a decision for each user to make. They’re very, very commonly used for C*, though. At a time when EBS was not sufficiently robust or reliable, a cluster of m1 instances was the de facto standard. 

The canonical “best practice” in 2015 was i2. We believe we’ve made a compelling argument to use m4 or c4 instead of i2. There exists a company we know currently testing d2 at scale, though I’m not sure they have much in terms of concrete results at this time. 

- Jeff

From: Jack Krupansky
Reply-To: "user@cassandra.apache.org"
Date: Monday, February 1, 2016 at 1:55 PM 

To: "user@cassandra.apache.org"
Subject: Re: EC2 storage options for C*

Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2 Dense Storage". 

The remaining question is whether any of the "Previous Generation Instances" should be publicly recommended going forward.

And whether non-SSD instances should be recommended going forward as well. sure, technically, someone could use the legacy instances, but the question is what we should be recommending as best practice going forward.

Yeah, the i2 instances look like the sweet spot for any non-EBS clusters.

-- Jack Krupansky

On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <sr...@highwire.org> wrote:
Hi Jack, 

At the bottom of the instance-types page, there is a link to the previous generations, which includes the older series (m1, m2, etc), many of which have HDD options. 

There are also the d2 (Dense Storage) instances in the current generation that include various combos of local HDDs.

The i2 series has good sized SSDs available, and has the advanced networking option, which is also useful for Cassandra. The enhanced networking is available with other instance types as well, as you'll see on the feature list under each type. 

Steve

On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <ja...@gmail.com> wrote:
Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic question, it seems like magnetic (HDD) is no longer a recommended storage option for databases on AWS. In particular, only the C2 Dense Storage instances have local magnetic storage - all the other instance types are SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent Data Access." 

For the record, that AWS doc has Cassandra listed as a use case for i2 instance types.

Also, the AWS doc lists EBS io2 for the NoSQL database use case and gp2 only for the "small to medium databases" use case.

Do older instances with local HDD still exist on AWS (m1, m2, etc.)? Is the doc simply for any newly started instances?

See:
https://aws.amazon.com/ec2/instance-types/
http://aws.amazon.com/ebs/details/

-- Jack Krupansky

On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <je...@crowdstrike.com> wrote:
> My apologies if my questions are actually answered on the video or slides, I just did a quick scan of the slide text.

Virtually all of them are covered.

> I'm curious where the EBS physical devices actually reside - are they in the same rack, the same data center, same availability zone? I mean, people try to minimize network latency between nodes, so how exactly is EBS able to avoid network latency?

Not published,and probably not a straight forward answer (probably have redundancy cross-az, if it matches some of their other published behaviors). The promise they give you is ‘iops’, with a certain block size. Some instance types are optimized for dedicated, ebs-only network interfaces. Like most things in cassandra / cloud, the only way to know for sure is to test it yourself and see if observed latency is acceptable (or trust our testing, if you assume we’re sufficiently smart and honest). 

> Did your test use Amazon EBS–Optimized Instances?

We tested dozens of instance type/size combinations (literally). The best performance was clearly with ebs-optimized instances that also have enhanced networking (c4, m4, etc) - slide 43

> SSD or magnetic or does it make any difference?

SSD, GP2 (slide 64)

> What info is available on EBS performance at peak times, when multiple AWS customers have spikes of demand?

Not published, but experiments show that we can hit 10k iops all day every day with only trivial noisy neighbor problems, not enough to impact a real cluster (slide 58)

> Is RAID much of a factor or help at all using EBS?

You can use RAID to get higher IOPS than you’d normally get by default (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you need more than 10k, you can stripe volumes together up to the ebs network link max) (hinted at in slide 64)

> How exactly is EBS provisioned in terms of its own HA - I mean, with a properly configured Cassandra cluster RF provides HA, so what is the equivalent for EBS? If I have RF=3, what assurance is there that those three EBS volumes aren't all in the same physical rack?

There is HA, I’m not sure that AWS publishes specifics. Occasionally specific volumes will have issues (hypervisor’s dedicated ethernet link to EBS network fails, for example). Occasionally instances will have issues. The volume-specific issues seem to be less common than the instance-store “instance retired” or “instance is running on degraded hardware” events. Stop/Start and you’ve recovered (possible with EBS, not possible with instance store). The assurances are in AWS’ SLA – if the SLA is insufficient (and it probably is insufficient), use more than one AZ and/or AWS region or cloud vendor.

> For multi-data center operation, what configuration options assure that the EBS volumes for each DC are truly physically separated?

It used to be true that EBS control plane for a given region spanned AZs. That’s no longer true. AWS asserts that failure modes for each AZ are isolated (data may replicate between AZs, but a full outage in us-east-1a shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide 65

> In terms of syncing data for the commit log, if the OS call to sync an EBS volume returns, is the commit log data absolutely 100% synced at the hardware level on the EBS end, such that a power failure of the systems on which the EBS volumes reside will still guarantee availability of the fsynced data. As well, is return from fsync an absolute guarantee of sstable durability when Cassandra is about to delete the commit log, including when the two are on different volumes? In practice, we would like some significant degree of pipelining of data, such as during the full processing of flushing memtables, but for the fsync at the end a solid guarantee is needed.

Most of the answers in this block are “probably not 100%, you should be writing to more than one host/AZ/DC/vendor to protect your organization from failures”. AWS targets something like 0.1% annual failure rate per volume and 99.999% availability (slide 66). We believe they’re exceeding those goals (at least based with the petabytes of data we have on gp2 volumes).  

From: Jack Krupansky
Reply-To: "user@cassandra.apache.org"
Date: Monday, February 1, 2016 at 5:51 AM 

To: "user@cassandra.apache.org"
Subject: Re: EC2 storage options for C*

I'm not a fan of guy - this appears to be the slideshare corresponding to the video: 
http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second

My apologies if my questions are actually answered on the video or slides, I just did a quick scan of the slide text.

I'm curious where the EBS physical devices actually reside - are they in the same rack, the same data center, same availability zone? I mean, people try to minimize network latency between nodes, so how exactly is EBS able to avoid network latency? 

Did your test use Amazon EBS–Optimized Instances?

SSD or magnetic or does it make any difference?

What info is available on EBS performance at peak times, when multiple AWS customers have spikes of demand?

Is RAID much of a factor or help at all using EBS?

How exactly is EBS provisioned in terms of its own HA - I mean, with a properly configured Cassandra cluster RF provides HA, so what is the equivalent for EBS? If I have RF=3, what assurance is there that those three EBS volumes aren't all in the same physical rack?

For multi-data center operation, what configuration options assure that the EBS volumes for each DC are truly physically separated?

In terms of syncing data for the commit log, if the OS call to sync an EBS volume returns, is the commit log data absolutely 100% synced at the hardware level on the EBS end, such that a power failure of the systems on which the EBS volumes reside will still guarantee availability of the fsynced data. As well, is return from fsync an absolute guarantee of sstable durability when Cassandra is about to delete the commit log, including when the two are on different volumes? In practice, we would like some significant degree of pipelining of data, such as during the full processing of flushing memtables, but for the fsync at the end a solid guarantee is needed.

-- Jack Krupansky

On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com> wrote:
Jeff, 

If EBS goes down, then EBS Gp2 will go down as well, no? I'm not discounting EBS, but prior outages are worrisome. 

On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com> wrote:
Free to choose what you'd like, but EBS outages were also addressed in that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the same as 2011 EBS. 

-- 
Jeff Jirsa

On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com> wrote:

Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral. GP2 after testing is a viable contender for our workload. The only worry I have is EBS outages, which have happened. 

On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com> wrote:
Also in that video - it's long but worth watching

We tested up to 1M reads/second as well, blowing out page cache to ensure we weren't "just" reading from memory

-- 
Jeff Jirsa

On Jan 31, 2016, at 9:52 AM, Jack Krupansky <ja...@gmail.com> wrote:

How about reads? Any differences between read-intensive and write-intensive workloads?

-- Jack Krupansky

On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <je...@crowdstrike.com> wrote:
Hi John,

We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes per second on 60 nodes, we didn’t come close to hitting even 50% utilization (10k is more than enough for most workloads). PIOPS is not necessary. 

From: John Wong
Reply-To: "user@cassandra.apache.org"
Date: Saturday, January 30, 2016 at 3:07 PM
To: "user@cassandra.apache.org"
Subject: Re: EC2 storage options for C*

For production I'd stick with ephemeral disks (aka instance storage) if you have running a lot of transaction. 
However, for regular small testing/qa cluster, or something you know you want to reload often, EBS is definitely good enough and we haven't had issues 99%. The 1% is kind of anomaly where we have flush blocked.

But Jeff, kudo that you are able to use EBS. I didn't go through the video, do you actually use PIOPS or just standard GP2 in your production cluster?

On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com> wrote:
Yep, that motivated my question "Do you have any idea what kind of disk performance you need?". If you need the performance, its hard to beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.

Personally, on small clusters like ours (12 nodes), we've found our choice of instance dictated much more by the balance of price, CPU, and memory. We're using GP2 SSD and we find that for our patterns the disk is rarely the bottleneck. YMMV, of course.

On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <je...@crowdstrike.com> wrote:
If you have to ask that question, I strongly recommend m4 or c4 instances with GP2 EBS.  When you don’t care about replacing a node because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is capable of amazing things, and greatly simplifies life.

We gave a talk on this topic at both Cassandra Summit and AWS re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable option, despite any old documents online that say otherwise.

From: Eric Plowe
Reply-To: "user@cassandra.apache.org"
Date: Friday, January 29, 2016 at 4:33 PM
To: "user@cassandra.apache.org"
Subject: EC2 storage options for C*

My company is planning on rolling out a C* cluster in EC2. We are thinking about going with ephemeral SSDs. The question is this: Should we put two in RAID 0 or just go with one? We currently run a cluster in our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the performance we are seeing thus far.

Thanks!

Eric

-- 
Steve Robenalt 
Software Architect
srobenalt@highwire.org 
(office/cell): 916-505-1785

HighWire Press, Inc.
425 Broadway St, Redwood City, CA 94063
www.highwire.org

Technology for Scholarly Communication

-- 
Ben Bromhead
CTO | Instaclustr 
+1 650 284 9692

Re: EC2 storage options for C*

Posted by Jack Krupansky <ja...@gmail.com>.

I meant to reply earlier that the current DataStax doc on EC2 is actually
reasonably decent. It says this about EBS:

"SSD-backed general purpose volumes (GP2) or provisioned IOPS volumes (PIOPS)
are suitable for production workloads."

with the caveat of:

"EBS magnetic volumes are not recommended for Cassandra data storage
volumes for the following reasons:..."

as well as:

"Note: Use only ephemeral instance-store or the recommended EBS volume
types for Cassandra data storage."

See:
http://docs.datastax.com/en/cassandra/3.x/cassandra/planning/planPlanningEC2.html





-- Jack Krupansky

On Wed, Feb 3, 2016 at 7:27 PM, Sebastian Estevez <
sebastian.estevez@datastax.com> wrote:

> Good points Bryan, some more color:
>
> Regular EBS is *not* okay for C*. But AWS has some nicer EBS now that has
> performed okay recently:
>
> http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html
>
> https://www.youtube.com/watch?v=1R-mgOcOSd4
>
>
> The cloud vendors are moving toward shared storage and we can't ignore
> that in the long term (they will push us in that direction financially).
> Fortunately their shared storage offerings are also getting better. For
> example google's elastic storage offerring provides very reliable
> latencies  <https://www.youtube.com/watch?v=qf-7IhCqCcI>which is what we
> care the most about, not iops.
>
> On the practical side, a key thing I've noticed with real deployments is
> that the size of the volume affects how fast it will perform and how stable
> it's latencies will be so make sure to get large EBS volumes > 1tb to get
> decent performance, even if your nodes aren't that dense.
>
>
>
>
> All the best,
>
>
> [image: datastax_logo.png] <http://www.datastax.com/>
>
> Sebastián Estévez
>
> Solutions Architect | 954 905 8615 | sebastian.estevez@datastax.com
>
> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
> <https://twitter.com/datastax> [image: g+.png]
> <https://plus.google.com/+Datastax/about>
> <http://feeds.feedburner.com/datastax>
> <http://goog_410786983>
>
>
> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
> On Wed, Feb 3, 2016 at 7:23 PM, Bryan Cheng <br...@blockcypher.com> wrote:
>
>> From my experience, EBS has transitioned from "stay the hell away" to
>> "OK" as the new GP2 SSD type has come out and stabilized over the last few
>> years, especially with the addition of EBS-optimized instances that have
>> dedicated EBS bandwidth. The latter has really helped to stabilize the
>> problematic 99.9-percentile latency spikes that use to plague EBS volumes.
>>
>> EBS (IMHO) has always had operational advantages, but inconsistent
>> latency and generally poor performance in the past lead many to disregard
>> it.
>>
>> On Wed, Feb 3, 2016 at 4:09 PM, James Rothering <jr...@codojo.me>
>> wrote:
>>
>>> Just curious here ... when did EBS become OK for C*? Didn't they always
>>> push towards using ephemeral disks?
>>>
>>> On Wed, Feb 3, 2016 at 12:17 PM, Ben Bromhead <be...@instaclustr.com>
>>> wrote:
>>>
>>>> For what it's worth we've tried d2 instances and they encourage
>>>> terrible things like super dense nodes (increases your replacement time).
>>>> In terms of useable storage I would go with gp2 EBS on a m4 based instance.
>>>>
>>>> On Mon, 1 Feb 2016 at 14:25 Jack Krupansky <ja...@gmail.com>
>>>> wrote:
>>>>
>>>>> Ah, yes, the good old days of m1.large.
>>>>>
>>>>> -- Jack Krupansky
>>>>>
>>>>> On Mon, Feb 1, 2016 at 5:12 PM, Jeff Jirsa <jeff.jirsa@crowdstrike.com
>>>>> > wrote:
>>>>>
>>>>>> A lot of people use the old gen instances (m1 in particular) because
>>>>>> they came with a ton of effectively free ephemeral storage (up to 1.6TB).
>>>>>> Whether or not they’re viable is a decision for each user to make. They’re
>>>>>> very, very commonly used for C*, though. At a time when EBS was not
>>>>>> sufficiently robust or reliable, a cluster of m1 instances was the de facto
>>>>>> standard.
>>>>>>
>>>>>> The canonical “best practice” in 2015 was i2. We believe we’ve made a
>>>>>> compelling argument to use m4 or c4 instead of i2. There exists a company
>>>>>> we know currently testing d2 at scale, though I’m not sure they have much
>>>>>> in terms of concrete results at this time.
>>>>>>
>>>>>> - Jeff
>>>>>>
>>>>>> From: Jack Krupansky
>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>> Date: Monday, February 1, 2016 at 1:55 PM
>>>>>>
>>>>>> To: "user@cassandra.apache.org"
>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>
>>>>>> Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2
>>>>>> Dense Storage".
>>>>>>
>>>>>> The remaining question is whether any of the "Previous Generation
>>>>>> Instances" should be publicly recommended going forward.
>>>>>>
>>>>>> And whether non-SSD instances should be recommended going forward as
>>>>>> well. sure, technically, someone could use the legacy instances, but the
>>>>>> question is what we should be recommending as best practice going forward.
>>>>>>
>>>>>> Yeah, the i2 instances look like the sweet spot for any non-EBS
>>>>>> clusters.
>>>>>>
>>>>>> -- Jack Krupansky
>>>>>>
>>>>>> On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <
>>>>>> srobenalt@highwire.org> wrote:
>>>>>>
>>>>>>> Hi Jack,
>>>>>>>
>>>>>>> At the bottom of the instance-types page, there is a link to the
>>>>>>> previous generations, which includes the older series (m1, m2, etc), many
>>>>>>> of which have HDD options.
>>>>>>>
>>>>>>> There are also the d2 (Dense Storage) instances in the current
>>>>>>> generation that include various combos of local HDDs.
>>>>>>>
>>>>>>> The i2 series has good sized SSDs available, and has the advanced
>>>>>>> networking option, which is also useful for Cassandra. The enhanced
>>>>>>> networking is available with other instance types as well, as you'll see on
>>>>>>> the feature list under each type.
>>>>>>>
>>>>>>> Steve
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <
>>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>>
>>>>>>>> Thanks. Reading a little bit on AWS, and back to my SSD vs.
>>>>>>>> magnetic question, it seems like magnetic (HDD) is no longer a recommended
>>>>>>>> storage option for databases on AWS. In particular, only the C2 Dense
>>>>>>>> Storage instances have local magnetic storage - all the other instance
>>>>>>>> types are SSD or EBS-only - and EBS Magnetic is only recommended for
>>>>>>>> "Infrequent Data Access."
>>>>>>>>
>>>>>>>> For the record, that AWS doc has Cassandra listed as a use case for
>>>>>>>> i2 instance types.
>>>>>>>>
>>>>>>>> Also, the AWS doc lists EBS io2 for the NoSQL database use case and
>>>>>>>> gp2 only for the "small to medium databases" use case.
>>>>>>>>
>>>>>>>> Do older instances with local HDD still exist on AWS (m1, m2,
>>>>>>>> etc.)? Is the doc simply for any newly started instances?
>>>>>>>>
>>>>>>>> See:
>>>>>>>> https://aws.amazon.com/ec2/instance-types/
>>>>>>>> http://aws.amazon.com/ebs/details/
>>>>>>>>
>>>>>>>>
>>>>>>>> -- Jack Krupansky
>>>>>>>>
>>>>>>>> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <
>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>
>>>>>>>>> > My apologies if my questions are actually answered on the video
>>>>>>>>> or slides, I just did a quick scan of the slide text.
>>>>>>>>>
>>>>>>>>> Virtually all of them are covered.
>>>>>>>>>
>>>>>>>>> > I'm curious where the EBS physical devices actually reside - are
>>>>>>>>> they in the same rack, the same data center, same availability zone? I
>>>>>>>>> mean, people try to minimize network latency between nodes, so how exactly
>>>>>>>>> is EBS able to avoid network latency?
>>>>>>>>>
>>>>>>>>> Not published,and probably not a straight forward answer (probably
>>>>>>>>> have redundancy cross-az, if it matches some of their other published
>>>>>>>>> behaviors). The promise they give you is ‘iops’, with a certain block size.
>>>>>>>>> Some instance types are optimized for dedicated, ebs-only network
>>>>>>>>> interfaces. Like most things in cassandra / cloud, the only way to know for
>>>>>>>>> sure is to test it yourself and see if observed latency is acceptable (or
>>>>>>>>> trust our testing, if you assume we’re sufficiently smart and honest).
>>>>>>>>>
>>>>>>>>> > Did your test use Amazon EBS–Optimized Instances?
>>>>>>>>>
>>>>>>>>> We tested dozens of instance type/size combinations (literally).
>>>>>>>>> The best performance was clearly with ebs-optimized instances that also
>>>>>>>>> have enhanced networking (c4, m4, etc) - slide 43
>>>>>>>>>
>>>>>>>>> > SSD or magnetic or does it make any difference?
>>>>>>>>>
>>>>>>>>> SSD, GP2 (slide 64)
>>>>>>>>>
>>>>>>>>> > What info is available on EBS performance at peak times, when
>>>>>>>>> multiple AWS customers have spikes of demand?
>>>>>>>>>
>>>>>>>>> Not published, but experiments show that we can hit 10k iops all
>>>>>>>>> day every day with only trivial noisy neighbor problems, not enough to
>>>>>>>>> impact a real cluster (slide 58)
>>>>>>>>>
>>>>>>>>> > Is RAID much of a factor or help at all using EBS?
>>>>>>>>>
>>>>>>>>> You can use RAID to get higher IOPS than you’d normally get by
>>>>>>>>> default (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you
>>>>>>>>> need more than 10k, you can stripe volumes together up to the ebs network
>>>>>>>>> link max) (hinted at in slide 64)
>>>>>>>>>
>>>>>>>>> > How exactly is EBS provisioned in terms of its own HA - I mean,
>>>>>>>>> with a properly configured Cassandra cluster RF provides HA, so what is the
>>>>>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>>>>>
>>>>>>>>> There is HA, I’m not sure that AWS publishes specifics.
>>>>>>>>> Occasionally specific volumes will have issues (hypervisor’s dedicated
>>>>>>>>> ethernet link to EBS network fails, for example). Occasionally instances
>>>>>>>>> will have issues. The volume-specific issues seem to be less common than
>>>>>>>>> the instance-store “instance retired” or “instance is running on degraded
>>>>>>>>> hardware” events. Stop/Start and you’ve recovered (possible with EBS, not
>>>>>>>>> possible with instance store). The assurances are in AWS’ SLA – if the SLA
>>>>>>>>> is insufficient (and it probably is insufficient), use more than one AZ
>>>>>>>>> and/or AWS region or cloud vendor.
>>>>>>>>>
>>>>>>>>> > For multi-data center operation, what configuration options
>>>>>>>>> assure that the EBS volumes for each DC are truly physically separated?
>>>>>>>>>
>>>>>>>>> It used to be true that EBS control plane for a given region
>>>>>>>>> spanned AZs. That’s no longer true. AWS asserts that failure modes for each
>>>>>>>>> AZ are isolated (data may replicate between AZs, but a full outage in
>>>>>>>>> us-east-1a shouldn’t affect running ebs volumes in us-east-1b or
>>>>>>>>> us-east-1c). Slide 65
>>>>>>>>>
>>>>>>>>> > In terms of syncing data for the commit log, if the OS call to
>>>>>>>>> sync an EBS volume returns, is the commit log data absolutely 100% synced
>>>>>>>>> at the hardware level on the EBS end, such that a power failure of the
>>>>>>>>> systems on which the EBS volumes reside will still guarantee availability
>>>>>>>>> of the fsynced data. As well, is return from fsync an absolute guarantee of
>>>>>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>>>>>> including when the two are on different volumes? In practice, we would like
>>>>>>>>> some significant degree of pipelining of data, such as during the full
>>>>>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>>>>>> guarantee is needed.
>>>>>>>>>
>>>>>>>>> Most of the answers in this block are “probably not 100%, you
>>>>>>>>> should be writing to more than one host/AZ/DC/vendor to protect your
>>>>>>>>> organization from failures”. AWS targets something like 0.1% annual failure
>>>>>>>>> rate per volume and 99.999% availability (slide 66). We believe they’re
>>>>>>>>> exceeding those goals (at least based with the petabytes of data we have on
>>>>>>>>> gp2 volumes).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> From: Jack Krupansky
>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>> Date: Monday, February 1, 2016 at 5:51 AM
>>>>>>>>>
>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>>
>>>>>>>>> I'm not a fan of guy - this appears to be the slideshare
>>>>>>>>> corresponding to the video:
>>>>>>>>>
>>>>>>>>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>>>>>>>>>
>>>>>>>>> My apologies if my questions are actually answered on the video or
>>>>>>>>> slides, I just did a quick scan of the slide text.
>>>>>>>>>
>>>>>>>>> I'm curious where the EBS physical devices actually reside - are
>>>>>>>>> they in the same rack, the same data center, same availability zone? I
>>>>>>>>> mean, people try to minimize network latency between nodes, so how exactly
>>>>>>>>> is EBS able to avoid network latency?
>>>>>>>>>
>>>>>>>>> Did your test use Amazon EBS–Optimized Instances?
>>>>>>>>>
>>>>>>>>> SSD or magnetic or does it make any difference?
>>>>>>>>>
>>>>>>>>> What info is available on EBS performance at peak times, when
>>>>>>>>> multiple AWS customers have spikes of demand?
>>>>>>>>>
>>>>>>>>> Is RAID much of a factor or help at all using EBS?
>>>>>>>>>
>>>>>>>>> How exactly is EBS provisioned in terms of its own HA - I mean,
>>>>>>>>> with a properly configured Cassandra cluster RF provides HA, so what is the
>>>>>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>>>>>
>>>>>>>>> For multi-data center operation, what configuration options assure
>>>>>>>>> that the EBS volumes for each DC are truly physically separated?
>>>>>>>>>
>>>>>>>>> In terms of syncing data for the commit log, if the OS call to
>>>>>>>>> sync an EBS volume returns, is the commit log data absolutely 100% synced
>>>>>>>>> at the hardware level on the EBS end, such that a power failure of the
>>>>>>>>> systems on which the EBS volumes reside will still guarantee availability
>>>>>>>>> of the fsynced data. As well, is return from fsync an absolute guarantee of
>>>>>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>>>>>> including when the two are on different volumes? In practice, we would like
>>>>>>>>> some significant degree of pipelining of data, such as during the full
>>>>>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>>>>>> guarantee is needed.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -- Jack Krupansky
>>>>>>>>>
>>>>>>>>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Jeff,
>>>>>>>>>>
>>>>>>>>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>>>>>>>>>> discounting EBS, but prior outages are worrisome.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <
>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Free to choose what you'd like, but EBS outages were also
>>>>>>>>>>> addressed in that video (second half, discussion by Dennis Opacki). 2016
>>>>>>>>>>> EBS isn't the same as 2011 EBS.
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Jeff Jirsa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Thank you all for the suggestions. I'm torn between GP2 vs
>>>>>>>>>>> Ephemeral. GP2 after testing is a viable contender for our workload. The
>>>>>>>>>>> only worry I have is EBS outages, which have happened.
>>>>>>>>>>>
>>>>>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <
>>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Also in that video - it's long but worth watching
>>>>>>>>>>>>
>>>>>>>>>>>> We tested up to 1M reads/second as well, blowing out page cache
>>>>>>>>>>>> to ensure we weren't "just" reading from memory
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Jeff Jirsa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <
>>>>>>>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> How about reads? Any differences between read-intensive and
>>>>>>>>>>>> write-intensive workloads?
>>>>>>>>>>>>
>>>>>>>>>>>> -- Jack Krupansky
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <
>>>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi John,
>>>>>>>>>>>>>
>>>>>>>>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at
>>>>>>>>>>>>> 1M writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>>>>>>>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>>>>>>>>>> necessary.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> From: John Wong
>>>>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>>>>>>
>>>>>>>>>>>>> For production I'd stick with ephemeral disks (aka instance
>>>>>>>>>>>>> storage) if you have running a lot of transaction.
>>>>>>>>>>>>> However, for regular small testing/qa cluster, or something
>>>>>>>>>>>>> you know you want to reload often, EBS is definitely good enough and we
>>>>>>>>>>>>> haven't had issues 99%. The 1% is kind of anomaly where we have flush
>>>>>>>>>>>>> blocked.
>>>>>>>>>>>>>
>>>>>>>>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go
>>>>>>>>>>>>> through the video, do you actually use PIOPS or just standard GP2 in your
>>>>>>>>>>>>> production cluster?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <
>>>>>>>>>>>>> bryan@blockcypher.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yep, that motivated my question "Do you have any idea what
>>>>>>>>>>>>>> kind of disk performance you need?". If you need the performance, its hard
>>>>>>>>>>>>>> to beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>>>>>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Personally, on small clusters like ours (12 nodes), we've
>>>>>>>>>>>>>> found our choice of instance dictated much more by the balance of price,
>>>>>>>>>>>>>> CPU, and memory. We're using GP2 SSD and we find that for our patterns the
>>>>>>>>>>>>>> disk is rarely the bottleneck. YMMV, of course.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If you have to ask that question, I strongly recommend m4 or
>>>>>>>>>>>>>>> c4 instances with GP2 EBS.  When you don’t care about replacing a node
>>>>>>>>>>>>>>> because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS
>>>>>>>>>>>>>>> is capable of amazing things, and greatly simplifies life.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We gave a talk on this topic at both Cassandra Summit and
>>>>>>>>>>>>>>> AWS re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s
>>>>>>>>>>>>>>> very much a viable option, despite any old documents online that say
>>>>>>>>>>>>>>> otherwise.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> From: Eric Plowe
>>>>>>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> My company is planning on rolling out a C* cluster in EC2.
>>>>>>>>>>>>>>> We are thinking about going with ephemeral SSDs. The question is
>>>>>>>>>>>>>>> this: Should we put two in RAID 0 or just go with one? We currently run a
>>>>>>>>>>>>>>> cluster in our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we
>>>>>>>>>>>>>>> are happy with the performance we are seeing thus far.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Eric
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Steve Robenalt
>>>>>>> Software Architect
>>>>>>> srobenalt@highwire.org <bz...@highwire.org>
>>>>>>> (office/cell): 916-505-1785
>>>>>>>
>>>>>>> HighWire Press, Inc.
>>>>>>> 425 Broadway St, Redwood City, CA 94063
>>>>>>> www.highwire.org
>>>>>>>
>>>>>>> Technology for Scholarly Communication
>>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>> Ben Bromhead
>>>> CTO | Instaclustr
>>>> +1 650 284 9692
>>>>
>>>
>>>
>>
>

Re: EC2 storage options for C*

Posted by Sebastian Estevez <se...@datastax.com>.

By the way, if someone wants to do some hard core testing like Al, I wrote
a guide on how to use his tool:

http://www.sestevez.com/how-to-use-toberts-effio/

I'm sure folks on this list would like to see more stats : )

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.estevez@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Feb 3, 2016 at 7:27 PM, Sebastian Estevez <
sebastian.estevez@datastax.com> wrote:

> Good points Bryan, some more color:
>
> Regular EBS is *not* okay for C*. But AWS has some nicer EBS now that has
> performed okay recently:
>
> http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html
>
> https://www.youtube.com/watch?v=1R-mgOcOSd4
>
>
> The cloud vendors are moving toward shared storage and we can't ignore
> that in the long term (they will push us in that direction financially).
> Fortunately their shared storage offerings are also getting better. For
> example google's elastic storage offerring provides very reliable
> latencies  <https://www.youtube.com/watch?v=qf-7IhCqCcI>which is what we
> care the most about, not iops.
>
> On the practical side, a key thing I've noticed with real deployments is
> that the size of the volume affects how fast it will perform and how stable
> it's latencies will be so make sure to get large EBS volumes > 1tb to get
> decent performance, even if your nodes aren't that dense.
>
>
>
>
> All the best,
>
>
> [image: datastax_logo.png] <http://www.datastax.com/>
>
> Sebastián Estévez
>
> Solutions Architect | 954 905 8615 | sebastian.estevez@datastax.com
>
> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
> <https://twitter.com/datastax> [image: g+.png]
> <https://plus.google.com/+Datastax/about>
> <http://feeds.feedburner.com/datastax>
> <http://goog_410786983>
>
>
> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
> On Wed, Feb 3, 2016 at 7:23 PM, Bryan Cheng <br...@blockcypher.com> wrote:
>
>> From my experience, EBS has transitioned from "stay the hell away" to
>> "OK" as the new GP2 SSD type has come out and stabilized over the last few
>> years, especially with the addition of EBS-optimized instances that have
>> dedicated EBS bandwidth. The latter has really helped to stabilize the
>> problematic 99.9-percentile latency spikes that use to plague EBS volumes.
>>
>> EBS (IMHO) has always had operational advantages, but inconsistent
>> latency and generally poor performance in the past lead many to disregard
>> it.
>>
>> On Wed, Feb 3, 2016 at 4:09 PM, James Rothering <jr...@codojo.me>
>> wrote:
>>
>>> Just curious here ... when did EBS become OK for C*? Didn't they always
>>> push towards using ephemeral disks?
>>>
>>> On Wed, Feb 3, 2016 at 12:17 PM, Ben Bromhead <be...@instaclustr.com>
>>> wrote:
>>>
>>>> For what it's worth we've tried d2 instances and they encourage
>>>> terrible things like super dense nodes (increases your replacement time).
>>>> In terms of useable storage I would go with gp2 EBS on a m4 based instance.
>>>>
>>>> On Mon, 1 Feb 2016 at 14:25 Jack Krupansky <ja...@gmail.com>
>>>> wrote:
>>>>
>>>>> Ah, yes, the good old days of m1.large.
>>>>>
>>>>> -- Jack Krupansky
>>>>>
>>>>> On Mon, Feb 1, 2016 at 5:12 PM, Jeff Jirsa <jeff.jirsa@crowdstrike.com
>>>>> > wrote:
>>>>>
>>>>>> A lot of people use the old gen instances (m1 in particular) because
>>>>>> they came with a ton of effectively free ephemeral storage (up to 1.6TB).
>>>>>> Whether or not they’re viable is a decision for each user to make. They’re
>>>>>> very, very commonly used for C*, though. At a time when EBS was not
>>>>>> sufficiently robust or reliable, a cluster of m1 instances was the de facto
>>>>>> standard.
>>>>>>
>>>>>> The canonical “best practice” in 2015 was i2. We believe we’ve made a
>>>>>> compelling argument to use m4 or c4 instead of i2. There exists a company
>>>>>> we know currently testing d2 at scale, though I’m not sure they have much
>>>>>> in terms of concrete results at this time.
>>>>>>
>>>>>> - Jeff
>>>>>>
>>>>>> From: Jack Krupansky
>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>> Date: Monday, February 1, 2016 at 1:55 PM
>>>>>>
>>>>>> To: "user@cassandra.apache.org"
>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>
>>>>>> Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2
>>>>>> Dense Storage".
>>>>>>
>>>>>> The remaining question is whether any of the "Previous Generation
>>>>>> Instances" should be publicly recommended going forward.
>>>>>>
>>>>>> And whether non-SSD instances should be recommended going forward as
>>>>>> well. sure, technically, someone could use the legacy instances, but the
>>>>>> question is what we should be recommending as best practice going forward.
>>>>>>
>>>>>> Yeah, the i2 instances look like the sweet spot for any non-EBS
>>>>>> clusters.
>>>>>>
>>>>>> -- Jack Krupansky
>>>>>>
>>>>>> On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <
>>>>>> srobenalt@highwire.org> wrote:
>>>>>>
>>>>>>> Hi Jack,
>>>>>>>
>>>>>>> At the bottom of the instance-types page, there is a link to the
>>>>>>> previous generations, which includes the older series (m1, m2, etc), many
>>>>>>> of which have HDD options.
>>>>>>>
>>>>>>> There are also the d2 (Dense Storage) instances in the current
>>>>>>> generation that include various combos of local HDDs.
>>>>>>>
>>>>>>> The i2 series has good sized SSDs available, and has the advanced
>>>>>>> networking option, which is also useful for Cassandra. The enhanced
>>>>>>> networking is available with other instance types as well, as you'll see on
>>>>>>> the feature list under each type.
>>>>>>>
>>>>>>> Steve
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <
>>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>>
>>>>>>>> Thanks. Reading a little bit on AWS, and back to my SSD vs.
>>>>>>>> magnetic question, it seems like magnetic (HDD) is no longer a recommended
>>>>>>>> storage option for databases on AWS. In particular, only the C2 Dense
>>>>>>>> Storage instances have local magnetic storage - all the other instance
>>>>>>>> types are SSD or EBS-only - and EBS Magnetic is only recommended for
>>>>>>>> "Infrequent Data Access."
>>>>>>>>
>>>>>>>> For the record, that AWS doc has Cassandra listed as a use case for
>>>>>>>> i2 instance types.
>>>>>>>>
>>>>>>>> Also, the AWS doc lists EBS io2 for the NoSQL database use case and
>>>>>>>> gp2 only for the "small to medium databases" use case.
>>>>>>>>
>>>>>>>> Do older instances with local HDD still exist on AWS (m1, m2,
>>>>>>>> etc.)? Is the doc simply for any newly started instances?
>>>>>>>>
>>>>>>>> See:
>>>>>>>> https://aws.amazon.com/ec2/instance-types/
>>>>>>>> http://aws.amazon.com/ebs/details/
>>>>>>>>
>>>>>>>>
>>>>>>>> -- Jack Krupansky
>>>>>>>>
>>>>>>>> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <
>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>
>>>>>>>>> > My apologies if my questions are actually answered on the video
>>>>>>>>> or slides, I just did a quick scan of the slide text.
>>>>>>>>>
>>>>>>>>> Virtually all of them are covered.
>>>>>>>>>
>>>>>>>>> > I'm curious where the EBS physical devices actually reside - are
>>>>>>>>> they in the same rack, the same data center, same availability zone? I
>>>>>>>>> mean, people try to minimize network latency between nodes, so how exactly
>>>>>>>>> is EBS able to avoid network latency?
>>>>>>>>>
>>>>>>>>> Not published,and probably not a straight forward answer (probably
>>>>>>>>> have redundancy cross-az, if it matches some of their other published
>>>>>>>>> behaviors). The promise they give you is ‘iops’, with a certain block size.
>>>>>>>>> Some instance types are optimized for dedicated, ebs-only network
>>>>>>>>> interfaces. Like most things in cassandra / cloud, the only way to know for
>>>>>>>>> sure is to test it yourself and see if observed latency is acceptable (or
>>>>>>>>> trust our testing, if you assume we’re sufficiently smart and honest).
>>>>>>>>>
>>>>>>>>> > Did your test use Amazon EBS–Optimized Instances?
>>>>>>>>>
>>>>>>>>> We tested dozens of instance type/size combinations (literally).
>>>>>>>>> The best performance was clearly with ebs-optimized instances that also
>>>>>>>>> have enhanced networking (c4, m4, etc) - slide 43
>>>>>>>>>
>>>>>>>>> > SSD or magnetic or does it make any difference?
>>>>>>>>>
>>>>>>>>> SSD, GP2 (slide 64)
>>>>>>>>>
>>>>>>>>> > What info is available on EBS performance at peak times, when
>>>>>>>>> multiple AWS customers have spikes of demand?
>>>>>>>>>
>>>>>>>>> Not published, but experiments show that we can hit 10k iops all
>>>>>>>>> day every day with only trivial noisy neighbor problems, not enough to
>>>>>>>>> impact a real cluster (slide 58)
>>>>>>>>>
>>>>>>>>> > Is RAID much of a factor or help at all using EBS?
>>>>>>>>>
>>>>>>>>> You can use RAID to get higher IOPS than you’d normally get by
>>>>>>>>> default (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you
>>>>>>>>> need more than 10k, you can stripe volumes together up to the ebs network
>>>>>>>>> link max) (hinted at in slide 64)
>>>>>>>>>
>>>>>>>>> > How exactly is EBS provisioned in terms of its own HA - I mean,
>>>>>>>>> with a properly configured Cassandra cluster RF provides HA, so what is the
>>>>>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>>>>>
>>>>>>>>> There is HA, I’m not sure that AWS publishes specifics.
>>>>>>>>> Occasionally specific volumes will have issues (hypervisor’s dedicated
>>>>>>>>> ethernet link to EBS network fails, for example). Occasionally instances
>>>>>>>>> will have issues. The volume-specific issues seem to be less common than
>>>>>>>>> the instance-store “instance retired” or “instance is running on degraded
>>>>>>>>> hardware” events. Stop/Start and you’ve recovered (possible with EBS, not
>>>>>>>>> possible with instance store). The assurances are in AWS’ SLA – if the SLA
>>>>>>>>> is insufficient (and it probably is insufficient), use more than one AZ
>>>>>>>>> and/or AWS region or cloud vendor.
>>>>>>>>>
>>>>>>>>> > For multi-data center operation, what configuration options
>>>>>>>>> assure that the EBS volumes for each DC are truly physically separated?
>>>>>>>>>
>>>>>>>>> It used to be true that EBS control plane for a given region
>>>>>>>>> spanned AZs. That’s no longer true. AWS asserts that failure modes for each
>>>>>>>>> AZ are isolated (data may replicate between AZs, but a full outage in
>>>>>>>>> us-east-1a shouldn’t affect running ebs volumes in us-east-1b or
>>>>>>>>> us-east-1c). Slide 65
>>>>>>>>>
>>>>>>>>> > In terms of syncing data for the commit log, if the OS call to
>>>>>>>>> sync an EBS volume returns, is the commit log data absolutely 100% synced
>>>>>>>>> at the hardware level on the EBS end, such that a power failure of the
>>>>>>>>> systems on which the EBS volumes reside will still guarantee availability
>>>>>>>>> of the fsynced data. As well, is return from fsync an absolute guarantee of
>>>>>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>>>>>> including when the two are on different volumes? In practice, we would like
>>>>>>>>> some significant degree of pipelining of data, such as during the full
>>>>>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>>>>>> guarantee is needed.
>>>>>>>>>
>>>>>>>>> Most of the answers in this block are “probably not 100%, you
>>>>>>>>> should be writing to more than one host/AZ/DC/vendor to protect your
>>>>>>>>> organization from failures”. AWS targets something like 0.1% annual failure
>>>>>>>>> rate per volume and 99.999% availability (slide 66). We believe they’re
>>>>>>>>> exceeding those goals (at least based with the petabytes of data we have on
>>>>>>>>> gp2 volumes).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> From: Jack Krupansky
>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>> Date: Monday, February 1, 2016 at 5:51 AM
>>>>>>>>>
>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>>
>>>>>>>>> I'm not a fan of guy - this appears to be the slideshare
>>>>>>>>> corresponding to the video:
>>>>>>>>>
>>>>>>>>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>>>>>>>>>
>>>>>>>>> My apologies if my questions are actually answered on the video or
>>>>>>>>> slides, I just did a quick scan of the slide text.
>>>>>>>>>
>>>>>>>>> I'm curious where the EBS physical devices actually reside - are
>>>>>>>>> they in the same rack, the same data center, same availability zone? I
>>>>>>>>> mean, people try to minimize network latency between nodes, so how exactly
>>>>>>>>> is EBS able to avoid network latency?
>>>>>>>>>
>>>>>>>>> Did your test use Amazon EBS–Optimized Instances?
>>>>>>>>>
>>>>>>>>> SSD or magnetic or does it make any difference?
>>>>>>>>>
>>>>>>>>> What info is available on EBS performance at peak times, when
>>>>>>>>> multiple AWS customers have spikes of demand?
>>>>>>>>>
>>>>>>>>> Is RAID much of a factor or help at all using EBS?
>>>>>>>>>
>>>>>>>>> How exactly is EBS provisioned in terms of its own HA - I mean,
>>>>>>>>> with a properly configured Cassandra cluster RF provides HA, so what is the
>>>>>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>>>>>
>>>>>>>>> For multi-data center operation, what configuration options assure
>>>>>>>>> that the EBS volumes for each DC are truly physically separated?
>>>>>>>>>
>>>>>>>>> In terms of syncing data for the commit log, if the OS call to
>>>>>>>>> sync an EBS volume returns, is the commit log data absolutely 100% synced
>>>>>>>>> at the hardware level on the EBS end, such that a power failure of the
>>>>>>>>> systems on which the EBS volumes reside will still guarantee availability
>>>>>>>>> of the fsynced data. As well, is return from fsync an absolute guarantee of
>>>>>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>>>>>> including when the two are on different volumes? In practice, we would like
>>>>>>>>> some significant degree of pipelining of data, such as during the full
>>>>>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>>>>>> guarantee is needed.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -- Jack Krupansky
>>>>>>>>>
>>>>>>>>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Jeff,
>>>>>>>>>>
>>>>>>>>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>>>>>>>>>> discounting EBS, but prior outages are worrisome.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <
>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Free to choose what you'd like, but EBS outages were also
>>>>>>>>>>> addressed in that video (second half, discussion by Dennis Opacki). 2016
>>>>>>>>>>> EBS isn't the same as 2011 EBS.
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Jeff Jirsa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Thank you all for the suggestions. I'm torn between GP2 vs
>>>>>>>>>>> Ephemeral. GP2 after testing is a viable contender for our workload. The
>>>>>>>>>>> only worry I have is EBS outages, which have happened.
>>>>>>>>>>>
>>>>>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <
>>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Also in that video - it's long but worth watching
>>>>>>>>>>>>
>>>>>>>>>>>> We tested up to 1M reads/second as well, blowing out page cache
>>>>>>>>>>>> to ensure we weren't "just" reading from memory
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Jeff Jirsa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <
>>>>>>>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> How about reads? Any differences between read-intensive and
>>>>>>>>>>>> write-intensive workloads?
>>>>>>>>>>>>
>>>>>>>>>>>> -- Jack Krupansky
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <
>>>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi John,
>>>>>>>>>>>>>
>>>>>>>>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at
>>>>>>>>>>>>> 1M writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>>>>>>>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>>>>>>>>>> necessary.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> From: John Wong
>>>>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>>>>>>
>>>>>>>>>>>>> For production I'd stick with ephemeral disks (aka instance
>>>>>>>>>>>>> storage) if you have running a lot of transaction.
>>>>>>>>>>>>> However, for regular small testing/qa cluster, or something
>>>>>>>>>>>>> you know you want to reload often, EBS is definitely good enough and we
>>>>>>>>>>>>> haven't had issues 99%. The 1% is kind of anomaly where we have flush
>>>>>>>>>>>>> blocked.
>>>>>>>>>>>>>
>>>>>>>>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go
>>>>>>>>>>>>> through the video, do you actually use PIOPS or just standard GP2 in your
>>>>>>>>>>>>> production cluster?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <
>>>>>>>>>>>>> bryan@blockcypher.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yep, that motivated my question "Do you have any idea what
>>>>>>>>>>>>>> kind of disk performance you need?". If you need the performance, its hard
>>>>>>>>>>>>>> to beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>>>>>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Personally, on small clusters like ours (12 nodes), we've
>>>>>>>>>>>>>> found our choice of instance dictated much more by the balance of price,
>>>>>>>>>>>>>> CPU, and memory. We're using GP2 SSD and we find that for our patterns the
>>>>>>>>>>>>>> disk is rarely the bottleneck. YMMV, of course.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If you have to ask that question, I strongly recommend m4 or
>>>>>>>>>>>>>>> c4 instances with GP2 EBS.  When you don’t care about replacing a node
>>>>>>>>>>>>>>> because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS
>>>>>>>>>>>>>>> is capable of amazing things, and greatly simplifies life.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We gave a talk on this topic at both Cassandra Summit and
>>>>>>>>>>>>>>> AWS re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s
>>>>>>>>>>>>>>> very much a viable option, despite any old documents online that say
>>>>>>>>>>>>>>> otherwise.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> From: Eric Plowe
>>>>>>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> My company is planning on rolling out a C* cluster in EC2.
>>>>>>>>>>>>>>> We are thinking about going with ephemeral SSDs. The question is
>>>>>>>>>>>>>>> this: Should we put two in RAID 0 or just go with one? We currently run a
>>>>>>>>>>>>>>> cluster in our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we
>>>>>>>>>>>>>>> are happy with the performance we are seeing thus far.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Eric
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Steve Robenalt
>>>>>>> Software Architect
>>>>>>> srobenalt@highwire.org <bz...@highwire.org>
>>>>>>> (office/cell): 916-505-1785
>>>>>>>
>>>>>>> HighWire Press, Inc.
>>>>>>> 425 Broadway St, Redwood City, CA 94063
>>>>>>> www.highwire.org
>>>>>>>
>>>>>>> Technology for Scholarly Communication
>>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>> Ben Bromhead
>>>> CTO | Instaclustr
>>>> +1 650 284 9692
>>>>
>>>
>>>
>>
>

Re: EC2 storage options for C*

Posted by Sebastian Estevez <se...@datastax.com>.

Good points Bryan, some more color:

Regular EBS is *not* okay for C*. But AWS has some nicer EBS now that has
performed okay recently:

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html

https://www.youtube.com/watch?v=1R-mgOcOSd4


The cloud vendors are moving toward shared storage and we can't ignore that
in the long term (they will push us in that direction financially).
Fortunately their shared storage offerings are also getting better. For
example google's elastic storage offerring provides very reliable latencies
<https://www.youtube.com/watch?v=qf-7IhCqCcI>which is what we care the most
about, not iops.

On the practical side, a key thing I've noticed with real deployments is
that the size of the volume affects how fast it will perform and how stable
it's latencies will be so make sure to get large EBS volumes > 1tb to get
decent performance, even if your nodes aren't that dense.




All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.estevez@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Feb 3, 2016 at 7:23 PM, Bryan Cheng <br...@blockcypher.com> wrote:

> From my experience, EBS has transitioned from "stay the hell away" to "OK"
> as the new GP2 SSD type has come out and stabilized over the last few
> years, especially with the addition of EBS-optimized instances that have
> dedicated EBS bandwidth. The latter has really helped to stabilize the
> problematic 99.9-percentile latency spikes that use to plague EBS volumes.
>
> EBS (IMHO) has always had operational advantages, but inconsistent latency
> and generally poor performance in the past lead many to disregard it.
>
> On Wed, Feb 3, 2016 at 4:09 PM, James Rothering <jr...@codojo.me>
> wrote:
>
>> Just curious here ... when did EBS become OK for C*? Didn't they always
>> push towards using ephemeral disks?
>>
>> On Wed, Feb 3, 2016 at 12:17 PM, Ben Bromhead <be...@instaclustr.com>
>> wrote:
>>
>>> For what it's worth we've tried d2 instances and they encourage terrible
>>> things like super dense nodes (increases your replacement time). In terms
>>> of useable storage I would go with gp2 EBS on a m4 based instance.
>>>
>>> On Mon, 1 Feb 2016 at 14:25 Jack Krupansky <ja...@gmail.com>
>>> wrote:
>>>
>>>> Ah, yes, the good old days of m1.large.
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> On Mon, Feb 1, 2016 at 5:12 PM, Jeff Jirsa <je...@crowdstrike.com>
>>>> wrote:
>>>>
>>>>> A lot of people use the old gen instances (m1 in particular) because
>>>>> they came with a ton of effectively free ephemeral storage (up to 1.6TB).
>>>>> Whether or not they’re viable is a decision for each user to make. They’re
>>>>> very, very commonly used for C*, though. At a time when EBS was not
>>>>> sufficiently robust or reliable, a cluster of m1 instances was the de facto
>>>>> standard.
>>>>>
>>>>> The canonical “best practice” in 2015 was i2. We believe we’ve made a
>>>>> compelling argument to use m4 or c4 instead of i2. There exists a company
>>>>> we know currently testing d2 at scale, though I’m not sure they have much
>>>>> in terms of concrete results at this time.
>>>>>
>>>>> - Jeff
>>>>>
>>>>> From: Jack Krupansky
>>>>> Reply-To: "user@cassandra.apache.org"
>>>>> Date: Monday, February 1, 2016 at 1:55 PM
>>>>>
>>>>> To: "user@cassandra.apache.org"
>>>>> Subject: Re: EC2 storage options for C*
>>>>>
>>>>> Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2
>>>>> Dense Storage".
>>>>>
>>>>> The remaining question is whether any of the "Previous Generation
>>>>> Instances" should be publicly recommended going forward.
>>>>>
>>>>> And whether non-SSD instances should be recommended going forward as
>>>>> well. sure, technically, someone could use the legacy instances, but the
>>>>> question is what we should be recommending as best practice going forward.
>>>>>
>>>>> Yeah, the i2 instances look like the sweet spot for any non-EBS
>>>>> clusters.
>>>>>
>>>>> -- Jack Krupansky
>>>>>
>>>>> On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <srobenalt@highwire.org
>>>>> > wrote:
>>>>>
>>>>>> Hi Jack,
>>>>>>
>>>>>> At the bottom of the instance-types page, there is a link to the
>>>>>> previous generations, which includes the older series (m1, m2, etc), many
>>>>>> of which have HDD options.
>>>>>>
>>>>>> There are also the d2 (Dense Storage) instances in the current
>>>>>> generation that include various combos of local HDDs.
>>>>>>
>>>>>> The i2 series has good sized SSDs available, and has the advanced
>>>>>> networking option, which is also useful for Cassandra. The enhanced
>>>>>> networking is available with other instance types as well, as you'll see on
>>>>>> the feature list under each type.
>>>>>>
>>>>>> Steve
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <
>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>
>>>>>>> Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic
>>>>>>> question, it seems like magnetic (HDD) is no longer a recommended storage
>>>>>>> option for databases on AWS. In particular, only the C2 Dense Storage
>>>>>>> instances have local magnetic storage - all the other instance types are
>>>>>>> SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent Data
>>>>>>> Access."
>>>>>>>
>>>>>>> For the record, that AWS doc has Cassandra listed as a use case for
>>>>>>> i2 instance types.
>>>>>>>
>>>>>>> Also, the AWS doc lists EBS io2 for the NoSQL database use case and
>>>>>>> gp2 only for the "small to medium databases" use case.
>>>>>>>
>>>>>>> Do older instances with local HDD still exist on AWS (m1, m2, etc.)?
>>>>>>> Is the doc simply for any newly started instances?
>>>>>>>
>>>>>>> See:
>>>>>>> https://aws.amazon.com/ec2/instance-types/
>>>>>>> http://aws.amazon.com/ebs/details/
>>>>>>>
>>>>>>>
>>>>>>> -- Jack Krupansky
>>>>>>>
>>>>>>> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <
>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>
>>>>>>>> > My apologies if my questions are actually answered on the video
>>>>>>>> or slides, I just did a quick scan of the slide text.
>>>>>>>>
>>>>>>>> Virtually all of them are covered.
>>>>>>>>
>>>>>>>> > I'm curious where the EBS physical devices actually reside - are
>>>>>>>> they in the same rack, the same data center, same availability zone? I
>>>>>>>> mean, people try to minimize network latency between nodes, so how exactly
>>>>>>>> is EBS able to avoid network latency?
>>>>>>>>
>>>>>>>> Not published,and probably not a straight forward answer (probably
>>>>>>>> have redundancy cross-az, if it matches some of their other published
>>>>>>>> behaviors). The promise they give you is ‘iops’, with a certain block size.
>>>>>>>> Some instance types are optimized for dedicated, ebs-only network
>>>>>>>> interfaces. Like most things in cassandra / cloud, the only way to know for
>>>>>>>> sure is to test it yourself and see if observed latency is acceptable (or
>>>>>>>> trust our testing, if you assume we’re sufficiently smart and honest).
>>>>>>>>
>>>>>>>> > Did your test use Amazon EBS–Optimized Instances?
>>>>>>>>
>>>>>>>> We tested dozens of instance type/size combinations (literally).
>>>>>>>> The best performance was clearly with ebs-optimized instances that also
>>>>>>>> have enhanced networking (c4, m4, etc) - slide 43
>>>>>>>>
>>>>>>>> > SSD or magnetic or does it make any difference?
>>>>>>>>
>>>>>>>> SSD, GP2 (slide 64)
>>>>>>>>
>>>>>>>> > What info is available on EBS performance at peak times, when
>>>>>>>> multiple AWS customers have spikes of demand?
>>>>>>>>
>>>>>>>> Not published, but experiments show that we can hit 10k iops all
>>>>>>>> day every day with only trivial noisy neighbor problems, not enough to
>>>>>>>> impact a real cluster (slide 58)
>>>>>>>>
>>>>>>>> > Is RAID much of a factor or help at all using EBS?
>>>>>>>>
>>>>>>>> You can use RAID to get higher IOPS than you’d normally get by
>>>>>>>> default (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you
>>>>>>>> need more than 10k, you can stripe volumes together up to the ebs network
>>>>>>>> link max) (hinted at in slide 64)
>>>>>>>>
>>>>>>>> > How exactly is EBS provisioned in terms of its own HA - I mean,
>>>>>>>> with a properly configured Cassandra cluster RF provides HA, so what is the
>>>>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>>>>
>>>>>>>> There is HA, I’m not sure that AWS publishes specifics.
>>>>>>>> Occasionally specific volumes will have issues (hypervisor’s dedicated
>>>>>>>> ethernet link to EBS network fails, for example). Occasionally instances
>>>>>>>> will have issues. The volume-specific issues seem to be less common than
>>>>>>>> the instance-store “instance retired” or “instance is running on degraded
>>>>>>>> hardware” events. Stop/Start and you’ve recovered (possible with EBS, not
>>>>>>>> possible with instance store). The assurances are in AWS’ SLA – if the SLA
>>>>>>>> is insufficient (and it probably is insufficient), use more than one AZ
>>>>>>>> and/or AWS region or cloud vendor.
>>>>>>>>
>>>>>>>> > For multi-data center operation, what configuration options
>>>>>>>> assure that the EBS volumes for each DC are truly physically separated?
>>>>>>>>
>>>>>>>> It used to be true that EBS control plane for a given region
>>>>>>>> spanned AZs. That’s no longer true. AWS asserts that failure modes for each
>>>>>>>> AZ are isolated (data may replicate between AZs, but a full outage in
>>>>>>>> us-east-1a shouldn’t affect running ebs volumes in us-east-1b or
>>>>>>>> us-east-1c). Slide 65
>>>>>>>>
>>>>>>>> > In terms of syncing data for the commit log, if the OS call to
>>>>>>>> sync an EBS volume returns, is the commit log data absolutely 100% synced
>>>>>>>> at the hardware level on the EBS end, such that a power failure of the
>>>>>>>> systems on which the EBS volumes reside will still guarantee availability
>>>>>>>> of the fsynced data. As well, is return from fsync an absolute guarantee of
>>>>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>>>>> including when the two are on different volumes? In practice, we would like
>>>>>>>> some significant degree of pipelining of data, such as during the full
>>>>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>>>>> guarantee is needed.
>>>>>>>>
>>>>>>>> Most of the answers in this block are “probably not 100%, you
>>>>>>>> should be writing to more than one host/AZ/DC/vendor to protect your
>>>>>>>> organization from failures”. AWS targets something like 0.1% annual failure
>>>>>>>> rate per volume and 99.999% availability (slide 66). We believe they’re
>>>>>>>> exceeding those goals (at least based with the petabytes of data we have on
>>>>>>>> gp2 volumes).
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> From: Jack Krupansky
>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>> Date: Monday, February 1, 2016 at 5:51 AM
>>>>>>>>
>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>
>>>>>>>> I'm not a fan of guy - this appears to be the slideshare
>>>>>>>> corresponding to the video:
>>>>>>>>
>>>>>>>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>>>>>>>>
>>>>>>>> My apologies if my questions are actually answered on the video or
>>>>>>>> slides, I just did a quick scan of the slide text.
>>>>>>>>
>>>>>>>> I'm curious where the EBS physical devices actually reside - are
>>>>>>>> they in the same rack, the same data center, same availability zone? I
>>>>>>>> mean, people try to minimize network latency between nodes, so how exactly
>>>>>>>> is EBS able to avoid network latency?
>>>>>>>>
>>>>>>>> Did your test use Amazon EBS–Optimized Instances?
>>>>>>>>
>>>>>>>> SSD or magnetic or does it make any difference?
>>>>>>>>
>>>>>>>> What info is available on EBS performance at peak times, when
>>>>>>>> multiple AWS customers have spikes of demand?
>>>>>>>>
>>>>>>>> Is RAID much of a factor or help at all using EBS?
>>>>>>>>
>>>>>>>> How exactly is EBS provisioned in terms of its own HA - I mean,
>>>>>>>> with a properly configured Cassandra cluster RF provides HA, so what is the
>>>>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>>>>
>>>>>>>> For multi-data center operation, what configuration options assure
>>>>>>>> that the EBS volumes for each DC are truly physically separated?
>>>>>>>>
>>>>>>>> In terms of syncing data for the commit log, if the OS call to sync
>>>>>>>> an EBS volume returns, is the commit log data absolutely 100% synced at the
>>>>>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>>>>>> which the EBS volumes reside will still guarantee availability of the
>>>>>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>>>>> including when the two are on different volumes? In practice, we would like
>>>>>>>> some significant degree of pipelining of data, such as during the full
>>>>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>>>>> guarantee is needed.
>>>>>>>>
>>>>>>>>
>>>>>>>> -- Jack Krupansky
>>>>>>>>
>>>>>>>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Jeff,
>>>>>>>>>
>>>>>>>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>>>>>>>>> discounting EBS, but prior outages are worrisome.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <
>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>
>>>>>>>>>> Free to choose what you'd like, but EBS outages were also
>>>>>>>>>> addressed in that video (second half, discussion by Dennis Opacki). 2016
>>>>>>>>>> EBS isn't the same as 2011 EBS.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Jeff Jirsa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Thank you all for the suggestions. I'm torn between GP2 vs
>>>>>>>>>> Ephemeral. GP2 after testing is a viable contender for our workload. The
>>>>>>>>>> only worry I have is EBS outages, which have happened.
>>>>>>>>>>
>>>>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <
>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Also in that video - it's long but worth watching
>>>>>>>>>>>
>>>>>>>>>>> We tested up to 1M reads/second as well, blowing out page cache
>>>>>>>>>>> to ensure we weren't "just" reading from memory
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Jeff Jirsa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <
>>>>>>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> How about reads? Any differences between read-intensive and
>>>>>>>>>>> write-intensive workloads?
>>>>>>>>>>>
>>>>>>>>>>> -- Jack Krupansky
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <
>>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi John,
>>>>>>>>>>>>
>>>>>>>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at
>>>>>>>>>>>> 1M writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>>>>>>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>>>>>>>>> necessary.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> From: John Wong
>>>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>>>>>
>>>>>>>>>>>> For production I'd stick with ephemeral disks (aka instance
>>>>>>>>>>>> storage) if you have running a lot of transaction.
>>>>>>>>>>>> However, for regular small testing/qa cluster, or something you
>>>>>>>>>>>> know you want to reload often, EBS is definitely good enough and we haven't
>>>>>>>>>>>> had issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>>>>>>>>>
>>>>>>>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go
>>>>>>>>>>>> through the video, do you actually use PIOPS or just standard GP2 in your
>>>>>>>>>>>> production cluster?
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <
>>>>>>>>>>>> bryan@blockcypher.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Yep, that motivated my question "Do you have any idea what
>>>>>>>>>>>>> kind of disk performance you need?". If you need the performance, its hard
>>>>>>>>>>>>> to beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>>>>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Personally, on small clusters like ours (12 nodes), we've
>>>>>>>>>>>>> found our choice of instance dictated much more by the balance of price,
>>>>>>>>>>>>> CPU, and memory. We're using GP2 SSD and we find that for our patterns the
>>>>>>>>>>>>> disk is rarely the bottleneck. YMMV, of course.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> If you have to ask that question, I strongly recommend m4 or
>>>>>>>>>>>>>> c4 instances with GP2 EBS.  When you don’t care about replacing a node
>>>>>>>>>>>>>> because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS
>>>>>>>>>>>>>> is capable of amazing things, and greatly simplifies life.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>>>>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s
>>>>>>>>>>>>>> very much a viable option, despite any old documents online that say
>>>>>>>>>>>>>> otherwise.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> From: Eric Plowe
>>>>>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> My company is planning on rolling out a C* cluster in EC2. We
>>>>>>>>>>>>>> are thinking about going with ephemeral SSDs. The question is this: Should
>>>>>>>>>>>>>> we put two in RAID 0 or just go with one? We currently run a cluster in our
>>>>>>>>>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>>>>>>>>>>>> the performance we are seeing thus far.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Eric
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Steve Robenalt
>>>>>> Software Architect
>>>>>> srobenalt@highwire.org <bz...@highwire.org>
>>>>>> (office/cell): 916-505-1785
>>>>>>
>>>>>> HighWire Press, Inc.
>>>>>> 425 Broadway St, Redwood City, CA 94063
>>>>>> www.highwire.org
>>>>>>
>>>>>> Technology for Scholarly Communication
>>>>>>
>>>>>
>>>>>
>>>> --
>>> Ben Bromhead
>>> CTO | Instaclustr
>>> +1 650 284 9692
>>>
>>
>>
>

Re: EC2 storage options for C*

Posted by Bryan Cheng <br...@blockcypher.com>.

>From my experience, EBS has transitioned from "stay the hell away" to "OK"
as the new GP2 SSD type has come out and stabilized over the last few
years, especially with the addition of EBS-optimized instances that have
dedicated EBS bandwidth. The latter has really helped to stabilize the
problematic 99.9-percentile latency spikes that use to plague EBS volumes.

EBS (IMHO) has always had operational advantages, but inconsistent latency
and generally poor performance in the past lead many to disregard it.

On Wed, Feb 3, 2016 at 4:09 PM, James Rothering <jr...@codojo.me>
wrote:

> Just curious here ... when did EBS become OK for C*? Didn't they always
> push towards using ephemeral disks?
>
> On Wed, Feb 3, 2016 at 12:17 PM, Ben Bromhead <be...@instaclustr.com> wrote:
>
>> For what it's worth we've tried d2 instances and they encourage terrible
>> things like super dense nodes (increases your replacement time). In terms
>> of useable storage I would go with gp2 EBS on a m4 based instance.
>>
>> On Mon, 1 Feb 2016 at 14:25 Jack Krupansky <ja...@gmail.com>
>> wrote:
>>
>>> Ah, yes, the good old days of m1.large.
>>>
>>> -- Jack Krupansky
>>>
>>> On Mon, Feb 1, 2016 at 5:12 PM, Jeff Jirsa <je...@crowdstrike.com>
>>> wrote:
>>>
>>>> A lot of people use the old gen instances (m1 in particular) because
>>>> they came with a ton of effectively free ephemeral storage (up to 1.6TB).
>>>> Whether or not they’re viable is a decision for each user to make. They’re
>>>> very, very commonly used for C*, though. At a time when EBS was not
>>>> sufficiently robust or reliable, a cluster of m1 instances was the de facto
>>>> standard.
>>>>
>>>> The canonical “best practice” in 2015 was i2. We believe we’ve made a
>>>> compelling argument to use m4 or c4 instead of i2. There exists a company
>>>> we know currently testing d2 at scale, though I’m not sure they have much
>>>> in terms of concrete results at this time.
>>>>
>>>> - Jeff
>>>>
>>>> From: Jack Krupansky
>>>> Reply-To: "user@cassandra.apache.org"
>>>> Date: Monday, February 1, 2016 at 1:55 PM
>>>>
>>>> To: "user@cassandra.apache.org"
>>>> Subject: Re: EC2 storage options for C*
>>>>
>>>> Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2
>>>> Dense Storage".
>>>>
>>>> The remaining question is whether any of the "Previous Generation
>>>> Instances" should be publicly recommended going forward.
>>>>
>>>> And whether non-SSD instances should be recommended going forward as
>>>> well. sure, technically, someone could use the legacy instances, but the
>>>> question is what we should be recommending as best practice going forward.
>>>>
>>>> Yeah, the i2 instances look like the sweet spot for any non-EBS
>>>> clusters.
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <sr...@highwire.org>
>>>> wrote:
>>>>
>>>>> Hi Jack,
>>>>>
>>>>> At the bottom of the instance-types page, there is a link to the
>>>>> previous generations, which includes the older series (m1, m2, etc), many
>>>>> of which have HDD options.
>>>>>
>>>>> There are also the d2 (Dense Storage) instances in the current
>>>>> generation that include various combos of local HDDs.
>>>>>
>>>>> The i2 series has good sized SSDs available, and has the advanced
>>>>> networking option, which is also useful for Cassandra. The enhanced
>>>>> networking is available with other instance types as well, as you'll see on
>>>>> the feature list under each type.
>>>>>
>>>>> Steve
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <
>>>>> jack.krupansky@gmail.com> wrote:
>>>>>
>>>>>> Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic
>>>>>> question, it seems like magnetic (HDD) is no longer a recommended storage
>>>>>> option for databases on AWS. In particular, only the C2 Dense Storage
>>>>>> instances have local magnetic storage - all the other instance types are
>>>>>> SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent Data
>>>>>> Access."
>>>>>>
>>>>>> For the record, that AWS doc has Cassandra listed as a use case for
>>>>>> i2 instance types.
>>>>>>
>>>>>> Also, the AWS doc lists EBS io2 for the NoSQL database use case and
>>>>>> gp2 only for the "small to medium databases" use case.
>>>>>>
>>>>>> Do older instances with local HDD still exist on AWS (m1, m2, etc.)?
>>>>>> Is the doc simply for any newly started instances?
>>>>>>
>>>>>> See:
>>>>>> https://aws.amazon.com/ec2/instance-types/
>>>>>> http://aws.amazon.com/ebs/details/
>>>>>>
>>>>>>
>>>>>> -- Jack Krupansky
>>>>>>
>>>>>> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <
>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>
>>>>>>> > My apologies if my questions are actually answered on the video or
>>>>>>> slides, I just did a quick scan of the slide text.
>>>>>>>
>>>>>>> Virtually all of them are covered.
>>>>>>>
>>>>>>> > I'm curious where the EBS physical devices actually reside - are
>>>>>>> they in the same rack, the same data center, same availability zone? I
>>>>>>> mean, people try to minimize network latency between nodes, so how exactly
>>>>>>> is EBS able to avoid network latency?
>>>>>>>
>>>>>>> Not published,and probably not a straight forward answer (probably
>>>>>>> have redundancy cross-az, if it matches some of their other published
>>>>>>> behaviors). The promise they give you is ‘iops’, with a certain block size.
>>>>>>> Some instance types are optimized for dedicated, ebs-only network
>>>>>>> interfaces. Like most things in cassandra / cloud, the only way to know for
>>>>>>> sure is to test it yourself and see if observed latency is acceptable (or
>>>>>>> trust our testing, if you assume we’re sufficiently smart and honest).
>>>>>>>
>>>>>>> > Did your test use Amazon EBS–Optimized Instances?
>>>>>>>
>>>>>>> We tested dozens of instance type/size combinations (literally). The
>>>>>>> best performance was clearly with ebs-optimized instances that also have
>>>>>>> enhanced networking (c4, m4, etc) - slide 43
>>>>>>>
>>>>>>> > SSD or magnetic or does it make any difference?
>>>>>>>
>>>>>>> SSD, GP2 (slide 64)
>>>>>>>
>>>>>>> > What info is available on EBS performance at peak times, when
>>>>>>> multiple AWS customers have spikes of demand?
>>>>>>>
>>>>>>> Not published, but experiments show that we can hit 10k iops all day
>>>>>>> every day with only trivial noisy neighbor problems, not enough to impact a
>>>>>>> real cluster (slide 58)
>>>>>>>
>>>>>>> > Is RAID much of a factor or help at all using EBS?
>>>>>>>
>>>>>>> You can use RAID to get higher IOPS than you’d normally get by
>>>>>>> default (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you
>>>>>>> need more than 10k, you can stripe volumes together up to the ebs network
>>>>>>> link max) (hinted at in slide 64)
>>>>>>>
>>>>>>> > How exactly is EBS provisioned in terms of its own HA - I mean,
>>>>>>> with a properly configured Cassandra cluster RF provides HA, so what is the
>>>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>>>
>>>>>>> There is HA, I’m not sure that AWS publishes specifics. Occasionally
>>>>>>> specific volumes will have issues (hypervisor’s dedicated ethernet link to
>>>>>>> EBS network fails, for example). Occasionally instances will have issues.
>>>>>>> The volume-specific issues seem to be less common than the instance-store
>>>>>>> “instance retired” or “instance is running on degraded hardware” events.
>>>>>>> Stop/Start and you’ve recovered (possible with EBS, not possible with
>>>>>>> instance store). The assurances are in AWS’ SLA – if the SLA is
>>>>>>> insufficient (and it probably is insufficient), use more than one AZ and/or
>>>>>>> AWS region or cloud vendor.
>>>>>>>
>>>>>>> > For multi-data center operation, what configuration options assure
>>>>>>> that the EBS volumes for each DC are truly physically separated?
>>>>>>>
>>>>>>> It used to be true that EBS control plane for a given region spanned
>>>>>>> AZs. That’s no longer true. AWS asserts that failure modes for each AZ are
>>>>>>> isolated (data may replicate between AZs, but a full outage in us-east-1a
>>>>>>> shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide 65
>>>>>>>
>>>>>>> > In terms of syncing data for the commit log, if the OS call to
>>>>>>> sync an EBS volume returns, is the commit log data absolutely 100% synced
>>>>>>> at the hardware level on the EBS end, such that a power failure of the
>>>>>>> systems on which the EBS volumes reside will still guarantee availability
>>>>>>> of the fsynced data. As well, is return from fsync an absolute guarantee of
>>>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>>>> including when the two are on different volumes? In practice, we would like
>>>>>>> some significant degree of pipelining of data, such as during the full
>>>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>>>> guarantee is needed.
>>>>>>>
>>>>>>> Most of the answers in this block are “probably not 100%, you should
>>>>>>> be writing to more than one host/AZ/DC/vendor to protect your organization
>>>>>>> from failures”. AWS targets something like 0.1% annual failure rate per
>>>>>>> volume and 99.999% availability (slide 66). We believe they’re exceeding
>>>>>>> those goals (at least based with the petabytes of data we have on gp2
>>>>>>> volumes).
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> From: Jack Krupansky
>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>> Date: Monday, February 1, 2016 at 5:51 AM
>>>>>>>
>>>>>>> To: "user@cassandra.apache.org"
>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>
>>>>>>> I'm not a fan of guy - this appears to be the slideshare
>>>>>>> corresponding to the video:
>>>>>>>
>>>>>>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>>>>>>>
>>>>>>> My apologies if my questions are actually answered on the video or
>>>>>>> slides, I just did a quick scan of the slide text.
>>>>>>>
>>>>>>> I'm curious where the EBS physical devices actually reside - are
>>>>>>> they in the same rack, the same data center, same availability zone? I
>>>>>>> mean, people try to minimize network latency between nodes, so how exactly
>>>>>>> is EBS able to avoid network latency?
>>>>>>>
>>>>>>> Did your test use Amazon EBS–Optimized Instances?
>>>>>>>
>>>>>>> SSD or magnetic or does it make any difference?
>>>>>>>
>>>>>>> What info is available on EBS performance at peak times, when
>>>>>>> multiple AWS customers have spikes of demand?
>>>>>>>
>>>>>>> Is RAID much of a factor or help at all using EBS?
>>>>>>>
>>>>>>> How exactly is EBS provisioned in terms of its own HA - I mean, with
>>>>>>> a properly configured Cassandra cluster RF provides HA, so what is the
>>>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>>>
>>>>>>> For multi-data center operation, what configuration options assure
>>>>>>> that the EBS volumes for each DC are truly physically separated?
>>>>>>>
>>>>>>> In terms of syncing data for the commit log, if the OS call to sync
>>>>>>> an EBS volume returns, is the commit log data absolutely 100% synced at the
>>>>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>>>>> which the EBS volumes reside will still guarantee availability of the
>>>>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>>>> including when the two are on different volumes? In practice, we would like
>>>>>>> some significant degree of pipelining of data, such as during the full
>>>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>>>> guarantee is needed.
>>>>>>>
>>>>>>>
>>>>>>> -- Jack Krupansky
>>>>>>>
>>>>>>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Jeff,
>>>>>>>>
>>>>>>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>>>>>>>> discounting EBS, but prior outages are worrisome.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Free to choose what you'd like, but EBS outages were also
>>>>>>>>> addressed in that video (second half, discussion by Dennis Opacki). 2016
>>>>>>>>> EBS isn't the same as 2011 EBS.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Jeff Jirsa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Thank you all for the suggestions. I'm torn between GP2 vs
>>>>>>>>> Ephemeral. GP2 after testing is a viable contender for our workload. The
>>>>>>>>> only worry I have is EBS outages, which have happened.
>>>>>>>>>
>>>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <
>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>
>>>>>>>>>> Also in that video - it's long but worth watching
>>>>>>>>>>
>>>>>>>>>> We tested up to 1M reads/second as well, blowing out page cache
>>>>>>>>>> to ensure we weren't "just" reading from memory
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Jeff Jirsa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <
>>>>>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> How about reads? Any differences between read-intensive and
>>>>>>>>>> write-intensive workloads?
>>>>>>>>>>
>>>>>>>>>> -- Jack Krupansky
>>>>>>>>>>
>>>>>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <
>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi John,
>>>>>>>>>>>
>>>>>>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at
>>>>>>>>>>> 1M writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>>>>>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>>>>>>>> necessary.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> From: John Wong
>>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>>>>
>>>>>>>>>>> For production I'd stick with ephemeral disks (aka instance
>>>>>>>>>>> storage) if you have running a lot of transaction.
>>>>>>>>>>> However, for regular small testing/qa cluster, or something you
>>>>>>>>>>> know you want to reload often, EBS is definitely good enough and we haven't
>>>>>>>>>>> had issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>>>>>>>>
>>>>>>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go through
>>>>>>>>>>> the video, do you actually use PIOPS or just standard GP2 in your
>>>>>>>>>>> production cluster?
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <
>>>>>>>>>>> bryan@blockcypher.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Yep, that motivated my question "Do you have any idea what
>>>>>>>>>>>> kind of disk performance you need?". If you need the performance, its hard
>>>>>>>>>>>> to beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>>>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>>>>>>>>
>>>>>>>>>>>> Personally, on small clusters like ours (12 nodes), we've found
>>>>>>>>>>>> our choice of instance dictated much more by the balance of price, CPU, and
>>>>>>>>>>>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>>>>>>>>>>>> rarely the bottleneck. YMMV, of course.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> If you have to ask that question, I strongly recommend m4 or
>>>>>>>>>>>>> c4 instances with GP2 EBS.  When you don’t care about replacing a node
>>>>>>>>>>>>> because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS
>>>>>>>>>>>>> is capable of amazing things, and greatly simplifies life.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>>>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s
>>>>>>>>>>>>> very much a viable option, despite any old documents online that say
>>>>>>>>>>>>> otherwise.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> From: Eric Plowe
>>>>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>>>>>>>
>>>>>>>>>>>>> My company is planning on rolling out a C* cluster in EC2. We
>>>>>>>>>>>>> are thinking about going with ephemeral SSDs. The question is this: Should
>>>>>>>>>>>>> we put two in RAID 0 or just go with one? We currently run a cluster in our
>>>>>>>>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>>>>>>>>>>> the performance we are seeing thus far.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>
>>>>>>>>>>>>> Eric
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Steve Robenalt
>>>>> Software Architect
>>>>> srobenalt@highwire.org <bz...@highwire.org>
>>>>> (office/cell): 916-505-1785
>>>>>
>>>>> HighWire Press, Inc.
>>>>> 425 Broadway St, Redwood City, CA 94063
>>>>> www.highwire.org
>>>>>
>>>>> Technology for Scholarly Communication
>>>>>
>>>>
>>>>
>>> --
>> Ben Bromhead
>> CTO | Instaclustr
>> +1 650 284 9692
>>
>
>

Re: EC2 storage options for C*

Posted by James Rothering <jr...@codojo.me>.

Just curious here ... when did EBS become OK for C*? Didn't they always
push towards using ephemeral disks?

On Wed, Feb 3, 2016 at 12:17 PM, Ben Bromhead <be...@instaclustr.com> wrote:

> For what it's worth we've tried d2 instances and they encourage terrible
> things like super dense nodes (increases your replacement time). In terms
> of useable storage I would go with gp2 EBS on a m4 based instance.
>
> On Mon, 1 Feb 2016 at 14:25 Jack Krupansky <ja...@gmail.com>
> wrote:
>
>> Ah, yes, the good old days of m1.large.
>>
>> -- Jack Krupansky
>>
>> On Mon, Feb 1, 2016 at 5:12 PM, Jeff Jirsa <je...@crowdstrike.com>
>> wrote:
>>
>>> A lot of people use the old gen instances (m1 in particular) because
>>> they came with a ton of effectively free ephemeral storage (up to 1.6TB).
>>> Whether or not they’re viable is a decision for each user to make. They’re
>>> very, very commonly used for C*, though. At a time when EBS was not
>>> sufficiently robust or reliable, a cluster of m1 instances was the de facto
>>> standard.
>>>
>>> The canonical “best practice” in 2015 was i2. We believe we’ve made a
>>> compelling argument to use m4 or c4 instead of i2. There exists a company
>>> we know currently testing d2 at scale, though I’m not sure they have much
>>> in terms of concrete results at this time.
>>>
>>> - Jeff
>>>
>>> From: Jack Krupansky
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Monday, February 1, 2016 at 1:55 PM
>>>
>>> To: "user@cassandra.apache.org"
>>> Subject: Re: EC2 storage options for C*
>>>
>>> Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2
>>> Dense Storage".
>>>
>>> The remaining question is whether any of the "Previous Generation
>>> Instances" should be publicly recommended going forward.
>>>
>>> And whether non-SSD instances should be recommended going forward as
>>> well. sure, technically, someone could use the legacy instances, but the
>>> question is what we should be recommending as best practice going forward.
>>>
>>> Yeah, the i2 instances look like the sweet spot for any non-EBS clusters.
>>>
>>> -- Jack Krupansky
>>>
>>> On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <sr...@highwire.org>
>>> wrote:
>>>
>>>> Hi Jack,
>>>>
>>>> At the bottom of the instance-types page, there is a link to the
>>>> previous generations, which includes the older series (m1, m2, etc), many
>>>> of which have HDD options.
>>>>
>>>> There are also the d2 (Dense Storage) instances in the current
>>>> generation that include various combos of local HDDs.
>>>>
>>>> The i2 series has good sized SSDs available, and has the advanced
>>>> networking option, which is also useful for Cassandra. The enhanced
>>>> networking is available with other instance types as well, as you'll see on
>>>> the feature list under each type.
>>>>
>>>> Steve
>>>>
>>>>
>>>>
>>>> On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <
>>>> jack.krupansky@gmail.com> wrote:
>>>>
>>>>> Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic
>>>>> question, it seems like magnetic (HDD) is no longer a recommended storage
>>>>> option for databases on AWS. In particular, only the C2 Dense Storage
>>>>> instances have local magnetic storage - all the other instance types are
>>>>> SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent Data
>>>>> Access."
>>>>>
>>>>> For the record, that AWS doc has Cassandra listed as a use case for i2
>>>>> instance types.
>>>>>
>>>>> Also, the AWS doc lists EBS io2 for the NoSQL database use case and
>>>>> gp2 only for the "small to medium databases" use case.
>>>>>
>>>>> Do older instances with local HDD still exist on AWS (m1, m2, etc.)?
>>>>> Is the doc simply for any newly started instances?
>>>>>
>>>>> See:
>>>>> https://aws.amazon.com/ec2/instance-types/
>>>>> http://aws.amazon.com/ebs/details/
>>>>>
>>>>>
>>>>> -- Jack Krupansky
>>>>>
>>>>> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <jeff.jirsa@crowdstrike.com
>>>>> > wrote:
>>>>>
>>>>>> > My apologies if my questions are actually answered on the video or
>>>>>> slides, I just did a quick scan of the slide text.
>>>>>>
>>>>>> Virtually all of them are covered.
>>>>>>
>>>>>> > I'm curious where the EBS physical devices actually reside - are
>>>>>> they in the same rack, the same data center, same availability zone? I
>>>>>> mean, people try to minimize network latency between nodes, so how exactly
>>>>>> is EBS able to avoid network latency?
>>>>>>
>>>>>> Not published,and probably not a straight forward answer (probably
>>>>>> have redundancy cross-az, if it matches some of their other published
>>>>>> behaviors). The promise they give you is ‘iops’, with a certain block size.
>>>>>> Some instance types are optimized for dedicated, ebs-only network
>>>>>> interfaces. Like most things in cassandra / cloud, the only way to know for
>>>>>> sure is to test it yourself and see if observed latency is acceptable (or
>>>>>> trust our testing, if you assume we’re sufficiently smart and honest).
>>>>>>
>>>>>> > Did your test use Amazon EBS–Optimized Instances?
>>>>>>
>>>>>> We tested dozens of instance type/size combinations (literally). The
>>>>>> best performance was clearly with ebs-optimized instances that also have
>>>>>> enhanced networking (c4, m4, etc) - slide 43
>>>>>>
>>>>>> > SSD or magnetic or does it make any difference?
>>>>>>
>>>>>> SSD, GP2 (slide 64)
>>>>>>
>>>>>> > What info is available on EBS performance at peak times, when
>>>>>> multiple AWS customers have spikes of demand?
>>>>>>
>>>>>> Not published, but experiments show that we can hit 10k iops all day
>>>>>> every day with only trivial noisy neighbor problems, not enough to impact a
>>>>>> real cluster (slide 58)
>>>>>>
>>>>>> > Is RAID much of a factor or help at all using EBS?
>>>>>>
>>>>>> You can use RAID to get higher IOPS than you’d normally get by
>>>>>> default (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you
>>>>>> need more than 10k, you can stripe volumes together up to the ebs network
>>>>>> link max) (hinted at in slide 64)
>>>>>>
>>>>>> > How exactly is EBS provisioned in terms of its own HA - I mean,
>>>>>> with a properly configured Cassandra cluster RF provides HA, so what is the
>>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>>
>>>>>> There is HA, I’m not sure that AWS publishes specifics. Occasionally
>>>>>> specific volumes will have issues (hypervisor’s dedicated ethernet link to
>>>>>> EBS network fails, for example). Occasionally instances will have issues.
>>>>>> The volume-specific issues seem to be less common than the instance-store
>>>>>> “instance retired” or “instance is running on degraded hardware” events.
>>>>>> Stop/Start and you’ve recovered (possible with EBS, not possible with
>>>>>> instance store). The assurances are in AWS’ SLA – if the SLA is
>>>>>> insufficient (and it probably is insufficient), use more than one AZ and/or
>>>>>> AWS region or cloud vendor.
>>>>>>
>>>>>> > For multi-data center operation, what configuration options assure
>>>>>> that the EBS volumes for each DC are truly physically separated?
>>>>>>
>>>>>> It used to be true that EBS control plane for a given region spanned
>>>>>> AZs. That’s no longer true. AWS asserts that failure modes for each AZ are
>>>>>> isolated (data may replicate between AZs, but a full outage in us-east-1a
>>>>>> shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide 65
>>>>>>
>>>>>> > In terms of syncing data for the commit log, if the OS call to sync
>>>>>> an EBS volume returns, is the commit log data absolutely 100% synced at the
>>>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>>>> which the EBS volumes reside will still guarantee availability of the
>>>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>>> including when the two are on different volumes? In practice, we would like
>>>>>> some significant degree of pipelining of data, such as during the full
>>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>>> guarantee is needed.
>>>>>>
>>>>>> Most of the answers in this block are “probably not 100%, you should
>>>>>> be writing to more than one host/AZ/DC/vendor to protect your organization
>>>>>> from failures”. AWS targets something like 0.1% annual failure rate per
>>>>>> volume and 99.999% availability (slide 66). We believe they’re exceeding
>>>>>> those goals (at least based with the petabytes of data we have on gp2
>>>>>> volumes).
>>>>>>
>>>>>>
>>>>>>
>>>>>> From: Jack Krupansky
>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>> Date: Monday, February 1, 2016 at 5:51 AM
>>>>>>
>>>>>> To: "user@cassandra.apache.org"
>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>
>>>>>> I'm not a fan of guy - this appears to be the slideshare
>>>>>> corresponding to the video:
>>>>>>
>>>>>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>>>>>>
>>>>>> My apologies if my questions are actually answered on the video or
>>>>>> slides, I just did a quick scan of the slide text.
>>>>>>
>>>>>> I'm curious where the EBS physical devices actually reside - are they
>>>>>> in the same rack, the same data center, same availability zone? I mean,
>>>>>> people try to minimize network latency between nodes, so how exactly is EBS
>>>>>> able to avoid network latency?
>>>>>>
>>>>>> Did your test use Amazon EBS–Optimized Instances?
>>>>>>
>>>>>> SSD or magnetic or does it make any difference?
>>>>>>
>>>>>> What info is available on EBS performance at peak times, when
>>>>>> multiple AWS customers have spikes of demand?
>>>>>>
>>>>>> Is RAID much of a factor or help at all using EBS?
>>>>>>
>>>>>> How exactly is EBS provisioned in terms of its own HA - I mean, with
>>>>>> a properly configured Cassandra cluster RF provides HA, so what is the
>>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>>
>>>>>> For multi-data center operation, what configuration options assure
>>>>>> that the EBS volumes for each DC are truly physically separated?
>>>>>>
>>>>>> In terms of syncing data for the commit log, if the OS call to sync
>>>>>> an EBS volume returns, is the commit log data absolutely 100% synced at the
>>>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>>>> which the EBS volumes reside will still guarantee availability of the
>>>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>>> including when the two are on different volumes? In practice, we would like
>>>>>> some significant degree of pipelining of data, such as during the full
>>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>>> guarantee is needed.
>>>>>>
>>>>>>
>>>>>> -- Jack Krupansky
>>>>>>
>>>>>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Jeff,
>>>>>>>
>>>>>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>>>>>>> discounting EBS, but prior outages are worrisome.
>>>>>>>
>>>>>>>
>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Free to choose what you'd like, but EBS outages were also addressed
>>>>>>>> in that video (second half, discussion by Dennis Opacki). 2016 EBS isn't
>>>>>>>> the same as 2011 EBS.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jeff Jirsa
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Thank you all for the suggestions. I'm torn between GP2 vs
>>>>>>>> Ephemeral. GP2 after testing is a viable contender for our workload. The
>>>>>>>> only worry I have is EBS outages, which have happened.
>>>>>>>>
>>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Also in that video - it's long but worth watching
>>>>>>>>>
>>>>>>>>> We tested up to 1M reads/second as well, blowing out page cache to
>>>>>>>>> ensure we weren't "just" reading from memory
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Jeff Jirsa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <
>>>>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> How about reads? Any differences between read-intensive and
>>>>>>>>> write-intensive workloads?
>>>>>>>>>
>>>>>>>>> -- Jack Krupansky
>>>>>>>>>
>>>>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <
>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi John,
>>>>>>>>>>
>>>>>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M
>>>>>>>>>> writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>>>>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>>>>>>> necessary.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> From: John Wong
>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>>>
>>>>>>>>>> For production I'd stick with ephemeral disks (aka instance
>>>>>>>>>> storage) if you have running a lot of transaction.
>>>>>>>>>> However, for regular small testing/qa cluster, or something you
>>>>>>>>>> know you want to reload often, EBS is definitely good enough and we haven't
>>>>>>>>>> had issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>>>>>>>
>>>>>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go through
>>>>>>>>>> the video, do you actually use PIOPS or just standard GP2 in your
>>>>>>>>>> production cluster?
>>>>>>>>>>
>>>>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <
>>>>>>>>>> bryan@blockcypher.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yep, that motivated my question "Do you have any idea what kind
>>>>>>>>>>> of disk performance you need?". If you need the performance, its hard to
>>>>>>>>>>> beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>>>>>>>
>>>>>>>>>>> Personally, on small clusters like ours (12 nodes), we've found
>>>>>>>>>>> our choice of instance dictated much more by the balance of price, CPU, and
>>>>>>>>>>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>>>>>>>>>>> rarely the bottleneck. YMMV, of course.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>>>>>>>>>> instances with GP2 EBS.  When you don’t care about replacing a node because
>>>>>>>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>>>>>>>>>>> capable of amazing things, and greatly simplifies life.
>>>>>>>>>>>>
>>>>>>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s
>>>>>>>>>>>> very much a viable option, despite any old documents online that say
>>>>>>>>>>>> otherwise.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> From: Eric Plowe
>>>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>>>>>>
>>>>>>>>>>>> My company is planning on rolling out a C* cluster in EC2. We
>>>>>>>>>>>> are thinking about going with ephemeral SSDs. The question is this: Should
>>>>>>>>>>>> we put two in RAID 0 or just go with one? We currently run a cluster in our
>>>>>>>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>>>>>>>>>> the performance we are seeing thus far.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>> Eric
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Steve Robenalt
>>>> Software Architect
>>>> srobenalt@highwire.org <bz...@highwire.org>
>>>> (office/cell): 916-505-1785
>>>>
>>>> HighWire Press, Inc.
>>>> 425 Broadway St, Redwood City, CA 94063
>>>> www.highwire.org
>>>>
>>>> Technology for Scholarly Communication
>>>>
>>>
>>>
>> --
> Ben Bromhead
> CTO | Instaclustr
> +1 650 284 9692
>

Re: EC2 storage options for C*

Posted by Ben Bromhead <be...@instaclustr.com>.

For what it's worth we've tried d2 instances and they encourage terrible
things like super dense nodes (increases your replacement time). In terms
of useable storage I would go with gp2 EBS on a m4 based instance.

On Mon, 1 Feb 2016 at 14:25 Jack Krupansky <ja...@gmail.com> wrote:

> Ah, yes, the good old days of m1.large.
>
> -- Jack Krupansky
>
> On Mon, Feb 1, 2016 at 5:12 PM, Jeff Jirsa <je...@crowdstrike.com>
> wrote:
>
>> A lot of people use the old gen instances (m1 in particular) because they
>> came with a ton of effectively free ephemeral storage (up to 1.6TB).
>> Whether or not they’re viable is a decision for each user to make. They’re
>> very, very commonly used for C*, though. At a time when EBS was not
>> sufficiently robust or reliable, a cluster of m1 instances was the de facto
>> standard.
>>
>> The canonical “best practice” in 2015 was i2. We believe we’ve made a
>> compelling argument to use m4 or c4 instead of i2. There exists a company
>> we know currently testing d2 at scale, though I’m not sure they have much
>> in terms of concrete results at this time.
>>
>> - Jeff
>>
>> From: Jack Krupansky
>> Reply-To: "user@cassandra.apache.org"
>> Date: Monday, February 1, 2016 at 1:55 PM
>>
>> To: "user@cassandra.apache.org"
>> Subject: Re: EC2 storage options for C*
>>
>> Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2
>> Dense Storage".
>>
>> The remaining question is whether any of the "Previous Generation
>> Instances" should be publicly recommended going forward.
>>
>> And whether non-SSD instances should be recommended going forward as
>> well. sure, technically, someone could use the legacy instances, but the
>> question is what we should be recommending as best practice going forward.
>>
>> Yeah, the i2 instances look like the sweet spot for any non-EBS clusters.
>>
>> -- Jack Krupansky
>>
>> On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <sr...@highwire.org>
>> wrote:
>>
>>> Hi Jack,
>>>
>>> At the bottom of the instance-types page, there is a link to the
>>> previous generations, which includes the older series (m1, m2, etc), many
>>> of which have HDD options.
>>>
>>> There are also the d2 (Dense Storage) instances in the current
>>> generation that include various combos of local HDDs.
>>>
>>> The i2 series has good sized SSDs available, and has the advanced
>>> networking option, which is also useful for Cassandra. The enhanced
>>> networking is available with other instance types as well, as you'll see on
>>> the feature list under each type.
>>>
>>> Steve
>>>
>>>
>>>
>>> On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <jack.krupansky@gmail.com
>>> > wrote:
>>>
>>>> Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic
>>>> question, it seems like magnetic (HDD) is no longer a recommended storage
>>>> option for databases on AWS. In particular, only the C2 Dense Storage
>>>> instances have local magnetic storage - all the other instance types are
>>>> SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent Data
>>>> Access."
>>>>
>>>> For the record, that AWS doc has Cassandra listed as a use case for i2
>>>> instance types.
>>>>
>>>> Also, the AWS doc lists EBS io2 for the NoSQL database use case and gp2
>>>> only for the "small to medium databases" use case.
>>>>
>>>> Do older instances with local HDD still exist on AWS (m1, m2, etc.)? Is
>>>> the doc simply for any newly started instances?
>>>>
>>>> See:
>>>> https://aws.amazon.com/ec2/instance-types/
>>>> http://aws.amazon.com/ebs/details/
>>>>
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <je...@crowdstrike.com>
>>>> wrote:
>>>>
>>>>> > My apologies if my questions are actually answered on the video or
>>>>> slides, I just did a quick scan of the slide text.
>>>>>
>>>>> Virtually all of them are covered.
>>>>>
>>>>> > I'm curious where the EBS physical devices actually reside - are
>>>>> they in the same rack, the same data center, same availability zone? I
>>>>> mean, people try to minimize network latency between nodes, so how exactly
>>>>> is EBS able to avoid network latency?
>>>>>
>>>>> Not published,and probably not a straight forward answer (probably
>>>>> have redundancy cross-az, if it matches some of their other published
>>>>> behaviors). The promise they give you is ‘iops’, with a certain block size.
>>>>> Some instance types are optimized for dedicated, ebs-only network
>>>>> interfaces. Like most things in cassandra / cloud, the only way to know for
>>>>> sure is to test it yourself and see if observed latency is acceptable (or
>>>>> trust our testing, if you assume we’re sufficiently smart and honest).
>>>>>
>>>>> > Did your test use Amazon EBS–Optimized Instances?
>>>>>
>>>>> We tested dozens of instance type/size combinations (literally). The
>>>>> best performance was clearly with ebs-optimized instances that also have
>>>>> enhanced networking (c4, m4, etc) - slide 43
>>>>>
>>>>> > SSD or magnetic or does it make any difference?
>>>>>
>>>>> SSD, GP2 (slide 64)
>>>>>
>>>>> > What info is available on EBS performance at peak times, when
>>>>> multiple AWS customers have spikes of demand?
>>>>>
>>>>> Not published, but experiments show that we can hit 10k iops all day
>>>>> every day with only trivial noisy neighbor problems, not enough to impact a
>>>>> real cluster (slide 58)
>>>>>
>>>>> > Is RAID much of a factor or help at all using EBS?
>>>>>
>>>>> You can use RAID to get higher IOPS than you’d normally get by default
>>>>> (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you need more
>>>>> than 10k, you can stripe volumes together up to the ebs network link max)
>>>>> (hinted at in slide 64)
>>>>>
>>>>> > How exactly is EBS provisioned in terms of its own HA - I mean, with
>>>>> a properly configured Cassandra cluster RF provides HA, so what is the
>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>
>>>>> There is HA, I’m not sure that AWS publishes specifics. Occasionally
>>>>> specific volumes will have issues (hypervisor’s dedicated ethernet link to
>>>>> EBS network fails, for example). Occasionally instances will have issues.
>>>>> The volume-specific issues seem to be less common than the instance-store
>>>>> “instance retired” or “instance is running on degraded hardware” events.
>>>>> Stop/Start and you’ve recovered (possible with EBS, not possible with
>>>>> instance store). The assurances are in AWS’ SLA – if the SLA is
>>>>> insufficient (and it probably is insufficient), use more than one AZ and/or
>>>>> AWS region or cloud vendor.
>>>>>
>>>>> > For multi-data center operation, what configuration options assure
>>>>> that the EBS volumes for each DC are truly physically separated?
>>>>>
>>>>> It used to be true that EBS control plane for a given region spanned
>>>>> AZs. That’s no longer true. AWS asserts that failure modes for each AZ are
>>>>> isolated (data may replicate between AZs, but a full outage in us-east-1a
>>>>> shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide 65
>>>>>
>>>>> > In terms of syncing data for the commit log, if the OS call to sync
>>>>> an EBS volume returns, is the commit log data absolutely 100% synced at the
>>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>>> which the EBS volumes reside will still guarantee availability of the
>>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>> including when the two are on different volumes? In practice, we would like
>>>>> some significant degree of pipelining of data, such as during the full
>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>> guarantee is needed.
>>>>>
>>>>> Most of the answers in this block are “probably not 100%, you should
>>>>> be writing to more than one host/AZ/DC/vendor to protect your organization
>>>>> from failures”. AWS targets something like 0.1% annual failure rate per
>>>>> volume and 99.999% availability (slide 66). We believe they’re exceeding
>>>>> those goals (at least based with the petabytes of data we have on gp2
>>>>> volumes).
>>>>>
>>>>>
>>>>>
>>>>> From: Jack Krupansky
>>>>> Reply-To: "user@cassandra.apache.org"
>>>>> Date: Monday, February 1, 2016 at 5:51 AM
>>>>>
>>>>> To: "user@cassandra.apache.org"
>>>>> Subject: Re: EC2 storage options for C*
>>>>>
>>>>> I'm not a fan of guy - this appears to be the slideshare corresponding
>>>>> to the video:
>>>>>
>>>>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>>>>>
>>>>> My apologies if my questions are actually answered on the video or
>>>>> slides, I just did a quick scan of the slide text.
>>>>>
>>>>> I'm curious where the EBS physical devices actually reside - are they
>>>>> in the same rack, the same data center, same availability zone? I mean,
>>>>> people try to minimize network latency between nodes, so how exactly is EBS
>>>>> able to avoid network latency?
>>>>>
>>>>> Did your test use Amazon EBS–Optimized Instances?
>>>>>
>>>>> SSD or magnetic or does it make any difference?
>>>>>
>>>>> What info is available on EBS performance at peak times, when multiple
>>>>> AWS customers have spikes of demand?
>>>>>
>>>>> Is RAID much of a factor or help at all using EBS?
>>>>>
>>>>> How exactly is EBS provisioned in terms of its own HA - I mean, with a
>>>>> properly configured Cassandra cluster RF provides HA, so what is the
>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>
>>>>> For multi-data center operation, what configuration options assure
>>>>> that the EBS volumes for each DC are truly physically separated?
>>>>>
>>>>> In terms of syncing data for the commit log, if the OS call to sync an
>>>>> EBS volume returns, is the commit log data absolutely 100% synced at the
>>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>>> which the EBS volumes reside will still guarantee availability of the
>>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>> including when the two are on different volumes? In practice, we would like
>>>>> some significant degree of pipelining of data, such as during the full
>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>> guarantee is needed.
>>>>>
>>>>>
>>>>> -- Jack Krupansky
>>>>>
>>>>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Jeff,
>>>>>>
>>>>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>>>>>> discounting EBS, but prior outages are worrisome.
>>>>>>
>>>>>>
>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Free to choose what you'd like, but EBS outages were also addressed
>>>>>>> in that video (second half, discussion by Dennis Opacki). 2016 EBS isn't
>>>>>>> the same as 2011 EBS.
>>>>>>>
>>>>>>> --
>>>>>>> Jeff Jirsa
>>>>>>>
>>>>>>>
>>>>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Thank you all for the suggestions. I'm torn between GP2 vs
>>>>>>> Ephemeral. GP2 after testing is a viable contender for our workload. The
>>>>>>> only worry I have is EBS outages, which have happened.
>>>>>>>
>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Also in that video - it's long but worth watching
>>>>>>>>
>>>>>>>> We tested up to 1M reads/second as well, blowing out page cache to
>>>>>>>> ensure we weren't "just" reading from memory
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jeff Jirsa
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <
>>>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>>>
>>>>>>>> How about reads? Any differences between read-intensive and
>>>>>>>> write-intensive workloads?
>>>>>>>>
>>>>>>>> -- Jack Krupansky
>>>>>>>>
>>>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <
>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>
>>>>>>>>> Hi John,
>>>>>>>>>
>>>>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M
>>>>>>>>> writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>>>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>>>>>> necessary.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> From: John Wong
>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>>
>>>>>>>>> For production I'd stick with ephemeral disks (aka instance
>>>>>>>>> storage) if you have running a lot of transaction.
>>>>>>>>> However, for regular small testing/qa cluster, or something you
>>>>>>>>> know you want to reload often, EBS is definitely good enough and we haven't
>>>>>>>>> had issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>>>>>>
>>>>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go through
>>>>>>>>> the video, do you actually use PIOPS or just standard GP2 in your
>>>>>>>>> production cluster?
>>>>>>>>>
>>>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <
>>>>>>>>> bryan@blockcypher.com> wrote:
>>>>>>>>>
>>>>>>>>>> Yep, that motivated my question "Do you have any idea what kind
>>>>>>>>>> of disk performance you need?". If you need the performance, its hard to
>>>>>>>>>> beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>>>>>>
>>>>>>>>>> Personally, on small clusters like ours (12 nodes), we've found
>>>>>>>>>> our choice of instance dictated much more by the balance of price, CPU, and
>>>>>>>>>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>>>>>>>>>> rarely the bottleneck. YMMV, of course.
>>>>>>>>>>
>>>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>>>>>>>>> instances with GP2 EBS.  When you don’t care about replacing a node because
>>>>>>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>>>>>>>>>> capable of amazing things, and greatly simplifies life.
>>>>>>>>>>>
>>>>>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s
>>>>>>>>>>> very much a viable option, despite any old documents online that say
>>>>>>>>>>> otherwise.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> From: Eric Plowe
>>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>>>>>
>>>>>>>>>>> My company is planning on rolling out a C* cluster in EC2. We
>>>>>>>>>>> are thinking about going with ephemeral SSDs. The question is this: Should
>>>>>>>>>>> we put two in RAID 0 or just go with one? We currently run a cluster in our
>>>>>>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>>>>>>>>> the performance we are seeing thus far.
>>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>>
>>>>>>>>>>> Eric
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Steve Robenalt
>>> Software Architect
>>> srobenalt@highwire.org <bz...@highwire.org>
>>> (office/cell): 916-505-1785
>>>
>>> HighWire Press, Inc.
>>> 425 Broadway St, Redwood City, CA 94063
>>> www.highwire.org
>>>
>>> Technology for Scholarly Communication
>>>
>>
>>
> --
Ben Bromhead
CTO | Instaclustr
+1 650 284 9692

Re: EC2 storage options for C*

Posted by Jack Krupansky <ja...@gmail.com>.

Ah, yes, the good old days of m1.large.

-- Jack Krupansky

On Mon, Feb 1, 2016 at 5:12 PM, Jeff Jirsa <je...@crowdstrike.com>
wrote:

> A lot of people use the old gen instances (m1 in particular) because they
> came with a ton of effectively free ephemeral storage (up to 1.6TB).
> Whether or not they’re viable is a decision for each user to make. They’re
> very, very commonly used for C*, though. At a time when EBS was not
> sufficiently robust or reliable, a cluster of m1 instances was the de facto
> standard.
>
> The canonical “best practice” in 2015 was i2. We believe we’ve made a
> compelling argument to use m4 or c4 instead of i2. There exists a company
> we know currently testing d2 at scale, though I’m not sure they have much
> in terms of concrete results at this time.
>
> - Jeff
>
> From: Jack Krupansky
> Reply-To: "user@cassandra.apache.org"
> Date: Monday, February 1, 2016 at 1:55 PM
>
> To: "user@cassandra.apache.org"
> Subject: Re: EC2 storage options for C*
>
> Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2
> Dense Storage".
>
> The remaining question is whether any of the "Previous Generation
> Instances" should be publicly recommended going forward.
>
> And whether non-SSD instances should be recommended going forward as well.
> sure, technically, someone could use the legacy instances, but the question
> is what we should be recommending as best practice going forward.
>
> Yeah, the i2 instances look like the sweet spot for any non-EBS clusters.
>
> -- Jack Krupansky
>
> On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <sr...@highwire.org>
> wrote:
>
>> Hi Jack,
>>
>> At the bottom of the instance-types page, there is a link to the previous
>> generations, which includes the older series (m1, m2, etc), many of which
>> have HDD options.
>>
>> There are also the d2 (Dense Storage) instances in the current generation
>> that include various combos of local HDDs.
>>
>> The i2 series has good sized SSDs available, and has the advanced
>> networking option, which is also useful for Cassandra. The enhanced
>> networking is available with other instance types as well, as you'll see on
>> the feature list under each type.
>>
>> Steve
>>
>>
>>
>> On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <ja...@gmail.com>
>> wrote:
>>
>>> Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic
>>> question, it seems like magnetic (HDD) is no longer a recommended storage
>>> option for databases on AWS. In particular, only the C2 Dense Storage
>>> instances have local magnetic storage - all the other instance types are
>>> SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent Data
>>> Access."
>>>
>>> For the record, that AWS doc has Cassandra listed as a use case for i2
>>> instance types.
>>>
>>> Also, the AWS doc lists EBS io2 for the NoSQL database use case and gp2
>>> only for the "small to medium databases" use case.
>>>
>>> Do older instances with local HDD still exist on AWS (m1, m2, etc.)? Is
>>> the doc simply for any newly started instances?
>>>
>>> See:
>>> https://aws.amazon.com/ec2/instance-types/
>>> http://aws.amazon.com/ebs/details/
>>>
>>>
>>> -- Jack Krupansky
>>>
>>> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <je...@crowdstrike.com>
>>> wrote:
>>>
>>>> > My apologies if my questions are actually answered on the video or
>>>> slides, I just did a quick scan of the slide text.
>>>>
>>>> Virtually all of them are covered.
>>>>
>>>> > I'm curious where the EBS physical devices actually reside - are they
>>>> in the same rack, the same data center, same availability zone? I mean,
>>>> people try to minimize network latency between nodes, so how exactly is EBS
>>>> able to avoid network latency?
>>>>
>>>> Not published,and probably not a straight forward answer (probably have
>>>> redundancy cross-az, if it matches some of their other published
>>>> behaviors). The promise they give you is ‘iops’, with a certain block size.
>>>> Some instance types are optimized for dedicated, ebs-only network
>>>> interfaces. Like most things in cassandra / cloud, the only way to know for
>>>> sure is to test it yourself and see if observed latency is acceptable (or
>>>> trust our testing, if you assume we’re sufficiently smart and honest).
>>>>
>>>> > Did your test use Amazon EBS–Optimized Instances?
>>>>
>>>> We tested dozens of instance type/size combinations (literally). The
>>>> best performance was clearly with ebs-optimized instances that also have
>>>> enhanced networking (c4, m4, etc) - slide 43
>>>>
>>>> > SSD or magnetic or does it make any difference?
>>>>
>>>> SSD, GP2 (slide 64)
>>>>
>>>> > What info is available on EBS performance at peak times, when
>>>> multiple AWS customers have spikes of demand?
>>>>
>>>> Not published, but experiments show that we can hit 10k iops all day
>>>> every day with only trivial noisy neighbor problems, not enough to impact a
>>>> real cluster (slide 58)
>>>>
>>>> > Is RAID much of a factor or help at all using EBS?
>>>>
>>>> You can use RAID to get higher IOPS than you’d normally get by default
>>>> (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you need more
>>>> than 10k, you can stripe volumes together up to the ebs network link max)
>>>> (hinted at in slide 64)
>>>>
>>>> > How exactly is EBS provisioned in terms of its own HA - I mean, with
>>>> a properly configured Cassandra cluster RF provides HA, so what is the
>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>> three EBS volumes aren't all in the same physical rack?
>>>>
>>>> There is HA, I’m not sure that AWS publishes specifics. Occasionally
>>>> specific volumes will have issues (hypervisor’s dedicated ethernet link to
>>>> EBS network fails, for example). Occasionally instances will have issues.
>>>> The volume-specific issues seem to be less common than the instance-store
>>>> “instance retired” or “instance is running on degraded hardware” events.
>>>> Stop/Start and you’ve recovered (possible with EBS, not possible with
>>>> instance store). The assurances are in AWS’ SLA – if the SLA is
>>>> insufficient (and it probably is insufficient), use more than one AZ and/or
>>>> AWS region or cloud vendor.
>>>>
>>>> > For multi-data center operation, what configuration options assure
>>>> that the EBS volumes for each DC are truly physically separated?
>>>>
>>>> It used to be true that EBS control plane for a given region spanned
>>>> AZs. That’s no longer true. AWS asserts that failure modes for each AZ are
>>>> isolated (data may replicate between AZs, but a full outage in us-east-1a
>>>> shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide 65
>>>>
>>>> > In terms of syncing data for the commit log, if the OS call to sync
>>>> an EBS volume returns, is the commit log data absolutely 100% synced at the
>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>> which the EBS volumes reside will still guarantee availability of the
>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>> sstable durability when Cassandra is about to delete the commit log,
>>>> including when the two are on different volumes? In practice, we would like
>>>> some significant degree of pipelining of data, such as during the full
>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>> guarantee is needed.
>>>>
>>>> Most of the answers in this block are “probably not 100%, you should be
>>>> writing to more than one host/AZ/DC/vendor to protect your organization
>>>> from failures”. AWS targets something like 0.1% annual failure rate per
>>>> volume and 99.999% availability (slide 66). We believe they’re exceeding
>>>> those goals (at least based with the petabytes of data we have on gp2
>>>> volumes).
>>>>
>>>>
>>>>
>>>> From: Jack Krupansky
>>>> Reply-To: "user@cassandra.apache.org"
>>>> Date: Monday, February 1, 2016 at 5:51 AM
>>>>
>>>> To: "user@cassandra.apache.org"
>>>> Subject: Re: EC2 storage options for C*
>>>>
>>>> I'm not a fan of guy - this appears to be the slideshare corresponding
>>>> to the video:
>>>>
>>>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>>>>
>>>> My apologies if my questions are actually answered on the video or
>>>> slides, I just did a quick scan of the slide text.
>>>>
>>>> I'm curious where the EBS physical devices actually reside - are they
>>>> in the same rack, the same data center, same availability zone? I mean,
>>>> people try to minimize network latency between nodes, so how exactly is EBS
>>>> able to avoid network latency?
>>>>
>>>> Did your test use Amazon EBS–Optimized Instances?
>>>>
>>>> SSD or magnetic or does it make any difference?
>>>>
>>>> What info is available on EBS performance at peak times, when multiple
>>>> AWS customers have spikes of demand?
>>>>
>>>> Is RAID much of a factor or help at all using EBS?
>>>>
>>>> How exactly is EBS provisioned in terms of its own HA - I mean, with a
>>>> properly configured Cassandra cluster RF provides HA, so what is the
>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>> three EBS volumes aren't all in the same physical rack?
>>>>
>>>> For multi-data center operation, what configuration options assure that
>>>> the EBS volumes for each DC are truly physically separated?
>>>>
>>>> In terms of syncing data for the commit log, if the OS call to sync an
>>>> EBS volume returns, is the commit log data absolutely 100% synced at the
>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>> which the EBS volumes reside will still guarantee availability of the
>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>> sstable durability when Cassandra is about to delete the commit log,
>>>> including when the two are on different volumes? In practice, we would like
>>>> some significant degree of pipelining of data, such as during the full
>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>> guarantee is needed.
>>>>
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com>
>>>> wrote:
>>>>
>>>>> Jeff,
>>>>>
>>>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>>>>> discounting EBS, but prior outages are worrisome.
>>>>>
>>>>>
>>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>>> wrote:
>>>>>
>>>>>> Free to choose what you'd like, but EBS outages were also addressed
>>>>>> in that video (second half, discussion by Dennis Opacki). 2016 EBS isn't
>>>>>> the same as 2011 EBS.
>>>>>>
>>>>>> --
>>>>>> Jeff Jirsa
>>>>>>
>>>>>>
>>>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com> wrote:
>>>>>>
>>>>>> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral.
>>>>>> GP2 after testing is a viable contender for our workload. The only worry I
>>>>>> have is EBS outages, which have happened.
>>>>>>
>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Also in that video - it's long but worth watching
>>>>>>>
>>>>>>> We tested up to 1M reads/second as well, blowing out page cache to
>>>>>>> ensure we weren't "just" reading from memory
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jeff Jirsa
>>>>>>>
>>>>>>>
>>>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <
>>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>>
>>>>>>> How about reads? Any differences between read-intensive and
>>>>>>> write-intensive workloads?
>>>>>>>
>>>>>>> -- Jack Krupansky
>>>>>>>
>>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <
>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>
>>>>>>>> Hi John,
>>>>>>>>
>>>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M
>>>>>>>> writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>>>>> necessary.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> From: John Wong
>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>
>>>>>>>> For production I'd stick with ephemeral disks (aka instance
>>>>>>>> storage) if you have running a lot of transaction.
>>>>>>>> However, for regular small testing/qa cluster, or something you
>>>>>>>> know you want to reload often, EBS is definitely good enough and we haven't
>>>>>>>> had issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>>>>>
>>>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go through
>>>>>>>> the video, do you actually use PIOPS or just standard GP2 in your
>>>>>>>> production cluster?
>>>>>>>>
>>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <bryan@blockcypher.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Yep, that motivated my question "Do you have any idea what kind
>>>>>>>>> of disk performance you need?". If you need the performance, its hard to
>>>>>>>>> beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>>>>>
>>>>>>>>> Personally, on small clusters like ours (12 nodes), we've found
>>>>>>>>> our choice of instance dictated much more by the balance of price, CPU, and
>>>>>>>>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>>>>>>>>> rarely the bottleneck. YMMV, of course.
>>>>>>>>>
>>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>
>>>>>>>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>>>>>>>> instances with GP2 EBS.  When you don’t care about replacing a node because
>>>>>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>>>>>>>>> capable of amazing things, and greatly simplifies life.
>>>>>>>>>>
>>>>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very
>>>>>>>>>> much a viable option, despite any old documents online that say otherwise.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> From: Eric Plowe
>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>>>>
>>>>>>>>>> My company is planning on rolling out a C* cluster in EC2. We are
>>>>>>>>>> thinking about going with ephemeral SSDs. The question is this: Should we
>>>>>>>>>> put two in RAID 0 or just go with one? We currently run a cluster in our
>>>>>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>>>>>>>> the performance we are seeing thus far.
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>> Eric
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>
>>>
>>
>>
>> --
>> Steve Robenalt
>> Software Architect
>> srobenalt@highwire.org <bz...@highwire.org>
>> (office/cell): 916-505-1785
>>
>> HighWire Press, Inc.
>> 425 Broadway St, Redwood City, CA 94063
>> www.highwire.org
>>
>> Technology for Scholarly Communication
>>
>
>

Re: EC2 storage options for C*

Posted by Jeff Jirsa <je...@crowdstrike.com>.

A lot of people use the old gen instances (m1 in particular) because they came with a ton of effectively free ephemeral storage (up to 1.6TB). Whether or not they’re viable is a decision for each user to make. They’re very, very commonly used for C*, though. At a time when EBS was not sufficiently robust or reliable, a cluster of m1 instances was the de facto standard. 

The canonical “best practice” in 2015 was i2. We believe we’ve made a compelling argument to use m4 or c4 instead of i2. There exists a company we know currently testing d2 at scale, though I’m not sure they have much in terms of concrete results at this time. 

- Jeff

From:  Jack Krupansky
Reply-To:  "user@cassandra.apache.org"
Date:  Monday, February 1, 2016 at 1:55 PM
To:  "user@cassandra.apache.org"
Subject:  Re: EC2 storage options for C*

Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2 Dense Storage". 

The remaining question is whether any of the "Previous Generation Instances" should be publicly recommended going forward.

And whether non-SSD instances should be recommended going forward as well. sure, technically, someone could use the legacy instances, but the question is what we should be recommending as best practice going forward.

Yeah, the i2 instances look like the sweet spot for any non-EBS clusters.

-- Jack Krupansky

On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <sr...@highwire.org> wrote:
Hi Jack, 

At the bottom of the instance-types page, there is a link to the previous generations, which includes the older series (m1, m2, etc), many of which have HDD options. 

There are also the d2 (Dense Storage) instances in the current generation that include various combos of local HDDs.

The i2 series has good sized SSDs available, and has the advanced networking option, which is also useful for Cassandra. The enhanced networking is available with other instance types as well, as you'll see on the feature list under each type. 

Steve

On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <ja...@gmail.com> wrote:
Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic question, it seems like magnetic (HDD) is no longer a recommended storage option for databases on AWS. In particular, only the C2 Dense Storage instances have local magnetic storage - all the other instance types are SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent Data Access." 

For the record, that AWS doc has Cassandra listed as a use case for i2 instance types.

Also, the AWS doc lists EBS io2 for the NoSQL database use case and gp2 only for the "small to medium databases" use case.

Do older instances with local HDD still exist on AWS (m1, m2, etc.)? Is the doc simply for any newly started instances?

See:
https://aws.amazon.com/ec2/instance-types/
http://aws.amazon.com/ebs/details/

-- Jack Krupansky

On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <je...@crowdstrike.com> wrote:
> My apologies if my questions are actually answered on the video or slides, I just did a quick scan of the slide text.

Virtually all of them are covered.

> I'm curious where the EBS physical devices actually reside - are they in the same rack, the same data center, same availability zone? I mean, people try to minimize network latency between nodes, so how exactly is EBS able to avoid network latency?

Not published,and probably not a straight forward answer (probably have redundancy cross-az, if it matches some of their other published behaviors). The promise they give you is ‘iops’, with a certain block size. Some instance types are optimized for dedicated, ebs-only network interfaces. Like most things in cassandra / cloud, the only way to know for sure is to test it yourself and see if observed latency is acceptable (or trust our testing, if you assume we’re sufficiently smart and honest). 

> Did your test use Amazon EBS–Optimized Instances?

We tested dozens of instance type/size combinations (literally). The best performance was clearly with ebs-optimized instances that also have enhanced networking (c4, m4, etc) - slide 43

> SSD or magnetic or does it make any difference?

SSD, GP2 (slide 64)

> What info is available on EBS performance at peak times, when multiple AWS customers have spikes of demand?

Not published, but experiments show that we can hit 10k iops all day every day with only trivial noisy neighbor problems, not enough to impact a real cluster (slide 58)

> Is RAID much of a factor or help at all using EBS?

You can use RAID to get higher IOPS than you’d normally get by default (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you need more than 10k, you can stripe volumes together up to the ebs network link max) (hinted at in slide 64)

> How exactly is EBS provisioned in terms of its own HA - I mean, with a properly configured Cassandra cluster RF provides HA, so what is the equivalent for EBS? If I have RF=3, what assurance is there that those three EBS volumes aren't all in the same physical rack?

There is HA, I’m not sure that AWS publishes specifics. Occasionally specific volumes will have issues (hypervisor’s dedicated ethernet link to EBS network fails, for example). Occasionally instances will have issues. The volume-specific issues seem to be less common than the instance-store “instance retired” or “instance is running on degraded hardware” events. Stop/Start and you’ve recovered (possible with EBS, not possible with instance store). The assurances are in AWS’ SLA – if the SLA is insufficient (and it probably is insufficient), use more than one AZ and/or AWS region or cloud vendor.

> For multi-data center operation, what configuration options assure that the EBS volumes for each DC are truly physically separated?

It used to be true that EBS control plane for a given region spanned AZs. That’s no longer true. AWS asserts that failure modes for each AZ are isolated (data may replicate between AZs, but a full outage in us-east-1a shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide 65

> In terms of syncing data for the commit log, if the OS call to sync an EBS volume returns, is the commit log data absolutely 100% synced at the hardware level on the EBS end, such that a power failure of the systems on which the EBS volumes reside will still guarantee availability of the fsynced data. As well, is return from fsync an absolute guarantee of sstable durability when Cassandra is about to delete the commit log, including when the two are on different volumes? In practice, we would like some significant degree of pipelining of data, such as during the full processing of flushing memtables, but for the fsync at the end a solid guarantee is needed.

Most of the answers in this block are “probably not 100%, you should be writing to more than one host/AZ/DC/vendor to protect your organization from failures”. AWS targets something like 0.1% annual failure rate per volume and 99.999% availability (slide 66). We believe they’re exceeding those goals (at least based with the petabytes of data we have on gp2 volumes).  

From: Jack Krupansky
Reply-To: "user@cassandra.apache.org"
Date: Monday, February 1, 2016 at 5:51 AM 

To: "user@cassandra.apache.org"
Subject: Re: EC2 storage options for C*

I'm not a fan of guy - this appears to be the slideshare corresponding to the video: 
http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second

My apologies if my questions are actually answered on the video or slides, I just did a quick scan of the slide text.

I'm curious where the EBS physical devices actually reside - are they in the same rack, the same data center, same availability zone? I mean, people try to minimize network latency between nodes, so how exactly is EBS able to avoid network latency? 

Did your test use Amazon EBS–Optimized Instances?

SSD or magnetic or does it make any difference?

What info is available on EBS performance at peak times, when multiple AWS customers have spikes of demand?

Is RAID much of a factor or help at all using EBS?

How exactly is EBS provisioned in terms of its own HA - I mean, with a properly configured Cassandra cluster RF provides HA, so what is the equivalent for EBS? If I have RF=3, what assurance is there that those three EBS volumes aren't all in the same physical rack?

For multi-data center operation, what configuration options assure that the EBS volumes for each DC are truly physically separated?

In terms of syncing data for the commit log, if the OS call to sync an EBS volume returns, is the commit log data absolutely 100% synced at the hardware level on the EBS end, such that a power failure of the systems on which the EBS volumes reside will still guarantee availability of the fsynced data. As well, is return from fsync an absolute guarantee of sstable durability when Cassandra is about to delete the commit log, including when the two are on different volumes? In practice, we would like some significant degree of pipelining of data, such as during the full processing of flushing memtables, but for the fsync at the end a solid guarantee is needed.

-- Jack Krupansky

On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com> wrote:
Jeff, 

If EBS goes down, then EBS Gp2 will go down as well, no? I'm not discounting EBS, but prior outages are worrisome. 

On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com> wrote:
Free to choose what you'd like, but EBS outages were also addressed in that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the same as 2011 EBS. 

-- 
Jeff Jirsa

On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com> wrote:

Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral. GP2 after testing is a viable contender for our workload. The only worry I have is EBS outages, which have happened. 

On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com> wrote:
Also in that video - it's long but worth watching

We tested up to 1M reads/second as well, blowing out page cache to ensure we weren't "just" reading from memory

-- 
Jeff Jirsa

On Jan 31, 2016, at 9:52 AM, Jack Krupansky <ja...@gmail.com> wrote:

How about reads? Any differences between read-intensive and write-intensive workloads?

-- Jack Krupansky

On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <je...@crowdstrike.com> wrote:
Hi John,

We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes per second on 60 nodes, we didn’t come close to hitting even 50% utilization (10k is more than enough for most workloads). PIOPS is not necessary. 

From: John Wong
Reply-To: "user@cassandra.apache.org"
Date: Saturday, January 30, 2016 at 3:07 PM
To: "user@cassandra.apache.org"
Subject: Re: EC2 storage options for C*

For production I'd stick with ephemeral disks (aka instance storage) if you have running a lot of transaction. 
However, for regular small testing/qa cluster, or something you know you want to reload often, EBS is definitely good enough and we haven't had issues 99%. The 1% is kind of anomaly where we have flush blocked.

But Jeff, kudo that you are able to use EBS. I didn't go through the video, do you actually use PIOPS or just standard GP2 in your production cluster?

On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com> wrote:
Yep, that motivated my question "Do you have any idea what kind of disk performance you need?". If you need the performance, its hard to beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.

Personally, on small clusters like ours (12 nodes), we've found our choice of instance dictated much more by the balance of price, CPU, and memory. We're using GP2 SSD and we find that for our patterns the disk is rarely the bottleneck. YMMV, of course.

On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <je...@crowdstrike.com> wrote:
If you have to ask that question, I strongly recommend m4 or c4 instances with GP2 EBS.  When you don’t care about replacing a node because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is capable of amazing things, and greatly simplifies life.

We gave a talk on this topic at both Cassandra Summit and AWS re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable option, despite any old documents online that say otherwise.

From: Eric Plowe
Reply-To: "user@cassandra.apache.org"
Date: Friday, January 29, 2016 at 4:33 PM
To: "user@cassandra.apache.org"
Subject: EC2 storage options for C*

My company is planning on rolling out a C* cluster in EC2. We are thinking about going with ephemeral SSDs. The question is this: Should we put two in RAID 0 or just go with one? We currently run a cluster in our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the performance we are seeing thus far.

Thanks!

Eric

-- 
Steve Robenalt 
Software Architect
srobenalt@highwire.org 
(office/cell): 916-505-1785

HighWire Press, Inc.
425 Broadway St, Redwood City, CA 94063
www.highwire.org

Technology for Scholarly Communication

Re: EC2 storage options for C*

Posted by Jack Krupansky <ja...@gmail.com>.

Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2 Dense
Storage".

The remaining question is whether any of the "Previous Generation
Instances" should be publicly recommended going forward.

And whether non-SSD instances should be recommended going forward as well.
sure, technically, someone could use the legacy instances, but the question
is what we should be recommending as best practice going forward.

Yeah, the i2 instances look like the sweet spot for any non-EBS clusters.

-- Jack Krupansky

On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <sr...@highwire.org>
wrote:

> Hi Jack,
>
> At the bottom of the instance-types page, there is a link to the previous
> generations, which includes the older series (m1, m2, etc), many of which
> have HDD options.
>
> There are also the d2 (Dense Storage) instances in the current generation
> that include various combos of local HDDs.
>
> The i2 series has good sized SSDs available, and has the advanced
> networking option, which is also useful for Cassandra. The enhanced
> networking is available with other instance types as well, as you'll see on
> the feature list under each type.
>
> Steve
>
>
>
> On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <ja...@gmail.com>
> wrote:
>
>> Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic
>> question, it seems like magnetic (HDD) is no longer a recommended storage
>> option for databases on AWS. In particular, only the C2 Dense Storage
>> instances have local magnetic storage - all the other instance types are
>> SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent Data
>> Access."
>>
>> For the record, that AWS doc has Cassandra listed as a use case for i2
>> instance types.
>>
>> Also, the AWS doc lists EBS io2 for the NoSQL database use case and gp2
>> only for the "small to medium databases" use case.
>>
>> Do older instances with local HDD still exist on AWS (m1, m2, etc.)? Is
>> the doc simply for any newly started instances?
>>
>> See:
>> https://aws.amazon.com/ec2/instance-types/
>> http://aws.amazon.com/ebs/details/
>>
>>
>> -- Jack Krupansky
>>
>> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <je...@crowdstrike.com>
>> wrote:
>>
>>> > My apologies if my questions are actually answered on the video or
>>> slides, I just did a quick scan of the slide text.
>>>
>>> Virtually all of them are covered.
>>>
>>> > I'm curious where the EBS physical devices actually reside - are they
>>> in the same rack, the same data center, same availability zone? I mean,
>>> people try to minimize network latency between nodes, so how exactly is EBS
>>> able to avoid network latency?
>>>
>>> Not published,and probably not a straight forward answer (probably have
>>> redundancy cross-az, if it matches some of their other published
>>> behaviors). The promise they give you is ‘iops’, with a certain block size.
>>> Some instance types are optimized for dedicated, ebs-only network
>>> interfaces. Like most things in cassandra / cloud, the only way to know for
>>> sure is to test it yourself and see if observed latency is acceptable (or
>>> trust our testing, if you assume we’re sufficiently smart and honest).
>>>
>>> > Did your test use Amazon EBS–Optimized Instances?
>>>
>>> We tested dozens of instance type/size combinations (literally). The
>>> best performance was clearly with ebs-optimized instances that also have
>>> enhanced networking (c4, m4, etc) - slide 43
>>>
>>> > SSD or magnetic or does it make any difference?
>>>
>>> SSD, GP2 (slide 64)
>>>
>>> > What info is available on EBS performance at peak times, when multiple
>>> AWS customers have spikes of demand?
>>>
>>> Not published, but experiments show that we can hit 10k iops all day
>>> every day with only trivial noisy neighbor problems, not enough to impact a
>>> real cluster (slide 58)
>>>
>>> > Is RAID much of a factor or help at all using EBS?
>>>
>>> You can use RAID to get higher IOPS than you’d normally get by default
>>> (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you need more
>>> than 10k, you can stripe volumes together up to the ebs network link max)
>>> (hinted at in slide 64)
>>>
>>> > How exactly is EBS provisioned in terms of its own HA - I mean, with a
>>> properly configured Cassandra cluster RF provides HA, so what is the
>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>> three EBS volumes aren't all in the same physical rack?
>>>
>>> There is HA, I’m not sure that AWS publishes specifics. Occasionally
>>> specific volumes will have issues (hypervisor’s dedicated ethernet link to
>>> EBS network fails, for example). Occasionally instances will have issues.
>>> The volume-specific issues seem to be less common than the instance-store
>>> “instance retired” or “instance is running on degraded hardware” events.
>>> Stop/Start and you’ve recovered (possible with EBS, not possible with
>>> instance store). The assurances are in AWS’ SLA – if the SLA is
>>> insufficient (and it probably is insufficient), use more than one AZ and/or
>>> AWS region or cloud vendor.
>>>
>>> > For multi-data center operation, what configuration options assure
>>> that the EBS volumes for each DC are truly physically separated?
>>>
>>> It used to be true that EBS control plane for a given region spanned
>>> AZs. That’s no longer true. AWS asserts that failure modes for each AZ are
>>> isolated (data may replicate between AZs, but a full outage in us-east-1a
>>> shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide 65
>>>
>>> > In terms of syncing data for the commit log, if the OS call to sync an
>>> EBS volume returns, is the commit log data absolutely 100% synced at the
>>> hardware level on the EBS end, such that a power failure of the systems on
>>> which the EBS volumes reside will still guarantee availability of the
>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>> sstable durability when Cassandra is about to delete the commit log,
>>> including when the two are on different volumes? In practice, we would like
>>> some significant degree of pipelining of data, such as during the full
>>> processing of flushing memtables, but for the fsync at the end a solid
>>> guarantee is needed.
>>>
>>> Most of the answers in this block are “probably not 100%, you should be
>>> writing to more than one host/AZ/DC/vendor to protect your organization
>>> from failures”. AWS targets something like 0.1% annual failure rate per
>>> volume and 99.999% availability (slide 66). We believe they’re exceeding
>>> those goals (at least based with the petabytes of data we have on gp2
>>> volumes).
>>>
>>>
>>>
>>> From: Jack Krupansky
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Monday, February 1, 2016 at 5:51 AM
>>>
>>> To: "user@cassandra.apache.org"
>>> Subject: Re: EC2 storage options for C*
>>>
>>> I'm not a fan of guy - this appears to be the slideshare corresponding
>>> to the video:
>>>
>>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>>>
>>> My apologies if my questions are actually answered on the video or
>>> slides, I just did a quick scan of the slide text.
>>>
>>> I'm curious where the EBS physical devices actually reside - are they in
>>> the same rack, the same data center, same availability zone? I mean, people
>>> try to minimize network latency between nodes, so how exactly is EBS able
>>> to avoid network latency?
>>>
>>> Did your test use Amazon EBS–Optimized Instances?
>>>
>>> SSD or magnetic or does it make any difference?
>>>
>>> What info is available on EBS performance at peak times, when multiple
>>> AWS customers have spikes of demand?
>>>
>>> Is RAID much of a factor or help at all using EBS?
>>>
>>> How exactly is EBS provisioned in terms of its own HA - I mean, with a
>>> properly configured Cassandra cluster RF provides HA, so what is the
>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>> three EBS volumes aren't all in the same physical rack?
>>>
>>> For multi-data center operation, what configuration options assure that
>>> the EBS volumes for each DC are truly physically separated?
>>>
>>> In terms of syncing data for the commit log, if the OS call to sync an
>>> EBS volume returns, is the commit log data absolutely 100% synced at the
>>> hardware level on the EBS end, such that a power failure of the systems on
>>> which the EBS volumes reside will still guarantee availability of the
>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>> sstable durability when Cassandra is about to delete the commit log,
>>> including when the two are on different volumes? In practice, we would like
>>> some significant degree of pipelining of data, such as during the full
>>> processing of flushing memtables, but for the fsync at the end a solid
>>> guarantee is needed.
>>>
>>>
>>> -- Jack Krupansky
>>>
>>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com>
>>> wrote:
>>>
>>>> Jeff,
>>>>
>>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>>>> discounting EBS, but prior outages are worrisome.
>>>>
>>>>
>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>> wrote:
>>>>
>>>>> Free to choose what you'd like, but EBS outages were also addressed in
>>>>> that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the
>>>>> same as 2011 EBS.
>>>>>
>>>>> --
>>>>> Jeff Jirsa
>>>>>
>>>>>
>>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com> wrote:
>>>>>
>>>>> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral.
>>>>> GP2 after testing is a viable contender for our workload. The only worry I
>>>>> have is EBS outages, which have happened.
>>>>>
>>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>>> wrote:
>>>>>
>>>>>> Also in that video - it's long but worth watching
>>>>>>
>>>>>> We tested up to 1M reads/second as well, blowing out page cache to
>>>>>> ensure we weren't "just" reading from memory
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Jeff Jirsa
>>>>>>
>>>>>>
>>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <ja...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> How about reads? Any differences between read-intensive and
>>>>>> write-intensive workloads?
>>>>>>
>>>>>> -- Jack Krupansky
>>>>>>
>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <
>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>
>>>>>>> Hi John,
>>>>>>>
>>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M
>>>>>>> writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>>>> necessary.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> From: John Wong
>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>>> To: "user@cassandra.apache.org"
>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>
>>>>>>> For production I'd stick with ephemeral disks (aka instance storage)
>>>>>>> if you have running a lot of transaction.
>>>>>>> However, for regular small testing/qa cluster, or something you know
>>>>>>> you want to reload often, EBS is definitely good enough and we haven't had
>>>>>>> issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>>>>
>>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go through the
>>>>>>> video, do you actually use PIOPS or just standard GP2 in your production
>>>>>>> cluster?
>>>>>>>
>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Yep, that motivated my question "Do you have any idea what kind of
>>>>>>>> disk performance you need?". If you need the performance, its hard to beat
>>>>>>>> ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>>>>
>>>>>>>> Personally, on small clusters like ours (12 nodes), we've found our
>>>>>>>> choice of instance dictated much more by the balance of price, CPU, and
>>>>>>>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>>>>>>>> rarely the bottleneck. YMMV, of course.
>>>>>>>>
>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>
>>>>>>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>>>>>>> instances with GP2 EBS.  When you don’t care about replacing a node because
>>>>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>>>>>>>> capable of amazing things, and greatly simplifies life.
>>>>>>>>>
>>>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very
>>>>>>>>> much a viable option, despite any old documents online that say otherwise.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> From: Eric Plowe
>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>>>
>>>>>>>>> My company is planning on rolling out a C* cluster in EC2. We are
>>>>>>>>> thinking about going with ephemeral SSDs. The question is this: Should we
>>>>>>>>> put two in RAID 0 or just go with one? We currently run a cluster in our
>>>>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>>>>>>> the performance we are seeing thus far.
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> Eric
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>
>>
>
>
> --
> Steve Robenalt
> Software Architect
> srobenalt@highwire.org <bz...@highwire.org>
> (office/cell): 916-505-1785
>
> HighWire Press, Inc.
> 425 Broadway St, Redwood City, CA 94063
> www.highwire.org
>
> Technology for Scholarly Communication
>

Re: EC2 storage options for C*

Posted by Steve Robenalt <sr...@highwire.org>.

Hi Jack,

At the bottom of the instance-types page, there is a link to the previous
generations, which includes the older series (m1, m2, etc), many of which
have HDD options.

There are also the d2 (Dense Storage) instances in the current generation
that include various combos of local HDDs.

The i2 series has good sized SSDs available, and has the advanced
networking option, which is also useful for Cassandra. The enhanced
networking is available with other instance types as well, as you'll see on
the feature list under each type.

Steve



On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <ja...@gmail.com>
wrote:

> Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic
> question, it seems like magnetic (HDD) is no longer a recommended storage
> option for databases on AWS. In particular, only the C2 Dense Storage
> instances have local magnetic storage - all the other instance types are
> SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent Data
> Access."
>
> For the record, that AWS doc has Cassandra listed as a use case for i2
> instance types.
>
> Also, the AWS doc lists EBS io2 for the NoSQL database use case and gp2
> only for the "small to medium databases" use case.
>
> Do older instances with local HDD still exist on AWS (m1, m2, etc.)? Is
> the doc simply for any newly started instances?
>
> See:
> https://aws.amazon.com/ec2/instance-types/
> http://aws.amazon.com/ebs/details/
>
>
> -- Jack Krupansky
>
> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <je...@crowdstrike.com>
> wrote:
>
>> > My apologies if my questions are actually answered on the video or
>> slides, I just did a quick scan of the slide text.
>>
>> Virtually all of them are covered.
>>
>> > I'm curious where the EBS physical devices actually reside - are they
>> in the same rack, the same data center, same availability zone? I mean,
>> people try to minimize network latency between nodes, so how exactly is EBS
>> able to avoid network latency?
>>
>> Not published,and probably not a straight forward answer (probably have
>> redundancy cross-az, if it matches some of their other published
>> behaviors). The promise they give you is ‘iops’, with a certain block size.
>> Some instance types are optimized for dedicated, ebs-only network
>> interfaces. Like most things in cassandra / cloud, the only way to know for
>> sure is to test it yourself and see if observed latency is acceptable (or
>> trust our testing, if you assume we’re sufficiently smart and honest).
>>
>> > Did your test use Amazon EBS–Optimized Instances?
>>
>> We tested dozens of instance type/size combinations (literally). The best
>> performance was clearly with ebs-optimized instances that also have
>> enhanced networking (c4, m4, etc) - slide 43
>>
>> > SSD or magnetic or does it make any difference?
>>
>> SSD, GP2 (slide 64)
>>
>> > What info is available on EBS performance at peak times, when multiple
>> AWS customers have spikes of demand?
>>
>> Not published, but experiments show that we can hit 10k iops all day
>> every day with only trivial noisy neighbor problems, not enough to impact a
>> real cluster (slide 58)
>>
>> > Is RAID much of a factor or help at all using EBS?
>>
>> You can use RAID to get higher IOPS than you’d normally get by default
>> (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you need more
>> than 10k, you can stripe volumes together up to the ebs network link max)
>> (hinted at in slide 64)
>>
>> > How exactly is EBS provisioned in terms of its own HA - I mean, with a
>> properly configured Cassandra cluster RF provides HA, so what is the
>> equivalent for EBS? If I have RF=3, what assurance is there that those
>> three EBS volumes aren't all in the same physical rack?
>>
>> There is HA, I’m not sure that AWS publishes specifics. Occasionally
>> specific volumes will have issues (hypervisor’s dedicated ethernet link to
>> EBS network fails, for example). Occasionally instances will have issues.
>> The volume-specific issues seem to be less common than the instance-store
>> “instance retired” or “instance is running on degraded hardware” events.
>> Stop/Start and you’ve recovered (possible with EBS, not possible with
>> instance store). The assurances are in AWS’ SLA – if the SLA is
>> insufficient (and it probably is insufficient), use more than one AZ and/or
>> AWS region or cloud vendor.
>>
>> > For multi-data center operation, what configuration options assure that
>> the EBS volumes for each DC are truly physically separated?
>>
>> It used to be true that EBS control plane for a given region spanned AZs.
>> That’s no longer true. AWS asserts that failure modes for each AZ are
>> isolated (data may replicate between AZs, but a full outage in us-east-1a
>> shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide 65
>>
>> > In terms of syncing data for the commit log, if the OS call to sync an
>> EBS volume returns, is the commit log data absolutely 100% synced at the
>> hardware level on the EBS end, such that a power failure of the systems on
>> which the EBS volumes reside will still guarantee availability of the
>> fsynced data. As well, is return from fsync an absolute guarantee of
>> sstable durability when Cassandra is about to delete the commit log,
>> including when the two are on different volumes? In practice, we would like
>> some significant degree of pipelining of data, such as during the full
>> processing of flushing memtables, but for the fsync at the end a solid
>> guarantee is needed.
>>
>> Most of the answers in this block are “probably not 100%, you should be
>> writing to more than one host/AZ/DC/vendor to protect your organization
>> from failures”. AWS targets something like 0.1% annual failure rate per
>> volume and 99.999% availability (slide 66). We believe they’re exceeding
>> those goals (at least based with the petabytes of data we have on gp2
>> volumes).
>>
>>
>>
>> From: Jack Krupansky
>> Reply-To: "user@cassandra.apache.org"
>> Date: Monday, February 1, 2016 at 5:51 AM
>>
>> To: "user@cassandra.apache.org"
>> Subject: Re: EC2 storage options for C*
>>
>> I'm not a fan of guy - this appears to be the slideshare corresponding to
>> the video:
>>
>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>>
>> My apologies if my questions are actually answered on the video or
>> slides, I just did a quick scan of the slide text.
>>
>> I'm curious where the EBS physical devices actually reside - are they in
>> the same rack, the same data center, same availability zone? I mean, people
>> try to minimize network latency between nodes, so how exactly is EBS able
>> to avoid network latency?
>>
>> Did your test use Amazon EBS–Optimized Instances?
>>
>> SSD or magnetic or does it make any difference?
>>
>> What info is available on EBS performance at peak times, when multiple
>> AWS customers have spikes of demand?
>>
>> Is RAID much of a factor or help at all using EBS?
>>
>> How exactly is EBS provisioned in terms of its own HA - I mean, with a
>> properly configured Cassandra cluster RF provides HA, so what is the
>> equivalent for EBS? If I have RF=3, what assurance is there that those
>> three EBS volumes aren't all in the same physical rack?
>>
>> For multi-data center operation, what configuration options assure that
>> the EBS volumes for each DC are truly physically separated?
>>
>> In terms of syncing data for the commit log, if the OS call to sync an
>> EBS volume returns, is the commit log data absolutely 100% synced at the
>> hardware level on the EBS end, such that a power failure of the systems on
>> which the EBS volumes reside will still guarantee availability of the
>> fsynced data. As well, is return from fsync an absolute guarantee of
>> sstable durability when Cassandra is about to delete the commit log,
>> including when the two are on different volumes? In practice, we would like
>> some significant degree of pipelining of data, such as during the full
>> processing of flushing memtables, but for the fsync at the end a solid
>> guarantee is needed.
>>
>>
>> -- Jack Krupansky
>>
>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com> wrote:
>>
>>> Jeff,
>>>
>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>>> discounting EBS, but prior outages are worrisome.
>>>
>>>
>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>> wrote:
>>>
>>>> Free to choose what you'd like, but EBS outages were also addressed in
>>>> that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the
>>>> same as 2011 EBS.
>>>>
>>>> --
>>>> Jeff Jirsa
>>>>
>>>>
>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com> wrote:
>>>>
>>>> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral.
>>>> GP2 after testing is a viable contender for our workload. The only worry I
>>>> have is EBS outages, which have happened.
>>>>
>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>>> wrote:
>>>>
>>>>> Also in that video - it's long but worth watching
>>>>>
>>>>> We tested up to 1M reads/second as well, blowing out page cache to
>>>>> ensure we weren't "just" reading from memory
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Jeff Jirsa
>>>>>
>>>>>
>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <ja...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> How about reads? Any differences between read-intensive and
>>>>> write-intensive workloads?
>>>>>
>>>>> -- Jack Krupansky
>>>>>
>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <
>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>
>>>>>> Hi John,
>>>>>>
>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M
>>>>>> writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>>> necessary.
>>>>>>
>>>>>>
>>>>>>
>>>>>> From: John Wong
>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>> To: "user@cassandra.apache.org"
>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>
>>>>>> For production I'd stick with ephemeral disks (aka instance storage)
>>>>>> if you have running a lot of transaction.
>>>>>> However, for regular small testing/qa cluster, or something you know
>>>>>> you want to reload often, EBS is definitely good enough and we haven't had
>>>>>> issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>>>
>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go through the
>>>>>> video, do you actually use PIOPS or just standard GP2 in your production
>>>>>> cluster?
>>>>>>
>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Yep, that motivated my question "Do you have any idea what kind of
>>>>>>> disk performance you need?". If you need the performance, its hard to beat
>>>>>>> ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>>>
>>>>>>> Personally, on small clusters like ours (12 nodes), we've found our
>>>>>>> choice of instance dictated much more by the balance of price, CPU, and
>>>>>>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>>>>>>> rarely the bottleneck. YMMV, of course.
>>>>>>>
>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>
>>>>>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>>>>>> instances with GP2 EBS.  When you don’t care about replacing a node because
>>>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>>>>>>> capable of amazing things, and greatly simplifies life.
>>>>>>>>
>>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very
>>>>>>>> much a viable option, despite any old documents online that say otherwise.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> From: Eric Plowe
>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>>
>>>>>>>> My company is planning on rolling out a C* cluster in EC2. We are
>>>>>>>> thinking about going with ephemeral SSDs. The question is this: Should we
>>>>>>>> put two in RAID 0 or just go with one? We currently run a cluster in our
>>>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>>>>>> the performance we are seeing thus far.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> Eric
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>
>


-- 
Steve Robenalt
Software Architect
srobenalt@highwire.org <bz...@highwire.org>
(office/cell): 916-505-1785

HighWire Press, Inc.
425 Broadway St, Redwood City, CA 94063
www.highwire.org

Technology for Scholarly Communication

Re: EC2 storage options for C*

Posted by Jack Krupansky <ja...@gmail.com>.

Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic
question, it seems like magnetic (HDD) is no longer a recommended storage
option for databases on AWS. In particular, only the C2 Dense Storage
instances have local magnetic storage - all the other instance types are
SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent Data
Access."

For the record, that AWS doc has Cassandra listed as a use case for i2
instance types.

Also, the AWS doc lists EBS io2 for the NoSQL database use case and gp2
only for the "small to medium databases" use case.

Do older instances with local HDD still exist on AWS (m1, m2, etc.)? Is the
doc simply for any newly started instances?

See:
https://aws.amazon.com/ec2/instance-types/
http://aws.amazon.com/ebs/details/


-- Jack Krupansky

On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <je...@crowdstrike.com>
wrote:

> > My apologies if my questions are actually answered on the video or
> slides, I just did a quick scan of the slide text.
>
> Virtually all of them are covered.
>
> > I'm curious where the EBS physical devices actually reside - are they in
> the same rack, the same data center, same availability zone? I mean, people
> try to minimize network latency between nodes, so how exactly is EBS able
> to avoid network latency?
>
> Not published,and probably not a straight forward answer (probably have
> redundancy cross-az, if it matches some of their other published
> behaviors). The promise they give you is ‘iops’, with a certain block size.
> Some instance types are optimized for dedicated, ebs-only network
> interfaces. Like most things in cassandra / cloud, the only way to know for
> sure is to test it yourself and see if observed latency is acceptable (or
> trust our testing, if you assume we’re sufficiently smart and honest).
>
> > Did your test use Amazon EBS–Optimized Instances?
>
> We tested dozens of instance type/size combinations (literally). The best
> performance was clearly with ebs-optimized instances that also have
> enhanced networking (c4, m4, etc) - slide 43
>
> > SSD or magnetic or does it make any difference?
>
> SSD, GP2 (slide 64)
>
> > What info is available on EBS performance at peak times, when multiple
> AWS customers have spikes of demand?
>
> Not published, but experiments show that we can hit 10k iops all day every
> day with only trivial noisy neighbor problems, not enough to impact a real
> cluster (slide 58)
>
> > Is RAID much of a factor or help at all using EBS?
>
> You can use RAID to get higher IOPS than you’d normally get by default
> (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you need more
> than 10k, you can stripe volumes together up to the ebs network link max)
> (hinted at in slide 64)
>
> > How exactly is EBS provisioned in terms of its own HA - I mean, with a
> properly configured Cassandra cluster RF provides HA, so what is the
> equivalent for EBS? If I have RF=3, what assurance is there that those
> three EBS volumes aren't all in the same physical rack?
>
> There is HA, I’m not sure that AWS publishes specifics. Occasionally
> specific volumes will have issues (hypervisor’s dedicated ethernet link to
> EBS network fails, for example). Occasionally instances will have issues.
> The volume-specific issues seem to be less common than the instance-store
> “instance retired” or “instance is running on degraded hardware” events.
> Stop/Start and you’ve recovered (possible with EBS, not possible with
> instance store). The assurances are in AWS’ SLA – if the SLA is
> insufficient (and it probably is insufficient), use more than one AZ and/or
> AWS region or cloud vendor.
>
> > For multi-data center operation, what configuration options assure that
> the EBS volumes for each DC are truly physically separated?
>
> It used to be true that EBS control plane for a given region spanned AZs.
> That’s no longer true. AWS asserts that failure modes for each AZ are
> isolated (data may replicate between AZs, but a full outage in us-east-1a
> shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide 65
>
> > In terms of syncing data for the commit log, if the OS call to sync an
> EBS volume returns, is the commit log data absolutely 100% synced at the
> hardware level on the EBS end, such that a power failure of the systems on
> which the EBS volumes reside will still guarantee availability of the
> fsynced data. As well, is return from fsync an absolute guarantee of
> sstable durability when Cassandra is about to delete the commit log,
> including when the two are on different volumes? In practice, we would like
> some significant degree of pipelining of data, such as during the full
> processing of flushing memtables, but for the fsync at the end a solid
> guarantee is needed.
>
> Most of the answers in this block are “probably not 100%, you should be
> writing to more than one host/AZ/DC/vendor to protect your organization
> from failures”. AWS targets something like 0.1% annual failure rate per
> volume and 99.999% availability (slide 66). We believe they’re exceeding
> those goals (at least based with the petabytes of data we have on gp2
> volumes).
>
>
>
> From: Jack Krupansky
> Reply-To: "user@cassandra.apache.org"
> Date: Monday, February 1, 2016 at 5:51 AM
>
> To: "user@cassandra.apache.org"
> Subject: Re: EC2 storage options for C*
>
> I'm not a fan of guy - this appears to be the slideshare corresponding to
> the video:
>
> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>
> My apologies if my questions are actually answered on the video or slides,
> I just did a quick scan of the slide text.
>
> I'm curious where the EBS physical devices actually reside - are they in
> the same rack, the same data center, same availability zone? I mean, people
> try to minimize network latency between nodes, so how exactly is EBS able
> to avoid network latency?
>
> Did your test use Amazon EBS–Optimized Instances?
>
> SSD or magnetic or does it make any difference?
>
> What info is available on EBS performance at peak times, when multiple AWS
> customers have spikes of demand?
>
> Is RAID much of a factor or help at all using EBS?
>
> How exactly is EBS provisioned in terms of its own HA - I mean, with a
> properly configured Cassandra cluster RF provides HA, so what is the
> equivalent for EBS? If I have RF=3, what assurance is there that those
> three EBS volumes aren't all in the same physical rack?
>
> For multi-data center operation, what configuration options assure that
> the EBS volumes for each DC are truly physically separated?
>
> In terms of syncing data for the commit log, if the OS call to sync an EBS
> volume returns, is the commit log data absolutely 100% synced at the
> hardware level on the EBS end, such that a power failure of the systems on
> which the EBS volumes reside will still guarantee availability of the
> fsynced data. As well, is return from fsync an absolute guarantee of
> sstable durability when Cassandra is about to delete the commit log,
> including when the two are on different volumes? In practice, we would like
> some significant degree of pipelining of data, such as during the full
> processing of flushing memtables, but for the fsync at the end a solid
> guarantee is needed.
>
>
> -- Jack Krupansky
>
> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com> wrote:
>
>> Jeff,
>>
>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>> discounting EBS, but prior outages are worrisome.
>>
>>
>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>> wrote:
>>
>>> Free to choose what you'd like, but EBS outages were also addressed in
>>> that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the
>>> same as 2011 EBS.
>>>
>>> --
>>> Jeff Jirsa
>>>
>>>
>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com> wrote:
>>>
>>> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral.
>>> GP2 after testing is a viable contender for our workload. The only worry I
>>> have is EBS outages, which have happened.
>>>
>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>>> wrote:
>>>
>>>> Also in that video - it's long but worth watching
>>>>
>>>> We tested up to 1M reads/second as well, blowing out page cache to
>>>> ensure we weren't "just" reading from memory
>>>>
>>>>
>>>>
>>>> --
>>>> Jeff Jirsa
>>>>
>>>>
>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <ja...@gmail.com>
>>>> wrote:
>>>>
>>>> How about reads? Any differences between read-intensive and
>>>> write-intensive workloads?
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <jeff.jirsa@crowdstrike.com
>>>> > wrote:
>>>>
>>>>> Hi John,
>>>>>
>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M
>>>>> writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>> necessary.
>>>>>
>>>>>
>>>>>
>>>>> From: John Wong
>>>>> Reply-To: "user@cassandra.apache.org"
>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>> To: "user@cassandra.apache.org"
>>>>> Subject: Re: EC2 storage options for C*
>>>>>
>>>>> For production I'd stick with ephemeral disks (aka instance storage)
>>>>> if you have running a lot of transaction.
>>>>> However, for regular small testing/qa cluster, or something you know
>>>>> you want to reload often, EBS is definitely good enough and we haven't had
>>>>> issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>>
>>>>> But Jeff, kudo that you are able to use EBS. I didn't go through the
>>>>> video, do you actually use PIOPS or just standard GP2 in your production
>>>>> cluster?
>>>>>
>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com>
>>>>> wrote:
>>>>>
>>>>>> Yep, that motivated my question "Do you have any idea what kind of
>>>>>> disk performance you need?". If you need the performance, its hard to beat
>>>>>> ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>>
>>>>>> Personally, on small clusters like ours (12 nodes), we've found our
>>>>>> choice of instance dictated much more by the balance of price, CPU, and
>>>>>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>>>>>> rarely the bottleneck. YMMV, of course.
>>>>>>
>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>
>>>>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>>>>> instances with GP2 EBS.  When you don’t care about replacing a node because
>>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>>>>>> capable of amazing things, and greatly simplifies life.
>>>>>>>
>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very
>>>>>>> much a viable option, despite any old documents online that say otherwise.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> From: Eric Plowe
>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>> To: "user@cassandra.apache.org"
>>>>>>> Subject: EC2 storage options for C*
>>>>>>>
>>>>>>> My company is planning on rolling out a C* cluster in EC2. We are
>>>>>>> thinking about going with ephemeral SSDs. The question is this: Should we
>>>>>>> put two in RAID 0 or just go with one? We currently run a cluster in our
>>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>>>>> the performance we are seeing thus far.
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> Eric
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>

Re: EC2 storage options for C*

Posted by Jeff Jirsa <je...@crowdstrike.com>.

> My apologies if my questions are actually answered on the video or slides, I just did a quick scan of the slide text.

Virtually all of them are covered.

> I'm curious where the EBS physical devices actually reside - are they in the same rack, the same data center, same availability zone? I mean, people try to minimize network latency between nodes, so how exactly is EBS able to avoid network latency?

Not published,and probably not a straight forward answer (probably have redundancy cross-az, if it matches some of their other published behaviors). The promise they give you is ‘iops’, with a certain block size. Some instance types are optimized for dedicated, ebs-only network interfaces. Like most things in cassandra / cloud, the only way to know for sure is to test it yourself and see if observed latency is acceptable (or trust our testing, if you assume we’re sufficiently smart and honest). 

> Did your test use Amazon EBS–Optimized Instances?

We tested dozens of instance type/size combinations (literally). The best performance was clearly with ebs-optimized instances that also have enhanced networking (c4, m4, etc) - slide 43

> SSD or magnetic or does it make any difference?

SSD, GP2 (slide 64)

> What info is available on EBS performance at peak times, when multiple AWS customers have spikes of demand?

Not published, but experiments show that we can hit 10k iops all day every day with only trivial noisy neighbor problems, not enough to impact a real cluster (slide 58)

> Is RAID much of a factor or help at all using EBS?

You can use RAID to get higher IOPS than you’d normally get by default (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you need more than 10k, you can stripe volumes together up to the ebs network link max) (hinted at in slide 64)

> How exactly is EBS provisioned in terms of its own HA - I mean, with a properly configured Cassandra cluster RF provides HA, so what is the equivalent for EBS? If I have RF=3, what assurance is there that those three EBS volumes aren't all in the same physical rack?

There is HA, I’m not sure that AWS publishes specifics. Occasionally specific volumes will have issues (hypervisor’s dedicated ethernet link to EBS network fails, for example). Occasionally instances will have issues. The volume-specific issues seem to be less common than the instance-store “instance retired” or “instance is running on degraded hardware” events. Stop/Start and you’ve recovered (possible with EBS, not possible with instance store). The assurances are in AWS’ SLA – if the SLA is insufficient (and it probably is insufficient), use more than one AZ and/or AWS region or cloud vendor.

> For multi-data center operation, what configuration options assure that the EBS volumes for each DC are truly physically separated?

It used to be true that EBS control plane for a given region spanned AZs. That’s no longer true. AWS asserts that failure modes for each AZ are isolated (data may replicate between AZs, but a full outage in us-east-1a shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide 65

> In terms of syncing data for the commit log, if the OS call to sync an EBS volume returns, is the commit log data absolutely 100% synced at the hardware level on the EBS end, such that a power failure of the systems on which the EBS volumes reside will still guarantee availability of the fsynced data. As well, is return from fsync an absolute guarantee of sstable durability when Cassandra is about to delete the commit log, including when the two are on different volumes? In practice, we would like some significant degree of pipelining of data, such as during the full processing of flushing memtables, but for the fsync at the end a solid guarantee is needed.

Most of the answers in this block are “probably not 100%, you should be writing to more than one host/AZ/DC/vendor to protect your organization from failures”. AWS targets something like 0.1% annual failure rate per volume and 99.999% availability (slide 66). We believe they’re exceeding those goals (at least based with the petabytes of data we have on gp2 volumes).  

From:  Jack Krupansky
Reply-To:  "user@cassandra.apache.org"
Date:  Monday, February 1, 2016 at 5:51 AM
To:  "user@cassandra.apache.org"
Subject:  Re: EC2 storage options for C*

I'm not a fan of guy - this appears to be the slideshare corresponding to the video: 
http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second

My apologies if my questions are actually answered on the video or slides, I just did a quick scan of the slide text.

I'm curious where the EBS physical devices actually reside - are they in the same rack, the same data center, same availability zone? I mean, people try to minimize network latency between nodes, so how exactly is EBS able to avoid network latency? 

Did your test use Amazon EBS–Optimized Instances?

SSD or magnetic or does it make any difference?

What info is available on EBS performance at peak times, when multiple AWS customers have spikes of demand?

Is RAID much of a factor or help at all using EBS?

How exactly is EBS provisioned in terms of its own HA - I mean, with a properly configured Cassandra cluster RF provides HA, so what is the equivalent for EBS? If I have RF=3, what assurance is there that those three EBS volumes aren't all in the same physical rack?

For multi-data center operation, what configuration options assure that the EBS volumes for each DC are truly physically separated?

In terms of syncing data for the commit log, if the OS call to sync an EBS volume returns, is the commit log data absolutely 100% synced at the hardware level on the EBS end, such that a power failure of the systems on which the EBS volumes reside will still guarantee availability of the fsynced data. As well, is return from fsync an absolute guarantee of sstable durability when Cassandra is about to delete the commit log, including when the two are on different volumes? In practice, we would like some significant degree of pipelining of data, such as during the full processing of flushing memtables, but for the fsync at the end a solid guarantee is needed.

-- Jack Krupansky

On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com> wrote:
Jeff, 

If EBS goes down, then EBS Gp2 will go down as well, no? I'm not discounting EBS, but prior outages are worrisome. 

On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com> wrote:
Free to choose what you'd like, but EBS outages were also addressed in that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the same as 2011 EBS. 

-- 
Jeff Jirsa

On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com> wrote:

Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral. GP2 after testing is a viable contender for our workload. The only worry I have is EBS outages, which have happened. 

On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com> wrote:
Also in that video - it's long but worth watching

We tested up to 1M reads/second as well, blowing out page cache to ensure we weren't "just" reading from memory

-- 
Jeff Jirsa

On Jan 31, 2016, at 9:52 AM, Jack Krupansky <ja...@gmail.com> wrote:

How about reads? Any differences between read-intensive and write-intensive workloads?

-- Jack Krupansky

On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <je...@crowdstrike.com> wrote:
Hi John,

We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes per second on 60 nodes, we didn’t come close to hitting even 50% utilization (10k is more than enough for most workloads). PIOPS is not necessary. 

From: John Wong
Reply-To: "user@cassandra.apache.org"
Date: Saturday, January 30, 2016 at 3:07 PM
To: "user@cassandra.apache.org"
Subject: Re: EC2 storage options for C*

For production I'd stick with ephemeral disks (aka instance storage) if you have running a lot of transaction. 
However, for regular small testing/qa cluster, or something you know you want to reload often, EBS is definitely good enough and we haven't had issues 99%. The 1% is kind of anomaly where we have flush blocked.

But Jeff, kudo that you are able to use EBS. I didn't go through the video, do you actually use PIOPS or just standard GP2 in your production cluster?

On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com> wrote:
Yep, that motivated my question "Do you have any idea what kind of disk performance you need?". If you need the performance, its hard to beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.

Personally, on small clusters like ours (12 nodes), we've found our choice of instance dictated much more by the balance of price, CPU, and memory. We're using GP2 SSD and we find that for our patterns the disk is rarely the bottleneck. YMMV, of course.

On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <je...@crowdstrike.com> wrote:
If you have to ask that question, I strongly recommend m4 or c4 instances with GP2 EBS.  When you don’t care about replacing a node because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is capable of amazing things, and greatly simplifies life.

We gave a talk on this topic at both Cassandra Summit and AWS re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable option, despite any old documents online that say otherwise.

From: Eric Plowe
Reply-To: "user@cassandra.apache.org"
Date: Friday, January 29, 2016 at 4:33 PM
To: "user@cassandra.apache.org"
Subject: EC2 storage options for C*

My company is planning on rolling out a C* cluster in EC2. We are thinking about going with ephemeral SSDs. The question is this: Should we put two in RAID 0 or just go with one? We currently run a cluster in our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the performance we are seeing thus far.

Thanks!

Eric

Re: EC2 storage options for C*

Posted by Jack Krupansky <ja...@gmail.com>.

I'm not a fan of guy - this appears to be the slideshare corresponding to
the video:
http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second

My apologies if my questions are actually answered on the video or slides,
I just did a quick scan of the slide text.

I'm curious where the EBS physical devices actually reside - are they in
the same rack, the same data center, same availability zone? I mean, people
try to minimize network latency between nodes, so how exactly is EBS able
to avoid network latency?

Did your test use Amazon EBS–Optimized Instances?

SSD or magnetic or does it make any difference?

What info is available on EBS performance at peak times, when multiple AWS
customers have spikes of demand?

Is RAID much of a factor or help at all using EBS?

How exactly is EBS provisioned in terms of its own HA - I mean, with a
properly configured Cassandra cluster RF provides HA, so what is the
equivalent for EBS? If I have RF=3, what assurance is there that those
three EBS volumes aren't all in the same physical rack?

For multi-data center operation, what configuration options assure that the
EBS volumes for each DC are truly physically separated?

In terms of syncing data for the commit log, if the OS call to sync an EBS
volume returns, is the commit log data absolutely 100% synced at the
hardware level on the EBS end, such that a power failure of the systems on
which the EBS volumes reside will still guarantee availability of the
fsynced data. As well, is return from fsync an absolute guarantee of
sstable durability when Cassandra is about to delete the commit log,
including when the two are on different volumes? In practice, we would like
some significant degree of pipelining of data, such as during the full
processing of flushing memtables, but for the fsync at the end a solid
guarantee is needed.


-- Jack Krupansky

On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <er...@gmail.com> wrote:

> Jeff,
>
> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
> discounting EBS, but prior outages are worrisome.
>
>
> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
> wrote:
>
>> Free to choose what you'd like, but EBS outages were also addressed in
>> that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the
>> same as 2011 EBS.
>>
>> --
>> Jeff Jirsa
>>
>>
>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com> wrote:
>>
>> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral. GP2
>> after testing is a viable contender for our workload. The only worry I have
>> is EBS outages, which have happened.
>>
>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>> wrote:
>>
>>> Also in that video - it's long but worth watching
>>>
>>> We tested up to 1M reads/second as well, blowing out page cache to
>>> ensure we weren't "just" reading from memory
>>>
>>>
>>>
>>> --
>>> Jeff Jirsa
>>>
>>>
>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <ja...@gmail.com>
>>> wrote:
>>>
>>> How about reads? Any differences between read-intensive and
>>> write-intensive workloads?
>>>
>>> -- Jack Krupansky
>>>
>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <je...@crowdstrike.com>
>>> wrote:
>>>
>>>> Hi John,
>>>>
>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M
>>>> writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>> necessary.
>>>>
>>>>
>>>>
>>>> From: John Wong
>>>> Reply-To: "user@cassandra.apache.org"
>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>> To: "user@cassandra.apache.org"
>>>> Subject: Re: EC2 storage options for C*
>>>>
>>>> For production I'd stick with ephemeral disks (aka instance storage) if
>>>> you have running a lot of transaction.
>>>> However, for regular small testing/qa cluster, or something you know
>>>> you want to reload often, EBS is definitely good enough and we haven't had
>>>> issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>
>>>> But Jeff, kudo that you are able to use EBS. I didn't go through the
>>>> video, do you actually use PIOPS or just standard GP2 in your production
>>>> cluster?
>>>>
>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com>
>>>> wrote:
>>>>
>>>>> Yep, that motivated my question "Do you have any idea what kind of
>>>>> disk performance you need?". If you need the performance, its hard to beat
>>>>> ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>
>>>>> Personally, on small clusters like ours (12 nodes), we've found our
>>>>> choice of instance dictated much more by the balance of price, CPU, and
>>>>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>>>>> rarely the bottleneck. YMMV, of course.
>>>>>
>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>
>>>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>>>> instances with GP2 EBS.  When you don’t care about replacing a node because
>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>>>>> capable of amazing things, and greatly simplifies life.
>>>>>>
>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very
>>>>>> much a viable option, despite any old documents online that say otherwise.
>>>>>>
>>>>>>
>>>>>>
>>>>>> From: Eric Plowe
>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>> To: "user@cassandra.apache.org"
>>>>>> Subject: EC2 storage options for C*
>>>>>>
>>>>>> My company is planning on rolling out a C* cluster in EC2. We are
>>>>>> thinking about going with ephemeral SSDs. The question is this: Should we
>>>>>> put two in RAID 0 or just go with one? We currently run a cluster in our
>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>>>> the performance we are seeing thus far.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>
>>>>>
>>>>
>>>

Re: EC2 storage options for C*

Posted by Eric Plowe <er...@gmail.com>.

http://m.theregister.co.uk/2013/08/26/amazon_ebs_cloud_problems/

That's what I'm worried about. Granted that's an article from 2013, and While
the the general purpose EBS volumes are performant for a production C*
workload, I'm worried about EBS outages. If EBS is down, my cluster is
down.

On Monday, February 1, 2016, Jeff Jirsa <je...@crowdstrike.com> wrote:

> Yes, but getting at why you think EBS is going down is the real point. New
> GM in 2011. Very different product. 35:40 in the video
>
>
> --
> Jeff Jirsa
>
>
> On Jan 31, 2016, at 9:57 PM, Eric Plowe <eric.plowe@gmail.com
> <javascript:_e(%7B%7D,'cvml','eric.plowe@gmail.com');>> wrote:
>
> Jeff,
>
> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
> discounting EBS, but prior outages are worrisome.
>
> On Sunday, January 31, 2016, Jeff Jirsa <jeff.jirsa@crowdstrike.com
> <javascript:_e(%7B%7D,'cvml','jeff.jirsa@crowdstrike.com');>> wrote:
>
>> Free to choose what you'd like, but EBS outages were also addressed in
>> that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the
>> same as 2011 EBS.
>>
>> --
>> Jeff Jirsa
>>
>>
>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com> wrote:
>>
>> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral. GP2
>> after testing is a viable contender for our workload. The only worry I have
>> is EBS outages, which have happened.
>>
>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com>
>> wrote:
>>
>>> Also in that video - it's long but worth watching
>>>
>>> We tested up to 1M reads/second as well, blowing out page cache to
>>> ensure we weren't "just" reading from memory
>>>
>>>
>>>
>>> --
>>> Jeff Jirsa
>>>
>>>
>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <ja...@gmail.com>
>>> wrote:
>>>
>>> How about reads? Any differences between read-intensive and
>>> write-intensive workloads?
>>>
>>> -- Jack Krupansky
>>>
>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <je...@crowdstrike.com>
>>> wrote:
>>>
>>>> Hi John,
>>>>
>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M
>>>> writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>> necessary.
>>>>
>>>>
>>>>
>>>> From: John Wong
>>>> Reply-To: "user@cassandra.apache.org"
>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>> To: "user@cassandra.apache.org"
>>>> Subject: Re: EC2 storage options for C*
>>>>
>>>> For production I'd stick with ephemeral disks (aka instance storage) if
>>>> you have running a lot of transaction.
>>>> However, for regular small testing/qa cluster, or something you know
>>>> you want to reload often, EBS is definitely good enough and we haven't had
>>>> issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>
>>>> But Jeff, kudo that you are able to use EBS. I didn't go through the
>>>> video, do you actually use PIOPS or just standard GP2 in your production
>>>> cluster?
>>>>
>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com>
>>>> wrote:
>>>>
>>>>> Yep, that motivated my question "Do you have any idea what kind of
>>>>> disk performance you need?". If you need the performance, its hard to beat
>>>>> ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>
>>>>> Personally, on small clusters like ours (12 nodes), we've found our
>>>>> choice of instance dictated much more by the balance of price, CPU, and
>>>>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>>>>> rarely the bottleneck. YMMV, of course.
>>>>>
>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>
>>>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>>>> instances with GP2 EBS.  When you don’t care about replacing a node because
>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>>>>> capable of amazing things, and greatly simplifies life.
>>>>>>
>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very
>>>>>> much a viable option, despite any old documents online that say otherwise.
>>>>>>
>>>>>>
>>>>>>
>>>>>> From: Eric Plowe
>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>> To: "user@cassandra.apache.org"
>>>>>> Subject: EC2 storage options for C*
>>>>>>
>>>>>> My company is planning on rolling out a C* cluster in EC2. We are
>>>>>> thinking about going with ephemeral SSDs. The question is this: Should we
>>>>>> put two in RAID 0 or just go with one? We currently run a cluster in our
>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>>>> the performance we are seeing thus far.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>
>>>>>
>>>>
>>>

Re: EC2 storage options for C*

Posted by Jeff Jirsa <je...@crowdstrike.com>.

Yes, but getting at why you think EBS is going down is the real point. New GM in 2011. Very different product. 35:40 in the video


-- 
Jeff Jirsa


> On Jan 31, 2016, at 9:57 PM, Eric Plowe <er...@gmail.com> wrote:
> 
> Jeff,
> 
> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not discounting EBS, but prior outages are worrisome.
> 
>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com> wrote:
>> Free to choose what you'd like, but EBS outages were also addressed in that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the same as 2011 EBS. 
>> 
>> -- 
>> Jeff Jirsa
>> 
>> 
>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com> wrote:
>>> 
>>> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral. GP2 after testing is a viable contender for our workload. The only worry I have is EBS outages, which have happened. 
>>> 
>>>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com> wrote:
>>>> Also in that video - it's long but worth watching
>>>> 
>>>> We tested up to 1M reads/second as well, blowing out page cache to ensure we weren't "just" reading from memory
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Jeff Jirsa
>>>> 
>>>> 
>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <ja...@gmail.com> wrote:
>>>>> 
>>>>> How about reads? Any differences between read-intensive and write-intensive workloads?
>>>>> 
>>>>> -- Jack Krupansky
>>>>> 
>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <je...@crowdstrike.com> wrote:
>>>>>> Hi John,
>>>>>> 
>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes per second on 60 nodes, we didn’t come close to hitting even 50% utilization (10k is more than enough for most workloads). PIOPS is not necessary. 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> From: John Wong
>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>> To: "user@cassandra.apache.org"
>>>>>> Subject: Re: EC2 storage options for C*
>>>>>> 
>>>>>> For production I'd stick with ephemeral disks (aka instance storage) if you have running a lot of transaction.
>>>>>> However, for regular small testing/qa cluster, or something you know you want to reload often, EBS is definitely good enough and we haven't had issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>>> 
>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go through the video, do you actually use PIOPS or just standard GP2 in your production cluster?
>>>>>> 
>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com> wrote:
>>>>>>> Yep, that motivated my question "Do you have any idea what kind of disk performance you need?". If you need the performance, its hard to beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>>>> 
>>>>>>> Personally, on small clusters like ours (12 nodes), we've found our choice of instance dictated much more by the balance of price, CPU, and memory. We're using GP2 SSD and we find that for our patterns the disk is rarely the bottleneck. YMMV, of course.
>>>>>>> 
>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <je...@crowdstrike.com> wrote:
>>>>>>>> If you have to ask that question, I strongly recommend m4 or c4 instances with GP2 EBS.  When you don’t care about replacing a node because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is capable of amazing things, and greatly simplifies life.
>>>>>>>> 
>>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable option, despite any old documents online that say otherwise.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> From: Eric Plowe
>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>> 
>>>>>>>> My company is planning on rolling out a C* cluster in EC2. We are thinking about going with ephemeral SSDs. The question is this: Should we put two in RAID 0 or just go with one? We currently run a cluster in our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the performance we are seeing thus far.
>>>>>>>> 
>>>>>>>> Thanks!
>>>>>>>> 
>>>>>>>> Eric

Re: EC2 storage options for C*

Posted by Eric Plowe <er...@gmail.com>.

Jeff,

If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
discounting EBS, but prior outages are worrisome.

On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com> wrote:

> Free to choose what you'd like, but EBS outages were also addressed in
> that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the
> same as 2011 EBS.
>
> --
> Jeff Jirsa
>
>
> On Jan 31, 2016, at 8:27 PM, Eric Plowe <eric.plowe@gmail.com
> <javascript:_e(%7B%7D,'cvml','eric.plowe@gmail.com');>> wrote:
>
> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral. GP2
> after testing is a viable contender for our workload. The only worry I have
> is EBS outages, which have happened.
>
> On Sunday, January 31, 2016, Jeff Jirsa <jeff.jirsa@crowdstrike.com
> <javascript:_e(%7B%7D,'cvml','jeff.jirsa@crowdstrike.com');>> wrote:
>
>> Also in that video - it's long but worth watching
>>
>> We tested up to 1M reads/second as well, blowing out page cache to ensure
>> we weren't "just" reading from memory
>>
>>
>>
>> --
>> Jeff Jirsa
>>
>>
>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <ja...@gmail.com>
>> wrote:
>>
>> How about reads? Any differences between read-intensive and
>> write-intensive workloads?
>>
>> -- Jack Krupansky
>>
>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <je...@crowdstrike.com>
>> wrote:
>>
>>> Hi John,
>>>
>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes
>>> per second on 60 nodes, we didn’t come close to hitting even 50%
>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>> necessary.
>>>
>>>
>>>
>>> From: John Wong
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>> To: "user@cassandra.apache.org"
>>> Subject: Re: EC2 storage options for C*
>>>
>>> For production I'd stick with ephemeral disks (aka instance storage) if
>>> you have running a lot of transaction.
>>> However, for regular small testing/qa cluster, or something you know you
>>> want to reload often, EBS is definitely good enough and we haven't had
>>> issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>
>>> But Jeff, kudo that you are able to use EBS. I didn't go through the
>>> video, do you actually use PIOPS or just standard GP2 in your production
>>> cluster?
>>>
>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com>
>>> wrote:
>>>
>>>> Yep, that motivated my question "Do you have any idea what kind of
>>>> disk performance you need?". If you need the performance, its hard to beat
>>>> ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>
>>>> Personally, on small clusters like ours (12 nodes), we've found our
>>>> choice of instance dictated much more by the balance of price, CPU, and
>>>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>>>> rarely the bottleneck. YMMV, of course.
>>>>
>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <jeff.jirsa@crowdstrike.com
>>>> > wrote:
>>>>
>>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>>> instances with GP2 EBS.  When you don’t care about replacing a node because
>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>>>> capable of amazing things, and greatly simplifies life.
>>>>>
>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much
>>>>> a viable option, despite any old documents online that say otherwise.
>>>>>
>>>>>
>>>>>
>>>>> From: Eric Plowe
>>>>> Reply-To: "user@cassandra.apache.org"
>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>> To: "user@cassandra.apache.org"
>>>>> Subject: EC2 storage options for C*
>>>>>
>>>>> My company is planning on rolling out a C* cluster in EC2. We are
>>>>> thinking about going with ephemeral SSDs. The question is this: Should we
>>>>> put two in RAID 0 or just go with one? We currently run a cluster in our
>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>>> the performance we are seeing thus far.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Eric
>>>>>
>>>>
>>>>
>>>
>>

Re: EC2 storage options for C*

Posted by Jeff Jirsa <je...@crowdstrike.com>.

Free to choose what you'd like, but EBS outages were also addressed in that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the same as 2011 EBS. 

-- 
Jeff Jirsa


> On Jan 31, 2016, at 8:27 PM, Eric Plowe <er...@gmail.com> wrote:
> 
> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral. GP2 after testing is a viable contender for our workload. The only worry I have is EBS outages, which have happened. 
> 
>> On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com> wrote:
>> Also in that video - it's long but worth watching
>> 
>> We tested up to 1M reads/second as well, blowing out page cache to ensure we weren't "just" reading from memory
>> 
>> 
>> 
>> -- 
>> Jeff Jirsa
>> 
>> 
>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <ja...@gmail.com> wrote:
>>> 
>>> How about reads? Any differences between read-intensive and write-intensive workloads?
>>> 
>>> -- Jack Krupansky
>>> 
>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <je...@crowdstrike.com> wrote:
>>>> Hi John,
>>>> 
>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes per second on 60 nodes, we didn’t come close to hitting even 50% utilization (10k is more than enough for most workloads). PIOPS is not necessary. 
>>>> 
>>>> 
>>>> 
>>>> From: John Wong
>>>> Reply-To: "user@cassandra.apache.org"
>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>> To: "user@cassandra.apache.org"
>>>> Subject: Re: EC2 storage options for C*
>>>> 
>>>> For production I'd stick with ephemeral disks (aka instance storage) if you have running a lot of transaction.
>>>> However, for regular small testing/qa cluster, or something you know you want to reload often, EBS is definitely good enough and we haven't had issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>> 
>>>> But Jeff, kudo that you are able to use EBS. I didn't go through the video, do you actually use PIOPS or just standard GP2 in your production cluster?
>>>> 
>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com> wrote:
>>>>> Yep, that motivated my question "Do you have any idea what kind of disk performance you need?". If you need the performance, its hard to beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>>> 
>>>>> Personally, on small clusters like ours (12 nodes), we've found our choice of instance dictated much more by the balance of price, CPU, and memory. We're using GP2 SSD and we find that for our patterns the disk is rarely the bottleneck. YMMV, of course.
>>>>> 
>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <je...@crowdstrike.com> wrote:
>>>>>> If you have to ask that question, I strongly recommend m4 or c4 instances with GP2 EBS.  When you don’t care about replacing a node because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is capable of amazing things, and greatly simplifies life.
>>>>>> 
>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable option, despite any old documents online that say otherwise.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> From: Eric Plowe
>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>> To: "user@cassandra.apache.org"
>>>>>> Subject: EC2 storage options for C*
>>>>>> 
>>>>>> My company is planning on rolling out a C* cluster in EC2. We are thinking about going with ephemeral SSDs. The question is this: Should we put two in RAID 0 or just go with one? We currently run a cluster in our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the performance we are seeing thus far.
>>>>>> 
>>>>>> Thanks!
>>>>>> 
>>>>>> Eric

Re: EC2 storage options for C*

Posted by Eric Plowe <er...@gmail.com>.

Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral. GP2
after testing is a viable contender for our workload. The only worry I have
is EBS outages, which have happened.

On Sunday, January 31, 2016, Jeff Jirsa <je...@crowdstrike.com> wrote:

> Also in that video - it's long but worth watching
>
> We tested up to 1M reads/second as well, blowing out page cache to ensure
> we weren't "just" reading from memory
>
>
>
> --
> Jeff Jirsa
>
>
> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <jack.krupansky@gmail.com
> <javascript:_e(%7B%7D,'cvml','jack.krupansky@gmail.com');>> wrote:
>
> How about reads? Any differences between read-intensive and
> write-intensive workloads?
>
> -- Jack Krupansky
>
> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <jeff.jirsa@crowdstrike.com
> <javascript:_e(%7B%7D,'cvml','jeff.jirsa@crowdstrike.com');>> wrote:
>
>> Hi John,
>>
>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes
>> per second on 60 nodes, we didn’t come close to hitting even 50%
>> utilization (10k is more than enough for most workloads). PIOPS is not
>> necessary.
>>
>>
>>
>> From: John Wong
>> Reply-To: "user@cassandra.apache.org
>> <javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');>"
>> Date: Saturday, January 30, 2016 at 3:07 PM
>> To: "user@cassandra.apache.org
>> <javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');>"
>> Subject: Re: EC2 storage options for C*
>>
>> For production I'd stick with ephemeral disks (aka instance storage) if
>> you have running a lot of transaction.
>> However, for regular small testing/qa cluster, or something you know you
>> want to reload often, EBS is definitely good enough and we haven't had
>> issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>
>> But Jeff, kudo that you are able to use EBS. I didn't go through the
>> video, do you actually use PIOPS or just standard GP2 in your production
>> cluster?
>>
>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <bryan@blockcypher.com
>> <javascript:_e(%7B%7D,'cvml','bryan@blockcypher.com');>> wrote:
>>
>>> Yep, that motivated my question "Do you have any idea what kind of disk
>>> performance you need?". If you need the performance, its hard to beat
>>> ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>>
>>> Personally, on small clusters like ours (12 nodes), we've found our
>>> choice of instance dictated much more by the balance of price, CPU, and
>>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>>> rarely the bottleneck. YMMV, of course.
>>>
>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <jeff.jirsa@crowdstrike.com
>>> <javascript:_e(%7B%7D,'cvml','jeff.jirsa@crowdstrike.com');>> wrote:
>>>
>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>> instances with GP2 EBS.  When you don’t care about replacing a node because
>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>>> capable of amazing things, and greatly simplifies life.
>>>>
>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much
>>>> a viable option, despite any old documents online that say otherwise.
>>>>
>>>>
>>>>
>>>> From: Eric Plowe
>>>> Reply-To: "user@cassandra.apache.org
>>>> <javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');>"
>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>> To: "user@cassandra.apache.org
>>>> <javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');>"
>>>> Subject: EC2 storage options for C*
>>>>
>>>> My company is planning on rolling out a C* cluster in EC2. We are
>>>> thinking about going with ephemeral SSDs. The question is this: Should we
>>>> put two in RAID 0 or just go with one? We currently run a cluster in our
>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>>> the performance we are seeing thus far.
>>>>
>>>> Thanks!
>>>>
>>>> Eric
>>>>
>>>
>>>
>>
>

Re: EC2 storage options for C*

Posted by Jeff Jirsa <je...@crowdstrike.com>.

Also in that video - it's long but worth watching

We tested up to 1M reads/second as well, blowing out page cache to ensure we weren't "just" reading from memory



-- 
Jeff Jirsa


> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <ja...@gmail.com> wrote:
> 
> How about reads? Any differences between read-intensive and write-intensive workloads?
> 
> -- Jack Krupansky
> 
>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <je...@crowdstrike.com> wrote:
>> Hi John,
>> 
>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes per second on 60 nodes, we didn’t come close to hitting even 50% utilization (10k is more than enough for most workloads). PIOPS is not necessary. 
>> 
>> 
>> 
>> From: John Wong
>> Reply-To: "user@cassandra.apache.org"
>> Date: Saturday, January 30, 2016 at 3:07 PM
>> To: "user@cassandra.apache.org"
>> Subject: Re: EC2 storage options for C*
>> 
>> For production I'd stick with ephemeral disks (aka instance storage) if you have running a lot of transaction.
>> However, for regular small testing/qa cluster, or something you know you want to reload often, EBS is definitely good enough and we haven't had issues 99%. The 1% is kind of anomaly where we have flush blocked.
>> 
>> But Jeff, kudo that you are able to use EBS. I didn't go through the video, do you actually use PIOPS or just standard GP2 in your production cluster?
>> 
>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com> wrote:
>>> Yep, that motivated my question "Do you have any idea what kind of disk performance you need?". If you need the performance, its hard to beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>> 
>>> Personally, on small clusters like ours (12 nodes), we've found our choice of instance dictated much more by the balance of price, CPU, and memory. We're using GP2 SSD and we find that for our patterns the disk is rarely the bottleneck. YMMV, of course.
>>> 
>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <je...@crowdstrike.com> wrote:
>>>> If you have to ask that question, I strongly recommend m4 or c4 instances with GP2 EBS.  When you don’t care about replacing a node because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is capable of amazing things, and greatly simplifies life.
>>>> 
>>>> We gave a talk on this topic at both Cassandra Summit and AWS re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable option, despite any old documents online that say otherwise.
>>>> 
>>>> 
>>>> 
>>>> From: Eric Plowe
>>>> Reply-To: "user@cassandra.apache.org"
>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>> To: "user@cassandra.apache.org"
>>>> Subject: EC2 storage options for C*
>>>> 
>>>> My company is planning on rolling out a C* cluster in EC2. We are thinking about going with ephemeral SSDs. The question is this: Should we put two in RAID 0 or just go with one? We currently run a cluster in our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the performance we are seeing thus far.
>>>> 
>>>> Thanks!
>>>> 
>>>> Eric
>

Re: EC2 storage options for C*

Posted by Jack Krupansky <ja...@gmail.com>.

How about reads? Any differences between read-intensive and write-intensive
workloads?

-- Jack Krupansky

On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <je...@crowdstrike.com>
wrote:

> Hi John,
>
> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes
> per second on 60 nodes, we didn’t come close to hitting even 50%
> utilization (10k is more than enough for most workloads). PIOPS is not
> necessary.
>
>
>
> From: John Wong
> Reply-To: "user@cassandra.apache.org"
> Date: Saturday, January 30, 2016 at 3:07 PM
> To: "user@cassandra.apache.org"
> Subject: Re: EC2 storage options for C*
>
> For production I'd stick with ephemeral disks (aka instance storage) if
> you have running a lot of transaction.
> However, for regular small testing/qa cluster, or something you know you
> want to reload often, EBS is definitely good enough and we haven't had
> issues 99%. The 1% is kind of anomaly where we have flush blocked.
>
> But Jeff, kudo that you are able to use EBS. I didn't go through the
> video, do you actually use PIOPS or just standard GP2 in your production
> cluster?
>
> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com>
> wrote:
>
>> Yep, that motivated my question "Do you have any idea what kind of disk
>> performance you need?". If you need the performance, its hard to beat
>> ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>>
>> Personally, on small clusters like ours (12 nodes), we've found our
>> choice of instance dictated much more by the balance of price, CPU, and
>> memory. We're using GP2 SSD and we find that for our patterns the disk is
>> rarely the bottleneck. YMMV, of course.
>>
>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <je...@crowdstrike.com>
>> wrote:
>>
>>> If you have to ask that question, I strongly recommend m4 or c4
>>> instances with GP2 EBS.  When you don’t care about replacing a node because
>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>> capable of amazing things, and greatly simplifies life.
>>>
>>> We gave a talk on this topic at both Cassandra Summit and AWS re:Invent:
>>> https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable
>>> option, despite any old documents online that say otherwise.
>>>
>>>
>>>
>>> From: Eric Plowe
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Friday, January 29, 2016 at 4:33 PM
>>> To: "user@cassandra.apache.org"
>>> Subject: EC2 storage options for C*
>>>
>>> My company is planning on rolling out a C* cluster in EC2. We are
>>> thinking about going with ephemeral SSDs. The question is this: Should we
>>> put two in RAID 0 or just go with one? We currently run a cluster in our
>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>>> the performance we are seeing thus far.
>>>
>>> Thanks!
>>>
>>> Eric
>>>
>>
>>
>

Re: EC2 storage options for C*

Posted by Jeff Jirsa <je...@crowdstrike.com>.

Hi John,

We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes per second on 60 nodes, we didn’t come close to hitting even 50% utilization (10k is more than enough for most workloads). PIOPS is not necessary. 

From:  John Wong
Reply-To:  "user@cassandra.apache.org"
Date:  Saturday, January 30, 2016 at 3:07 PM
To:  "user@cassandra.apache.org"
Subject:  Re: EC2 storage options for C*

For production I'd stick with ephemeral disks (aka instance storage) if you have running a lot of transaction. 
However, for regular small testing/qa cluster, or something you know you want to reload often, EBS is definitely good enough and we haven't had issues 99%. The 1% is kind of anomaly where we have flush blocked.

But Jeff, kudo that you are able to use EBS. I didn't go through the video, do you actually use PIOPS or just standard GP2 in your production cluster?

On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com> wrote:
Yep, that motivated my question "Do you have any idea what kind of disk performance you need?". If you need the performance, its hard to beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.

Personally, on small clusters like ours (12 nodes), we've found our choice of instance dictated much more by the balance of price, CPU, and memory. We're using GP2 SSD and we find that for our patterns the disk is rarely the bottleneck. YMMV, of course.

On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <je...@crowdstrike.com> wrote:
If you have to ask that question, I strongly recommend m4 or c4 instances with GP2 EBS.  When you don’t care about replacing a node because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is capable of amazing things, and greatly simplifies life.

We gave a talk on this topic at both Cassandra Summit and AWS re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable option, despite any old documents online that say otherwise.

From: Eric Plowe
Reply-To: "user@cassandra.apache.org"
Date: Friday, January 29, 2016 at 4:33 PM
To: "user@cassandra.apache.org"
Subject: EC2 storage options for C*

My company is planning on rolling out a C* cluster in EC2. We are thinking about going with ephemeral SSDs. The question is this: Should we put two in RAID 0 or just go with one? We currently run a cluster in our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the performance we are seeing thus far.

Thanks!

Eric

Re: EC2 storage options for C*

Posted by John Wong <go...@gmail.com>.

For production I'd stick with ephemeral disks (aka instance storage) if you
have running a lot of transaction.
However, for regular small testing/qa cluster, or something you know you
want to reload often, EBS is definitely good enough and we haven't had
issues 99%. The 1% is kind of anomaly where we have flush blocked.

But Jeff, kudo that you are able to use EBS. I didn't go through the video,
do you actually use PIOPS or just standard GP2 in your production cluster?

On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com> wrote:

> Yep, that motivated my question "Do you have any idea what kind of disk
> performance you need?". If you need the performance, its hard to beat
> ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.
>
> Personally, on small clusters like ours (12 nodes), we've found our choice
> of instance dictated much more by the balance of price, CPU, and memory.
> We're using GP2 SSD and we find that for our patterns the disk is rarely
> the bottleneck. YMMV, of course.
>
> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <je...@crowdstrike.com>
> wrote:
>
>> If you have to ask that question, I strongly recommend m4 or c4 instances
>> with GP2 EBS.  When you don’t care about replacing a node because of an
>> instance failure, go with i2+ephemerals. Until then, GP2 EBS is capable of
>> amazing things, and greatly simplifies life.
>>
>> We gave a talk on this topic at both Cassandra Summit and AWS re:Invent:
>> https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable
>> option, despite any old documents online that say otherwise.
>>
>>
>>
>> From: Eric Plowe
>> Reply-To: "user@cassandra.apache.org"
>> Date: Friday, January 29, 2016 at 4:33 PM
>> To: "user@cassandra.apache.org"
>> Subject: EC2 storage options for C*
>>
>> My company is planning on rolling out a C* cluster in EC2. We are
>> thinking about going with ephemeral SSDs. The question is this: Should we
>> put two in RAID 0 or just go with one? We currently run a cluster in our
>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with
>> the performance we are seeing thus far.
>>
>> Thanks!
>>
>> Eric
>>
>
>

Re: EC2 storage options for C*

Posted by Bryan Cheng <br...@blockcypher.com>.

Yep, that motivated my question "Do you have any idea what kind of disk
performance you need?". If you need the performance, its hard to beat
ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
configuration. If you don't, though, EBS GP2 will save a _lot_ of headache.

Personally, on small clusters like ours (12 nodes), we've found our choice
of instance dictated much more by the balance of price, CPU, and memory.
We're using GP2 SSD and we find that for our patterns the disk is rarely
the bottleneck. YMMV, of course.

On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <je...@crowdstrike.com>
wrote:

> If you have to ask that question, I strongly recommend m4 or c4 instances
> with GP2 EBS.  When you don’t care about replacing a node because of an
> instance failure, go with i2+ephemerals. Until then, GP2 EBS is capable of
> amazing things, and greatly simplifies life.
>
> We gave a talk on this topic at both Cassandra Summit and AWS re:Invent:
> https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable
> option, despite any old documents online that say otherwise.
>
>
>
> From: Eric Plowe
> Reply-To: "user@cassandra.apache.org"
> Date: Friday, January 29, 2016 at 4:33 PM
> To: "user@cassandra.apache.org"
> Subject: EC2 storage options for C*
>
> My company is planning on rolling out a C* cluster in EC2. We are thinking
> about going with ephemeral SSDs. The question is this: Should we put two in
> RAID 0 or just go with one? We currently run a cluster in our data center
> with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the
> performance we are seeing thus far.
>
> Thanks!
>
> Eric
>

Re: EC2 storage options for C*

Posted by Jeff Jirsa <je...@crowdstrike.com>.

If you have to ask that question, I strongly recommend m4 or c4 instances with GP2 EBS.  When you don’t care about replacing a node because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is capable of amazing things, and greatly simplifies life.

We gave a talk on this topic at both Cassandra Summit and AWS re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable option, despite any old documents online that say otherwise.

From:  Eric Plowe
Reply-To:  "user@cassandra.apache.org"
Date:  Friday, January 29, 2016 at 4:33 PM
To:  "user@cassandra.apache.org"
Subject:  EC2 storage options for C*

My company is planning on rolling out a C* cluster in EC2. We are thinking about going with ephemeral SSDs. The question is this: Should we put two in RAID 0 or just go with one? We currently run a cluster in our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the performance we are seeing thus far.

Thanks!

Eric