You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Dikang Gu <di...@gmail.com> on 2016/03/10 07:18:27 UTC

How to measure the write amplification of C*?

Hello there,

I'm wondering is there a good way to measure the write amplification of
Cassandra?

I'm thinking it could be calculated by (size of mutations written to the
node)/(number of bytes written to the disk).

Do we already have the metrics of "size of mutations written to the node"?
I did not find it in jmx metrics.

Thanks

-- 
Dikang

Re: How to measure the write amplification of C*?

Posted by Jeff Jirsa <je...@crowdstrike.com>.

A bit of Splunk-fu probably works for this – you’ll have different line entries for memtable flushes vs compaction output. Comparing the two will give you a general idea of compaction amplification. 

From:  Dikang Gu
Reply-To:  "user@cassandra.apache.org"
Date:  Thursday, March 10, 2016 at 9:44 AM
To:  cassandra, "mkennedy@datastax.com"
Subject:  Re: How to measure the write amplification of C*?

Hi Matt, 

Thanks for the detailed explanation! Yes, this is exactly what I'm looking for, "write amplification = data written to flash/data written by the host".

We are heavily using the LCS in production, so I'd like to figure out the amplification caused by that and see what we can do to optimize it. I have the metrics of "data written to flash", and I'm wondering is there an easy way to get the "data written by the host" on each C* node?

Thanks

On Thu, Mar 10, 2016 at 8:48 AM, Matt Kennedy <mk...@datastax.com> wrote:
TL;DR - Cassandra actually causes a ton of write amplification but it doesn't freaking matter any more. Read on for details...

That slide deck does have a lot of very good information on it, but unfortunately I think it has led to a fundamental misunderstanding about Cassandra and write amplification. In particular, slide 51 vastly oversimplifies the situation.

The wikipedia definition of write amplification looks at this from the perspective of the SSD controller:
https://en.wikipedia.org/wiki/Write_amplification#Calculating_the_value

In short, write amplification = data written to flash/data written by the host

So, if I write 1MB in my application, but the SSD has to write my 1MB, plus rearrange another 1MB of data in order to make room for it, then I've written a total of 2MB and my write amplification is 2x.

In other words, it is measuring how much extra the SSD controller has to write in order to do its own housekeeping.

However, the wikipedia definition is a bit more constrained than how the term is used in the storage industry. The whole point of looking at write amplification is to understand the impact that a particular workload is going to have on the underlying NAND by virtue of the data written. So a definition of write amplification that is a little more relevant to the context of Cassandra is to consider this:

write amplification = data written to flash/data written to the database

So, while the fact that we only sequentially write large immutable SSTables does in fact mean that controller-level write amplification is near zero, Compaction comes along and completely destroys that tidy little story. Think about it, every time a compaction re-writes data that has already been written, we are creating a lot of application-level write amplification. Different compaction strategies and the workload itself impact what the real application-level write amp is, but generally speaking, LCS is the worst, followed by STCS and DTCS will cause the least write-amp. To measure this, you can usually use smartctl (may be another mechanism depending on SSD manufacturer) to get the physical bytes written to your SSDs and divide that by the data that you've actually logically written to Cassandra. I've measured (more than two years ago) LCS write amp as high as 50x on some workloads, which is significantly higher than the typical controller level write amp on a b-tree style update-in-place data store. Also note that the new storage engine in general reduces a lot of inefficiency in the Cassandra storage engine therefore reducing the impact of write amp due to compactions.

However, if you're a person that understands SSDs, at this point you're wondering why we aren't burning out SSDs right and left. The reality is that general SSD endurance has gotten so good, that all this write amp isn't really a problem any more. If you're curious to read more about that, I recommend you start here:

http://hothardware.com/news/google-data-center-ssd-research-report-offers-surprising-results-slc-not-more-reliable-than-mlc-flash

and the paper that article mentions:
http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/23105-fast16-papers-schroeder.pdf

Hope this helps.

Matt Kennedy

On Thu, Mar 10, 2016 at 7:05 AM, Paulo Motta <pa...@gmail.com> wrote:
This is a good source on Cassandra + write amplification: http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives

2016-03-10 9:57 GMT-03:00 Benjamin Lerer <be...@datastax.com>:
Cassandra should not cause any write amplification. Write amplification
appends only when you updates data on SSDs. Cassandra does not update any
data in place. Data can be rewritten during compaction but it is never
updated.

Benjamin

On Thu, Mar 10, 2016 at 12:42 PM, Alain RODRIGUEZ <ar...@gmail.com>
wrote:

> Hi Dikang,
>
> I am not sure about what you call "amplification", but as sizes highly
> depends on the structure I think I would probably give it a try using CCM (
> https://github.com/pcmanus/ccm) or some test cluster with 'production
> like'
> setting and schema. You can write a row, flush it and see how big is the
> data cluster-wide / per node.
>
> Hope this will be of some help.
>
> C*heers,
> -----------------------
> Alain Rodriguez - alain@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2016-03-10 7:18 GMT+01:00 Dikang Gu <di...@gmail.com>:
>
> > Hello there,
> >
> > I'm wondering is there a good way to measure the write amplification of
> > Cassandra?
> >
> > I'm thinking it could be calculated by (size of mutations written to the
> > node)/(number of bytes written to the disk).
> >
> > Do we already have the metrics of "size of mutations written to the
> node"?
> > I did not find it in jmx metrics.
> >
> > Thanks
> >
> > --
> > Dikang
> >
> >
>

-- 
Dikang

Re: How to measure the write amplification of C*?

Posted by Sebastian Estevez <se...@datastax.com>.

https://issues.apache.org/jira/browse/CASSANDRA-10805

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.estevez@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, Mar 10, 2016 at 1:10 PM, Jeff Ferland <jb...@tubularlabs.com> wrote:

> Compaction logs show the number of bytes written and the level written to.
> Base write load = table flushed to L0.
> Write amplification = sum of all compactions written to disk for the table.
>
> On Thu, Mar 10, 2016 at 9:44 AM, Dikang Gu <di...@gmail.com> wrote:
>
>> Hi Matt,
>>
>> Thanks for the detailed explanation! Yes, this is exactly what I'm
>> looking for, "write amplification = data written to flash/data written
>> by the host".
>>
>> We are heavily using the LCS in production, so I'd like to figure out the
>> amplification caused by that and see what we can do to optimize it. I have
>> the metrics of "data written to flash", and I'm wondering is there an
>> easy way to get the "data written by the host" on each C* node?
>>
>> Thanks
>>
>> On Thu, Mar 10, 2016 at 8:48 AM, Matt Kennedy <mk...@datastax.com>
>> wrote:
>>
>>> TL;DR - Cassandra actually causes a ton of write amplification but it
>>> doesn't freaking matter any more. Read on for details...
>>>
>>> That slide deck does have a lot of very good information on it, but
>>> unfortunately I think it has led to a fundamental misunderstanding about
>>> Cassandra and write amplification. In particular, slide 51 vastly
>>> oversimplifies the situation.
>>>
>>> The wikipedia definition of write amplification looks at this from the
>>> perspective of the SSD controller:
>>> https://en.wikipedia.org/wiki/Write_amplification#Calculating_the_value
>>>
>>> In short, write amplification = data written to flash/data written by
>>> the host
>>>
>>> So, if I write 1MB in my application, but the SSD has to write my 1MB,
>>> plus rearrange another 1MB of data in order to make room for it, then I've
>>> written a total of 2MB and my write amplification is 2x.
>>>
>>> In other words, it is measuring how much extra the SSD controller has to
>>> write in order to do its own housekeeping.
>>>
>>> However, the wikipedia definition is a bit more constrained than how the
>>> term is used in the storage industry. The whole point of looking at write
>>> amplification is to understand the impact that a particular workload is
>>> going to have on the underlying NAND by virtue of the data written. So a
>>> definition of write amplification that is a little more relevant to the
>>> context of Cassandra is to consider this:
>>>
>>> write amplification = data written to flash/data written to the database
>>>
>>> So, while the fact that we only sequentially write large immutable
>>> SSTables does in fact mean that controller-level write amplification is
>>> near zero, Compaction comes along and completely destroys that tidy little
>>> story. Think about it, every time a compaction re-writes data that has
>>> already been written, we are creating a lot of application-level write
>>> amplification. Different compaction strategies and the workload itself
>>> impact what the real application-level write amp is, but generally
>>> speaking, LCS is the worst, followed by STCS and DTCS will cause the least
>>> write-amp. To measure this, you can usually use smartctl (may be another
>>> mechanism depending on SSD manufacturer) to get the physical bytes written
>>> to your SSDs and divide that by the data that you've actually logically
>>> written to Cassandra. I've measured (more than two years ago) LCS write amp
>>> as high as 50x on some workloads, which is significantly higher than the
>>> typical controller level write amp on a b-tree style update-in-place data
>>> store. Also note that the new storage engine in general reduces a lot of
>>> inefficiency in the Cassandra storage engine therefore reducing the impact
>>> of write amp due to compactions.
>>>
>>> However, if you're a person that understands SSDs, at this point you're
>>> wondering why we aren't burning out SSDs right and left. The reality is
>>> that general SSD endurance has gotten so good, that all this write amp
>>> isn't really a problem any more. If you're curious to read more about that,
>>> I recommend you start here:
>>>
>>>
>>> http://hothardware.com/news/google-data-center-ssd-research-report-offers-surprising-results-slc-not-more-reliable-than-mlc-flash
>>>
>>> and the paper that article mentions:
>>>
>>> http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/23105-fast16-papers-schroeder.pdf
>>>
>>>
>>> Hope this helps.
>>>
>>>
>>> Matt Kennedy
>>>
>>>
>>>
>>> On Thu, Mar 10, 2016 at 7:05 AM, Paulo Motta <pa...@gmail.com>
>>> wrote:
>>>
>>>> This is a good source on Cassandra + write amplification:
>>>> http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
>>>>
>>>> 2016-03-10 9:57 GMT-03:00 Benjamin Lerer <be...@datastax.com>:
>>>>
>>>>> Cassandra should not cause any write amplification. Write amplification
>>>>> appends only when you updates data on SSDs. Cassandra does not update
>>>>> any
>>>>> data in place. Data can be rewritten during compaction but it is never
>>>>> updated.
>>>>>
>>>>> Benjamin
>>>>>
>>>>> On Thu, Mar 10, 2016 at 12:42 PM, Alain RODRIGUEZ <ar...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Hi Dikang,
>>>>> >
>>>>> > I am not sure about what you call "amplification", but as sizes
>>>>> highly
>>>>> > depends on the structure I think I would probably give it a try
>>>>> using CCM (
>>>>> > https://github.com/pcmanus/ccm) or some test cluster with
>>>>> 'production
>>>>> > like'
>>>>> > setting and schema. You can write a row, flush it and see how big is
>>>>> the
>>>>> > data cluster-wide / per node.
>>>>> >
>>>>> > Hope this will be of some help.
>>>>> >
>>>>> > C*heers,
>>>>> > -----------------------
>>>>> > Alain Rodriguez - alain@thelastpickle.com
>>>>> > France
>>>>> >
>>>>> > The Last Pickle - Apache Cassandra Consulting
>>>>> > http://www.thelastpickle.com
>>>>> >
>>>>> > 2016-03-10 7:18 GMT+01:00 Dikang Gu <di...@gmail.com>:
>>>>> >
>>>>> > > Hello there,
>>>>> > >
>>>>> > > I'm wondering is there a good way to measure the write
>>>>> amplification of
>>>>> > > Cassandra?
>>>>> > >
>>>>> > > I'm thinking it could be calculated by (size of mutations written
>>>>> to the
>>>>> > > node)/(number of bytes written to the disk).
>>>>> > >
>>>>> > > Do we already have the metrics of "size of mutations written to the
>>>>> > node"?
>>>>> > > I did not find it in jmx metrics.
>>>>> > >
>>>>> > > Thanks
>>>>> > >
>>>>> > > --
>>>>> > > Dikang
>>>>> > >
>>>>> > >
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Dikang
>>
>>
>

Re: How to measure the write amplification of C*?

Posted by Jeff Ferland <jb...@tubularlabs.com>.

Compaction logs show the number of bytes written and the level written to.
Base write load = table flushed to L0.
Write amplification = sum of all compactions written to disk for the table.

On Thu, Mar 10, 2016 at 9:44 AM, Dikang Gu <di...@gmail.com> wrote:

> Hi Matt,
>
> Thanks for the detailed explanation! Yes, this is exactly what I'm looking
> for, "write amplification = data written to flash/data written by the
> host".
>
> We are heavily using the LCS in production, so I'd like to figure out the
> amplification caused by that and see what we can do to optimize it. I have
> the metrics of "data written to flash", and I'm wondering is there an
> easy way to get the "data written by the host" on each C* node?
>
> Thanks
>
> On Thu, Mar 10, 2016 at 8:48 AM, Matt Kennedy <mk...@datastax.com>
> wrote:
>
>> TL;DR - Cassandra actually causes a ton of write amplification but it
>> doesn't freaking matter any more. Read on for details...
>>
>> That slide deck does have a lot of very good information on it, but
>> unfortunately I think it has led to a fundamental misunderstanding about
>> Cassandra and write amplification. In particular, slide 51 vastly
>> oversimplifies the situation.
>>
>> The wikipedia definition of write amplification looks at this from the
>> perspective of the SSD controller:
>> https://en.wikipedia.org/wiki/Write_amplification#Calculating_the_value
>>
>> In short, write amplification = data written to flash/data written by the
>> host
>>
>> So, if I write 1MB in my application, but the SSD has to write my 1MB,
>> plus rearrange another 1MB of data in order to make room for it, then I've
>> written a total of 2MB and my write amplification is 2x.
>>
>> In other words, it is measuring how much extra the SSD controller has to
>> write in order to do its own housekeeping.
>>
>> However, the wikipedia definition is a bit more constrained than how the
>> term is used in the storage industry. The whole point of looking at write
>> amplification is to understand the impact that a particular workload is
>> going to have on the underlying NAND by virtue of the data written. So a
>> definition of write amplification that is a little more relevant to the
>> context of Cassandra is to consider this:
>>
>> write amplification = data written to flash/data written to the database
>>
>> So, while the fact that we only sequentially write large immutable
>> SSTables does in fact mean that controller-level write amplification is
>> near zero, Compaction comes along and completely destroys that tidy little
>> story. Think about it, every time a compaction re-writes data that has
>> already been written, we are creating a lot of application-level write
>> amplification. Different compaction strategies and the workload itself
>> impact what the real application-level write amp is, but generally
>> speaking, LCS is the worst, followed by STCS and DTCS will cause the least
>> write-amp. To measure this, you can usually use smartctl (may be another
>> mechanism depending on SSD manufacturer) to get the physical bytes written
>> to your SSDs and divide that by the data that you've actually logically
>> written to Cassandra. I've measured (more than two years ago) LCS write amp
>> as high as 50x on some workloads, which is significantly higher than the
>> typical controller level write amp on a b-tree style update-in-place data
>> store. Also note that the new storage engine in general reduces a lot of
>> inefficiency in the Cassandra storage engine therefore reducing the impact
>> of write amp due to compactions.
>>
>> However, if you're a person that understands SSDs, at this point you're
>> wondering why we aren't burning out SSDs right and left. The reality is
>> that general SSD endurance has gotten so good, that all this write amp
>> isn't really a problem any more. If you're curious to read more about that,
>> I recommend you start here:
>>
>>
>> http://hothardware.com/news/google-data-center-ssd-research-report-offers-surprising-results-slc-not-more-reliable-than-mlc-flash
>>
>> and the paper that article mentions:
>>
>> http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/23105-fast16-papers-schroeder.pdf
>>
>>
>> Hope this helps.
>>
>>
>> Matt Kennedy
>>
>>
>>
>> On Thu, Mar 10, 2016 at 7:05 AM, Paulo Motta <pa...@gmail.com>
>> wrote:
>>
>>> This is a good source on Cassandra + write amplification:
>>> http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
>>>
>>> 2016-03-10 9:57 GMT-03:00 Benjamin Lerer <be...@datastax.com>:
>>>
>>>> Cassandra should not cause any write amplification. Write amplification
>>>> appends only when you updates data on SSDs. Cassandra does not update
>>>> any
>>>> data in place. Data can be rewritten during compaction but it is never
>>>> updated.
>>>>
>>>> Benjamin
>>>>
>>>> On Thu, Mar 10, 2016 at 12:42 PM, Alain RODRIGUEZ <ar...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi Dikang,
>>>> >
>>>> > I am not sure about what you call "amplification", but as sizes highly
>>>> > depends on the structure I think I would probably give it a try using
>>>> CCM (
>>>> > https://github.com/pcmanus/ccm) or some test cluster with 'production
>>>> > like'
>>>> > setting and schema. You can write a row, flush it and see how big is
>>>> the
>>>> > data cluster-wide / per node.
>>>> >
>>>> > Hope this will be of some help.
>>>> >
>>>> > C*heers,
>>>> > -----------------------
>>>> > Alain Rodriguez - alain@thelastpickle.com
>>>> > France
>>>> >
>>>> > The Last Pickle - Apache Cassandra Consulting
>>>> > http://www.thelastpickle.com
>>>> >
>>>> > 2016-03-10 7:18 GMT+01:00 Dikang Gu <di...@gmail.com>:
>>>> >
>>>> > > Hello there,
>>>> > >
>>>> > > I'm wondering is there a good way to measure the write
>>>> amplification of
>>>> > > Cassandra?
>>>> > >
>>>> > > I'm thinking it could be calculated by (size of mutations written
>>>> to the
>>>> > > node)/(number of bytes written to the disk).
>>>> > >
>>>> > > Do we already have the metrics of "size of mutations written to the
>>>> > node"?
>>>> > > I did not find it in jmx metrics.
>>>> > >
>>>> > > Thanks
>>>> > >
>>>> > > --
>>>> > > Dikang
>>>> > >
>>>> > >
>>>> >
>>>>
>>>
>>>
>>
>
>
> --
> Dikang
>
>

Re: How to measure the write amplification of C*?

Posted by Dikang Gu <di...@gmail.com>.

As a follow-up, I'm going to write a simple patch to expose the number of
flushed bytes from memtable to JMX, so that we can easily monitor it.

Here is the jira: https://issues.apache.org/jira/browse/CASSANDRA-11420

On Thu, Mar 10, 2016 at 12:55 PM, Jack Krupansky <ja...@gmail.com>
wrote:

> The doc does say this:
>
> "A log-structured engine that avoids overwrites and uses sequential IO to
> update data is essential for writing to solid-state disks (SSD) and hard
> disks (HDD) On HDD, writing randomly involves a higher number of seek
> operations than sequential writing. The seek penalty incurred can be
> substantial. Using sequential IO (thereby avoiding write amplification
> <http://en.wikipedia.org/wiki/Write_amplification> and disk failure),
> Cassandra accommodates inexpensive, consumer SSDs extremely well."
>
> I presume that write amplification argues for placing the commit log on a
> separate SSD device. That should probably be mentioned.
>
> -- Jack Krupansky
>
> On Thu, Mar 10, 2016 at 12:52 PM, Matt Kennedy <ma...@datastax.com>
> wrote:
>
>> It isn't really the data written by the host that you're concerned with,
>> it's the data written by your application. I'd start by instrumenting your
>> application tier to tally up the size of the values that it writes to C*.
>>
>> However, it may not be extremely useful to have this value. You can't do
>> much with the information it provides. It is probably a better idea to
>> track the bytes written to flash for each drive so that you know the
>> physical endurance of that type of drive given your workload. Unfortunately
>> the TBW endurance rated for the drive may not be extremely useful given the
>> difference between the synthetic workload used to create those ratings and
>> the workload that Cassandra is producing for your particular case. You can
>> find out more about those here:
>> https://www.jedec.org/standards-documents/docs/jesd219a
>>
>>
>> Matt Kennedy
>>
>> Sr. Product Manager, DSE Core
>>
>> matt.kennedy@datastax.com | Public Calendar <http://goo.gl/4Ui04Z>
>>
>> *DataStax Enterprise - the database for cloud applications.*
>>
>> On Thu, Mar 10, 2016 at 11:44 AM, Dikang Gu <di...@gmail.com> wrote:
>>
>>> Hi Matt,
>>>
>>> Thanks for the detailed explanation! Yes, this is exactly what I'm
>>> looking for, "write amplification = data written to flash/data written
>>> by the host".
>>>
>>> We are heavily using the LCS in production, so I'd like to figure out
>>> the amplification caused by that and see what we can do to optimize it. I
>>> have the metrics of "data written to flash", and I'm wondering is there
>>> an easy way to get the "data written by the host" on each C* node?
>>>
>>> Thanks
>>>
>>> On Thu, Mar 10, 2016 at 8:48 AM, Matt Kennedy <mk...@datastax.com>
>>> wrote:
>>>
>>>> TL;DR - Cassandra actually causes a ton of write amplification but it
>>>> doesn't freaking matter any more. Read on for details...
>>>>
>>>> That slide deck does have a lot of very good information on it, but
>>>> unfortunately I think it has led to a fundamental misunderstanding about
>>>> Cassandra and write amplification. In particular, slide 51 vastly
>>>> oversimplifies the situation.
>>>>
>>>> The wikipedia definition of write amplification looks at this from the
>>>> perspective of the SSD controller:
>>>> https://en.wikipedia.org/wiki/Write_amplification#Calculating_the_value
>>>>
>>>> In short, write amplification = data written to flash/data written by
>>>> the host
>>>>
>>>> So, if I write 1MB in my application, but the SSD has to write my 1MB,
>>>> plus rearrange another 1MB of data in order to make room for it, then I've
>>>> written a total of 2MB and my write amplification is 2x.
>>>>
>>>> In other words, it is measuring how much extra the SSD controller has
>>>> to write in order to do its own housekeeping.
>>>>
>>>> However, the wikipedia definition is a bit more constrained than how
>>>> the term is used in the storage industry. The whole point of looking at
>>>> write amplification is to understand the impact that a particular workload
>>>> is going to have on the underlying NAND by virtue of the data written. So a
>>>> definition of write amplification that is a little more relevant to the
>>>> context of Cassandra is to consider this:
>>>>
>>>> write amplification = data written to flash/data written to the database
>>>>
>>>> So, while the fact that we only sequentially write large immutable
>>>> SSTables does in fact mean that controller-level write amplification is
>>>> near zero, Compaction comes along and completely destroys that tidy little
>>>> story. Think about it, every time a compaction re-writes data that has
>>>> already been written, we are creating a lot of application-level write
>>>> amplification. Different compaction strategies and the workload itself
>>>> impact what the real application-level write amp is, but generally
>>>> speaking, LCS is the worst, followed by STCS and DTCS will cause the least
>>>> write-amp. To measure this, you can usually use smartctl (may be another
>>>> mechanism depending on SSD manufacturer) to get the physical bytes written
>>>> to your SSDs and divide that by the data that you've actually logically
>>>> written to Cassandra. I've measured (more than two years ago) LCS write amp
>>>> as high as 50x on some workloads, which is significantly higher than the
>>>> typical controller level write amp on a b-tree style update-in-place data
>>>> store. Also note that the new storage engine in general reduces a lot of
>>>> inefficiency in the Cassandra storage engine therefore reducing the impact
>>>> of write amp due to compactions.
>>>>
>>>> However, if you're a person that understands SSDs, at this point you're
>>>> wondering why we aren't burning out SSDs right and left. The reality is
>>>> that general SSD endurance has gotten so good, that all this write amp
>>>> isn't really a problem any more. If you're curious to read more about that,
>>>> I recommend you start here:
>>>>
>>>>
>>>> http://hothardware.com/news/google-data-center-ssd-research-report-offers-surprising-results-slc-not-more-reliable-than-mlc-flash
>>>>
>>>> and the paper that article mentions:
>>>>
>>>> http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/23105-fast16-papers-schroeder.pdf
>>>>
>>>>
>>>> Hope this helps.
>>>>
>>>>
>>>> Matt Kennedy
>>>>
>>>>
>>>>
>>>> On Thu, Mar 10, 2016 at 7:05 AM, Paulo Motta <pa...@gmail.com>
>>>> wrote:
>>>>
>>>>> This is a good source on Cassandra + write amplification:
>>>>> http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
>>>>>
>>>>> 2016-03-10 9:57 GMT-03:00 Benjamin Lerer <be...@datastax.com>
>>>>> :
>>>>>
>>>>>> Cassandra should not cause any write amplification. Write
>>>>>> amplification
>>>>>> appends only when you updates data on SSDs. Cassandra does not update
>>>>>> any
>>>>>> data in place. Data can be rewritten during compaction but it is never
>>>>>> updated.
>>>>>>
>>>>>> Benjamin
>>>>>>
>>>>>> On Thu, Mar 10, 2016 at 12:42 PM, Alain RODRIGUEZ <arodrime@gmail.com
>>>>>> >
>>>>>> wrote:
>>>>>>
>>>>>> > Hi Dikang,
>>>>>> >
>>>>>> > I am not sure about what you call "amplification", but as sizes
>>>>>> highly
>>>>>> > depends on the structure I think I would probably give it a try
>>>>>> using CCM (
>>>>>> > https://github.com/pcmanus/ccm) or some test cluster with
>>>>>> 'production
>>>>>> > like'
>>>>>> > setting and schema. You can write a row, flush it and see how big
>>>>>> is the
>>>>>> > data cluster-wide / per node.
>>>>>> >
>>>>>> > Hope this will be of some help.
>>>>>> >
>>>>>> > C*heers,
>>>>>> > -----------------------
>>>>>> > Alain Rodriguez - alain@thelastpickle.com
>>>>>> > France
>>>>>> >
>>>>>> > The Last Pickle - Apache Cassandra Consulting
>>>>>> > http://www.thelastpickle.com
>>>>>> >
>>>>>> > 2016-03-10 7:18 GMT+01:00 Dikang Gu <di...@gmail.com>:
>>>>>> >
>>>>>> > > Hello there,
>>>>>> > >
>>>>>> > > I'm wondering is there a good way to measure the write
>>>>>> amplification of
>>>>>> > > Cassandra?
>>>>>> > >
>>>>>> > > I'm thinking it could be calculated by (size of mutations written
>>>>>> to the
>>>>>> > > node)/(number of bytes written to the disk).
>>>>>> > >
>>>>>> > > Do we already have the metrics of "size of mutations written to
>>>>>> the
>>>>>> > node"?
>>>>>> > > I did not find it in jmx metrics.
>>>>>> > >
>>>>>> > > Thanks
>>>>>> > >
>>>>>> > > --
>>>>>> > > Dikang
>>>>>> > >
>>>>>> > >
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Dikang
>>>
>>>
>>
>


-- 
Dikang

Re: How to measure the write amplification of C*?

Posted by Dikang Gu <di...@gmail.com>.

As a follow-up, I'm going to write a simple patch to expose the number of
flushed bytes from memtable to JMX, so that we can easily monitor it.

Here is the jira: https://issues.apache.org/jira/browse/CASSANDRA-11420

On Thu, Mar 10, 2016 at 12:55 PM, Jack Krupansky <ja...@gmail.com>
wrote:

> The doc does say this:
>
> "A log-structured engine that avoids overwrites and uses sequential IO to
> update data is essential for writing to solid-state disks (SSD) and hard
> disks (HDD) On HDD, writing randomly involves a higher number of seek
> operations than sequential writing. The seek penalty incurred can be
> substantial. Using sequential IO (thereby avoiding write amplification
> <http://en.wikipedia.org/wiki/Write_amplification> and disk failure),
> Cassandra accommodates inexpensive, consumer SSDs extremely well."
>
> I presume that write amplification argues for placing the commit log on a
> separate SSD device. That should probably be mentioned.
>
> -- Jack Krupansky
>
> On Thu, Mar 10, 2016 at 12:52 PM, Matt Kennedy <ma...@datastax.com>
> wrote:
>
>> It isn't really the data written by the host that you're concerned with,
>> it's the data written by your application. I'd start by instrumenting your
>> application tier to tally up the size of the values that it writes to C*.
>>
>> However, it may not be extremely useful to have this value. You can't do
>> much with the information it provides. It is probably a better idea to
>> track the bytes written to flash for each drive so that you know the
>> physical endurance of that type of drive given your workload. Unfortunately
>> the TBW endurance rated for the drive may not be extremely useful given the
>> difference between the synthetic workload used to create those ratings and
>> the workload that Cassandra is producing for your particular case. You can
>> find out more about those here:
>> https://www.jedec.org/standards-documents/docs/jesd219a
>>
>>
>> Matt Kennedy
>>
>> Sr. Product Manager, DSE Core
>>
>> matt.kennedy@datastax.com | Public Calendar <http://goo.gl/4Ui04Z>
>>
>> *DataStax Enterprise - the database for cloud applications.*
>>
>> On Thu, Mar 10, 2016 at 11:44 AM, Dikang Gu <di...@gmail.com> wrote:
>>
>>> Hi Matt,
>>>
>>> Thanks for the detailed explanation! Yes, this is exactly what I'm
>>> looking for, "write amplification = data written to flash/data written
>>> by the host".
>>>
>>> We are heavily using the LCS in production, so I'd like to figure out
>>> the amplification caused by that and see what we can do to optimize it. I
>>> have the metrics of "data written to flash", and I'm wondering is there
>>> an easy way to get the "data written by the host" on each C* node?
>>>
>>> Thanks
>>>
>>> On Thu, Mar 10, 2016 at 8:48 AM, Matt Kennedy <mk...@datastax.com>
>>> wrote:
>>>
>>>> TL;DR - Cassandra actually causes a ton of write amplification but it
>>>> doesn't freaking matter any more. Read on for details...
>>>>
>>>> That slide deck does have a lot of very good information on it, but
>>>> unfortunately I think it has led to a fundamental misunderstanding about
>>>> Cassandra and write amplification. In particular, slide 51 vastly
>>>> oversimplifies the situation.
>>>>
>>>> The wikipedia definition of write amplification looks at this from the
>>>> perspective of the SSD controller:
>>>> https://en.wikipedia.org/wiki/Write_amplification#Calculating_the_value
>>>>
>>>> In short, write amplification = data written to flash/data written by
>>>> the host
>>>>
>>>> So, if I write 1MB in my application, but the SSD has to write my 1MB,
>>>> plus rearrange another 1MB of data in order to make room for it, then I've
>>>> written a total of 2MB and my write amplification is 2x.
>>>>
>>>> In other words, it is measuring how much extra the SSD controller has
>>>> to write in order to do its own housekeeping.
>>>>
>>>> However, the wikipedia definition is a bit more constrained than how
>>>> the term is used in the storage industry. The whole point of looking at
>>>> write amplification is to understand the impact that a particular workload
>>>> is going to have on the underlying NAND by virtue of the data written. So a
>>>> definition of write amplification that is a little more relevant to the
>>>> context of Cassandra is to consider this:
>>>>
>>>> write amplification = data written to flash/data written to the database
>>>>
>>>> So, while the fact that we only sequentially write large immutable
>>>> SSTables does in fact mean that controller-level write amplification is
>>>> near zero, Compaction comes along and completely destroys that tidy little
>>>> story. Think about it, every time a compaction re-writes data that has
>>>> already been written, we are creating a lot of application-level write
>>>> amplification. Different compaction strategies and the workload itself
>>>> impact what the real application-level write amp is, but generally
>>>> speaking, LCS is the worst, followed by STCS and DTCS will cause the least
>>>> write-amp. To measure this, you can usually use smartctl (may be another
>>>> mechanism depending on SSD manufacturer) to get the physical bytes written
>>>> to your SSDs and divide that by the data that you've actually logically
>>>> written to Cassandra. I've measured (more than two years ago) LCS write amp
>>>> as high as 50x on some workloads, which is significantly higher than the
>>>> typical controller level write amp on a b-tree style update-in-place data
>>>> store. Also note that the new storage engine in general reduces a lot of
>>>> inefficiency in the Cassandra storage engine therefore reducing the impact
>>>> of write amp due to compactions.
>>>>
>>>> However, if you're a person that understands SSDs, at this point you're
>>>> wondering why we aren't burning out SSDs right and left. The reality is
>>>> that general SSD endurance has gotten so good, that all this write amp
>>>> isn't really a problem any more. If you're curious to read more about that,
>>>> I recommend you start here:
>>>>
>>>>
>>>> http://hothardware.com/news/google-data-center-ssd-research-report-offers-surprising-results-slc-not-more-reliable-than-mlc-flash
>>>>
>>>> and the paper that article mentions:
>>>>
>>>> http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/23105-fast16-papers-schroeder.pdf
>>>>
>>>>
>>>> Hope this helps.
>>>>
>>>>
>>>> Matt Kennedy
>>>>
>>>>
>>>>
>>>> On Thu, Mar 10, 2016 at 7:05 AM, Paulo Motta <pa...@gmail.com>
>>>> wrote:
>>>>
>>>>> This is a good source on Cassandra + write amplification:
>>>>> http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
>>>>>
>>>>> 2016-03-10 9:57 GMT-03:00 Benjamin Lerer <be...@datastax.com>
>>>>> :
>>>>>
>>>>>> Cassandra should not cause any write amplification. Write
>>>>>> amplification
>>>>>> appends only when you updates data on SSDs. Cassandra does not update
>>>>>> any
>>>>>> data in place. Data can be rewritten during compaction but it is never
>>>>>> updated.
>>>>>>
>>>>>> Benjamin
>>>>>>
>>>>>> On Thu, Mar 10, 2016 at 12:42 PM, Alain RODRIGUEZ <arodrime@gmail.com
>>>>>> >
>>>>>> wrote:
>>>>>>
>>>>>> > Hi Dikang,
>>>>>> >
>>>>>> > I am not sure about what you call "amplification", but as sizes
>>>>>> highly
>>>>>> > depends on the structure I think I would probably give it a try
>>>>>> using CCM (
>>>>>> > https://github.com/pcmanus/ccm) or some test cluster with
>>>>>> 'production
>>>>>> > like'
>>>>>> > setting and schema. You can write a row, flush it and see how big
>>>>>> is the
>>>>>> > data cluster-wide / per node.
>>>>>> >
>>>>>> > Hope this will be of some help.
>>>>>> >
>>>>>> > C*heers,
>>>>>> > -----------------------
>>>>>> > Alain Rodriguez - alain@thelastpickle.com
>>>>>> > France
>>>>>> >
>>>>>> > The Last Pickle - Apache Cassandra Consulting
>>>>>> > http://www.thelastpickle.com
>>>>>> >
>>>>>> > 2016-03-10 7:18 GMT+01:00 Dikang Gu <di...@gmail.com>:
>>>>>> >
>>>>>> > > Hello there,
>>>>>> > >
>>>>>> > > I'm wondering is there a good way to measure the write
>>>>>> amplification of
>>>>>> > > Cassandra?
>>>>>> > >
>>>>>> > > I'm thinking it could be calculated by (size of mutations written
>>>>>> to the
>>>>>> > > node)/(number of bytes written to the disk).
>>>>>> > >
>>>>>> > > Do we already have the metrics of "size of mutations written to
>>>>>> the
>>>>>> > node"?
>>>>>> > > I did not find it in jmx metrics.
>>>>>> > >
>>>>>> > > Thanks
>>>>>> > >
>>>>>> > > --
>>>>>> > > Dikang
>>>>>> > >
>>>>>> > >
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Dikang
>>>
>>>
>>
>


-- 
Dikang

Re: How to measure the write amplification of C*?

Posted by Jack Krupansky <ja...@gmail.com>.

The doc does say this:

"A log-structured engine that avoids overwrites and uses sequential IO to
update data is essential for writing to solid-state disks (SSD) and hard
disks (HDD) On HDD, writing randomly involves a higher number of seek
operations than sequential writing. The seek penalty incurred can be
substantial. Using sequential IO (thereby avoiding write amplification
<http://en.wikipedia.org/wiki/Write_amplification> and disk failure),
Cassandra accommodates inexpensive, consumer SSDs extremely well."

I presume that write amplification argues for placing the commit log on a
separate SSD device. That should probably be mentioned.

-- Jack Krupansky

On Thu, Mar 10, 2016 at 12:52 PM, Matt Kennedy <ma...@datastax.com>
wrote:

> It isn't really the data written by the host that you're concerned with,
> it's the data written by your application. I'd start by instrumenting your
> application tier to tally up the size of the values that it writes to C*.
>
> However, it may not be extremely useful to have this value. You can't do
> much with the information it provides. It is probably a better idea to
> track the bytes written to flash for each drive so that you know the
> physical endurance of that type of drive given your workload. Unfortunately
> the TBW endurance rated for the drive may not be extremely useful given the
> difference between the synthetic workload used to create those ratings and
> the workload that Cassandra is producing for your particular case. You can
> find out more about those here:
> https://www.jedec.org/standards-documents/docs/jesd219a
>
>
> Matt Kennedy
>
> Sr. Product Manager, DSE Core
>
> matt.kennedy@datastax.com | Public Calendar <http://goo.gl/4Ui04Z>
>
> *DataStax Enterprise - the database for cloud applications.*
>
> On Thu, Mar 10, 2016 at 11:44 AM, Dikang Gu <di...@gmail.com> wrote:
>
>> Hi Matt,
>>
>> Thanks for the detailed explanation! Yes, this is exactly what I'm
>> looking for, "write amplification = data written to flash/data written
>> by the host".
>>
>> We are heavily using the LCS in production, so I'd like to figure out the
>> amplification caused by that and see what we can do to optimize it. I have
>> the metrics of "data written to flash", and I'm wondering is there an
>> easy way to get the "data written by the host" on each C* node?
>>
>> Thanks
>>
>> On Thu, Mar 10, 2016 at 8:48 AM, Matt Kennedy <mk...@datastax.com>
>> wrote:
>>
>>> TL;DR - Cassandra actually causes a ton of write amplification but it
>>> doesn't freaking matter any more. Read on for details...
>>>
>>> That slide deck does have a lot of very good information on it, but
>>> unfortunately I think it has led to a fundamental misunderstanding about
>>> Cassandra and write amplification. In particular, slide 51 vastly
>>> oversimplifies the situation.
>>>
>>> The wikipedia definition of write amplification looks at this from the
>>> perspective of the SSD controller:
>>> https://en.wikipedia.org/wiki/Write_amplification#Calculating_the_value
>>>
>>> In short, write amplification = data written to flash/data written by
>>> the host
>>>
>>> So, if I write 1MB in my application, but the SSD has to write my 1MB,
>>> plus rearrange another 1MB of data in order to make room for it, then I've
>>> written a total of 2MB and my write amplification is 2x.
>>>
>>> In other words, it is measuring how much extra the SSD controller has to
>>> write in order to do its own housekeeping.
>>>
>>> However, the wikipedia definition is a bit more constrained than how the
>>> term is used in the storage industry. The whole point of looking at write
>>> amplification is to understand the impact that a particular workload is
>>> going to have on the underlying NAND by virtue of the data written. So a
>>> definition of write amplification that is a little more relevant to the
>>> context of Cassandra is to consider this:
>>>
>>> write amplification = data written to flash/data written to the database
>>>
>>> So, while the fact that we only sequentially write large immutable
>>> SSTables does in fact mean that controller-level write amplification is
>>> near zero, Compaction comes along and completely destroys that tidy little
>>> story. Think about it, every time a compaction re-writes data that has
>>> already been written, we are creating a lot of application-level write
>>> amplification. Different compaction strategies and the workload itself
>>> impact what the real application-level write amp is, but generally
>>> speaking, LCS is the worst, followed by STCS and DTCS will cause the least
>>> write-amp. To measure this, you can usually use smartctl (may be another
>>> mechanism depending on SSD manufacturer) to get the physical bytes written
>>> to your SSDs and divide that by the data that you've actually logically
>>> written to Cassandra. I've measured (more than two years ago) LCS write amp
>>> as high as 50x on some workloads, which is significantly higher than the
>>> typical controller level write amp on a b-tree style update-in-place data
>>> store. Also note that the new storage engine in general reduces a lot of
>>> inefficiency in the Cassandra storage engine therefore reducing the impact
>>> of write amp due to compactions.
>>>
>>> However, if you're a person that understands SSDs, at this point you're
>>> wondering why we aren't burning out SSDs right and left. The reality is
>>> that general SSD endurance has gotten so good, that all this write amp
>>> isn't really a problem any more. If you're curious to read more about that,
>>> I recommend you start here:
>>>
>>>
>>> http://hothardware.com/news/google-data-center-ssd-research-report-offers-surprising-results-slc-not-more-reliable-than-mlc-flash
>>>
>>> and the paper that article mentions:
>>>
>>> http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/23105-fast16-papers-schroeder.pdf
>>>
>>>
>>> Hope this helps.
>>>
>>>
>>> Matt Kennedy
>>>
>>>
>>>
>>> On Thu, Mar 10, 2016 at 7:05 AM, Paulo Motta <pa...@gmail.com>
>>> wrote:
>>>
>>>> This is a good source on Cassandra + write amplification:
>>>> http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
>>>>
>>>> 2016-03-10 9:57 GMT-03:00 Benjamin Lerer <be...@datastax.com>:
>>>>
>>>>> Cassandra should not cause any write amplification. Write amplification
>>>>> appends only when you updates data on SSDs. Cassandra does not update
>>>>> any
>>>>> data in place. Data can be rewritten during compaction but it is never
>>>>> updated.
>>>>>
>>>>> Benjamin
>>>>>
>>>>> On Thu, Mar 10, 2016 at 12:42 PM, Alain RODRIGUEZ <ar...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Hi Dikang,
>>>>> >
>>>>> > I am not sure about what you call "amplification", but as sizes
>>>>> highly
>>>>> > depends on the structure I think I would probably give it a try
>>>>> using CCM (
>>>>> > https://github.com/pcmanus/ccm) or some test cluster with
>>>>> 'production
>>>>> > like'
>>>>> > setting and schema. You can write a row, flush it and see how big is
>>>>> the
>>>>> > data cluster-wide / per node.
>>>>> >
>>>>> > Hope this will be of some help.
>>>>> >
>>>>> > C*heers,
>>>>> > -----------------------
>>>>> > Alain Rodriguez - alain@thelastpickle.com
>>>>> > France
>>>>> >
>>>>> > The Last Pickle - Apache Cassandra Consulting
>>>>> > http://www.thelastpickle.com
>>>>> >
>>>>> > 2016-03-10 7:18 GMT+01:00 Dikang Gu <di...@gmail.com>:
>>>>> >
>>>>> > > Hello there,
>>>>> > >
>>>>> > > I'm wondering is there a good way to measure the write
>>>>> amplification of
>>>>> > > Cassandra?
>>>>> > >
>>>>> > > I'm thinking it could be calculated by (size of mutations written
>>>>> to the
>>>>> > > node)/(number of bytes written to the disk).
>>>>> > >
>>>>> > > Do we already have the metrics of "size of mutations written to the
>>>>> > node"?
>>>>> > > I did not find it in jmx metrics.
>>>>> > >
>>>>> > > Thanks
>>>>> > >
>>>>> > > --
>>>>> > > Dikang
>>>>> > >
>>>>> > >
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Dikang
>>
>>
>

Re: How to measure the write amplification of C*?

Posted by Matt Kennedy <ma...@datastax.com>.

It isn't really the data written by the host that you're concerned with,
it's the data written by your application. I'd start by instrumenting your
application tier to tally up the size of the values that it writes to C*.

However, it may not be extremely useful to have this value. You can't do
much with the information it provides. It is probably a better idea to
track the bytes written to flash for each drive so that you know the
physical endurance of that type of drive given your workload. Unfortunately
the TBW endurance rated for the drive may not be extremely useful given the
difference between the synthetic workload used to create those ratings and
the workload that Cassandra is producing for your particular case. You can
find out more about those here:
https://www.jedec.org/standards-documents/docs/jesd219a


Matt Kennedy

Sr. Product Manager, DSE Core

matt.kennedy@datastax.com | Public Calendar <http://goo.gl/4Ui04Z>

*DataStax Enterprise - the database for cloud applications.*

On Thu, Mar 10, 2016 at 11:44 AM, Dikang Gu <di...@gmail.com> wrote:

> Hi Matt,
>
> Thanks for the detailed explanation! Yes, this is exactly what I'm looking
> for, "write amplification = data written to flash/data written by the
> host".
>
> We are heavily using the LCS in production, so I'd like to figure out the
> amplification caused by that and see what we can do to optimize it. I have
> the metrics of "data written to flash", and I'm wondering is there an
> easy way to get the "data written by the host" on each C* node?
>
> Thanks
>
> On Thu, Mar 10, 2016 at 8:48 AM, Matt Kennedy <mk...@datastax.com>
> wrote:
>
>> TL;DR - Cassandra actually causes a ton of write amplification but it
>> doesn't freaking matter any more. Read on for details...
>>
>> That slide deck does have a lot of very good information on it, but
>> unfortunately I think it has led to a fundamental misunderstanding about
>> Cassandra and write amplification. In particular, slide 51 vastly
>> oversimplifies the situation.
>>
>> The wikipedia definition of write amplification looks at this from the
>> perspective of the SSD controller:
>> https://en.wikipedia.org/wiki/Write_amplification#Calculating_the_value
>>
>> In short, write amplification = data written to flash/data written by the
>> host
>>
>> So, if I write 1MB in my application, but the SSD has to write my 1MB,
>> plus rearrange another 1MB of data in order to make room for it, then I've
>> written a total of 2MB and my write amplification is 2x.
>>
>> In other words, it is measuring how much extra the SSD controller has to
>> write in order to do its own housekeeping.
>>
>> However, the wikipedia definition is a bit more constrained than how the
>> term is used in the storage industry. The whole point of looking at write
>> amplification is to understand the impact that a particular workload is
>> going to have on the underlying NAND by virtue of the data written. So a
>> definition of write amplification that is a little more relevant to the
>> context of Cassandra is to consider this:
>>
>> write amplification = data written to flash/data written to the database
>>
>> So, while the fact that we only sequentially write large immutable
>> SSTables does in fact mean that controller-level write amplification is
>> near zero, Compaction comes along and completely destroys that tidy little
>> story. Think about it, every time a compaction re-writes data that has
>> already been written, we are creating a lot of application-level write
>> amplification. Different compaction strategies and the workload itself
>> impact what the real application-level write amp is, but generally
>> speaking, LCS is the worst, followed by STCS and DTCS will cause the least
>> write-amp. To measure this, you can usually use smartctl (may be another
>> mechanism depending on SSD manufacturer) to get the physical bytes written
>> to your SSDs and divide that by the data that you've actually logically
>> written to Cassandra. I've measured (more than two years ago) LCS write amp
>> as high as 50x on some workloads, which is significantly higher than the
>> typical controller level write amp on a b-tree style update-in-place data
>> store. Also note that the new storage engine in general reduces a lot of
>> inefficiency in the Cassandra storage engine therefore reducing the impact
>> of write amp due to compactions.
>>
>> However, if you're a person that understands SSDs, at this point you're
>> wondering why we aren't burning out SSDs right and left. The reality is
>> that general SSD endurance has gotten so good, that all this write amp
>> isn't really a problem any more. If you're curious to read more about that,
>> I recommend you start here:
>>
>>
>> http://hothardware.com/news/google-data-center-ssd-research-report-offers-surprising-results-slc-not-more-reliable-than-mlc-flash
>>
>> and the paper that article mentions:
>>
>> http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/23105-fast16-papers-schroeder.pdf
>>
>>
>> Hope this helps.
>>
>>
>> Matt Kennedy
>>
>>
>>
>> On Thu, Mar 10, 2016 at 7:05 AM, Paulo Motta <pa...@gmail.com>
>> wrote:
>>
>>> This is a good source on Cassandra + write amplification:
>>> http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
>>>
>>> 2016-03-10 9:57 GMT-03:00 Benjamin Lerer <be...@datastax.com>:
>>>
>>>> Cassandra should not cause any write amplification. Write amplification
>>>> appends only when you updates data on SSDs. Cassandra does not update
>>>> any
>>>> data in place. Data can be rewritten during compaction but it is never
>>>> updated.
>>>>
>>>> Benjamin
>>>>
>>>> On Thu, Mar 10, 2016 at 12:42 PM, Alain RODRIGUEZ <ar...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi Dikang,
>>>> >
>>>> > I am not sure about what you call "amplification", but as sizes highly
>>>> > depends on the structure I think I would probably give it a try using
>>>> CCM (
>>>> > https://github.com/pcmanus/ccm) or some test cluster with 'production
>>>> > like'
>>>> > setting and schema. You can write a row, flush it and see how big is
>>>> the
>>>> > data cluster-wide / per node.
>>>> >
>>>> > Hope this will be of some help.
>>>> >
>>>> > C*heers,
>>>> > -----------------------
>>>> > Alain Rodriguez - alain@thelastpickle.com
>>>> > France
>>>> >
>>>> > The Last Pickle - Apache Cassandra Consulting
>>>> > http://www.thelastpickle.com
>>>> >
>>>> > 2016-03-10 7:18 GMT+01:00 Dikang Gu <di...@gmail.com>:
>>>> >
>>>> > > Hello there,
>>>> > >
>>>> > > I'm wondering is there a good way to measure the write
>>>> amplification of
>>>> > > Cassandra?
>>>> > >
>>>> > > I'm thinking it could be calculated by (size of mutations written
>>>> to the
>>>> > > node)/(number of bytes written to the disk).
>>>> > >
>>>> > > Do we already have the metrics of "size of mutations written to the
>>>> > node"?
>>>> > > I did not find it in jmx metrics.
>>>> > >
>>>> > > Thanks
>>>> > >
>>>> > > --
>>>> > > Dikang
>>>> > >
>>>> > >
>>>> >
>>>>
>>>
>>>
>>
>
>
> --
> Dikang
>
>

Re: How to measure the write amplification of C*?

Posted by Dikang Gu <di...@gmail.com>.

Hi Matt,

Thanks for the detailed explanation! Yes, this is exactly what I'm looking
for, "write amplification = data written to flash/data written by the host".

We are heavily using the LCS in production, so I'd like to figure out the
amplification caused by that and see what we can do to optimize it. I have
the metrics of "data written to flash", and I'm wondering is there an easy
way to get the "data written by the host" on each C* node?

Thanks

On Thu, Mar 10, 2016 at 8:48 AM, Matt Kennedy <mk...@datastax.com> wrote:

> TL;DR - Cassandra actually causes a ton of write amplification but it
> doesn't freaking matter any more. Read on for details...
>
> That slide deck does have a lot of very good information on it, but
> unfortunately I think it has led to a fundamental misunderstanding about
> Cassandra and write amplification. In particular, slide 51 vastly
> oversimplifies the situation.
>
> The wikipedia definition of write amplification looks at this from the
> perspective of the SSD controller:
> https://en.wikipedia.org/wiki/Write_amplification#Calculating_the_value
>
> In short, write amplification = data written to flash/data written by the
> host
>
> So, if I write 1MB in my application, but the SSD has to write my 1MB,
> plus rearrange another 1MB of data in order to make room for it, then I've
> written a total of 2MB and my write amplification is 2x.
>
> In other words, it is measuring how much extra the SSD controller has to
> write in order to do its own housekeeping.
>
> However, the wikipedia definition is a bit more constrained than how the
> term is used in the storage industry. The whole point of looking at write
> amplification is to understand the impact that a particular workload is
> going to have on the underlying NAND by virtue of the data written. So a
> definition of write amplification that is a little more relevant to the
> context of Cassandra is to consider this:
>
> write amplification = data written to flash/data written to the database
>
> So, while the fact that we only sequentially write large immutable
> SSTables does in fact mean that controller-level write amplification is
> near zero, Compaction comes along and completely destroys that tidy little
> story. Think about it, every time a compaction re-writes data that has
> already been written, we are creating a lot of application-level write
> amplification. Different compaction strategies and the workload itself
> impact what the real application-level write amp is, but generally
> speaking, LCS is the worst, followed by STCS and DTCS will cause the least
> write-amp. To measure this, you can usually use smartctl (may be another
> mechanism depending on SSD manufacturer) to get the physical bytes written
> to your SSDs and divide that by the data that you've actually logically
> written to Cassandra. I've measured (more than two years ago) LCS write amp
> as high as 50x on some workloads, which is significantly higher than the
> typical controller level write amp on a b-tree style update-in-place data
> store. Also note that the new storage engine in general reduces a lot of
> inefficiency in the Cassandra storage engine therefore reducing the impact
> of write amp due to compactions.
>
> However, if you're a person that understands SSDs, at this point you're
> wondering why we aren't burning out SSDs right and left. The reality is
> that general SSD endurance has gotten so good, that all this write amp
> isn't really a problem any more. If you're curious to read more about that,
> I recommend you start here:
>
>
> http://hothardware.com/news/google-data-center-ssd-research-report-offers-surprising-results-slc-not-more-reliable-than-mlc-flash
>
> and the paper that article mentions:
>
> http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/23105-fast16-papers-schroeder.pdf
>
>
> Hope this helps.
>
>
> Matt Kennedy
>
>
>
> On Thu, Mar 10, 2016 at 7:05 AM, Paulo Motta <pa...@gmail.com>
> wrote:
>
>> This is a good source on Cassandra + write amplification:
>> http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
>>
>> 2016-03-10 9:57 GMT-03:00 Benjamin Lerer <be...@datastax.com>:
>>
>>> Cassandra should not cause any write amplification. Write amplification
>>> appends only when you updates data on SSDs. Cassandra does not update any
>>> data in place. Data can be rewritten during compaction but it is never
>>> updated.
>>>
>>> Benjamin
>>>
>>> On Thu, Mar 10, 2016 at 12:42 PM, Alain RODRIGUEZ <ar...@gmail.com>
>>> wrote:
>>>
>>> > Hi Dikang,
>>> >
>>> > I am not sure about what you call "amplification", but as sizes highly
>>> > depends on the structure I think I would probably give it a try using
>>> CCM (
>>> > https://github.com/pcmanus/ccm) or some test cluster with 'production
>>> > like'
>>> > setting and schema. You can write a row, flush it and see how big is
>>> the
>>> > data cluster-wide / per node.
>>> >
>>> > Hope this will be of some help.
>>> >
>>> > C*heers,
>>> > -----------------------
>>> > Alain Rodriguez - alain@thelastpickle.com
>>> > France
>>> >
>>> > The Last Pickle - Apache Cassandra Consulting
>>> > http://www.thelastpickle.com
>>> >
>>> > 2016-03-10 7:18 GMT+01:00 Dikang Gu <di...@gmail.com>:
>>> >
>>> > > Hello there,
>>> > >
>>> > > I'm wondering is there a good way to measure the write amplification
>>> of
>>> > > Cassandra?
>>> > >
>>> > > I'm thinking it could be calculated by (size of mutations written to
>>> the
>>> > > node)/(number of bytes written to the disk).
>>> > >
>>> > > Do we already have the metrics of "size of mutations written to the
>>> > node"?
>>> > > I did not find it in jmx metrics.
>>> > >
>>> > > Thanks
>>> > >
>>> > > --
>>> > > Dikang
>>> > >
>>> > >
>>> >
>>>
>>
>>
>


-- 
Dikang

Re: How to measure the write amplification of C*?

Posted by Matt Kennedy <mk...@datastax.com>.

After posting this, Jon Haddad pinged me on chat and said (I'm
paraphrasing):

Actually, this company I work with a lot burns through SSDs so fast it's
absurd, their write amp is gigantic.

This is a very good point, however it isn't what I would call typical, and
a lot is going to depend on the drive manufacturer and workload. But in
general, this isn't an epidemic, which is what I was trying to emphasize.

Keep spares around, all drives fail, whether it's due to wear out or some
other factor. If your cost of NAND/GB/Time is too high, consider moving to
a higher endurance drive to replace your next round of failed units.

[image: datastax_logo.png] <http://www.datastax.com/>

Matt Kennedy

Partner Architect | +1.703.582.5017 | matt.kennedy@datastax.com

[image: linkedin.png] <https://www.linkedin.com/pub/matt-kennedy/25/258/663>
 [image: twitter.png] <https://twitter.com/thetweetofmatt>
<https://github.com/datastax/> [image: g+.png]
<https://plus.google.com/+Datastax/about> [image: facebook.png]
<https://www.facebook.com/datastax>  <http://feeds.feedburner.com/datastax>

On Thu, Mar 10, 2016 at 10:48 AM, Matt Kennedy <mk...@datastax.com>
wrote:

> TL;DR - Cassandra actually causes a ton of write amplification but it
> doesn't freaking matter any more. Read on for details...
>
> That slide deck does have a lot of very good information on it, but
> unfortunately I think it has led to a fundamental misunderstanding about
> Cassandra and write amplification. In particular, slide 51 vastly
> oversimplifies the situation.
>
> The wikipedia definition of write amplification looks at this from the
> perspective of the SSD controller:
> https://en.wikipedia.org/wiki/Write_amplification#Calculating_the_value
>
> In short, write amplification = data written to flash/data written by the
> host
>
> So, if I write 1MB in my application, but the SSD has to write my 1MB,
> plus rearrange another 1MB of data in order to make room for it, then I've
> written a total of 2MB and my write amplification is 2x.
>
> In other words, it is measuring how much extra the SSD controller has to
> write in order to do its own housekeeping.
>
> However, the wikipedia definition is a bit more constrained than how the
> term is used in the storage industry. The whole point of looking at write
> amplification is to understand the impact that a particular workload is
> going to have on the underlying NAND by virtue of the data written. So a
> definition of write amplification that is a little more relevant to the
> context of Cassandra is to consider this:
>
> write amplification = data written to flash/data written to the database
>
> So, while the fact that we only sequentially write large immutable
> SSTables does in fact mean that controller-level write amplification is
> near zero, Compaction comes along and completely destroys that tidy little
> story. Think about it, every time a compaction re-writes data that has
> already been written, we are creating a lot of application-level write
> amplification. Different compaction strategies and the workload itself
> impact what the real application-level write amp is, but generally
> speaking, LCS is the worst, followed by STCS and DTCS will cause the least
> write-amp. To measure this, you can usually use smartctl (may be another
> mechanism depending on SSD manufacturer) to get the physical bytes written
> to your SSDs and divide that by the data that you've actually logically
> written to Cassandra. I've measured (more than two years ago) LCS write amp
> as high as 50x on some workloads, which is significantly higher than the
> typical controller level write amp on a b-tree style update-in-place data
> store. Also note that the new storage engine in general reduces a lot of
> inefficiency in the Cassandra storage engine therefore reducing the impact
> of write amp due to compactions.
>
> However, if you're a person that understands SSDs, at this point you're
> wondering why we aren't burning out SSDs right and left. The reality is
> that general SSD endurance has gotten so good, that all this write amp
> isn't really a problem any more. If you're curious to read more about that,
> I recommend you start here:
>
>
> http://hothardware.com/news/google-data-center-ssd-research-report-offers-surprising-results-slc-not-more-reliable-than-mlc-flash
>
> and the paper that article mentions:
>
> http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/23105-fast16-papers-schroeder.pdf
>
>
> Hope this helps.
>
>
> Matt Kennedy
>
>
>
> On Thu, Mar 10, 2016 at 7:05 AM, Paulo Motta <pa...@gmail.com>
> wrote:
>
>> This is a good source on Cassandra + write amplification:
>> http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
>>
>> 2016-03-10 9:57 GMT-03:00 Benjamin Lerer <be...@datastax.com>:
>>
>>> Cassandra should not cause any write amplification. Write amplification
>>> appends only when you updates data on SSDs. Cassandra does not update any
>>> data in place. Data can be rewritten during compaction but it is never
>>> updated.
>>>
>>> Benjamin
>>>
>>> On Thu, Mar 10, 2016 at 12:42 PM, Alain RODRIGUEZ <ar...@gmail.com>
>>> wrote:
>>>
>>> > Hi Dikang,
>>> >
>>> > I am not sure about what you call "amplification", but as sizes highly
>>> > depends on the structure I think I would probably give it a try using
>>> CCM (
>>> > https://github.com/pcmanus/ccm) or some test cluster with 'production
>>> > like'
>>> > setting and schema. You can write a row, flush it and see how big is
>>> the
>>> > data cluster-wide / per node.
>>> >
>>> > Hope this will be of some help.
>>> >
>>> > C*heers,
>>> > -----------------------
>>> > Alain Rodriguez - alain@thelastpickle.com
>>> > France
>>> >
>>> > The Last Pickle - Apache Cassandra Consulting
>>> > http://www.thelastpickle.com
>>> >
>>> > 2016-03-10 7:18 GMT+01:00 Dikang Gu <di...@gmail.com>:
>>> >
>>> > > Hello there,
>>> > >
>>> > > I'm wondering is there a good way to measure the write amplification
>>> of
>>> > > Cassandra?
>>> > >
>>> > > I'm thinking it could be calculated by (size of mutations written to
>>> the
>>> > > node)/(number of bytes written to the disk).
>>> > >
>>> > > Do we already have the metrics of "size of mutations written to the
>>> > node"?
>>> > > I did not find it in jmx metrics.
>>> > >
>>> > > Thanks
>>> > >
>>> > > --
>>> > > Dikang
>>> > >
>>> > >
>>> >
>>>
>>
>>
>

Re: How to measure the write amplification of C*?

Posted by Matt Kennedy <mk...@datastax.com>.

TL;DR - Cassandra actually causes a ton of write amplification but it
doesn't freaking matter any more. Read on for details...

That slide deck does have a lot of very good information on it, but
unfortunately I think it has led to a fundamental misunderstanding about
Cassandra and write amplification. In particular, slide 51 vastly
oversimplifies the situation.

The wikipedia definition of write amplification looks at this from the
perspective of the SSD controller:
https://en.wikipedia.org/wiki/Write_amplification#Calculating_the_value

In short, write amplification = data written to flash/data written by the
host

So, if I write 1MB in my application, but the SSD has to write my 1MB, plus
rearrange another 1MB of data in order to make room for it, then I've
written a total of 2MB and my write amplification is 2x.

In other words, it is measuring how much extra the SSD controller has to
write in order to do its own housekeeping.

However, the wikipedia definition is a bit more constrained than how the
term is used in the storage industry. The whole point of looking at write
amplification is to understand the impact that a particular workload is
going to have on the underlying NAND by virtue of the data written. So a
definition of write amplification that is a little more relevant to the
context of Cassandra is to consider this:

write amplification = data written to flash/data written to the database

So, while the fact that we only sequentially write large immutable SSTables
does in fact mean that controller-level write amplification is near zero,
Compaction comes along and completely destroys that tidy little story.
Think about it, every time a compaction re-writes data that has already
been written, we are creating a lot of application-level write
amplification. Different compaction strategies and the workload itself
impact what the real application-level write amp is, but generally
speaking, LCS is the worst, followed by STCS and DTCS will cause the least
write-amp. To measure this, you can usually use smartctl (may be another
mechanism depending on SSD manufacturer) to get the physical bytes written
to your SSDs and divide that by the data that you've actually logically
written to Cassandra. I've measured (more than two years ago) LCS write amp
as high as 50x on some workloads, which is significantly higher than the
typical controller level write amp on a b-tree style update-in-place data
store. Also note that the new storage engine in general reduces a lot of
inefficiency in the Cassandra storage engine therefore reducing the impact
of write amp due to compactions.

However, if you're a person that understands SSDs, at this point you're
wondering why we aren't burning out SSDs right and left. The reality is
that general SSD endurance has gotten so good, that all this write amp
isn't really a problem any more. If you're curious to read more about that,
I recommend you start here:

http://hothardware.com/news/google-data-center-ssd-research-report-offers-surprising-results-slc-not-more-reliable-than-mlc-flash

and the paper that article mentions:
http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/23105-fast16-papers-schroeder.pdf

Hope this helps.

Matt Kennedy

On Thu, Mar 10, 2016 at 7:05 AM, Paulo Motta <pa...@gmail.com>
wrote:

> This is a good source on Cassandra + write amplification:
> http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
>
> 2016-03-10 9:57 GMT-03:00 Benjamin Lerer <be...@datastax.com>:
>
>> Cassandra should not cause any write amplification. Write amplification
>> appends only when you updates data on SSDs. Cassandra does not update any
>> data in place. Data can be rewritten during compaction but it is never
>> updated.
>>
>> Benjamin
>>
>> On Thu, Mar 10, 2016 at 12:42 PM, Alain RODRIGUEZ <ar...@gmail.com>
>> wrote:
>>
>> > Hi Dikang,
>> >
>> > I am not sure about what you call "amplification", but as sizes highly
>> > depends on the structure I think I would probably give it a try using
>> CCM (
>> > https://github.com/pcmanus/ccm) or some test cluster with 'production
>> > like'
>> > setting and schema. You can write a row, flush it and see how big is the
>> > data cluster-wide / per node.
>> >
>> > Hope this will be of some help.
>> >
>> > C*heers,
>> > -----------------------
>> > Alain Rodriguez - alain@thelastpickle.com
>> > France
>> >
>> > The Last Pickle - Apache Cassandra Consulting
>> > http://www.thelastpickle.com
>> >
>> > 2016-03-10 7:18 GMT+01:00 Dikang Gu <di...@gmail.com>:
>> >
>> > > Hello there,
>> > >
>> > > I'm wondering is there a good way to measure the write amplification
>> of
>> > > Cassandra?
>> > >
>> > > I'm thinking it could be calculated by (size of mutations written to
>> the
>> > > node)/(number of bytes written to the disk).
>> > >
>> > > Do we already have the metrics of "size of mutations written to the
>> > node"?
>> > > I did not find it in jmx metrics.
>> > >
>> > > Thanks
>> > >
>> > > --
>> > > Dikang
>> > >
>> > >
>> >
>>
>
>

Re: How to measure the write amplification of C*?

Posted by Paulo Motta <pa...@gmail.com>.

This is a good source on Cassandra + write amplification:
http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives

2016-03-10 9:57 GMT-03:00 Benjamin Lerer <be...@datastax.com>:

> Cassandra should not cause any write amplification. Write amplification
> appends only when you updates data on SSDs. Cassandra does not update any
> data in place. Data can be rewritten during compaction but it is never
> updated.
>
> Benjamin
>
> On Thu, Mar 10, 2016 at 12:42 PM, Alain RODRIGUEZ <ar...@gmail.com>
> wrote:
>
> > Hi Dikang,
> >
> > I am not sure about what you call "amplification", but as sizes highly
> > depends on the structure I think I would probably give it a try using
> CCM (
> > https://github.com/pcmanus/ccm) or some test cluster with 'production
> > like'
> > setting and schema. You can write a row, flush it and see how big is the
> > data cluster-wide / per node.
> >
> > Hope this will be of some help.
> >
> > C*heers,
> > -----------------------
> > Alain Rodriguez - alain@thelastpickle.com
> > France
> >
> > The Last Pickle - Apache Cassandra Consulting
> > http://www.thelastpickle.com
> >
> > 2016-03-10 7:18 GMT+01:00 Dikang Gu <di...@gmail.com>:
> >
> > > Hello there,
> > >
> > > I'm wondering is there a good way to measure the write amplification of
> > > Cassandra?
> > >
> > > I'm thinking it could be calculated by (size of mutations written to
> the
> > > node)/(number of bytes written to the disk).
> > >
> > > Do we already have the metrics of "size of mutations written to the
> > node"?
> > > I did not find it in jmx metrics.
> > >
> > > Thanks
> > >
> > > --
> > > Dikang
> > >
> > >
> >
>

Re: How to measure the write amplification of C*?

Posted by Benjamin Lerer <be...@datastax.com>.

Cassandra should not cause any write amplification. Write amplification
appends only when you updates data on SSDs. Cassandra does not update any
data in place. Data can be rewritten during compaction but it is never
updated.

Benjamin

On Thu, Mar 10, 2016 at 12:42 PM, Alain RODRIGUEZ <ar...@gmail.com>
wrote:

> Hi Dikang,
>
> I am not sure about what you call "amplification", but as sizes highly
> depends on the structure I think I would probably give it a try using CCM (
> https://github.com/pcmanus/ccm) or some test cluster with 'production
> like'
> setting and schema. You can write a row, flush it and see how big is the
> data cluster-wide / per node.
>
> Hope this will be of some help.
>
> C*heers,
> -----------------------
> Alain Rodriguez - alain@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2016-03-10 7:18 GMT+01:00 Dikang Gu <di...@gmail.com>:
>
> > Hello there,
> >
> > I'm wondering is there a good way to measure the write amplification of
> > Cassandra?
> >
> > I'm thinking it could be calculated by (size of mutations written to the
> > node)/(number of bytes written to the disk).
> >
> > Do we already have the metrics of "size of mutations written to the
> node"?
> > I did not find it in jmx metrics.
> >
> > Thanks
> >
> > --
> > Dikang
> >
> >
>

Re: How to measure the write amplification of C*?

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

Hi Dikang,

I am not sure about what you call "amplification", but as sizes highly
depends on the structure I think I would probably give it a try using CCM (
https://github.com/pcmanus/ccm) or some test cluster with 'production like'
setting and schema. You can write a row, flush it and see how big is the
data cluster-wide / per node.

Hope this will be of some help.

C*heers,
-----------------------
Alain Rodriguez - alain@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-03-10 7:18 GMT+01:00 Dikang Gu <di...@gmail.com>:

> Hello there,
>
> I'm wondering is there a good way to measure the write amplification of
> Cassandra?
>
> I'm thinking it could be calculated by (size of mutations written to the
> node)/(number of bytes written to the disk).
>
> Do we already have the metrics of "size of mutations written to the node"?
> I did not find it in jmx metrics.
>
> Thanks
>
> --
> Dikang
>
>

Re: How to measure the write amplification of C*?

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

Hi Dikang,

I am not sure about what you call "amplification", but as sizes highly
depends on the structure I think I would probably give it a try using CCM (
https://github.com/pcmanus/ccm) or some test cluster with 'production like'
setting and schema. You can write a row, flush it and see how big is the
data cluster-wide / per node.

Hope this will be of some help.

C*heers,
-----------------------
Alain Rodriguez - alain@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-03-10 7:18 GMT+01:00 Dikang Gu <di...@gmail.com>:

> Hello there,
>
> I'm wondering is there a good way to measure the write amplification of
> Cassandra?
>
> I'm thinking it could be calculated by (size of mutations written to the
> node)/(number of bytes written to the disk).
>
> Do we already have the metrics of "size of mutations written to the node"?
> I did not find it in jmx metrics.
>
> Thanks
>
> --
> Dikang
>
>