You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Keith Freeman <8f...@gmail.com> on 2013/09/10 16:55:56 UTC

heavy insert load overloads CPUs, with MutationStage pending

On my 3-node cluster (v1.2.8) with 4-cores each and SSDs for commitlog 
and data, I get high CPU loads during a heavy-ish wide-row insert load 
into a single CF (5000 1k inserts/sec), e.g. uptime load avg for last 
minute 18/11/10.  Checking tpstats, I see MutationStage pending on all 
the nodes, e.g.:

> Pool Name                    Active   Pending      Completed   Blocked  All time blocked
> ReadStage                         0         0            144         0                 0
> RequestResponseStage              0         0         243529         0                 0
> MutationStage                     1         9         290394         0                 0
> ReadRepairStage                   0         0              0         0                 0
> ReplicateOnWriteStage             0         0              0         0                 0
> GossipStage                       0         0           1014         0                 0
> AntiEntropyStage                  0         0              0         0                 0
> MigrationStage                    0         0             13         0                 0
> MemtablePostFlusher               1         2             35         0                 0
> FlushWriter                       1         2             20         0                 0
> MiscStage                         0         0              1         0                 0
> commitlog_archiver                0         0              0         0                 0

I can't seem find information about the real meaning of MutationStage, 
is this just normal for lots of inserts?

Also, switching from spinning disks to SSDs didn't seem to significantly 
improve insert performance, so it seems clear my use-case it totally 
CPU-bound.  Cassandra docs say "Insert-heavy workloads are CPU-bound in 
Cassandra before becoming memory-bound.", so I guess that's what I'm 
seeing, but there's no explanation. So I'm wonder what's overloading my 
CPUs, and is there anything I can do about it short of adding more nodes?

.

Re: heavy insert load overloads CPUs, with MutationStage pending

Posted by Keith Freeman <8f...@gmail.com>.

I have the defaults as shown in your response.

On 09/10/2013 01:59 PM, sankalp kohli wrote:
> What have you set these to?
> # commitlog_sync may be either "periodic" or "batch."
> # When in batch mode, Cassandra won't ack writes until the commit log
> # has been fsynced to disk.  It will wait up to
> # commitlog_sync_batch_window_in_ms milliseconds for other writes, before
> # performing the sync.
> #
> # commitlog_sync: batch
> # commitlog_sync_batch_window_in_ms: 50
> #
> # the other option is "periodic" where writes may be acked immediately
> # and the CommitLog is simply synced every commitlog_sync_period_in_ms
> # milliseconds.
> commitlog_sync: periodic
> commitlog_sync_period_in_ms: 1000
>
>
> On Tue, Sep 10, 2013 at 10:42 AM, Nate McCall <nate@thelastpickle.com 
> <ma...@thelastpickle.com>> wrote:
>
>     With SSDs, you can turn up memtable_flush_writers - try 3
>     initially (1 by default) and see what happens. However, given that
>     there are no entries in 'All time blocked' for such, they may be
>     something else.
>
>     How are you inserting the data?
>
>
>     On Tue, Sep 10, 2013 at 12:40 PM, Keith Freeman <8forty@gmail.com
>     <ma...@gmail.com>> wrote:
>
>
>         On 09/10/2013 11:17 AM, Robert Coli wrote:
>>         On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman
>>         <8forty@gmail.com <ma...@gmail.com>> wrote:
>>
>>             On my 3-node cluster (v1.2.8) with 4-cores each and SSDs
>>             for commitlog and data
>>
>>
>>         On SSD, you don't need to separate commitlog and data. You
>>         only win from this separation if you have a head to not-move
>>         between appends to the commit log. You will get better IO
>>         from a strip with an additional SSD.
>         Right, actually both partitions are on the same SSD.  
>         Assuming you meant "stripe", would that really make a difference
>
>>                 Pool Name  Active   Pending      Completed Blocked
>>                  All time blocked
>>                 MutationStage 1         9         290394 0          
>>                       0
>>                 FlushWriter 1         2             20 0            
>>                     0
>>
>>             I can't seem find information about the real meaning of
>>             MutationStage, is this just normal for lots of inserts?
>>
>>
>>         The mutation stage is the stage in which mutations to rows in
>>         memtables ("writes") occur.
>>
>>         The FlushWriter stage is the stage that turns memtables into
>>         SSTables by flushing them.
>>
>>         However, 9 pending mutations is a very small number. For
>>         reference on an overloaded cluster which was being written to
>>         death I recently saw.... 1216434 pending MutationStage. What
>>         problem other than "high CPU load" are you experiencing? 2
>>         Pending FlushWriters is slightly suggestive of some sort of
>>         bound related to flushing..
>         So the basic problem is that write performance is lower than I
>         expected.  I can't get sustained writing of 5000 ~1024-byte
>         records / sec at RF=2 on a good 3-node cluster, and my only
>         guess is that's because of the heavy CPU loads on the server
>         (loads over 10 on 4-CPU systems).  I've tried both a single
>         client writing 5000 rows/second and 2 clients (on separate
>         boxes) writing 2500 rows/second, and in both cases the
>         server(s) doesn't respond quickly enough to maintain that
>         rate.  It keeps up ok with 2000 or 3000 rows per second (and
>         has lower server loads).
>
>
>
>

Re: heavy insert load overloads CPUs, with MutationStage pending

Posted by sankalp kohli <ko...@gmail.com>.

What have you set these to?
# commitlog_sync may be either "periodic" or "batch."
# When in batch mode, Cassandra won't ack writes until the commit log
# has been fsynced to disk.  It will wait up to
# commitlog_sync_batch_window_in_ms milliseconds for other writes, before
# performing the sync.
#
# commitlog_sync: batch
# commitlog_sync_batch_window_in_ms: 50
#
# the other option is "periodic" where writes may be acked immediately
# and the CommitLog is simply synced every commitlog_sync_period_in_ms
# milliseconds.
commitlog_sync: periodic
commitlog_sync_period_in_ms: 1000


On Tue, Sep 10, 2013 at 10:42 AM, Nate McCall <na...@thelastpickle.com>wrote:

> With SSDs, you can turn up memtable_flush_writers - try 3 initially (1 by
> default) and see what happens. However, given that there are no entries in
> 'All time blocked' for such, they may be something else.
>
> How are you inserting the data?
>
>
> On Tue, Sep 10, 2013 at 12:40 PM, Keith Freeman <8f...@gmail.com> wrote:
>
>>
>> On 09/10/2013 11:17 AM, Robert Coli wrote:
>>
>> On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <8f...@gmail.com> wrote:
>>
>>> On my 3-node cluster (v1.2.8) with 4-cores each and SSDs for commitlog
>>> and data
>>
>>
>>  On SSD, you don't need to separate commitlog and data. You only win
>> from this separation if you have a head to not-move between appends to the
>> commit log. You will get better IO from a strip with an additional SSD.
>>
>> Right, actually both partitions are on the same SSD.   Assuming you meant
>> "stripe", would that really make a difference
>>
>>
>>
>>>  Pool Name                    Active   Pending      Completed   Blocked
>>>>  All time blocked
>>>> MutationStage                     1         9         290394         0
>>>>                 0
>>>> FlushWriter                       1         2             20         0
>>>>                 0
>>>>
>>>
>>
>>>  I can't seem find information about the real meaning of MutationStage,
>>> is this just normal for lots of inserts?
>>>
>>
>>  The mutation stage is the stage in which mutations to rows in memtables
>> ("writes") occur.
>>
>>  The FlushWriter stage is the stage that turns memtables into SSTables
>> by flushing them.
>>
>>  However, 9 pending mutations is a very small number. For reference on
>> an overloaded cluster which was being written to death I recently saw....
>> 1216434 pending MutationStage. What problem other than "high CPU load" are
>> you experiencing? 2 Pending FlushWriters is slightly suggestive of some
>> sort of bound related to flushing..
>>
>> So the basic problem is that write performance is lower than I expected.
>> I can't get sustained writing of 5000 ~1024-byte records / sec at RF=2 on a
>> good 3-node cluster, and my only guess is that's because of the heavy CPU
>> loads on the server (loads over 10 on 4-CPU systems).  I've tried both a
>> single client writing 5000 rows/second and 2 clients (on separate boxes)
>> writing 2500 rows/second, and in both cases the server(s) doesn't respond
>> quickly enough to maintain that rate.  It keeps up ok with 2000 or 3000
>> rows per second (and has lower server loads).
>>
>>
>>
>

Re: heavy insert load overloads CPUs, with MutationStage pending

Posted by Nate McCall <na...@thelastpickle.com>.

tl;dr: It seems the datastax client, though otherwise well written and
performant, is, in it's current form for 1.2.x and below, a non-starter for
folks requiring high performance inserts.

Corroborating other's findings on this thread, and several posts the post
couple of weeks, I just ran a series of tests for a client and Astyanax
outperformed the DS driver by a factor of 3 to 1 in a single-threaded (for
simplicity sake and to reduce potential variables) load of time series data.

Paul's example is pretty much the same approach I took to use the existing
CQL3 table definition from Thrift. Brian O'Neil has a pair of good blog
posts on this topic for more detail:
http://brianoneill.blogspot.com/2012/09/composite-keys-connecting-dots-between.html
http://brianoneill.blogspot.com/2012/10/cql-astyanax-and-compoundcomposite-keys.html

Per Keith's findings with compatibility, see:
https://github.com/Netflix/astyanax/issues/391


On Thu, Sep 12, 2013 at 3:26 PM, Paul Cichonski
<pa...@lithium.com>wrote:

> I'm running Cassandra 1.2.6 without compact storage on my tables. The
> trick is making your Astyanax (I'm running 1.56.42) mutation work with the
> CQL table definition (this is definitely a bit of a hack since most of the
> advice says don't mix the CQL and Thrift APIs so it is your call on how far
> you want to go). If you want to still try and test it out you need to
> leverage the Astyanax CompositeColumn construct to make it work (
> https://github.com/Netflix/astyanax/wiki/Composite-columns)
>
> I've provided a slightly modified version of what I am doing below:
>
> CQL table def:
>
> CREATE TABLE standard_subscription_index
> (
>         subscription_type text,
>         subscription_target_id text,
>         entitytype text,
>         entityid int,
>         creationtimestamp timestamp,
>         indexed_tenant_id uuid,
>         deleted boolean,
>     PRIMARY KEY ((subscription_type, subscription_target_id), entitytype,
> entityid)
> )
>
> ColumnFamily definition:
>
> private static final ColumnFamily<SubscriptionIndexCompositeKey,
> SubscribingEntityCompositeColumn> COMPOSITE_ROW_COLUMN = new
> ColumnFamily<SubscriptionIndexCompositeKey,
> SubscribingEntityCompositeColumn>(
>         SUBSCRIPTION_CF_NAME, new
> AnnotatedCompositeSerializer<SubscriptionIndexCompositeKey>(SubscriptionIndexCompositeKey.class),
>         new
> AnnotatedCompositeSerializer<SubscribingEntityCompositeColumn>(SubscribingEntityCompositeColumn.class));
>
>
> SubscriptionIndexCompositeKey is a class that contains the fields from the
> row key (e.g., subscription_type, subscription_target_id), and
> SubscribingEntityCompositeColumn contains the fields from the composite
> column (as it would look if you view your data using Cassandra-cli), so:
> entityType, entityId, columnName. The columnName field is the tricky part
> as it defines what to interpret the column value as (i.e., if it is a value
> for the creationtimestamp the column might be
> "someEntityType:4:creationtimestamp"
>
> The actual mutation looks something like this:
>
> final MutationBatch mutation = getKeyspace().prepareMutationBatch();
> final ColumnListMutation<SubscribingEntityCompositeColumn> row =
> mutation.withRow(COMPOSITE_ROW_COLUMN,
>                 new
> SubscriptionIndexCompositeKey(targetEntityType.getName(), targetEntityId));
>
> for (Subscription sub : subs) {
>         row.putColumn(new
> SubscribingEntityCompositeColumn(sub.getEntityType().getName(),
> sub.getEntityId(),
>                                 "creationtimestamp"),
> sub.getCreationTimestamp());
>         row.putColumn(new
> SubscribingEntityCompositeColumn(sub.getEntityType().getName(),
> sub.getEntityId(),
>                                 "deleted"), sub.isDeleted());
>         row.putColumn(new
> SubscribingEntityCompositeColumn(sub.getEntityType().getName(),
> sub.getEntityId(),
>                                 "indexed_tenant_id"), tenantId);
> }
>
> Hope that helps,
> Paul
>
>
> From: Keith Freeman [mailto:8forty@gmail.com]
> Sent: Thursday, September 12, 2013 12:10 PM
> To: user@cassandra.apache.org
> Subject: Re: heavy insert load overloads CPUs, with MutationStage pending
>
> Ok, your results are pretty impressive, I'm giving it a try.  I've made
> some initial attempts to use Astyanax 1.56.37, but have some troubles:
>
>   - it's not compatible with 1.2.8 client-side ( NoSuchMethodError's on
> org.apache.cassandra.thrift.TBinaryProtocol, which changed it's signature
> since 1.2.5)
>   - even switching to C* 1.2.5 servers, it's been difficult getting simple
> examples to work unless I use CF's that have "WITH COMPACT STORAGE"
>
> How did you handle these problems?  How much effort did it take you to
> switch from datastax to astyanax?
>
> I feel like I'm getting lost in a pretty deep rabbit-hole here.
> On 09/11/2013 03:03 PM, Paul Cichonski wrote:
> I was reluctant to use the thrift as well, and I spent about a week trying
> to get the CQL inserts to work by partitioning the INSERTS in different
> ways and tuning the cluster.
>
> However, nothing worked remotely as well as the batch_mutate when it came
> to writing a full wide-row at once. I think Cassandra 2.0 makes CQL work
> better for these cases (CASSANDRA-4693), but I haven't tested it yet.
>
> -Paul
>
> -----Original Message-----
> From: Keith Freeman [mailto:8forty@gmail.com]
> Sent: Wednesday, September 11, 2013 1:06 PM
> To: user@cassandra.apache.org
> Subject: Re: heavy insert load overloads CPUs, with MutationStage pending
>
> Thanks, I had seen your stackoverflow post.  I've got hundreds of
> (wide-) rows, and the writes are pretty well distributed across them.
> I'm very reluctant to drop back to the thrift interface.
>
> On 09/11/2013 10:46 AM, Paul Cichonski wrote:
> How much of the data you are writing is going against the same row key?
>
> I've experienced some issues using CQL to write a full wide-row at once
> (across multiple threads) that exhibited some of the symptoms you have
> described (i.e., high cpu, dropped mutations).
>
> This question goes into it a bit
> more:http://stackoverflow.com/questions/18522191/using-cassandra-and-
> cql3-how-do-you-insert-an-entire-wide-row-in-a-single-reque  . I was able
> to
> solve my issue by switching to using the thrift batch_mutate to write a
> full
> wide-row at once instead of using many CQL INSERT statements.
>
> -Paul
>
> -----Original Message-----
> From: Keith Freeman [mailto:8forty@gmail.com]
> Sent: Wednesday, September 11, 2013 9:16 AM
> To:user@cassandra.apache.org
> Subject: Re: heavy insert load overloads CPUs, with MutationStage
> pending
>
>
> On 09/10/2013 11:42 AM, Nate McCall wrote:
> With SSDs, you can turn up memtable_flush_writers - try 3 initially
> (1 by default) and see what happens. However, given that there are
> no entries in 'All time blocked' for such, they may be something else.
> Tried that, it seems to have reduced the loads a little after
> everything warmed-up, but not much.
> How are you inserting the data?
> A java client on a separate box using the datastax java driver, 48
> threads writing 100 records each iteration as prepared batch statements.
>
> At 5000 records/sec, the servers just can't keep up, so the client backs
> up.
> That's only 5M of data/sec, which doesn't seem like much.  As I
> mentioned, switching to SSDs didn't help much, so I'm assuming at
> this point that the server overloads are what's holding up the client.
>
>
>

Re: heavy insert load overloads CPUs, with MutationStage pending

Posted by Nate McCall <na...@thelastpickle.com>.

Also, I was working on this a bit for a client so compiled my notes and
approach into a blog post for posterity (and so it's easier to find for
others):
http://thelastpickle.com/blog/2013/09/13/CQL3-to-Astyanax-Compatibility.html

Paul's method on this thread is cited at the bottom as well.


On Fri, Sep 13, 2013 at 11:16 AM, Nate McCall <na...@thelastpickle.com>wrote:

> https://github.com/Netflix/astyanax/issues/391
>
> I've gotten in touch with a couple of netflix folks and they are going to
> try to roll a release shortly.
>
> You should be able to build against 1.2.2 and 'talking' to 1.2.9 instance
> should work. Just a PITA development wise to maintain a different
> version(s).
>
>
> On Fri, Sep 13, 2013 at 10:52 AM, Keith Freeman <8f...@gmail.com> wrote:
>
>> Paul-  Sorry to go off-list but I'm diving pretty far into details here.
>>  Ignore if you wish.
>>
>> Thanks a lot for the example, definitely very helpful.  I'm surprised
>> that the Cassandra experts aren't more interested-in/alarmed-by our
>> results, it seems like we've proved that insert performance for wide rows
>> in CQL is enormously worse than it was before CQL.  And I have a feeling
>> 2.0 won't help much -- I'm already using entirely-prepared batches.
>>
>> To reproduce your example, I switched to cassandra 1.2.6  and astyanax
>> 1.56.42.  But anything I try to do with that version combination gives me
>> an exception on the client side (e.g. execute() on a query):
>>
>>> 13-09-13 15:42:42.511 [pool-6-thread-1] ERROR c.n.a.t.**
>>> ThriftSyncConnectionFactoryImp**l - Error creating connection
>>> java.lang.NoSuchMethodError: org.apache.cassandra.thrift.**TBinaryProtocol:
>>> method <init>(Lorg/apache/thrift/**transport/TTransport;)V not found
>>>     at com.netflix.astyanax.thrift.**ThriftSyncConnectionFactoryImp**
>>> l$ThriftConnection.open(**ThriftSyncConnectionFactoryImp**l.java:195)
>>> ~[astyanax-thrift-1.56.37.jar:**na]
>>>     at com.netflix.astyanax.thrift.**ThriftSyncConnectionFactoryImp**
>>> l$ThriftConnection$1.run(**ThriftSyncConnectionFactoryImp**l.java:232)
>>> [astyanax-thrift-1.56.37.jar:**na]
>>>     at java.util.concurrent.**Executors$RunnableAdapter.**call(Executors.java:471)
>>> [na:1.7.0_07]
>>>
>> From my googling this is due to a cassandra API change in
>> TBinaryProtocol, which is why I had to use cassandra 1.2.5 jars to get my
>> astyanax client to work at all in my earlier experiments. Did you encounter
>> this?  Also, you had 1.2.8 in the stackoverflow post, but 1.2.6 in this
>> email, did you have to rollback?
>>
>> Thanks for any help you can offer, hope I can return the favor at some
>> point.
>>
>>
>>
>> On 09/12/2013 02:26 PM, Paul Cichonski wrote:
>>
>>> I'm running Cassandra 1.2.6 without compact storage on my tables. The
>>> trick is making your Astyanax (I'm running 1.56.42) mutation work with the
>>> CQL table definition (this is definitely a bit of a hack since most of the
>>> advice says don't mix the CQL and Thrift APIs so it is your call on how far
>>> you want to go). If you want to still try and test it out you need to
>>> leverage the Astyanax CompositeColumn construct to make it work (
>>> https://github.com/Netflix/**astyanax/wiki/Composite-**columns<https://github.com/Netflix/astyanax/wiki/Composite-columns>
>>> )
>>>
>>> I've provided a slightly modified version of what I am doing below:
>>>
>>> CQL table def:
>>>
>>> CREATE TABLE standard_subscription_index
>>> (
>>>         subscription_type text,
>>>         subscription_target_id text,
>>>         entitytype text,
>>>         entityid int,
>>>         creationtimestamp timestamp,
>>>         indexed_tenant_id uuid,
>>>         deleted boolean,
>>>      PRIMARY KEY ((subscription_type, subscription_target_id),
>>> entitytype, entityid)
>>> )
>>>
>>> ColumnFamily definition:
>>>
>>> private static final ColumnFamily<**SubscriptionIndexCompositeKey,
>>> SubscribingEntityCompositeColu**mn> COMPOSITE_ROW_COLUMN = new
>>> ColumnFamily<**SubscriptionIndexCompositeKey,
>>> SubscribingEntityCompositeColu**mn>(
>>>         SUBSCRIPTION_CF_NAME, new AnnotatedCompositeSerializer<**
>>> SubscriptionIndexCompositeKey>**(**SubscriptionIndexCompositeKey.**
>>> class),
>>>         new AnnotatedCompositeSerializer<**
>>> SubscribingEntityCompositeColu**mn>(**SubscribingEntityCompositeColu**
>>> mn.class));
>>>
>>>
>>> SubscriptionIndexCompositeKey is a class that contains the fields from
>>> the row key (e.g., subscription_type, subscription_target_id), and
>>> SubscribingEntityCompositeColu**mn contains the fields from the
>>> composite column (as it would look if you view your data using
>>> Cassandra-cli), so: entityType, entityId, columnName. The columnName field
>>> is the tricky part as it defines what to interpret the column value as
>>> (i.e., if it is a value for the creationtimestamp the column might be
>>> "someEntityType:4:**creationtimestamp"
>>>
>>> The actual mutation looks something like this:
>>>
>>> final MutationBatch mutation = getKeyspace().**prepareMutationBatch();
>>> final ColumnListMutation<**SubscribingEntityCompositeColu**mn> row =
>>> mutation.withRow(COMPOSITE_**ROW_COLUMN,
>>>                 new SubscriptionIndexCompositeKey(**targetEntityType.getName(),
>>> targetEntityId));
>>>
>>> for (Subscription sub : subs) {
>>>         row.putColumn(new SubscribingEntityCompositeColu**
>>> mn(sub.getEntityType().**getName(), sub.getEntityId(),
>>>                                 "creationtimestamp"),
>>> sub.getCreationTimestamp());
>>>         row.putColumn(new SubscribingEntityCompositeColu**
>>> mn(sub.getEntityType().**getName(), sub.getEntityId(),
>>>                                 "deleted"), sub.isDeleted());
>>>         row.putColumn(new SubscribingEntityCompositeColu**
>>> mn(sub.getEntityType().**getName(), sub.getEntityId(),
>>>                                 "indexed_tenant_id"), tenantId);
>>> }
>>>
>>> Hope that helps,
>>> Paul
>>>
>>>
>

Re: heavy insert load overloads CPUs, with MutationStage pending

Posted by Nate McCall <na...@thelastpickle.com>.

https://github.com/Netflix/astyanax/issues/391

I've gotten in touch with a couple of netflix folks and they are going to
try to roll a release shortly.

You should be able to build against 1.2.2 and 'talking' to 1.2.9 instance
should work. Just a PITA development wise to maintain a different
version(s).


On Fri, Sep 13, 2013 at 10:52 AM, Keith Freeman <8f...@gmail.com> wrote:

> Paul-  Sorry to go off-list but I'm diving pretty far into details here.
>  Ignore if you wish.
>
> Thanks a lot for the example, definitely very helpful.  I'm surprised that
> the Cassandra experts aren't more interested-in/alarmed-by our results, it
> seems like we've proved that insert performance for wide rows in CQL is
> enormously worse than it was before CQL.  And I have a feeling 2.0 won't
> help much -- I'm already using entirely-prepared batches.
>
> To reproduce your example, I switched to cassandra 1.2.6  and astyanax
> 1.56.42.  But anything I try to do with that version combination gives me
> an exception on the client side (e.g. execute() on a query):
>
>> 13-09-13 15:42:42.511 [pool-6-thread-1] ERROR c.n.a.t.**
>> ThriftSyncConnectionFactoryImp**l - Error creating connection
>> java.lang.NoSuchMethodError: org.apache.cassandra.thrift.**TBinaryProtocol:
>> method <init>(Lorg/apache/thrift/**transport/TTransport;)V not found
>>     at com.netflix.astyanax.thrift.**ThriftSyncConnectionFactoryImp**
>> l$ThriftConnection.open(**ThriftSyncConnectionFactoryImp**l.java:195)
>> ~[astyanax-thrift-1.56.37.jar:**na]
>>     at com.netflix.astyanax.thrift.**ThriftSyncConnectionFactoryImp**
>> l$ThriftConnection$1.run(**ThriftSyncConnectionFactoryImp**l.java:232)
>> [astyanax-thrift-1.56.37.jar:**na]
>>     at java.util.concurrent.**Executors$RunnableAdapter.**call(Executors.java:471)
>> [na:1.7.0_07]
>>
> From my googling this is due to a cassandra API change in TBinaryProtocol,
> which is why I had to use cassandra 1.2.5 jars to get my astyanax client to
> work at all in my earlier experiments. Did you encounter this?  Also, you
> had 1.2.8 in the stackoverflow post, but 1.2.6 in this email, did you have
> to rollback?
>
> Thanks for any help you can offer, hope I can return the favor at some
> point.
>
>
>
> On 09/12/2013 02:26 PM, Paul Cichonski wrote:
>
>> I'm running Cassandra 1.2.6 without compact storage on my tables. The
>> trick is making your Astyanax (I'm running 1.56.42) mutation work with the
>> CQL table definition (this is definitely a bit of a hack since most of the
>> advice says don't mix the CQL and Thrift APIs so it is your call on how far
>> you want to go). If you want to still try and test it out you need to
>> leverage the Astyanax CompositeColumn construct to make it work (
>> https://github.com/Netflix/**astyanax/wiki/Composite-**columns<https://github.com/Netflix/astyanax/wiki/Composite-columns>
>> )
>>
>> I've provided a slightly modified version of what I am doing below:
>>
>> CQL table def:
>>
>> CREATE TABLE standard_subscription_index
>> (
>>         subscription_type text,
>>         subscription_target_id text,
>>         entitytype text,
>>         entityid int,
>>         creationtimestamp timestamp,
>>         indexed_tenant_id uuid,
>>         deleted boolean,
>>      PRIMARY KEY ((subscription_type, subscription_target_id),
>> entitytype, entityid)
>> )
>>
>> ColumnFamily definition:
>>
>> private static final ColumnFamily<**SubscriptionIndexCompositeKey,
>> SubscribingEntityCompositeColu**mn> COMPOSITE_ROW_COLUMN = new
>> ColumnFamily<**SubscriptionIndexCompositeKey,
>> SubscribingEntityCompositeColu**mn>(
>>         SUBSCRIPTION_CF_NAME, new AnnotatedCompositeSerializer<**
>> SubscriptionIndexCompositeKey>**(**SubscriptionIndexCompositeKey.**
>> class),
>>         new AnnotatedCompositeSerializer<**SubscribingEntityCompositeColu
>> **mn>(**SubscribingEntityCompositeColu**mn.class));
>>
>>
>> SubscriptionIndexCompositeKey is a class that contains the fields from
>> the row key (e.g., subscription_type, subscription_target_id), and
>> SubscribingEntityCompositeColu**mn contains the fields from the
>> composite column (as it would look if you view your data using
>> Cassandra-cli), so: entityType, entityId, columnName. The columnName field
>> is the tricky part as it defines what to interpret the column value as
>> (i.e., if it is a value for the creationtimestamp the column might be
>> "someEntityType:4:**creationtimestamp"
>>
>> The actual mutation looks something like this:
>>
>> final MutationBatch mutation = getKeyspace().**prepareMutationBatch();
>> final ColumnListMutation<**SubscribingEntityCompositeColu**mn> row =
>> mutation.withRow(COMPOSITE_**ROW_COLUMN,
>>                 new SubscriptionIndexCompositeKey(**targetEntityType.getName(),
>> targetEntityId));
>>
>> for (Subscription sub : subs) {
>>         row.putColumn(new SubscribingEntityCompositeColu**
>> mn(sub.getEntityType().**getName(), sub.getEntityId(),
>>                                 "creationtimestamp"),
>> sub.getCreationTimestamp());
>>         row.putColumn(new SubscribingEntityCompositeColu**
>> mn(sub.getEntityType().**getName(), sub.getEntityId(),
>>                                 "deleted"), sub.isDeleted());
>>         row.putColumn(new SubscribingEntityCompositeColu**
>> mn(sub.getEntityType().**getName(), sub.getEntityId(),
>>                                 "indexed_tenant_id"), tenantId);
>> }
>>
>> Hope that helps,
>> Paul
>>
>>

Re: heavy insert load overloads CPUs, with MutationStage pending

Posted by Keith Freeman <8f...@gmail.com>.

Paul-  Sorry to go off-list but I'm diving pretty far into details 
here.  Ignore if you wish.

Thanks a lot for the example, definitely very helpful.  I'm surprised 
that the Cassandra experts aren't more interested-in/alarmed-by our 
results, it seems like we've proved that insert performance for wide 
rows in CQL is enormously worse than it was before CQL.  And I have a 
feeling 2.0 won't help much -- I'm already using entirely-prepared batches.

To reproduce your example, I switched to cassandra 1.2.6  and astyanax 
1.56.42.  But anything I try to do with that version combination gives 
me an exception on the client side (e.g. execute() on a query):
> 13-09-13 15:42:42.511 [pool-6-thread-1] ERROR 
> c.n.a.t.ThriftSyncConnectionFactoryImpl - Error creating connection
> java.lang.NoSuchMethodError: 
> org.apache.cassandra.thrift.TBinaryProtocol: method 
> <init>(Lorg/apache/thrift/transport/TTransport;)V not found
>     at 
> com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.open(ThriftSyncConnectionFactoryImpl.java:195) 
> ~[astyanax-thrift-1.56.37.jar:na]
>     at 
> com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection$1.run(ThriftSyncConnectionFactoryImpl.java:232) 
> [astyanax-thrift-1.56.37.jar:na]
>     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_07]
 From my googling this is due to a cassandra API change in 
TBinaryProtocol, which is why I had to use cassandra 1.2.5 jars to get 
my astyanax client to work at all in my earlier experiments. Did you 
encounter this?  Also, you had 1.2.8 in the stackoverflow post, but 
1.2.6 in this email, did you have to rollback?

Thanks for any help you can offer, hope I can return the favor at some 
point.


On 09/12/2013 02:26 PM, Paul Cichonski wrote:
> I'm running Cassandra 1.2.6 without compact storage on my tables. The trick is making your Astyanax (I'm running 1.56.42) mutation work with the CQL table definition (this is definitely a bit of a hack since most of the advice says don't mix the CQL and Thrift APIs so it is your call on how far you want to go). If you want to still try and test it out you need to leverage the Astyanax CompositeColumn construct to make it work (https://github.com/Netflix/astyanax/wiki/Composite-columns)
>
> I've provided a slightly modified version of what I am doing below:
>
> CQL table def:
>
> CREATE TABLE standard_subscription_index
> (
>   	subscription_type text,
> 	subscription_target_id text,
> 	entitytype text,
> 	entityid int,
> 	creationtimestamp timestamp,
> 	indexed_tenant_id uuid,
> 	deleted boolean,
>      PRIMARY KEY ((subscription_type, subscription_target_id), entitytype, entityid)
> )
>
> ColumnFamily definition:
>
> private static final ColumnFamily<SubscriptionIndexCompositeKey, SubscribingEntityCompositeColumn> COMPOSITE_ROW_COLUMN = new ColumnFamily<SubscriptionIndexCompositeKey, SubscribingEntityCompositeColumn>(
> 	SUBSCRIPTION_CF_NAME, new AnnotatedCompositeSerializer<SubscriptionIndexCompositeKey>(SubscriptionIndexCompositeKey.class),
> 	new AnnotatedCompositeSerializer<SubscribingEntityCompositeColumn>(SubscribingEntityCompositeColumn.class));
>
>
> SubscriptionIndexCompositeKey is a class that contains the fields from the row key (e.g., subscription_type, subscription_target_id), and SubscribingEntityCompositeColumn contains the fields from the composite column (as it would look if you view your data using Cassandra-cli), so: entityType, entityId, columnName. The columnName field is the tricky part as it defines what to interpret the column value as (i.e., if it is a value for the creationtimestamp the column might be "someEntityType:4:creationtimestamp"
>
> The actual mutation looks something like this:
>
> final MutationBatch mutation = getKeyspace().prepareMutationBatch();
> final ColumnListMutation<SubscribingEntityCompositeColumn> row = mutation.withRow(COMPOSITE_ROW_COLUMN,
> 		new SubscriptionIndexCompositeKey(targetEntityType.getName(), targetEntityId));
>
> for (Subscription sub : subs) {
> 	row.putColumn(new SubscribingEntityCompositeColumn(sub.getEntityType().getName(), sub.getEntityId(),
> 				"creationtimestamp"), sub.getCreationTimestamp());
> 	row.putColumn(new SubscribingEntityCompositeColumn(sub.getEntityType().getName(), sub.getEntityId(),
> 				"deleted"), sub.isDeleted());
> 	row.putColumn(new SubscribingEntityCompositeColumn(sub.getEntityType().getName(), sub.getEntityId(),
> 				"indexed_tenant_id"), tenantId);
> }
>
> Hope that helps,
> Paul
>

RE: heavy insert load overloads CPUs, with MutationStage pending

Posted by Paul Cichonski <pa...@lithium.com>.

I'm running Cassandra 1.2.6 without compact storage on my tables. The trick is making your Astyanax (I'm running 1.56.42) mutation work with the CQL table definition (this is definitely a bit of a hack since most of the advice says don't mix the CQL and Thrift APIs so it is your call on how far you want to go). If you want to still try and test it out you need to leverage the Astyanax CompositeColumn construct to make it work (https://github.com/Netflix/astyanax/wiki/Composite-columns)

I've provided a slightly modified version of what I am doing below:

CQL table def:

CREATE TABLE standard_subscription_index
(
 	subscription_type text,
	subscription_target_id text,
	entitytype text,
	entityid int,
	creationtimestamp timestamp,
	indexed_tenant_id uuid,
	deleted boolean,
    PRIMARY KEY ((subscription_type, subscription_target_id), entitytype, entityid)
)

ColumnFamily definition:

private static final ColumnFamily<SubscriptionIndexCompositeKey, SubscribingEntityCompositeColumn> COMPOSITE_ROW_COLUMN = new ColumnFamily<SubscriptionIndexCompositeKey, SubscribingEntityCompositeColumn>(
	SUBSCRIPTION_CF_NAME, new AnnotatedCompositeSerializer<SubscriptionIndexCompositeKey>(SubscriptionIndexCompositeKey.class),
	new AnnotatedCompositeSerializer<SubscribingEntityCompositeColumn>(SubscribingEntityCompositeColumn.class));


SubscriptionIndexCompositeKey is a class that contains the fields from the row key (e.g., subscription_type, subscription_target_id), and SubscribingEntityCompositeColumn contains the fields from the composite column (as it would look if you view your data using Cassandra-cli), so: entityType, entityId, columnName. The columnName field is the tricky part as it defines what to interpret the column value as (i.e., if it is a value for the creationtimestamp the column might be "someEntityType:4:creationtimestamp"

The actual mutation looks something like this:

final MutationBatch mutation = getKeyspace().prepareMutationBatch();
final ColumnListMutation<SubscribingEntityCompositeColumn> row = mutation.withRow(COMPOSITE_ROW_COLUMN,
		new SubscriptionIndexCompositeKey(targetEntityType.getName(), targetEntityId));

for (Subscription sub : subs) {
	row.putColumn(new SubscribingEntityCompositeColumn(sub.getEntityType().getName(), sub.getEntityId(),
				"creationtimestamp"), sub.getCreationTimestamp());
	row.putColumn(new SubscribingEntityCompositeColumn(sub.getEntityType().getName(), sub.getEntityId(),
				"deleted"), sub.isDeleted());
	row.putColumn(new SubscribingEntityCompositeColumn(sub.getEntityType().getName(), sub.getEntityId(),
				"indexed_tenant_id"), tenantId);
}

Hope that helps,
Paul


From: Keith Freeman [mailto:8forty@gmail.com] 
Sent: Thursday, September 12, 2013 12:10 PM
To: user@cassandra.apache.org
Subject: Re: heavy insert load overloads CPUs, with MutationStage pending

Ok, your results are pretty impressive, I'm giving it a try.  I've made some initial attempts to use Astyanax 1.56.37, but have some troubles:

  - it's not compatible with 1.2.8 client-side ( NoSuchMethodError's on org.apache.cassandra.thrift.TBinaryProtocol, which changed it's signature since 1.2.5)
  - even switching to C* 1.2.5 servers, it's been difficult getting simple examples to work unless I use CF's that have "WITH COMPACT STORAGE"

How did you handle these problems?  How much effort did it take you to switch from datastax to astyanax?  

I feel like I'm getting lost in a pretty deep rabbit-hole here.
On 09/11/2013 03:03 PM, Paul Cichonski wrote:
I was reluctant to use the thrift as well, and I spent about a week trying to get the CQL inserts to work by partitioning the INSERTS in different ways and tuning the cluster.

However, nothing worked remotely as well as the batch_mutate when it came to writing a full wide-row at once. I think Cassandra 2.0 makes CQL work better for these cases (CASSANDRA-4693), but I haven't tested it yet.

-Paul

-----Original Message-----
From: Keith Freeman [mailto:8forty@gmail.com]
Sent: Wednesday, September 11, 2013 1:06 PM
To: user@cassandra.apache.org
Subject: Re: heavy insert load overloads CPUs, with MutationStage pending

Thanks, I had seen your stackoverflow post.  I've got hundreds of
(wide-) rows, and the writes are pretty well distributed across them.
I'm very reluctant to drop back to the thrift interface.

On 09/11/2013 10:46 AM, Paul Cichonski wrote:
How much of the data you are writing is going against the same row key?

I've experienced some issues using CQL to write a full wide-row at once
(across multiple threads) that exhibited some of the symptoms you have
described (i.e., high cpu, dropped mutations).

This question goes into it a bit
more:http://stackoverflow.com/questions/18522191/using-cassandra-and-
cql3-how-do-you-insert-an-entire-wide-row-in-a-single-reque  . I was able to
solve my issue by switching to using the thrift batch_mutate to write a full
wide-row at once instead of using many CQL INSERT statements.

-Paul

-----Original Message-----
From: Keith Freeman [mailto:8forty@gmail.com]
Sent: Wednesday, September 11, 2013 9:16 AM
To:user@cassandra.apache.org
Subject: Re: heavy insert load overloads CPUs, with MutationStage
pending


On 09/10/2013 11:42 AM, Nate McCall wrote:
With SSDs, you can turn up memtable_flush_writers - try 3 initially
(1 by default) and see what happens. However, given that there are
no entries in 'All time blocked' for such, they may be something else.
Tried that, it seems to have reduced the loads a little after
everything warmed-up, but not much.
How are you inserting the data?
A java client on a separate box using the datastax java driver, 48
threads writing 100 records each iteration as prepared batch statements.

At 5000 records/sec, the servers just can't keep up, so the client backs up.
That's only 5M of data/sec, which doesn't seem like much.  As I
mentioned, switching to SSDs didn't help much, so I'm assuming at
this point that the server overloads are what's holding up the client.

Re: heavy insert load overloads CPUs, with MutationStage pending

Posted by Keith Freeman <8f...@gmail.com>.

Ok, your results are pretty impressive, I'm giving it a try.  I've made 
some initial attempts to use Astyanax 1.56.37, but have some troubles:

   - it's not compatible with 1.2.8 client-side ( NoSuchMethodError's on 
org.apache.cassandra.thrift.TBinaryProtocol, which changed it's 
signature since 1.2.5)
   - even switching to C* 1.2.5 servers, it's been difficult getting 
simple examples to work unless I use CF's that have "WITH COMPACT STORAGE"

How did you handle these problems?  How much effort did it take you to 
switch from datastax to astyanax?

I feel like I'm getting lost in a pretty deep rabbit-hole here.

On 09/11/2013 03:03 PM, Paul Cichonski wrote:
> I was reluctant to use the thrift as well, and I spent about a week trying to get the CQL inserts to work by partitioning the INSERTS in different ways and tuning the cluster.
>
> However, nothing worked remotely as well as the batch_mutate when it came to writing a full wide-row at once. I think Cassandra 2.0 makes CQL work better for these cases (CASSANDRA-4693), but I haven't tested it yet.
>
> -Paul
>
>> -----Original Message-----
>> From: Keith Freeman [mailto:8forty@gmail.com]
>> Sent: Wednesday, September 11, 2013 1:06 PM
>> To: user@cassandra.apache.org
>> Subject: Re: heavy insert load overloads CPUs, with MutationStage pending
>>
>> Thanks, I had seen your stackoverflow post.  I've got hundreds of
>> (wide-) rows, and the writes are pretty well distributed across them.
>> I'm very reluctant to drop back to the thrift interface.
>>
>> On 09/11/2013 10:46 AM, Paul Cichonski wrote:
>>> How much of the data you are writing is going against the same row key?
>>>
>>> I've experienced some issues using CQL to write a full wide-row at once
>> (across multiple threads) that exhibited some of the symptoms you have
>> described (i.e., high cpu, dropped mutations).
>>> This question goes into it a bit
>> more:http://stackoverflow.com/questions/18522191/using-cassandra-and-
>> cql3-how-do-you-insert-an-entire-wide-row-in-a-single-reque  . I was able to
>> solve my issue by switching to using the thrift batch_mutate to write a full
>> wide-row at once instead of using many CQL INSERT statements.
>>> -Paul
>>>
>>>> -----Original Message-----
>>>> From: Keith Freeman [mailto:8forty@gmail.com]
>>>> Sent: Wednesday, September 11, 2013 9:16 AM
>>>> To:user@cassandra.apache.org
>>>> Subject: Re: heavy insert load overloads CPUs, with MutationStage
>>>> pending
>>>>
>>>>
>>>> On 09/10/2013 11:42 AM, Nate McCall wrote:
>>>>> With SSDs, you can turn up memtable_flush_writers - try 3 initially
>>>>> (1 by default) and see what happens. However, given that there are
>>>>> no entries in 'All time blocked' for such, they may be something else.
>>>> Tried that, it seems to have reduced the loads a little after
>>>> everything warmed-up, but not much.
>>>>> How are you inserting the data?
>>>> A java client on a separate box using the datastax java driver, 48
>>>> threads writing 100 records each iteration as prepared batch statements.
>>>>
>>>> At 5000 records/sec, the servers just can't keep up, so the client backs up.
>>>> That's only 5M of data/sec, which doesn't seem like much.  As I
>>>> mentioned, switching to SSDs didn't help much, so I'm assuming at
>>>> this point that the server overloads are what's holding up the client.

RE: heavy insert load overloads CPUs, with MutationStage pending

Posted by Paul Cichonski <pa...@lithium.com>.

I was reluctant to use the thrift as well, and I spent about a week trying to get the CQL inserts to work by partitioning the INSERTS in different ways and tuning the cluster.

However, nothing worked remotely as well as the batch_mutate when it came to writing a full wide-row at once. I think Cassandra 2.0 makes CQL work better for these cases (CASSANDRA-4693), but I haven't tested it yet.

-Paul

> -----Original Message-----
> From: Keith Freeman [mailto:8forty@gmail.com]
> Sent: Wednesday, September 11, 2013 1:06 PM
> To: user@cassandra.apache.org
> Subject: Re: heavy insert load overloads CPUs, with MutationStage pending
> 
> Thanks, I had seen your stackoverflow post.  I've got hundreds of
> (wide-) rows, and the writes are pretty well distributed across them.
> I'm very reluctant to drop back to the thrift interface.
> 
> On 09/11/2013 10:46 AM, Paul Cichonski wrote:
> > How much of the data you are writing is going against the same row key?
> >
> > I've experienced some issues using CQL to write a full wide-row at once
> (across multiple threads) that exhibited some of the symptoms you have
> described (i.e., high cpu, dropped mutations).
> >
> > This question goes into it a bit
> more:http://stackoverflow.com/questions/18522191/using-cassandra-and-
> cql3-how-do-you-insert-an-entire-wide-row-in-a-single-reque  . I was able to
> solve my issue by switching to using the thrift batch_mutate to write a full
> wide-row at once instead of using many CQL INSERT statements.
> >
> > -Paul
> >
> >> -----Original Message-----
> >> From: Keith Freeman [mailto:8forty@gmail.com]
> >> Sent: Wednesday, September 11, 2013 9:16 AM
> >> To:user@cassandra.apache.org
> >> Subject: Re: heavy insert load overloads CPUs, with MutationStage
> >> pending
> >>
> >>
> >> On 09/10/2013 11:42 AM, Nate McCall wrote:
> >>> With SSDs, you can turn up memtable_flush_writers - try 3 initially
> >>> (1 by default) and see what happens. However, given that there are
> >>> no entries in 'All time blocked' for such, they may be something else.
> >> Tried that, it seems to have reduced the loads a little after
> >> everything warmed-up, but not much.
> >>> How are you inserting the data?
> >> A java client on a separate box using the datastax java driver, 48
> >> threads writing 100 records each iteration as prepared batch statements.
> >>
> >> At 5000 records/sec, the servers just can't keep up, so the client backs up.
> >> That's only 5M of data/sec, which doesn't seem like much.  As I
> >> mentioned, switching to SSDs didn't help much, so I'm assuming at
> >> this point that the server overloads are what's holding up the client.

Re: heavy insert load overloads CPUs, with MutationStage pending

Posted by Keith Freeman <8f...@gmail.com>.

Thanks, I had seen your stackoverflow post.  I've got hundreds of 
(wide-) rows, and the writes are pretty well distributed across them.  
I'm very reluctant to drop back to the thrift interface.

On 09/11/2013 10:46 AM, Paul Cichonski wrote:
> How much of the data you are writing is going against the same row key?
>
> I've experienced some issues using CQL to write a full wide-row at once (across multiple threads) that exhibited some of the symptoms you have described (i.e., high cpu, dropped mutations).
>
> This question goes into it a bit more:http://stackoverflow.com/questions/18522191/using-cassandra-and-cql3-how-do-you-insert-an-entire-wide-row-in-a-single-reque  . I was able to solve my issue by switching to using the thrift batch_mutate to write a full wide-row at once instead of using many CQL INSERT statements.
>
> -Paul
>
>> -----Original Message-----
>> From: Keith Freeman [mailto:8forty@gmail.com]
>> Sent: Wednesday, September 11, 2013 9:16 AM
>> To:user@cassandra.apache.org
>> Subject: Re: heavy insert load overloads CPUs, with MutationStage pending
>>
>>
>> On 09/10/2013 11:42 AM, Nate McCall wrote:
>>> With SSDs, you can turn up memtable_flush_writers - try 3 initially (1
>>> by default) and see what happens. However, given that there are no
>>> entries in 'All time blocked' for such, they may be something else.
>> Tried that, it seems to have reduced the loads a little after everything
>> warmed-up, but not much.
>>> How are you inserting the data?
>> A java client on a separate box using the datastax java driver, 48 threads
>> writing 100 records each iteration as prepared batch statements.
>>
>> At 5000 records/sec, the servers just can't keep up, so the client backs up.
>> That's only 5M of data/sec, which doesn't seem like much.  As I mentioned,
>> switching to SSDs didn't help much, so I'm assuming at this point that the
>> server overloads are what's holding up the client.

RE: heavy insert load overloads CPUs, with MutationStage pending

Posted by Paul Cichonski <pa...@lithium.com>.

How much of the data you are writing is going against the same row key? 

I've experienced some issues using CQL to write a full wide-row at once (across multiple threads) that exhibited some of the symptoms you have described (i.e., high cpu, dropped mutations). 

This question goes into it a bit more: http://stackoverflow.com/questions/18522191/using-cassandra-and-cql3-how-do-you-insert-an-entire-wide-row-in-a-single-reque . I was able to solve my issue by switching to using the thrift batch_mutate to write a full wide-row at once instead of using many CQL INSERT statements. 

-Paul

> -----Original Message-----
> From: Keith Freeman [mailto:8forty@gmail.com]
> Sent: Wednesday, September 11, 2013 9:16 AM
> To: user@cassandra.apache.org
> Subject: Re: heavy insert load overloads CPUs, with MutationStage pending
> 
> 
> On 09/10/2013 11:42 AM, Nate McCall wrote:
> > With SSDs, you can turn up memtable_flush_writers - try 3 initially (1
> > by default) and see what happens. However, given that there are no
> > entries in 'All time blocked' for such, they may be something else.
> Tried that, it seems to have reduced the loads a little after everything
> warmed-up, but not much.
> >
> > How are you inserting the data?
> 
> A java client on a separate box using the datastax java driver, 48 threads
> writing 100 records each iteration as prepared batch statements.
> 
> At 5000 records/sec, the servers just can't keep up, so the client backs up.
> That's only 5M of data/sec, which doesn't seem like much.  As I mentioned,
> switching to SSDs didn't help much, so I'm assuming at this point that the
> server overloads are what's holding up the client.

Re: heavy insert load overloads CPUs, with MutationStage pending

Posted by Keith Freeman <8f...@gmail.com>.

On 09/10/2013 11:42 AM, Nate McCall wrote:
> With SSDs, you can turn up memtable_flush_writers - try 3 initially (1 
> by default) and see what happens. However, given that there are no 
> entries in 'All time blocked' for such, they may be something else.
Tried that, it seems to have reduced the loads a little after everything 
warmed-up, but not much.
>
> How are you inserting the data?

A java client on a separate box using the datastax java driver, 48 
threads writing 100 records each iteration as prepared batch statements.

At 5000 records/sec, the servers just can't keep up, so the client backs 
up.  That's only 5M of data/sec, which doesn't seem like much.  As I 
mentioned, switching to SSDs didn't help much, so I'm assuming at this 
point that the server overloads are what's holding up the client.

Re: heavy insert load overloads CPUs, with MutationStage pending

Posted by Nate McCall <na...@thelastpickle.com>.

With SSDs, you can turn up memtable_flush_writers - try 3 initially (1 by
default) and see what happens. However, given that there are no entries in
'All time blocked' for such, they may be something else.

How are you inserting the data?


On Tue, Sep 10, 2013 at 12:40 PM, Keith Freeman <8f...@gmail.com> wrote:

>
> On 09/10/2013 11:17 AM, Robert Coli wrote:
>
> On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <8f...@gmail.com> wrote:
>
>> On my 3-node cluster (v1.2.8) with 4-cores each and SSDs for commitlog
>> and data
>
>
>  On SSD, you don't need to separate commitlog and data. You only win from
> this separation if you have a head to not-move between appends to the
> commit log. You will get better IO from a strip with an additional SSD.
>
> Right, actually both partitions are on the same SSD.   Assuming you meant
> "stripe", would that really make a difference
>
>
>
>>  Pool Name                    Active   Pending      Completed   Blocked
>>>  All time blocked
>>> MutationStage                     1         9         290394         0
>>>               0
>>> FlushWriter                       1         2             20         0
>>>               0
>>>
>>
>
>>  I can't seem find information about the real meaning of MutationStage,
>> is this just normal for lots of inserts?
>>
>
>  The mutation stage is the stage in which mutations to rows in memtables
> ("writes") occur.
>
>  The FlushWriter stage is the stage that turns memtables into SSTables by
> flushing them.
>
>  However, 9 pending mutations is a very small number. For reference on an
> overloaded cluster which was being written to death I recently saw....
> 1216434 pending MutationStage. What problem other than "high CPU load" are
> you experiencing? 2 Pending FlushWriters is slightly suggestive of some
> sort of bound related to flushing..
>
> So the basic problem is that write performance is lower than I expected.
> I can't get sustained writing of 5000 ~1024-byte records / sec at RF=2 on a
> good 3-node cluster, and my only guess is that's because of the heavy CPU
> loads on the server (loads over 10 on 4-CPU systems).  I've tried both a
> single client writing 5000 rows/second and 2 clients (on separate boxes)
> writing 2500 rows/second, and in both cases the server(s) doesn't respond
> quickly enough to maintain that rate.  It keeps up ok with 2000 or 3000
> rows per second (and has lower server loads).
>
>
>

Re: heavy insert load overloads CPUs, with MutationStage pending

Posted by Keith Freeman <8f...@gmail.com>.

On 09/10/2013 11:17 AM, Robert Coli wrote:
> On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <8forty@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     On my 3-node cluster (v1.2.8) with 4-cores each and SSDs for
>     commitlog and data
>
>
> On SSD, you don't need to separate commitlog and data. You only win 
> from this separation if you have a head to not-move between appends to 
> the commit log. You will get better IO from a strip with an additional 
> SSD.
Right, actually both partitions are on the same SSD.   Assuming you 
meant "stripe", would that really make a difference
>
>         Pool Name                    Active   Pending  Completed  
>         Blocked  All time blocked
>         MutationStage                     1         9 290394         0
>                         0
>         FlushWriter                       1         2   20         0  
>                       0
>
>     I can't seem find information about the real meaning of
>     MutationStage, is this just normal for lots of inserts?
>
>
> The mutation stage is the stage in which mutations to rows in 
> memtables ("writes") occur.
>
> The FlushWriter stage is the stage that turns memtables into SSTables 
> by flushing them.
>
> However, 9 pending mutations is a very small number. For reference on 
> an overloaded cluster which was being written to death I recently 
> saw.... 1216434 pending MutationStage. What problem other than "high 
> CPU load" are you experiencing? 2 Pending FlushWriters is slightly 
> suggestive of some sort of bound related to flushing..
So the basic problem is that write performance is lower than I 
expected.  I can't get sustained writing of 5000 ~1024-byte records / 
sec at RF=2 on a good 3-node cluster, and my only guess is that's 
because of the heavy CPU loads on the server (loads over 10 on 4-CPU 
systems).  I've tried both a single client writing 5000 rows/second and 
2 clients (on separate boxes) writing 2500 rows/second, and in both 
cases the server(s) doesn't respond quickly enough to maintain that 
rate.  It keeps up ok with 2000 or 3000 rows per second (and has lower 
server loads).

Re: heavy insert load overloads CPUs, with MutationStage pending

Posted by Keith Freeman <8f...@gmail.com>.

I have RF=2

On 09/10/2013 11:18 AM, Robert Coli wrote:
> On Tue, Sep 10, 2013 at 10:17 AM, Robert Coli <rcoli@eventbrite.com 
> <ma...@eventbrite.com>> wrote:
>
>     On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <8forty@gmail.com
>     <ma...@gmail.com>> wrote:
>
>         On my 3-node cluster (v1.2.8) with 4-cores each and SSDs for
>         commitlog and data
>
>
> BTW, is RF=3? If so, you effectively have a 1 node cluster while writing.
>
> =Rob

Re: heavy insert load overloads CPUs, with MutationStage pending

Posted by Robert Coli <rc...@eventbrite.com>.

On Tue, Sep 10, 2013 at 10:17 AM, Robert Coli <rc...@eventbrite.com> wrote:

> On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <8f...@gmail.com> wrote:
>
>> On my 3-node cluster (v1.2.8) with 4-cores each and SSDs for commitlog
>> and data
>
>
BTW, is RF=3? If so, you effectively have a 1 node cluster while writing.

=Rob

Re: heavy insert load overloads CPUs, with MutationStage pending

Posted by Robert Coli <rc...@eventbrite.com>.

On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <8f...@gmail.com> wrote:

> On my 3-node cluster (v1.2.8) with 4-cores each and SSDs for commitlog and
> data

On SSD, you don't need to separate commitlog and data. You only win from
this separation if you have a head to not-move between appends to the
commit log. You will get better IO from a strip with an additional SSD.

> Pool Name                    Active   Pending      Completed   Blocked
>>  All time blocked
>> MutationStage                     1         9         290394         0
>>               0
>> FlushWriter                       1         2             20         0
>>               0
>>
>

> I can't seem find information about the real meaning of MutationStage, is
> this just normal for lots of inserts?
>

The mutation stage is the stage in which mutations to rows in memtables
("writes") occur.

The FlushWriter stage is the stage that turns memtables into SSTables by
flushing them.

However, 9 pending mutations is a very small number. For reference on an
overloaded cluster which was being written to death I recently saw....
1216434 pending MutationStage. What problem other than "high CPU load" are
you experiencing? 2 Pending FlushWriters is slightly suggestive of some
sort of bound related to flushing..

> Also, switching from spinning disks to SSDs didn't seem to significantly
> improve insert performance, so it seems clear my use-case it totally
> CPU-bound.  Cassandra docs say "Insert-heavy workloads are CPU-bound in
> Cassandra before becoming memory-bound.", so I guess that's what I'm
> seeing, but there's no explanation. So I'm wonder what's overloading my
> CPUs, and is there anything I can do about it short of adding more nodes?
>

Insert performance is pretty optimized from an I/O perspective. There is
probably not too much you can do. You can disable durability guarantees if
you truly require insert performance at all costs.

That said, the percentage of people running Cassandra on SSDs is still
relatively low. It is likely that performance improvements wrt CPU usage
are possible.

=Rob