You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict (JIRA)" <ji...@apache.org> on 2014/04/29 00:41:15 UTC
[jira] [Resolved] (CASSANDRA-7103) Very poor performance with simple setup

     [ https://issues.apache.org/jira/browse/CASSANDRA-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benedict resolved CASSANDRA-7103.
---------------------------------

    Resolution: Invalid

[~Martin Bligh] it's helpful if you can take up these sorts of issues on the user list first to see what can be done to address the issues you're having before resorting to filing a ticket. It's likely that CASSANDRA-5417 (in the upcoming 2.1 release) would help dramatically with this, however you can also batch updates (with unlogged batches) to a single cql row - that is, all of those map inserts to the same (time_order, time_start) pairs; performing asynchronous writes to the same (large) cql row is a recipe for many wasted cpu cycles, as each row is copy-on-write updated, so with 16 updating the same row, 15 are simply burning cycles at any given moment. There may be other aspects to consider as well.



> Very poor performance with simple setup
> ---------------------------------------
>
>                 Key: CASSANDRA-7103
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7103
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Fedora 19 (also happens on Ubuntu), Cassandra 2.0.7. dsc standard install
>            Reporter: Martin Bligh
>
> Single node (this is just development, 32GB 20 core server), single disk array.
> Create the following table:
> {code}
> CREATE TABLE reut (
>   time_order bigint,
>   time_start bigint,
>   ack_us map<int, int>,
>   gc_strategy map<text, int>,
>   gc_strategy_symbol map<text, int>,
>   gc_symbol map<text, int>,
>   ge_strategy map<text, int>,
>   ge_strategy_symbol map<text, int>,
>   ge_symbol map<text, int>,
>   go_strategy map<text, int>,
>   go_strategy_symbol map<text, int>,
>   go_symbol map<text, int>,
>   message_type map<text, int>,
>   PRIMARY KEY (time_order, time_start)
> ) WITH
>   bloom_filter_fp_chance=0.010000 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.000000 AND
>   gc_grace_seconds=864000 AND
>   index_interval=128 AND
>   read_repair_chance=0.100000 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   default_time_to_live=0 AND
>   speculative_retry='99.0PERCENTILE' AND
>   memtable_flush_period_in_ms=0 AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={};
> {code}
> Now I just insert data into it (using python driver, async insert, prepared insert statement). Each row only fills out one of the gc_*, go_*, or ge_* columns, and there's something like 20-100 entries per map column, occasionally 1000, but it's nothing huge. 
> First run 685 inserts in 1.004860 seconds (681.687053 Operations/s).
> OK, not great, but that's fine.
> Now throw 50,000 rows at it.
> Now run the first run again, and it takes 53s to do the same insert of 685 rows - I'm getting about 10 rows per second. 
> It's not IO bound - "iostat 1" shows quiescent for 9 seconds, then ~640KB write, then sleeps again - seems like the fflush sync.
> Run "nodetool flush" and performance goes back to as before!!!!
> Not sure why this gets so slow - I think it just builds huge commit logs and memtables, but never writes out to the data/ directory with sstables because I only have one table? That doesn't seem like a good situation. 
> Worse ... if you let the python driver just throw stuff at it async (I think this allows up to 128 request if I understand the underlying protocol, then it gets so slow that a single write takes over 10s and times out). Seems to be some sort of synchronization problem in Java ... if I limit the concurrent async requests to the left column below, I get the number of seconds elapsed on the right:
> 1: 103 seconds
> 2: 63 seconds
> 8: 53 seconds
> 16: 53 seconds
> 32: 66 seconds
> 64: so slow it explodes in timeouts on write (over 10s each).
> I guess there's some thundering herd type locking issue in whatever Java primitive you are using to lock concurrent access to a single table. I know some of the Java concurrent.* stuff has this issue. So for the other tests above, I was limiting async writes to 16 pending.



--
This message was sent by Atlassian JIRA
(v6.2#6252)