You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2015/08/24 17:57:46 UTC
[jira] [Resolved] (CASSANDRA-8035) 2.0.x repair causes large
increasein client latency even for small datasets
[ https://issues.apache.org/jira/browse/CASSANDRA-8035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis resolved CASSANDRA-8035.
---------------------------------------
Resolution: Cannot Reproduce
Fix Version/s: (was: 2.0.x)
Closing as cantrepro since 2.0 is EOL. Please reopen if you see this on 2.1+
> 2.0.x repair causes large increasein client latency even for small datasets
> ---------------------------------------------------------------------------
>
> Key: CASSANDRA-8035
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8035
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: c-2.0.10, 3 nodes per @ DCs. Load < 50 MB
> Reporter: Chris Burroughs
> Attachments: cl-latency.png, cpu-idle.png, keyspace-99p.png, row-cache-hit-rate.png
>
>
> Running repair causes a significnat increase in client latency even when the total amount of data per node is very small.
> Each node 900 req/s and during normal operations the 99p Client Request Lantecy is less than 4 ms and usually less than 1ms. During repair the latency increases to within 4-10ms on all nodes. I am unable to find any resource based explantion for this. Several graphs are attached to summarize. Repair started at about 10:10 and finished around 10:25.
> * Client Request Latency goes up significantly.
> * Local keyspace read latency is flat. I interpret this to mean that it's purly coordinator overhead that's causing the slowdown.
> * Row cache hit rate is unaffected ( and is very high). Between these two metrics I don't think there is any doubt that virtually all reads are being satisfied in memory.
> * There is plenty of available cpu. Aggregate cpu used (mostly nic) did go up during this.
> Having more/larger keyspaces seems to make it worse. Having two keyspaces on this cluster (still with total size << RAM) caused larger increases in latency which would have made for better graphs but it pushed the cluster well outsid of SLAs and we needed to move the second keyspace.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)