You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Ariel Weisberg (JIRA)" <ji...@apache.org> on 2015/11/12 17:05:11 UTC

[jira] [Commented] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

    [ https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15002303#comment-15002303 ] 

Ariel Weisberg commented on CASSANDRA-7217:
-------------------------------------------

I was able to reproduce this running the server on my OS X laptop and the client on my quad-core i5 Sandy Bridge Linux desktop.

With 500 threads I was getting 80k op/sec and with 2000 I was getting 30k op/sec.

I took flight recordings, but they are too big to look at and not that interesting. There is more contention detected with a 1 millisecond threshold at 500 threads then at 2000 threads presumably because with 500 threads so much more work is getting done.

CPU utilization at the client is pretty high at 500 threads, above 300%. 18k interrupts/second and 140k context switches/second.

With 2000 threads utilization is lower more towards 250% with closer to 10k interrupts/second, but 250-300k context switches/second.

My hypothesis is that having so many client threads is a problem for the Netty threads because there are more client threads then event threads by a large margin. With only one server there would really only be one since there is a single connection.

In cstar on bdplab I see a sharp drop between 1000 and 1250 threads. I would have expected a graceful slope and the overhead of context switching threads increases so there is still more to be explained.

> Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads
> -------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7217
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Benedict
>            Assignee: Ariel Weisberg
>              Labels: performance, stress, triaged
>             Fix For: 3.1
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)