You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Robert Rudduck (JIRA)" <ji...@apache.org> on 2014/08/18 04:48:18 UTC

[jira] [Commented] (CASSANDRA-7220) Nodes hang with 100% CPU load

    [ https://issues.apache.org/jira/browse/CASSANDRA-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100232#comment-14100232 ] 

Robert Rudduck commented on CASSANDRA-7220:
-------------------------------------------

I am experiencing the same symptoms, but do not have any special cluster settings. 4 Nodes (VMs), nearly all writes with very few reads. Randomly after a day or two a node will appear offline in OpsCenter, but logging into the VM the cassandra process is running @ 100% cpu but is not responding. I can provide logs if needed.

Thanks.

- Robert

> Nodes hang with 100% CPU load
> -----------------------------
>
>                 Key: CASSANDRA-7220
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7220
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: C* 2.0.7
> 4 nodes cluster on 12 core machines
>            Reporter: Robert Stupp
>            Assignee: Ryan McGuire
>         Attachments: c-12-read-100perc-cpu.zip
>
>
> I've ran a test that both reads and writes rows.
> After some time, all writes succeeded and all reads stopped.
> Two of the four nodes have 16 of 16 threads of the "ReadStage" thread pool running. The number of pending task continuouly grows on these nodes.
> I have attached outputs of the stack traces and some diagnostic output from "nodetool tpstats"
> "nodetool status" shows all nodes as UN.
> I had run that test previously without any issues in with the same configuration.
> Some "specials" from cassandra.yaml:
> - key_cache_size_in_mb: 1024
> - row_cache_size_in_mb: 8192
> The nodes running at 100% CPU are "node2" and "node3". node1&node4 are fine.
> I'm not sure if it is reproducable - but it's definitly not a good behaviour.



--
This message was sent by Atlassian JIRA
(v6.2#6252)