You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Nathan Bijnens <na...@nathan.gs> on 2015/07/09 13:09:53 UTC

ReadStage high number of pending tasks

We are having issues where one node out of 6 shows increasing ReadStage
pending tasks, for all others this is close to 0.

[image: graph.png]

When connecting VisualVM to Cassandra we see that the following method
takes all cpu time, this method is called from all the ReadStage threads:
org.apache.cassandra.db.RangeTombstoneList.addInternal(RangeTombstoneList.java:545)

[image: Capture.PNG]

Because the ReadStage pending is so high the node almost stops responding.

The cpu usage on this node is much higher than on all the other ones:
[image: stacked.png]
There is also almost no disk IO going on.

When restarting cassandra the load drops for a second and rises again. The
ReadStage pending tasks reset to 0, but rice fast again.

The following settings are configured:
read_request_timeout_in_ms: 60000
range_request_timeout_in_ms: 10000
write_request_timeout_in_ms: 10000
truncate_request_timeout_in_ms: 60000
request_timeout_in_ms: 10000

Anyone seen similar behaviour?

We are using Cassandra 2.0.14.

Thanks,
  Nathan

Re: ReadStage high number of pending tasks

Posted by Alain RODRIGUEZ <ar...@gmail.com>.
Hi Nathan,

This "read_request_timeout_in_ms: 60000" looks pretty high to me. I imagine
that if you read heavily then any GC could make reads to wait and with such
a high timeout configuration you will keep trying answering queries for a
long period of time, meanwhile new read will stacks. Depending on your
reasons, you might want to revert this to default (5000 ?).  As your CPU
doesn't seems to be limiting and you disk isn't bounded either your could
also try increasing the concurrent_reads parameter in the YAML file. You
might also want to check for tombstones or GC warning in the logs.

Distinct behaviours on distinct nodes come mainly from distinct hardware
(hardware failure ?) or unbalanced load (are all your tables well
distributed across the cluster ?).

C*heers,

Alain

2015-07-09 13:09 GMT+02:00 Nathan Bijnens <na...@nathan.gs>:

> We are having issues where one node out of 6 shows increasing ReadStage
> pending tasks, for all others this is close to 0.
>
> [image: graph.png]
>
> When connecting VisualVM to Cassandra we see that the following method
> takes all cpu time, this method is called from all the ReadStage threads:
>
> org.apache.cassandra.db.RangeTombstoneList.addInternal(RangeTombstoneList.java:545)
>
> [image: Capture.PNG]
>
> Because the ReadStage pending is so high the node almost stops responding.
>
> The cpu usage on this node is much higher than on all the other ones:
> [image: stacked.png]
> There is also almost no disk IO going on.
>
> When restarting cassandra the load drops for a second and rises again. The
> ReadStage pending tasks reset to 0, but rice fast again.
>
> The following settings are configured:
> read_request_timeout_in_ms: 60000
> range_request_timeout_in_ms: 10000
> write_request_timeout_in_ms: 10000
> truncate_request_timeout_in_ms: 60000
> request_timeout_in_ms: 10000
>
> Anyone seen similar behaviour?
>
> We are using Cassandra 2.0.14.
>
> Thanks,
>   Nathan
>