You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jeremy Dunck <jd...@gmail.com> on 2010/05/12 04:01:18 UTC

Timed out reads still in queue

Reddit posted a blog entry about some recent downtime, partially due
to issues with Cassandra.
http://blog.reddit.com/2010/05/reddits-may-2010-state-of-servers.html

This part surprised me:
"
First, Cassandra has an internal queue of work to do. When it times
out a client (10s by default), it still leaves the operation in the
queue of work to complete (even though the person that asked for the
read is no longer even holding the socket), which given a constant
stream of requests makes the amount of pending work snowball
effectively infinitely (specifically, ROW-READ-STAGE's PENDING
operations grow unbounded).
"

I've searched Jira for an issue related to this -- it seems like a bug
to have reads in queue when the result is useless (because the reader
is gone).  Obviously a 10-second read is not a normal run condition,
but removing stale reads could remove a cause of cascading failure.

Should I open a ticket, or have I misunderstood something?

Re: Timed out reads still in queue

Posted by Jonathan Ellis <jb...@gmail.com>.
This is a slightly different way of describing
https://issues.apache.org/jira/browse/CASSANDRA-685

On Tue, May 11, 2010 at 9:01 PM, Jeremy Dunck <jd...@gmail.com> wrote:
> Reddit posted a blog entry about some recent downtime, partially due
> to issues with Cassandra.
> http://blog.reddit.com/2010/05/reddits-may-2010-state-of-servers.html
>
> This part surprised me:
> "
> First, Cassandra has an internal queue of work to do. When it times
> out a client (10s by default), it still leaves the operation in the
> queue of work to complete (even though the person that asked for the
> read is no longer even holding the socket), which given a constant
> stream of requests makes the amount of pending work snowball
> effectively infinitely (specifically, ROW-READ-STAGE's PENDING
> operations grow unbounded).
> "
>
> I've searched Jira for an issue related to this -- it seems like a bug
> to have reads in queue when the result is useless (because the reader
> is gone).  Obviously a 10-second read is not a normal run condition,
> but removing stale reads could remove a cause of cascading failure.
>
> Should I open a ticket, or have I misunderstood something?
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com