You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cassandra.apache.org by Dominic Chevalier <dc...@gmail.com> on 2016/01/08 22:41:31 UTC

Blocking behavior of StorageProxy.fetchRows

Hello,

tldr;

It looks like StorageProxy.fetchRows blocking for read responses can get
pretty bad during quorum reads involving many geographically distant data
centers. If this is true, why doesn't the coordinator handle replies
asynchronously to keep over all throughput up?

Long;

I'm running apache cassandra 2.0.16 with ~400 nodes total, spread
throughout 5 AWS regions globally.

I tried running many (hundreds) simultaneous paged range scans over large
token ranges, 'select * from table where token(partition_key) >= ? and
token(partition_key) < ?' at consistency level QUORUM. Replication factor
3. Row size is small, a few hundred bytes max.

This caused the cassandra nodes in the local data center hosting the
application to become quite sluggish to other queries. Upon investigation
of the code, it looks like, and comments say the same, that
StorageProxy.fetchRows blocks for reads, even if the read comes from a
remote node.

Based on the behavior I observed, and the impact on other queries, I
suspected the quorum reads were blocking the read stage executor pool of
the coordinator nodes.

If I've drawn the correct conclusions, why does the read stage block for
reads from other nodes, especially nodes in remote datacenter where latency
is not small, rather than asynchronously processing read replies and
freeing up the read stage threads?

I came across https://issues.apache.org/jira/browse/CASSANDRA-10989 which
seems to target performance improvements in the threading model, which made
me more curious about the above question.

Thoughts and info are greatly appreciated.

Kindly,
Dominic

Re: Blocking behavior of StorageProxy.fetchRows

Posted by Dominic Chevalier <dc...@gmail.com>.

Yes, LOCAL_QUORUM is usually sufficient and I use this consistency level
often, but at times a stronger consistency level is desired. Even within a
single DC, blocking in a fixed size thread pool on multiple network calls
is less than ideal.

If light weight transactions are being adopted warmly by the user base,
then the need for strong consistency reads increases.

Operationally, we like running background jobs in low-traffic regions, or
where most of our users are in bed for the night, so being sure we're
getting the most accurate data globally is of very high importance. Some of
these jobs might slip in before other functions like read-repair and
nodetool repair have been able to clean things up.

So my question to the community is, if my assessment of current threading
during read-stage  is correct, are there plans or thoughts on moving to a
more asynchronous read response gathering model?

Also, this seems like  a blind spot in current cassandra metrics (at least
in my 2.0.x world), that we do not have metrics on how long queries are
queued up before being processed. The recent read latency metric continues
to indicate normal even during high over all system latency.

On Fri, Jan 8, 2016 at 7:03 PM, Russell Bradberry <rb...@gmail.com>
wrote:

> While using LOCAL_QUORUM may be a solution in a lot of use cases, there
> are definitely use cases where reading at full QUORUM is required, think
> finance, medical, military. I think for these types of use cases using non
> blocking behavior will be an incredible improvement in performance. Even
> for LOCAL_QUORUM it would be a great improvement.
> Simply stating to use LQ makes it seem like this use case is meritless,
> when it surely is not.
> Plus anything that improves performance is, in my opinion, a good idea.
> Whether or not it is worth the development investment is not something I
> can speak on.
>
>
>
>
> On Fri, Jan 8, 2016 at 4:48 PM -0800, "Jonathan Haddad" <jo...@jonhaddad.com>
> wrote:
>
>
>
>
>
>
>
>
>
>
> Use local quorum, don't talk to remote dcs.
> On Fri, Jan 8, 2016 at 1:41 PM Dominic Chevalier
> wrote:
>
> > Hello,
> >
> > tldr;
> >
> > It looks like StorageProxy.fetchRows blocking for read responses can get
> > pretty bad during quorum reads involving many geographically distant data
> > centers. If this is true, why doesn't the coordinator handle replies
> > asynchronously to keep over all throughput up?
> >
> > Long;
> >
> > I'm running apache cassandra 2.0.16 with ~400 nodes total, spread
> > throughout 5 AWS regions globally.
> >
> > I tried running many (hundreds) simultaneous paged range scans over large
> > token ranges, 'select * from table where token(partition_key) >= ? and
> > token(partition_key) < ?' at consistency level QUORUM. Replication factor
> > 3. Row size is small, a few hundred bytes max.
> >
> > This caused the cassandra nodes in the local data center hosting the
> > application to become quite sluggish to other queries. Upon investigation
> > of the code, it looks like, and comments say the same, that
> > StorageProxy.fetchRows blocks for reads, even if the read comes from a
> > remote node.
> >
> > Based on the behavior I observed, and the impact on other queries, I
> > suspected the quorum reads were blocking the read stage executor pool of
> > the coordinator nodes.
> >
> > If I've drawn the correct conclusions, why does the read stage block for
> > reads from other nodes, especially nodes in remote datacenter where
> latency
> > is not small, rather than asynchronously processing read replies and
> > freeing up the read stage threads?
> >
> > I came across https://issues.apache.org/jira/browse/CASSANDRA-10989
> which
> > seems to target performance improvements in the threading model, which
> made
> > me more curious about the above question.
> >
> > Thoughts and info are greatly appreciated.
> >
> > Kindly,
> > Dominic
> >
>
>
>
>
>
>

Re: Blocking behavior of StorageProxy.fetchRows

Posted by Russell Bradberry <rb...@gmail.com>.

While using LOCAL_QUORUM may be a solution in a lot of use cases, there are definitely use cases where reading at full QUORUM is required, think finance, medical, military. I think for these types of use cases using non blocking behavior will be an incredible improvement in performance. Even for LOCAL_QUORUM it would be a great improvement.
Simply stating to use LQ makes it seem like this use case is meritless, when it surely is not.
Plus anything that improves performance is, in my opinion, a good idea.
Whether or not it is worth the development investment is not something I can speak on.




On Fri, Jan 8, 2016 at 4:48 PM -0800, "Jonathan Haddad" <jo...@jonhaddad.com> wrote:










Use local quorum, don't talk to remote dcs.
On Fri, Jan 8, 2016 at 1:41 PM Dominic Chevalier 
wrote:

> Hello,
>
> tldr;
>
> It looks like StorageProxy.fetchRows blocking for read responses can get
> pretty bad during quorum reads involving many geographically distant data
> centers. If this is true, why doesn't the coordinator handle replies
> asynchronously to keep over all throughput up?
>
> Long;
>
> I'm running apache cassandra 2.0.16 with ~400 nodes total, spread
> throughout 5 AWS regions globally.
>
> I tried running many (hundreds) simultaneous paged range scans over large
> token ranges, 'select * from table where token(partition_key) >= ? and
> token(partition_key) < ?' at consistency level QUORUM. Replication factor
> 3. Row size is small, a few hundred bytes max.
>
> This caused the cassandra nodes in the local data center hosting the
> application to become quite sluggish to other queries. Upon investigation
> of the code, it looks like, and comments say the same, that
> StorageProxy.fetchRows blocks for reads, even if the read comes from a
> remote node.
>
> Based on the behavior I observed, and the impact on other queries, I
> suspected the quorum reads were blocking the read stage executor pool of
> the coordinator nodes.
>
> If I've drawn the correct conclusions, why does the read stage block for
> reads from other nodes, especially nodes in remote datacenter where latency
> is not small, rather than asynchronously processing read replies and
> freeing up the read stage threads?
>
> I came across https://issues.apache.org/jira/browse/CASSANDRA-10989 which
> seems to target performance improvements in the threading model, which made
> me more curious about the above question.
>
> Thoughts and info are greatly appreciated.
>
> Kindly,
> Dominic
>

Re: Blocking behavior of StorageProxy.fetchRows

Posted by Jonathan Haddad <jo...@jonhaddad.com>.

Use local quorum, don't talk to remote dcs.
On Fri, Jan 8, 2016 at 1:41 PM Dominic Chevalier <dc...@gmail.com>
wrote:

> Hello,
>
> tldr;
>
> It looks like StorageProxy.fetchRows blocking for read responses can get
> pretty bad during quorum reads involving many geographically distant data
> centers. If this is true, why doesn't the coordinator handle replies
> asynchronously to keep over all throughput up?
>
> Long;
>
> I'm running apache cassandra 2.0.16 with ~400 nodes total, spread
> throughout 5 AWS regions globally.
>
> I tried running many (hundreds) simultaneous paged range scans over large
> token ranges, 'select * from table where token(partition_key) >= ? and
> token(partition_key) < ?' at consistency level QUORUM. Replication factor
> 3. Row size is small, a few hundred bytes max.
>
> This caused the cassandra nodes in the local data center hosting the
> application to become quite sluggish to other queries. Upon investigation
> of the code, it looks like, and comments say the same, that
> StorageProxy.fetchRows blocks for reads, even if the read comes from a
> remote node.
>
> Based on the behavior I observed, and the impact on other queries, I
> suspected the quorum reads were blocking the read stage executor pool of
> the coordinator nodes.
>
> If I've drawn the correct conclusions, why does the read stage block for
> reads from other nodes, especially nodes in remote datacenter where latency
> is not small, rather than asynchronously processing read replies and
> freeing up the read stage threads?
>
> I came across https://issues.apache.org/jira/browse/CASSANDRA-10989 which
> seems to target performance improvements in the threading model, which made
> me more curious about the above question.
>
> Thoughts and info are greatly appreciated.
>
> Kindly,
> Dominic
>