You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Philippe <wa...@gmail.com> on 2011/12/10 17:05:22 UTC

Meaning of values in tpstats

Hello,

Here's an example tpstats on one node in my cluster. I only issue
multigetslice reads to counter columns
Pool Name                    Active   Pending      Completed   Blocked  All
time blocked
ReadStage                        27      2166     3565927301         0
            0
MutationStage                     1         1       55802973         0
            0

What does ReadStage.Pending exactly mean ?

   1. the number of keys left to query (because I batch) ?
   2. or the number of multigetslice requests issued to that node for
   execution ?

Same question for MutationStage (mutating counter columns only)

Thanks

Re: Meaning of values in tpstats

Posted by Philippe <wa...@gmail.com>.
>
> Took me a while to figure out that // == "parallel" :)
>
Sorry, that's left over from Math classes :)


> I'm pretty sure (but not entirely, I'd have to check the code) that
> the request is forwarded as one request to the necessary node(s); what
>
Humm... hadn't even thought of that forwarding aspect.
So can someone knowledgeable confirm that a multigetslice on counter super
columns (assume CL=QUORUM):

   - gets sent in whole to the coordinator
   - the coordinator resends it as a whole to RF replicas and waits for
   QUORUM identical responses (and how does that work in a multiget ? Does it
   hash the whole multiget or key by key ?). Does this use one thread ? What
   JMX counter does it show up in ?
   - each replica runs the multiget in parallel so a multiget with a lot of
   keys will saturate the ReadStage.Active counter
   - the coordinator sends the result back.Does this use one thread ? What
   JMX counter does it show up in ?



> I was saying rather was that the individual gets get queued up as
> individual tasks to be executed internally in the different stages.
> That does lead to parallelism locally on the node (subject to the
> concurrent reader setting.
>
Ok. So in my case, average ping is 0.2ms and stable while average
multigetslice is 40ms for 512 keys at a time. So network time is accounting
for 0.5% of query time : I can really lower the batch size without it
hurting too much. And given the better results I've seen on my workload
with smaller batches, I'm going to do just that.

 Philippe

Re: Meaning of values in tpstats

Posted by Peter Schuller <pe...@infidyne.com>.
>> With the slicing, I'm not sure off the top of my head. I'm sure
>> someone else can chime in. For e.g. a multi-get, they end up as
>> independent tasks.
>
> So if I multiget 10 keys, they are fetched in //, consolidated by the
> coodinator and then sent back ?

Took me a while to figure out that // == "parallel" :)

I'm pretty sure (but not entirely, I'd have to check the code) that
the request is forwarded as one request to the necessary node(s); what
I was saying rather was that the individual gets get queued up as
individual tasks to be executed internally in the different stages.
That does lead to parallelism locally on the node (subject to the
concurrent reader setting.


>  Agreed, I followed someone suggestion some time ago to reduce my batch
> sizes and it has helped tremendoulsy. I'm now doing multigetslices in
> batchers of 512 instead of 5000 and I find I no longer have Pendings up so
> high. The most I see now is a couple hundred.

In general, the best balance will depend on the situation. For example
the benefit of batching increases as the latency to the cluster (and
within it) increases, and the negative effects increase as you have
higher demands of low latency on other traffic to the cluster.


-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Meaning of values in tpstats

Posted by Philippe <wa...@gmail.com>.
Answer below


> > Pool Name                    Active   Pending      Completed   Blocked
>  All
> > time blocked
> > ReadStage                        27      2166     3565927301         0
> With the slicing, I'm not sure off the top of my head. I'm sure
> someone else can chime in. For e.g. a multi-get, they end up as
> independent tasks.
>
So if I multiget 10 keys, they are fetched in //, consolidated by the
coodinator and then sent back ?
Can anyone confirm for multigetslice ? I want to know if batching is
counterproductive really.

Typically having pending persistently above 0 for ReadStage or
> MutationStage, especially if more than a hand-ful, means that you are
> having a performance issue - either capacity problem or something
> else, as incoming requests will have to wait to be services. Typically
> the most common effect is that you are bottlenecking on I/O and
> ReadStage pending shoots through the roof.

In general, batching is good - but don't overdo it, especially for
> reads, and especially if you're going to disk for the workload.
>
 Agreed, I followed someone suggestion some time ago to reduce my batch
sizes and it has helped tremendoulsy. I'm now doing multigetslices in
batchers of 512 instead of 5000 and I find I no longer have Pendings up so
high. The most I see now is a couple hundred.

Re: Meaning of values in tpstats

Posted by Philippe <wa...@gmail.com>.
Was't that on the 1.0 branch ? I'm still running 0.8x ?

@Peter: investigating a little more before answering. Thanks

2011/12/10 Edward Capriolo <ed...@gmail.com>

> There was a recent patch that fixed an issue where counters were hitting
> the same natural endpoint rather then being randomized across all of them.
>
>
> On Saturday, December 10, 2011, Peter Schuller <
> peter.schuller@infidyne.com> wrote:
> >> Pool Name                    Active   Pending      Completed   Blocked
>  All
> >> time blocked
> >> ReadStage                        27      2166     3565927301         0
> >
> > In general, "active" refers to work that is being executed right now,
> > "pending" refers to work that is waiting to be executed (go into
> > "active"), and completed is the cumulative all-time (since node start)
> > count of the number of tasks executed.
> >
> > With the slicing, I'm not sure off the top of my head. I'm sure
> > someone else can chime in. For e.g. a multi-get, they end up as
> > independent tasks.
> >
> > Typically having pending persistently above 0 for ReadStage or
> > MutationStage, especially if more than a hand-ful, means that you are
> > having a performance issue - either capacity problem or something
> > else, as incoming requests will have to wait to be services. Typically
> > the most common effect is that you are bottlenecking on I/O and
> > ReadStage pending shoots through the roof.
> >
> > There are exceptions. If you e.g. submit a really large multi-get of
> > 5000, that will naturally lead to a spike (and if all 5000 of them
> > need to go down to disk, the spike will survive for a bit). If you are
> > ONLY doing these queries, that's not a problem per se. But if you are
> > also expecting other requests to have low latency, then you want to
> > avoid it.
> >
> > In general, batching is good - but don't overdo it, especially for
> > reads, and especially if you're going to disk for the workload.
> >
> > --
> > / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
> >
>

Re: Meaning of values in tpstats

Posted by Edward Capriolo <ed...@gmail.com>.
There was a recent patch that fixed an issue where counters were hitting
the same natural endpoint rather then being randomized across all of them.

On Saturday, December 10, 2011, Peter Schuller <pe...@infidyne.com>
wrote:
>> Pool Name                    Active   Pending      Completed   Blocked
 All
>> time blocked
>> ReadStage                        27      2166     3565927301         0
>
> In general, "active" refers to work that is being executed right now,
> "pending" refers to work that is waiting to be executed (go into
> "active"), and completed is the cumulative all-time (since node start)
> count of the number of tasks executed.
>
> With the slicing, I'm not sure off the top of my head. I'm sure
> someone else can chime in. For e.g. a multi-get, they end up as
> independent tasks.
>
> Typically having pending persistently above 0 for ReadStage or
> MutationStage, especially if more than a hand-ful, means that you are
> having a performance issue - either capacity problem or something
> else, as incoming requests will have to wait to be services. Typically
> the most common effect is that you are bottlenecking on I/O and
> ReadStage pending shoots through the roof.
>
> There are exceptions. If you e.g. submit a really large multi-get of
> 5000, that will naturally lead to a spike (and if all 5000 of them
> need to go down to disk, the spike will survive for a bit). If you are
> ONLY doing these queries, that's not a problem per se. But if you are
> also expecting other requests to have low latency, then you want to
> avoid it.
>
> In general, batching is good - but don't overdo it, especially for
> reads, and especially if you're going to disk for the workload.
>
> --
> / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
>

Re: Meaning of values in tpstats

Posted by Peter Schuller <pe...@infidyne.com>.
> Pool Name                    Active   Pending      Completed   Blocked  All
> time blocked
> ReadStage                        27      2166     3565927301         0

In general, "active" refers to work that is being executed right now,
"pending" refers to work that is waiting to be executed (go into
"active"), and completed is the cumulative all-time (since node start)
count of the number of tasks executed.

With the slicing, I'm not sure off the top of my head. I'm sure
someone else can chime in. For e.g. a multi-get, they end up as
independent tasks.

Typically having pending persistently above 0 for ReadStage or
MutationStage, especially if more than a hand-ful, means that you are
having a performance issue - either capacity problem or something
else, as incoming requests will have to wait to be services. Typically
the most common effect is that you are bottlenecking on I/O and
ReadStage pending shoots through the roof.

There are exceptions. If you e.g. submit a really large multi-get of
5000, that will naturally lead to a spike (and if all 5000 of them
need to go down to disk, the spike will survive for a bit). If you are
ONLY doing these queries, that's not a problem per se. But if you are
also expecting other requests to have low latency, then you want to
avoid it.

In general, batching is good - but don't overdo it, especially for
reads, and especially if you're going to disk for the workload.

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)