You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by David Dabbs <dm...@gmail.com> on 2011/02/03 06:35:24 UTC

Tracking down read latency

Hello.

We’re encountering some high read latency issues. But our main Cass expert
is out-of-office so it falls to me.
We're more read than write, though there doesn't seem to be many pending
reads.
I have seen active/pending row-read at three or four, though.

Pool Name                    Active   Pending      Completed
FILEUTILS-DELETE-POOL             0         0             46
STREAM-STAGE                      0         0              0
RESPONSE-STAGE                    0         0       17471880
ROW-READ-STAGE                    1         1       37652361
LB-OPERATIONS                     0         0              0
MISCELLANEOUS-POOL                0         0              0
GMFD                              0         0         154630
LB-TARGET                         0         0              0
CONSISTENCY-MANAGER               0         0        2993464
ROW-MUTATION-STAGE                0         0       16383305
MESSAGE-STREAMING-POOL            0         0              0
LOAD-BALANCER-STAGE               0         0              0
FLUSH-SORTER-POOL                 0         0              0
MEMTABLE-POST-FLUSHER             0         0            116
FLUSH-WRITER-POOL                 0         0            116
AE-SERVICE-STAGE                  0         0              0
HINTED-HANDOFF-POOL               0         0             16


Does the high iops on our data mean we need to tune Key or other caches?
 
$ iostat
Linux 2.6.18-194.11.3.el5 02/03/2011

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.00    0.00    0.25    1.32    0.00   97.43

Device:            tps   Blk_read/s   Blk_wrtn/s     Blk_read    Blk_wrtn
sda               1.52         6.66        20.53     88886262   273904650
sda1              0.00         0.00         0.00         2484          18
sda2              1.46         5.68        18.95     75741946   252831120
sda3              0.06         0.98         1.58     13141400    21073512
# data here
sdb             103.06     13964.72      2718.28 186315436859 36266884800
sdb1            103.06     13964.72      2718.28 186315436235 36266884800
# commit logs here
sdc               1.47         1.71       309.36     22800725  4127423000
sdc1              1.47         1.71       309.36     22799901  4127423000



We're running on a beefy 64-bit Nehalem, so mmap should be
available/possible.
I need to check with our Cassandra lead when he's available as to why we're
not using mmap or auto.

>From /opt/cassandra/conf/storage-conf.xml. 

    <DiskAccessMode>standard</DiskAccessMode>



Heap size is 16gb.

JVM_OPTS=" \
        -ea \
        -Xms16G \
        -Xmx16G \
        -XX:+UseParNewGC \
        -XX:+UseConcMarkSweepGC \
        -XX:+CMSParallelRemarkEnabled \
        -XX:SurvivorRatio=8 \
        -XX:MaxTenuringThreshold=1 \
        -XX:CMSInitiatingOccupancyFraction=75 \
        -XX:+UseCMSInitiatingOccupancyOnly \
        -XX:+HeapDumpOnOutOfMemoryError \
        -XX:+UseCompressedOops \
        -XX:+UseThreadPriorities \
        -XX:ThreadPriorityPolicy=42 \
        -Dcassandra.compaction.priority=1"




If I've omitted any key infos, please advise and I'll provide.


Thanks,

David




Re: Tracking down read latency

Posted by Robert Coli <rc...@digg.com>.
On Wed, Feb 2, 2011 at 9:35 PM, David Dabbs <dm...@gmail.com> wrote:
> We’re encountering some high read latency issues.

What is reporting high read latency?

> We're more read than write, though there doesn't seem to be many pending
> reads.
> I have seen active/pending row-read at three or four, though.

In general if you were I/O bound on reads (the most common
pathological case) you would see much higher row-read stage pending.

> [ sane looking tpstats ]

Your tpstats does not look like a node which is struggling.

> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            1.00    0.00    0.25    1.32    0.00   97.43

Your system also seems to not be breaking a sweat.

> We're running on a beefy 64-bit Nehalem, so mmap should be
> available/possible.
> I need to check with our Cassandra lead when he's available as to why we're
> not using mmap or auto.

Probably because of :

https://issues.apache.org/jira/browse/CASSANDRA-1214

> Heap size is 16gb.

16gb out of how much total?

Do your GC logs seem to indicate reasonable GC performance? Do all
nodes generally have a complete view of the ring and all nodes
generally seem to be up?

=Rob

Re: Tracking down read latency

Posted by sridhar basam <sr...@basam.org>.
On Fri, Feb 4, 2011 at 2:44 PM, David Dabbs <dm...@gmail.com> wrote:

>
> Our data is on sdb, commit logs on sdc.
> So do I read this correctly that we're 'await'ing 6+millis on average for
> data drive (sdb)
> requests to be serviced?
>
>
That is right. Those numbers look pretty good for rotational media. What
sort of read latencies do you see? Have you also looked into GC.

 Sridhar

RE: Tracking down read latency

Posted by David Dabbs <dm...@gmail.com>.
Thank you both for your advice. See my updated iostats below.


>From: sridhar.basam@gmail.com [mailto:sridhar.basam@gmail.com] On Behalf Of
sridhar basam
>Sent: Thursday, February 03, 2011 10:58 AM
>To: user@cassandra.apache.org
>Subject: Re: Tracking down read latency
>
>The data provided is also a average value since boot time. Run the -x as
suggested below but run it via a interval of around 5 seconds. You very well
could be having i/o issue, >it is hard to tell from the overall average
value you provided. Collect "iostat -x 5" during the times when you see slow
reads and see how busy the disks are.

>Sridhar

>On Thu, Feb 3, 2011 at 3:21 AM, Peter Schuller wrote:
>> $ iostat
>
>As rcoli already mentioned you don't seen to have an I/O problem, but
>as a point of general recommendation: When determining whether you are
>blocking on disk I/O, pretty much *always* use "iostat -x" rather than
>the much less useful default mode of iostat. The %util and queue
>wait/average time columns are massively useful/important; without them
>one is much more blind as to whether or not storage devices are
>actually saturated.
>
> Peter Schuller


Our data is on sdb, commit logs on sdc.
So do I read this correctly that we're 'await'ing 6+millis on average for
data drive (sdb) 
requests to be serviced?


$iostat -x 5

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.59    0.00    0.22    0.94    0.00   98.25

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sda3              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdb              11.20     0.00 42.00  0.00  4993.60     0.00   118.90
0.28    6.77   5.22  21.92
sdb1             11.20     0.00 42.00  0.00  4993.60     0.00   118.90
0.28    6.77   5.22  21.92
sdc               0.00    31.00  0.00  1.40     0.00   259.20   185.14
0.00    0.14   0.14   0.02
sdc1              0.00    31.00  0.00  1.40     0.00   259.20   185.14
0.00    0.14   0.14   0.02

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.56    0.00    0.18    1.08    0.00   98.17

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sda3              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdb               8.80     0.00 49.40  0.00  5936.00     0.00   120.16
0.33    6.62   5.22  25.78
sdb1              8.80     0.00 49.40  0.00  5936.00     0.00   120.16
0.33    6.62   5.22  25.78
sdc               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdc1              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.99    0.00    0.22    1.08    0.00   97.71

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sda3              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdb              11.40     0.00 46.20  0.00  5147.20     0.00   111.41
0.30    6.55   5.58  25.80
sdb1             11.40     0.00 46.20  0.00  5147.20     0.00   111.41
0.30    6.55   5.58  25.80
sdc               0.00     7.40  0.00  0.80     0.00    65.60    82.00
0.00    0.25   0.25   0.02
sdc1              0.00     7.40  0.00  0.80     0.00    65.60    82.00
0.00    0.25   0.25   0.02

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.68    0.00    0.23    0.95    0.00   98.13

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00     0.80  0.00  0.80     0.00    12.77    16.00
0.00    0.25   0.25   0.02
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sda2              0.00     0.80  0.00  0.80     0.00    12.77    16.00
0.00    0.25   0.25   0.02
sda3              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdb               5.19     0.00 38.12  0.00  4356.09     0.00   114.26
0.26    6.70   5.91  22.53
sdb1              5.19     0.00 38.12  0.00  4356.09     0.00   114.26
0.26    6.70   5.91  22.53
sdc               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdc1              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00




Re: Tracking down read latency

Posted by sridhar basam <sr...@basam.org>.
The data provided is also a average value since boot time. Run the -x as
suggested below but run it via a interval of around 5 seconds. You very well
could be having i/o issue, it is hard to tell from the overall average value
you provided. Collect "iostat -x 5" during the times when you see slow reads
and see how busy the disks are.


 Sridhar

On Thu, Feb 3, 2011 at 3:21 AM, Peter Schuller
<pe...@infidyne.com>wrote:

> > $ iostat
>
> As rcoli already mentioned you don't seen to have an I/O problem, but
> as a point of general recommendation: When determining whether you are
> blocking on disk I/O, pretty much *always* use "iostat -x" rather than
> the much less useful default mode of iostat. The %util and queue
> wait/average time columns are massively useful/important; without them
> one is much more blind as to whether or not storage devices are
> actually saturated.
>
> --
> / Peter Schuller
>

Re: Tracking down read latency

Posted by Peter Schuller <pe...@infidyne.com>.
> $ iostat

As rcoli already mentioned you don't seen to have an I/O problem, but
as a point of general recommendation: When determining whether you are
blocking on disk I/O, pretty much *always* use "iostat -x" rather than
the much less useful default mode of iostat. The %util and queue
wait/average time columns are massively useful/important; without them
one is much more blind as to whether or not storage devices are
actually saturated.

-- 
/ Peter Schuller