You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Octavian Rinciog <oc...@gmail.com> on 2018/01/17 17:40:58 UTC
High read rate on hard-disk
Hello!
I am using Cassandra 3.10, on Ubuntu 14.04 and I have a counter
table(RF=1), with the following schema:
CREATE TABLE edges (
src_id text,
src_type text,
source text
weight counter,
PRIMARY KEY ((src_id, src_type), source)
) WITH
compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
SELECT vs UPDATE requests ratio is 0.001. ( Read Count: 3771000, Write
Count: 3401236000, in one month)
We have Counter Cache enabled:
Counter Cache : entries 1018782, size 256 MiB, capacity 256
MiB, 2799913189 hits, 3469459479 requests, 0.807 recent hit rate, 7200
save period in seconds
The problem is that our read rate limit on our hard-disk is always
near 30MBps and our write rate limit is near 500KBps.
One example of output of "iostat -x" is
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.06 1.04 263.65 2.04 28832.42 572.53
146.07 0.36 1.35 0.74 81.16 1.27 33.81
Also with iotop, we saw that are about 8 threads that each goes around
3MB/s read rate.
Total DISK READ : 22.73 M/s | Total DISK WRITE : 494.35 K/s
Actual DISK READ: 22.62 M/s | Actual DISK WRITE: 528.57 K/s
TID PRIO USER DISK READ> DISK WRITE SWAPIN IO COMMAND
14793 be/4 cassandra 3.061 M/s 0.0010 B/s 0.00 % 93.27 % java
-Dcassandra.fd_max_interval_ms=400
The output of strace on these threads is :
strace -cp 14793
Process 14793 attached
^CProcess 14793 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
99.85 32.118518 57 567288 256251 futex
0.15 0.048822 3 15339 write
0.00 0.000000 0 1 rt_sigreturn
------ ----------- ----------- --------- --------- ----------------
100.00 32.167340 582628 256251 total
Despite that iotop shows that this thread is reading with 3MB/s, there
is no read syscall in strace.
I want to ask if actually the futex is responsible for the read rate
and how can we debug this problem further ?
Btw, there are no compaction tasks in progress and there are no SELECT
queries in progress.
Also, I know that for each update, a lock is obtained[1]
Thank you,
[1]https://apache.googlesource.com/cassandra/+/refs/heads/trunk/src/java/org/apache/cassandra/db/CounterMutation.java#121
--
Octavian Rinciog
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org
Re: High read rate on hard-disk
Posted by Octavian Rinciog <oc...@gmail.com>.
Hy Alain,
Thank you for your response.
> - Other than the 'lock', Counters perform an implicit read before the write
> operation.
From what I know there is one counter cache[1], that is used to read
the old values of the counters. According to [2], it is used only for
UPDATE requests
> I would say what you are seeing is expected with this use case. Also, I have
> never seen a use case where using RF = 1 is good idea (excepted for some
> testing maybe). Be aware this data is weak and can easily be lost (if it's a
> deliberate choice, ignore my comment). On the bright side, you have no
> entropy / consistency issues or need for repairs with RF = 1 :D.
Yes, indeed RF=1 policy is our choice (basically because we didn't
manage to scale the counter writes very good and we assumed that we
can loose some data)
[1]https://apache.googlesource.com/cassandra/+/refs/heads/trunk/src/java/org/apache/cassandra/db/CounterMutation.java#193
[2]https://issues.apache.org/jira/browse/CASSANDRA-12500?focusedCommentId=15464023&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15464023
2018-01-18 12:51 GMT+02:00 Alain RODRIGUEZ <ar...@gmail.com>:
> Hello Octavian,
>
>>
>> I have a counter table(RF=1)
>>
>> SELECT vs UPDATE requests ratio is 0.001. ( Read Count: 3771000, Write
>> Count: 3401236000, in one month)
>>
>> SELECT vs UPDATE requests ratio is 0.001. ( Read Count: 3771000, Write
>> Count: 3401236000, in one month)
>
>
>> The problem is that our read rate limit on our hard-disk is always near
>> 30MBps and our write rate limit is near 500KBps.
>
>
> I did not read all your numbers, but here are the internal details you could
> be missing:
>
> - Other than the 'lock', Counters perform an implicit read before the write
> operation. To increment, you need to know about past value. It was true last
> time I used them, I believe there is no real workaround and it's still the
> case today.
> - Writes do not hit the disk synchronously. Instead of this, they are stored
> in the Memtable and only flushed once, sequentially and efficiently. Then
> compactions manages to merge partitions after, asynchronously.
>
> I would say what you are seeing is expected with this use case. Also, I have
> never seen a use case where using RF = 1 is good idea (excepted for some
> testing maybe). Be aware this data is weak and can easily be lost (if it's a
> deliberate choice, ignore my comment). On the bright side, you have no
> entropy / consistency issues or need for repairs with RF = 1 :D.
>
> C*heers,
> -----------------------
> Alain Rodriguez - @arodream - alain@thelastpickle.com
> France / Spain
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2018-01-17 17:40 GMT+00:00 Octavian Rinciog <oc...@gmail.com>:
>>
>> Hello!
>>
>> I am using Cassandra 3.10, on Ubuntu 14.04 and I have a counter
>> table(RF=1), with the following schema:
>>
>> CREATE TABLE edges (
>> src_id text,
>> src_type text,
>> source text
>> weight counter,
>> PRIMARY KEY ((src_id, src_type), source)
>> ) WITH
>> compaction = {'class':
>> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
>> 'max_threshold': '32', 'min_threshold': '4'}
>>
>> SELECT vs UPDATE requests ratio is 0.001. ( Read Count: 3771000, Write
>> Count: 3401236000, in one month)
>>
>> We have Counter Cache enabled:
>>
>> Counter Cache : entries 1018782, size 256 MiB, capacity 256
>> MiB, 2799913189 hits, 3469459479 requests, 0.807 recent hit rate, 7200
>> save period in seconds
>>
>> The problem is that our read rate limit on our hard-disk is always
>> near 30MBps and our write rate limit is near 500KBps.
>>
>> One example of output of "iostat -x" is
>>
>> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
>> avgrq-sz avgqu-sz await r_await w_await svctm %util
>> sdb 0.06 1.04 263.65 2.04 28832.42 572.53
>> 146.07 0.36 1.35 0.74 81.16 1.27 33.81
>>
>> Also with iotop, we saw that are about 8 threads that each goes around
>> 3MB/s read rate.
>>
>> Total DISK READ : 22.73 M/s | Total DISK WRITE : 494.35 K/s
>> Actual DISK READ: 22.62 M/s | Actual DISK WRITE: 528.57 K/s
>> TID PRIO USER DISK READ> DISK WRITE SWAPIN IO COMMAND
>> 14793 be/4 cassandra 3.061 M/s 0.0010 B/s 0.00 % 93.27 % java
>> -Dcassandra.fd_max_interval_ms=400
>>
>> The output of strace on these threads is :
>>
>> strace -cp 14793
>> Process 14793 attached
>> ^CProcess 14793 detached
>> % time seconds usecs/call calls errors syscall
>> ------ ----------- ----------- --------- --------- ----------------
>> 99.85 32.118518 57 567288 256251 futex
>> 0.15 0.048822 3 15339 write
>> 0.00 0.000000 0 1 rt_sigreturn
>> ------ ----------- ----------- --------- --------- ----------------
>> 100.00 32.167340 582628 256251 total
>>
>>
>> Despite that iotop shows that this thread is reading with 3MB/s, there
>> is no read syscall in strace.
>>
>> I want to ask if actually the futex is responsible for the read rate
>> and how can we debug this problem further ?
>>
>> Btw, there are no compaction tasks in progress and there are no SELECT
>> queries in progress.
>>
>> Also, I know that for each update, a lock is obtained[1]
>>
>> Thank you,
>>
>>
>> [1]https://apache.googlesource.com/cassandra/+/refs/heads/trunk/src/java/org/apache/cassandra/db/CounterMutation.java#121
>> --
>> Octavian Rinciog
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>
--
Octavian Rinciog
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org
Re: High read rate on hard-disk
Posted by Alain RODRIGUEZ <ar...@gmail.com>.
Hello Octavian,
> I have a counter table(RF=1)
SELECT vs UPDATE requests ratio is 0.001. ( Read Count: 3771000, Write Count:
> 3401236000, in one month)
SELECT vs UPDATE requests ratio is 0.001. ( Read Count: 3771000, Write
> Count: 3401236000, in one month)
The problem is that our read rate limit on our hard-disk is always near
> 30MBps and our write rate limit is near 500KBps.
I did not read all your numbers, but here are the internal details you
could be missing:
- Other than the 'lock', Counters perform an implicit read before the write
operation. To increment, you need to know about past value. It was true
last time I used them, I believe there is no real workaround and it's still
the case today.
- Writes do not hit the disk synchronously. Instead of this, they are
stored in the Memtable and only flushed once, sequentially and efficiently.
Then compactions manages to merge partitions after, asynchronously.
I would say what you are seeing is expected with this use case. Also, I
have never seen a use case where using RF = 1 is good idea (excepted for
some testing maybe). Be aware this data is weak and can easily be lost (if
it's a deliberate choice, ignore my comment). On the bright side, you have
no entropy / consistency issues or need for repairs with RF = 1 :D.
C*heers,
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com
France / Spain
The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com
2018-01-17 17:40 GMT+00:00 Octavian Rinciog <oc...@gmail.com>:
> Hello!
>
> I am using Cassandra 3.10, on Ubuntu 14.04 and I have a counter
> table(RF=1), with the following schema:
>
> CREATE TABLE edges (
> src_id text,
> src_type text,
> source text
> weight counter,
> PRIMARY KEY ((src_id, src_type), source)
> ) WITH
> compaction = {'class':
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
>
> SELECT vs UPDATE requests ratio is 0.001. ( Read Count: 3771000, Write
> Count: 3401236000, in one month)
>
> We have Counter Cache enabled:
>
> Counter Cache : entries 1018782, size 256 MiB, capacity 256
> MiB, 2799913189 hits, 3469459479 requests, 0.807 recent hit rate, 7200
> save period in seconds
>
> The problem is that our read rate limit on our hard-disk is always
> near 30MBps and our write rate limit is near 500KBps.
>
> One example of output of "iostat -x" is
>
> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
> avgrq-sz avgqu-sz await r_await w_await svctm %util
> sdb 0.06 1.04 263.65 2.04 28832.42 572.53
> 146.07 0.36 1.35 0.74 81.16 1.27 33.81
>
> Also with iotop, we saw that are about 8 threads that each goes around
> 3MB/s read rate.
>
> Total DISK READ : 22.73 M/s | Total DISK WRITE : 494.35 K/s
> Actual DISK READ: 22.62 M/s | Actual DISK WRITE: 528.57 K/s
> TID PRIO USER DISK READ> DISK WRITE SWAPIN IO COMMAND
> 14793 be/4 cassandra 3.061 M/s 0.0010 B/s 0.00 % 93.27 % java
> -Dcassandra.fd_max_interval_ms=400
>
> The output of strace on these threads is :
>
> strace -cp 14793
> Process 14793 attached
> ^CProcess 14793 detached
> % time seconds usecs/call calls errors syscall
> ------ ----------- ----------- --------- --------- ----------------
> 99.85 32.118518 57 567288 256251 futex
> 0.15 0.048822 3 15339 write
> 0.00 0.000000 0 1 rt_sigreturn
> ------ ----------- ----------- --------- --------- ----------------
> 100.00 32.167340 582628 256251 total
>
>
> Despite that iotop shows that this thread is reading with 3MB/s, there
> is no read syscall in strace.
>
> I want to ask if actually the futex is responsible for the read rate
> and how can we debug this problem further ?
>
> Btw, there are no compaction tasks in progress and there are no SELECT
> queries in progress.
>
> Also, I know that for each update, a lock is obtained[1]
>
> Thank you,
>
> [1]https://apache.googlesource.com/cassandra/+/
> refs/heads/trunk/src/java/org/apache/cassandra/db/CounterMutation.java#121
> --
> Octavian Rinciog
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>