You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by 曹志富 <ca...@gmail.com> on 2015/10/23 13:12:22 UTC

terrible read/write latency fluctuation

some nodes seems normal ,but some nodes do't. here is the metrics from
opscenter:
[image: 内嵌图片 1]

[image: 内嵌图片 2]

[image: 内嵌图片 3]


--------------------------------------
Ranger Tsao

Re: terrible read/write latency fluctuation

Posted by 曹志富 <ca...@gmail.com>.

Thanks all of u.

--------------------------------------
Ranger Tsao

2015-10-30 18:25 GMT+08:00 Anishek Agarwal <an...@gmail.com>:

> if its some sort of timeseries DTCS might turn out to be better for
> compaction. also some disk monitoring might help to understand if disk is
> the bottleneck.
>
> On Sun, Oct 25, 2015 at 3:47 PM, 曹志富 <ca...@gmail.com> wrote:
>
>> I will try to trace a read that take > 20msec
>> .
>>
>> just HDD.no delete just 60days ttl.value size is small ,max length is 140.
>>
>>
>> My data like Time Series . date of 90% reads which timestamp < 7days.
>> data almost just insert ,with a lit update.
>>
>
>

Re: terrible read/write latency fluctuation

Posted by Anishek Agarwal <an...@gmail.com>.

if its some sort of timeseries DTCS might turn out to be better for
compaction. also some disk monitoring might help to understand if disk is
the bottleneck.

On Sun, Oct 25, 2015 at 3:47 PM, 曹志富 <ca...@gmail.com> wrote:

> I will try to trace a read that take > 20msec
> .
>
> just HDD.no delete just 60days ttl.value size is small ,max length is 140.
>
>
> My data like Time Series . date of 90% reads which timestamp < 7days.
> data almost just insert ,with a lit update.
>

Re: terrible read/write latency fluctuation

Posted by 曹志富 <ca...@gmail.com>.

I will try to trace a read that take > 20msec
.

just HDD.no delete just 60days ttl.value size is small ,max length is 140.


My data like Time Series . date of 90% reads which timestamp < 7days. data
almost just insert ,with a lit update.

Re: terrible read/write latency fluctuation

Posted by Dor Laor <do...@scylladb.com>.

This read request took ~4ms which isn't too bad.
Do you have a trace of the reads that take > 20msec?

Worth also to check how much performance improves after major compaction. 1
TB is relatively
high and if compaction didn't catch up, you'll need to read lots of
entries.
More questions to ask:
  - Are you using spinning disk or SSD
  - What's the deletes ration
  - What's the size of the 'value'

On Sun, Oct 25, 2015 at 4:54 AM, 曹志富 <ca...@gmail.com> wrote:

> beauty trace log :
> [image: 内嵌图片 1]
>
> --------------------------------------
> Ranger Tsao
>
> 2015-10-25 10:51 GMT+08:00 曹志富 <ca...@gmail.com>:
>
>> schema is simple  :
>> CREATE TABLE k1.c1 (
>>     key text,
>>     column1 text,
>>     value text,
>>     PRIMARY KEY (key, column1)
>> ) WITH COMPACT STORAGE
>>     AND CLUSTERING ORDER BY (column1 ASC)
>>     AND bloom_filter_fp_chance = 0.01
>>     AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
>>     AND comment = ''
>>     AND compaction = {'min_threshold': '4',
>> 'unchecked_tombstone_compaction': 'false', 'class':
>> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
>> 'max_threshold': '32'}
>>     AND compression = {'sstable_compression':
>> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>>     AND dclocal_read_repair_chance = 0.1
>>     AND default_time_to_live = 0
>>     AND gc_grace_seconds = 0
>>     AND max_index_interval = 2048
>>     AND memtable_flush_period_in_ms = 0
>>     AND min_index_interval = 128
>>     AND read_repair_chance = 0.0
>>     AND speculative_retry = '99.0PERCENTILE';
>>
>>
>> One row key 25 columns.
>>
>> 25 nodes ,RF 3 ,one node 1TB.
>>
>> without superuser login cqlsh ,can't print trace log :
>>
>>
>> Tracing session: 20a7fe00-7ac3-11e5-ad2a-0bbf95bc6f42
>>
>>  activity
>>                         | timestamp                  | source        |
>> source_elapsed
>>
>> --------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>>
>>      Execute CQL3 query | 2015-10-25 10:50:18.976000 |  172.19.105.9 |
>>          0
>>                READ message received from /172.19.105.9
>> [MessagingService-Incoming-/172.19.105.9] | 2015-10-25 10:50:18.977000 |
>> 172.19.105.53 |             50
>>    Parsing select * from weibo.content1 where key =
>> 'tw650663253366837248'; [SharedPool-Worker-5] | 2015-10-25 10:50:18.977000
>> |  172.19.105.9 |             75
>>                                   Executing single-partition query on
>> users [SharedPool-Worker-1] | 2015-10-25 10:50:18.978000 | 172.19.105.53 |
>>           1179
>>                                                         Preparing
>> statement [SharedPool-Worker-5] | 2015-10-25 10:50:18.978000 |
>>  172.19.105.9 |            113
>>                                                Acquiring sstable
>> references [SharedPool-Worker-1] | 2015-10-25 10:50:18.978000 |
>> 172.19.105.53 |           1218
>>                                            reading data from /
>> 172.19.105.53 [SharedPool-Worker-5] | 2015-10-25 10:50:18.978000 |
>>  172.19.105.9 |            542
>>                 Sending READ message to /172.19.105.53
>> [MessagingService-Outgoing-/172.19.105.53] | 2015-10-25 10:50:18.978000
>> |  172.19.105.9 |            619
>>                                                 Merging memtable
>> tombstones [SharedPool-Worker-1] | 2015-10-25 10:50:18.979000 |
>> 172.19.105.53 |           1249
>>                READ message received from /172.19.105.9
>> [MessagingService-Incoming-/172.19.105.9] | 2015-10-25 10:50:18.979000 |
>> 172.19.105.58 |             37
>>                                    speculating read retry on /
>> 172.19.105.71 [SharedPool-Worker-5] | 2015-10-25 10:50:18.979000 |
>>  172.19.105.9 |           2794
>>                                                 Key cache hit for sstable
>> 1 [SharedPool-Worker-1] | 2015-10-25 10:50:18.979000 | 172.19.105.53 |
>>       1291
>>                 Sending READ message to /172.19.105.71
>> [MessagingService-Outgoing-/172.19.105.71] | 2015-10-25 10:50:18.979000
>> |  172.19.105.9 |           2862
>>                                 Seeking to partition beginning in data
>> file [SharedPool-Worker-1] | 2015-10-25 10:50:18.979000 | 172.19.105.53 |
>>         1301
>>  REQUEST_RESPONSE message received from /172.19.105.53
>> [MessagingService-Incoming-/172.19.105.53] | 2015-10-25 10:50:18.979000
>> |  172.19.105.9 |           3028
>>                                                 Key cache hit for sstable
>> 2 [SharedPool-Worker-1] | 2015-10-25 10:50:18.980000 | 172.19.105.53 |
>>       1346
>>                                Executing single-partition query on
>> content1 [SharedPool-Worker-2] | 2015-10-25 10:50:18.980000 | 172.19.105.58
>> |            125
>>                                     Processing response from /
>> 172.19.105.53 [SharedPool-Worker-1] | 2015-10-25 10:50:18.980000 |
>>  172.19.105.9 |           3081
>>                                 Seeking to partition beginning in data
>> file [SharedPool-Worker-1] | 2015-10-25 10:50:18.980000 | 172.19.105.53 |
>>         1356
>>                                                Acquiring sstable
>> references [SharedPool-Worker-2] | 2015-10-25 10:50:18.980000 |
>> 172.19.105.58 |            131
>>                                            reading data from /
>> 172.19.105.58 [SharedPool-Worker-5] | 2015-10-25 10:50:18.980000 |
>>  172.19.105.9 |           3196
>>                                                 Merging memtable
>> tombstones [SharedPool-Worker-2] | 2015-10-25 10:50:18.980000 |
>> 172.19.105.58 |            140
>>                 Sending READ message to /172.19.105.58
>> [MessagingService-Outgoing-/172.19.105.58] | 2015-10-25 10:50:18.980000
>> |  172.19.105.9 |           3252
>>   Skipped 0/2 non-slice-intersecting sstables, included 0 due to
>> tombstones [SharedPool-Worker-1] | 2015-10-25 10:50:18.981000 |
>> 172.19.105.53 |           1385
>>                                             Key cache hit for sstable
>> 66818 [SharedPool-Worker-2] | 2015-10-25 10:50:18.981000 | 172.19.105.58 |
>>            185
>>  REQUEST_RESPONSE message received from /172.19.105.58
>> [MessagingService-Incoming-/172.19.105.58] | 2015-10-25 10:50:18.981000
>> |  172.19.105.9 |           4343
>>                                  Merging data from memtables and 2
>> sstables [SharedPool-Worker-1] | 2015-10-25 10:50:18.981000 | 172.19.105.53
>> |           1392
>>                                 Seeking to partition beginning in data
>> file [SharedPool-Worker-2] | 2015-10-25 10:50:18.981000 | 172.19.105.58 |
>>          189
>>                                     Processing response from /
>> 172.19.105.58 [SharedPool-Worker-3] | 2015-10-25 10:50:18.981000 |
>>  172.19.105.9 |           4444
>>                                           Read 1 live and 0 tombstone
>> cells [SharedPool-Worker-1] | 2015-10-25 10:50:18.981000 | 172.19.105.53 |
>>           1442
>>   Skipped 0/1 non-slice-intersecting sstables, included 0 due to
>> tombstones [SharedPool-Worker-2] | 2015-10-25 10:50:18.981000 |
>> 172.19.105.58 |            394
>>                                         Enqueuing response to /
>> 172.19.105.9 [SharedPool-Worker-1] | 2015-10-25 10:50:18.982000 |
>> 172.19.105.53 |           1548
>>                                  Merging data from memtables and 1
>> sstables [SharedPool-Worker-2] | 2015-10-25 10:50:18.982000 | 172.19.105.58
>> |            406
>>       Sending REQUEST_RESPONSE message to /172.19.105.9
>> [MessagingService-Outgoing-/172.19.105.9] | 2015-10-25 10:50:18.982000 |
>> 172.19.105.53 |           1640
>>                                          Read 23 live and 0 tombstone
>> cells [SharedPool-Worker-2] | 2015-10-25 10:50:18.982000 | 172.19.105.58 |
>>            438
>>                                         Enqueuing response to /
>> 172.19.105.9 [SharedPool-Worker-2] | 2015-10-25 10:50:18.982000 |
>> 172.19.105.58 |            509
>>       Sending REQUEST_RESPONSE message to /172.19.105.9
>> [MessagingService-Outgoing-/172.19.105.9] | 2015-10-25 10:50:18.983000 |
>> 172.19.105.58 |            557
>>
>>        Request complete | 2015-10-25 10:50:18.980520 |  172.19.105.9 |
>>       4520
>>
>>
>>
>>
>> --------------------------------------
>> Ranger Tsao
>>
>> 2015-10-25 3:40 GMT+08:00 Dor Laor <do...@scylladb.com>:
>>
>>> Can you provide more details - the schema, the config, especially
>>> compaction policy,
>>> amount of data and eventually cql trace of the reads.
>>>
>>> Best,
>>> Dor
>>>
>>> On Fri, Oct 23, 2015 at 2:16 PM, 曹志富 <ca...@gmail.com> wrote:
>>>
>>>> additional info:
>>>> 25nodes,4000 write requests ,900 read requests the whole cluster.
>>>>
>>>> --------------------------------------
>>>> Ranger Tsao
>>>>
>>>> 2015-10-23 19:12 GMT+08:00 曹志富 <ca...@gmail.com>:
>>>>
>>>>> some nodes seems normal ,but some nodes do't. here is the metrics from
>>>>> opscenter:
>>>>> [image: 内嵌图片 1]
>>>>>
>>>>> [image: 内嵌图片 2]
>>>>>
>>>>> [image: 内嵌图片 3]
>>>>>
>>>>>
>>>>> --------------------------------------
>>>>> Ranger Tsao
>>>>>
>>>>
>>>>
>>>
>>
>

Re: terrible read/write latency fluctuation

Posted by 曹志富 <ca...@gmail.com>.

beauty trace log :
[image: 内嵌图片 1]

--------------------------------------
Ranger Tsao

2015-10-25 10:51 GMT+08:00 曹志富 <ca...@gmail.com>:

> schema is simple  :
> CREATE TABLE k1.c1 (
>     key text,
>     column1 text,
>     value text,
>     PRIMARY KEY (key, column1)
> ) WITH COMPACT STORAGE
>     AND CLUSTERING ORDER BY (column1 ASC)
>     AND bloom_filter_fp_chance = 0.01
>     AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
>     AND comment = ''
>     AND compaction = {'min_threshold': '4',
> 'unchecked_tombstone_compaction': 'false', 'class':
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32'}
>     AND compression = {'sstable_compression':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>     AND dclocal_read_repair_chance = 0.1
>     AND default_time_to_live = 0
>     AND gc_grace_seconds = 0
>     AND max_index_interval = 2048
>     AND memtable_flush_period_in_ms = 0
>     AND min_index_interval = 128
>     AND read_repair_chance = 0.0
>     AND speculative_retry = '99.0PERCENTILE';
>
>
> One row key 25 columns.
>
> 25 nodes ,RF 3 ,one node 1TB.
>
> without superuser login cqlsh ,can't print trace log :
>
>
> Tracing session: 20a7fe00-7ac3-11e5-ad2a-0bbf95bc6f42
>
>  activity
>                         | timestamp                  | source        |
> source_elapsed
>
> --------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>
>      Execute CQL3 query | 2015-10-25 10:50:18.976000 |  172.19.105.9 |
>          0
>                READ message received from /172.19.105.9
> [MessagingService-Incoming-/172.19.105.9] | 2015-10-25 10:50:18.977000 |
> 172.19.105.53 |             50
>    Parsing select * from weibo.content1 where key =
> 'tw650663253366837248'; [SharedPool-Worker-5] | 2015-10-25 10:50:18.977000
> |  172.19.105.9 |             75
>                                   Executing single-partition query on
> users [SharedPool-Worker-1] | 2015-10-25 10:50:18.978000 | 172.19.105.53 |
>           1179
>                                                         Preparing
> statement [SharedPool-Worker-5] | 2015-10-25 10:50:18.978000 |
>  172.19.105.9 |            113
>                                                Acquiring sstable
> references [SharedPool-Worker-1] | 2015-10-25 10:50:18.978000 |
> 172.19.105.53 |           1218
>                                            reading data from /
> 172.19.105.53 [SharedPool-Worker-5] | 2015-10-25 10:50:18.978000 |
>  172.19.105.9 |            542
>                 Sending READ message to /172.19.105.53
> [MessagingService-Outgoing-/172.19.105.53] | 2015-10-25 10:50:18.978000 |
>  172.19.105.9 |            619
>                                                 Merging memtable
> tombstones [SharedPool-Worker-1] | 2015-10-25 10:50:18.979000 |
> 172.19.105.53 |           1249
>                READ message received from /172.19.105.9
> [MessagingService-Incoming-/172.19.105.9] | 2015-10-25 10:50:18.979000 |
> 172.19.105.58 |             37
>                                    speculating read retry on /
> 172.19.105.71 [SharedPool-Worker-5] | 2015-10-25 10:50:18.979000 |
>  172.19.105.9 |           2794
>                                                 Key cache hit for sstable
> 1 [SharedPool-Worker-1] | 2015-10-25 10:50:18.979000 | 172.19.105.53 |
>       1291
>                 Sending READ message to /172.19.105.71
> [MessagingService-Outgoing-/172.19.105.71] | 2015-10-25 10:50:18.979000 |
>  172.19.105.9 |           2862
>                                 Seeking to partition beginning in data
> file [SharedPool-Worker-1] | 2015-10-25 10:50:18.979000 | 172.19.105.53 |
>         1301
>  REQUEST_RESPONSE message received from /172.19.105.53
> [MessagingService-Incoming-/172.19.105.53] | 2015-10-25 10:50:18.979000 |
>  172.19.105.9 |           3028
>                                                 Key cache hit for sstable
> 2 [SharedPool-Worker-1] | 2015-10-25 10:50:18.980000 | 172.19.105.53 |
>       1346
>                                Executing single-partition query on
> content1 [SharedPool-Worker-2] | 2015-10-25 10:50:18.980000 | 172.19.105.58
> |            125
>                                     Processing response from /
> 172.19.105.53 [SharedPool-Worker-1] | 2015-10-25 10:50:18.980000 |
>  172.19.105.9 |           3081
>                                 Seeking to partition beginning in data
> file [SharedPool-Worker-1] | 2015-10-25 10:50:18.980000 | 172.19.105.53 |
>         1356
>                                                Acquiring sstable
> references [SharedPool-Worker-2] | 2015-10-25 10:50:18.980000 |
> 172.19.105.58 |            131
>                                            reading data from /
> 172.19.105.58 [SharedPool-Worker-5] | 2015-10-25 10:50:18.980000 |
>  172.19.105.9 |           3196
>                                                 Merging memtable
> tombstones [SharedPool-Worker-2] | 2015-10-25 10:50:18.980000 |
> 172.19.105.58 |            140
>                 Sending READ message to /172.19.105.58
> [MessagingService-Outgoing-/172.19.105.58] | 2015-10-25 10:50:18.980000 |
>  172.19.105.9 |           3252
>   Skipped 0/2 non-slice-intersecting sstables, included 0 due to
> tombstones [SharedPool-Worker-1] | 2015-10-25 10:50:18.981000 |
> 172.19.105.53 |           1385
>                                             Key cache hit for sstable
> 66818 [SharedPool-Worker-2] | 2015-10-25 10:50:18.981000 | 172.19.105.58 |
>            185
>  REQUEST_RESPONSE message received from /172.19.105.58
> [MessagingService-Incoming-/172.19.105.58] | 2015-10-25 10:50:18.981000 |
>  172.19.105.9 |           4343
>                                  Merging data from memtables and 2
> sstables [SharedPool-Worker-1] | 2015-10-25 10:50:18.981000 | 172.19.105.53
> |           1392
>                                 Seeking to partition beginning in data
> file [SharedPool-Worker-2] | 2015-10-25 10:50:18.981000 | 172.19.105.58 |
>          189
>                                     Processing response from /
> 172.19.105.58 [SharedPool-Worker-3] | 2015-10-25 10:50:18.981000 |
>  172.19.105.9 |           4444
>                                           Read 1 live and 0 tombstone
> cells [SharedPool-Worker-1] | 2015-10-25 10:50:18.981000 | 172.19.105.53 |
>           1442
>   Skipped 0/1 non-slice-intersecting sstables, included 0 due to
> tombstones [SharedPool-Worker-2] | 2015-10-25 10:50:18.981000 |
> 172.19.105.58 |            394
>                                         Enqueuing response to /
> 172.19.105.9 [SharedPool-Worker-1] | 2015-10-25 10:50:18.982000 |
> 172.19.105.53 |           1548
>                                  Merging data from memtables and 1
> sstables [SharedPool-Worker-2] | 2015-10-25 10:50:18.982000 | 172.19.105.58
> |            406
>       Sending REQUEST_RESPONSE message to /172.19.105.9
> [MessagingService-Outgoing-/172.19.105.9] | 2015-10-25 10:50:18.982000 |
> 172.19.105.53 |           1640
>                                          Read 23 live and 0 tombstone
> cells [SharedPool-Worker-2] | 2015-10-25 10:50:18.982000 | 172.19.105.58 |
>            438
>                                         Enqueuing response to /
> 172.19.105.9 [SharedPool-Worker-2] | 2015-10-25 10:50:18.982000 |
> 172.19.105.58 |            509
>       Sending REQUEST_RESPONSE message to /172.19.105.9
> [MessagingService-Outgoing-/172.19.105.9] | 2015-10-25 10:50:18.983000 |
> 172.19.105.58 |            557
>
>        Request complete | 2015-10-25 10:50:18.980520 |  172.19.105.9 |
>       4520
>
>
>
>
> --------------------------------------
> Ranger Tsao
>
> 2015-10-25 3:40 GMT+08:00 Dor Laor <do...@scylladb.com>:
>
>> Can you provide more details - the schema, the config, especially
>> compaction policy,
>> amount of data and eventually cql trace of the reads.
>>
>> Best,
>> Dor
>>
>> On Fri, Oct 23, 2015 at 2:16 PM, 曹志富 <ca...@gmail.com> wrote:
>>
>>> additional info:
>>> 25nodes,4000 write requests ,900 read requests the whole cluster.
>>>
>>> --------------------------------------
>>> Ranger Tsao
>>>
>>> 2015-10-23 19:12 GMT+08:00 曹志富 <ca...@gmail.com>:
>>>
>>>> some nodes seems normal ,but some nodes do't. here is the metrics from
>>>> opscenter:
>>>> [image: 内嵌图片 1]
>>>>
>>>> [image: 内嵌图片 2]
>>>>
>>>> [image: 内嵌图片 3]
>>>>
>>>>
>>>> --------------------------------------
>>>> Ranger Tsao
>>>>
>>>
>>>
>>
>

Re: terrible read/write latency fluctuation

Posted by 曹志富 <ca...@gmail.com>.

schema is simple  :
CREATE TABLE k1.c1 (
    key text,
    column1 text,
    value text,
    PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
    AND CLUSTERING ORDER BY (column1 ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
    AND comment = ''
    AND compaction = {'min_threshold': '4',
'unchecked_tombstone_compaction': 'false', 'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32'}
    AND compression = {'sstable_compression':
'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 0
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';


One row key 25 columns.

25 nodes ,RF 3 ,one node 1TB.

without superuser login cqlsh ,can't print trace log :


Tracing session: 20a7fe00-7ac3-11e5-ad2a-0bbf95bc6f42

 activity
                      | timestamp                  | source        |
source_elapsed
--------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------

   Execute CQL3 query | 2015-10-25 10:50:18.976000 |  172.19.105.9 |
       0
               READ message received from /172.19.105.9
[MessagingService-Incoming-/172.19.105.9] | 2015-10-25 10:50:18.977000 |
172.19.105.53 |             50
   Parsing select * from weibo.content1 where key = 'tw650663253366837248';
[SharedPool-Worker-5] | 2015-10-25 10:50:18.977000 |  172.19.105.9 |
      75
                                  Executing single-partition query on users
[SharedPool-Worker-1] | 2015-10-25 10:50:18.978000 | 172.19.105.53 |
    1179
                                                        Preparing statement
[SharedPool-Worker-5] | 2015-10-25 10:50:18.978000 |  172.19.105.9 |
     113
                                               Acquiring sstable references
[SharedPool-Worker-1] | 2015-10-25 10:50:18.978000 | 172.19.105.53 |
    1218
                                           reading data from /172.19.105.53
[SharedPool-Worker-5] | 2015-10-25 10:50:18.978000 |  172.19.105.9 |
     542
                Sending READ message to /172.19.105.53
[MessagingService-Outgoing-/172.19.105.53] | 2015-10-25 10:50:18.978000 |
 172.19.105.9 |            619
                                                Merging memtable tombstones
[SharedPool-Worker-1] | 2015-10-25 10:50:18.979000 | 172.19.105.53 |
    1249
               READ message received from /172.19.105.9
[MessagingService-Incoming-/172.19.105.9] | 2015-10-25 10:50:18.979000 |
172.19.105.58 |             37
                                   speculating read retry on /172.19.105.71
[SharedPool-Worker-5] | 2015-10-25 10:50:18.979000 |  172.19.105.9 |
    2794
                                                Key cache hit for sstable 1
[SharedPool-Worker-1] | 2015-10-25 10:50:18.979000 | 172.19.105.53 |
    1291
                Sending READ message to /172.19.105.71
[MessagingService-Outgoing-/172.19.105.71] | 2015-10-25 10:50:18.979000 |
 172.19.105.9 |           2862
                                Seeking to partition beginning in data file
[SharedPool-Worker-1] | 2015-10-25 10:50:18.979000 | 172.19.105.53 |
    1301
 REQUEST_RESPONSE message received from /172.19.105.53
[MessagingService-Incoming-/172.19.105.53] | 2015-10-25 10:50:18.979000 |
 172.19.105.9 |           3028
                                                Key cache hit for sstable 2
[SharedPool-Worker-1] | 2015-10-25 10:50:18.980000 | 172.19.105.53 |
    1346
                               Executing single-partition query on content1
[SharedPool-Worker-2] | 2015-10-25 10:50:18.980000 | 172.19.105.58 |
     125
                                    Processing response from /172.19.105.53
[SharedPool-Worker-1] | 2015-10-25 10:50:18.980000 |  172.19.105.9 |
    3081
                                Seeking to partition beginning in data file
[SharedPool-Worker-1] | 2015-10-25 10:50:18.980000 | 172.19.105.53 |
    1356
                                               Acquiring sstable references
[SharedPool-Worker-2] | 2015-10-25 10:50:18.980000 | 172.19.105.58 |
     131
                                           reading data from /172.19.105.58
[SharedPool-Worker-5] | 2015-10-25 10:50:18.980000 |  172.19.105.9 |
    3196
                                                Merging memtable tombstones
[SharedPool-Worker-2] | 2015-10-25 10:50:18.980000 | 172.19.105.58 |
     140
                Sending READ message to /172.19.105.58
[MessagingService-Outgoing-/172.19.105.58] | 2015-10-25 10:50:18.980000 |
 172.19.105.9 |           3252
  Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones
[SharedPool-Worker-1] | 2015-10-25 10:50:18.981000 | 172.19.105.53 |
    1385
                                            Key cache hit for sstable 66818
[SharedPool-Worker-2] | 2015-10-25 10:50:18.981000 | 172.19.105.58 |
     185
 REQUEST_RESPONSE message received from /172.19.105.58
[MessagingService-Incoming-/172.19.105.58] | 2015-10-25 10:50:18.981000 |
 172.19.105.9 |           4343
                                 Merging data from memtables and 2 sstables
[SharedPool-Worker-1] | 2015-10-25 10:50:18.981000 | 172.19.105.53 |
    1392
                                Seeking to partition beginning in data file
[SharedPool-Worker-2] | 2015-10-25 10:50:18.981000 | 172.19.105.58 |
     189
                                    Processing response from /172.19.105.58
[SharedPool-Worker-3] | 2015-10-25 10:50:18.981000 |  172.19.105.9 |
    4444
                                          Read 1 live and 0 tombstone cells
[SharedPool-Worker-1] | 2015-10-25 10:50:18.981000 | 172.19.105.53 |
    1442
  Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones
[SharedPool-Worker-2] | 2015-10-25 10:50:18.981000 | 172.19.105.58 |
     394
                                        Enqueuing response to /172.19.105.9
[SharedPool-Worker-1] | 2015-10-25 10:50:18.982000 | 172.19.105.53 |
    1548
                                 Merging data from memtables and 1 sstables
[SharedPool-Worker-2] | 2015-10-25 10:50:18.982000 | 172.19.105.58 |
     406
      Sending REQUEST_RESPONSE message to /172.19.105.9
[MessagingService-Outgoing-/172.19.105.9] | 2015-10-25 10:50:18.982000 |
172.19.105.53 |           1640
                                         Read 23 live and 0 tombstone cells
[SharedPool-Worker-2] | 2015-10-25 10:50:18.982000 | 172.19.105.58 |
     438
                                        Enqueuing response to /172.19.105.9
[SharedPool-Worker-2] | 2015-10-25 10:50:18.982000 | 172.19.105.58 |
     509
      Sending REQUEST_RESPONSE message to /172.19.105.9
[MessagingService-Outgoing-/172.19.105.9] | 2015-10-25 10:50:18.983000 |
172.19.105.58 |            557

     Request complete | 2015-10-25 10:50:18.980520 |  172.19.105.9 |
    4520




--------------------------------------
Ranger Tsao

2015-10-25 3:40 GMT+08:00 Dor Laor <do...@scylladb.com>:

> Can you provide more details - the schema, the config, especially
> compaction policy,
> amount of data and eventually cql trace of the reads.
>
> Best,
> Dor
>
> On Fri, Oct 23, 2015 at 2:16 PM, 曹志富 <ca...@gmail.com> wrote:
>
>> additional info:
>> 25nodes,4000 write requests ,900 read requests the whole cluster.
>>
>> --------------------------------------
>> Ranger Tsao
>>
>> 2015-10-23 19:12 GMT+08:00 曹志富 <ca...@gmail.com>:
>>
>>> some nodes seems normal ,but some nodes do't. here is the metrics from
>>> opscenter:
>>> [image: 内嵌图片 1]
>>>
>>> [image: 内嵌图片 2]
>>>
>>> [image: 内嵌图片 3]
>>>
>>>
>>> --------------------------------------
>>> Ranger Tsao
>>>
>>
>>
>

Re: terrible read/write latency fluctuation

Posted by Dor Laor <do...@scylladb.com>.

Can you provide more details - the schema, the config, especially
compaction policy,
amount of data and eventually cql trace of the reads.

Best,
Dor

On Fri, Oct 23, 2015 at 2:16 PM, 曹志富 <ca...@gmail.com> wrote:

> additional info:
> 25nodes,4000 write requests ,900 read requests the whole cluster.
>
> --------------------------------------
> Ranger Tsao
>
> 2015-10-23 19:12 GMT+08:00 曹志富 <ca...@gmail.com>:
>
>> some nodes seems normal ,but some nodes do't. here is the metrics from
>> opscenter:
>> [image: 内嵌图片 1]
>>
>> [image: 内嵌图片 2]
>>
>> [image: 内嵌图片 3]
>>
>>
>> --------------------------------------
>> Ranger Tsao
>>
>
>

Re: terrible read/write latency fluctuation

Posted by 曹志富 <ca...@gmail.com>.

additional info:
25nodes,4000 write requests ,900 read requests the whole cluster.

--------------------------------------
Ranger Tsao

2015-10-23 19:12 GMT+08:00 曹志富 <ca...@gmail.com>:

> some nodes seems normal ,but some nodes do't. here is the metrics from
> opscenter:
> [image: 内嵌图片 1]
>
> [image: 内嵌图片 2]
>
> [image: 内嵌图片 3]
>
>
> --------------------------------------
> Ranger Tsao
>