You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Alain RODRIGUEZ <ar...@gmail.com> on 2015/08/31 15:48:03 UTC

Network / GC / Latency spike

Hi,

Running a 2.0.16 C* on AWS (private VPC, 2 DC).

I am facing an issue on our EU DC where I have a network burst (alongside
with GC and latency increase).

My first thought was a sudden application burst, though, I see no
corresponding evolution on reads / write or even CPU.

So I thought that this might come from the node themselves as IN almost
equal OUT Network. I tried lowering stream throughput on the whole DC to 1
Mbps, with ~30 nodes --> 30 Mbps --> ~4 MB/s max. My network went a lot
higher about 30 M in both sides (see screenshots attached).

I have tried to use iftop to see where this network is headed too, but I
was not able to do it because burst are very shorts.

So, questions are:

- Did someone experienced something similar already ? If so, any clue would
be appreciated :).
- How can I know (monitor, capture) where this big amount of network is
headed to or due to ?
- Am I right trying to figure out what this network is or should I follow
an other lead ?

Notes: I also noticed that CPU does not spike nor does R&W, but disk reads
also spikes !

C*heers,

Alain

Re: Network / GC / Latency spike

Posted by Otis Gospodnetić <ot...@gmail.com>.

Hi Alain,

Nice charts! ;)  (attachments came through the list).

Since you're using SPM for monitoring Cassandra, you may want to have a
look at https://sematext.atlassian.net/wiki/display/PUBSPM/Network+Map
which I think would have shown which nodes were talking to which nodes and
how much. Don't have a screenshot to share, but it looks a bit like the one
on http://blog.sematext.com/2015/08/06/introducing-appmap/

Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


On Thu, Sep 10, 2015 at 11:43 AM, Alain RODRIGUEZ <ar...@gmail.com>
wrote:

> Hi, just wanted to drop the follow up here.
>
> I finally figure out that bigdata guys were basically hammering the
> cluster by reading 2 month of data as fast as possible on one table at boot
> time to cache it. As this table is storing 12 MB blobs (Bloom Filters),
> even if the number of reads was not very high, as each row is really big,
> reads + read repairs were putting to much pressure on Cassandra. Those
> reads were mixed with much higher workloads so I was not seeing any burst
> in reads, making this harder to troubleshoot. Local reads (from Sematext /
> Opscenter) helped finding this out.
>
> Given the use case (no random reads, write once, no update) and the data
> size for each element, we will get this out of Cassandra to some HDFS or S3
> storage, basically. We do not need any database for this kind of job.
> Meanwhile we just disabled this feature as it is not something critical.
>
> @Fabien, Thank you for your help.
>
> C*heers,
>
> Alain
>
> 2015-09-02 0:43 GMT+02:00 Fabien Rousseau <fa...@gmail.com>:
>
>> Hi Alain,
>>
>> Maybe it's possible to confirm this by testing on a small cluster:
>> - create a cluster of 2 nodes (using https://github.com/pcmanus/ccm for
>> example)
>> - create a fake wide row of a few mb (using the python driver for example)
>> - drain and stop one of the two nodes
>> - remove the sstables of the stopped node (to provoke inconsistencies)
>> - start it again
>> - select a small portion of the wide row (many times, use nodetool
>> tpstats to know when a read repair has been triggered)
>> - nodetool flush (on the previously stopped node)
>> - check the size of the sstable (if a few kb, then only the selected
>> slice was repaired, but if a few mb then the whole row was repaired)
>>
>> The wild guess was: if a read repair was triggered when reading a small
>> portion of a wide row and if it resulted in streaming the whole wide row,
>> it could explain a network burst. (But, on a second thought it make more
>> sense to only repair the small portion being read...)
>>
>>
>>
>> 2015-09-01 12:05 GMT+02:00 Alain RODRIGUEZ <ar...@gmail.com>:
>>
>>> Hi Fabien, thanks for your help.
>>>
>>> I did not mention it but I indeed saw a correlation between latency and
>>> read repairs spikes. Though this is like going from 5 RR per second to 10
>>> per sec cluster wide according to opscenter: http://img42.com/L6gx1
>>>
>>> I have indeed some wide rows and this explanation looks reasonable to
>>> me, I mean this makes sense. Yet isn't this amount of Read Repair too low
>>> to induce such a "shitstorm" (even if it spikes x2, I got network x10) ?
>>> Also wide rows are present on heavy used tables (sadly...), so I should be
>>> using more network all the time (why only a few spikes per day (like 2 / 3
>>> max) ?
>>>
>>> How could I confirm this, without removing RR and waiting a week I mean,
>>> is there a way to see the size of the data being repaired through this
>>> mechanism ?
>>>
>>> C*heers
>>>
>>> Alain
>>>
>>> 2015-09-01 0:11 GMT+02:00 Fabien Rousseau <fa...@gmail.com>:
>>>
>>>> Hi Alain,
>>>>
>>>> Could it be wide rows + read repair ? (Let's suppose the "read repair"
>>>> repairs the full row, and it may not be subject to stream throughput limit)
>>>>
>>>> Best Regards
>>>> Fabien
>>>>
>>>> 2015-08-31 15:56 GMT+02:00 Alain RODRIGUEZ <ar...@gmail.com>:
>>>>
>>>>> I just realised that I have no idea about how this mailing list handle
>>>>> attached files.
>>>>>
>>>>> Please find screenshots there --> http://img42.com/collection/y2KxS
>>>>>
>>>>> Alain
>>>>>
>>>>> 2015-08-31 15:48 GMT+02:00 Alain RODRIGUEZ <ar...@gmail.com>:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Running a 2.0.16 C* on AWS (private VPC, 2 DC).
>>>>>>
>>>>>> I am facing an issue on our EU DC where I have a network burst
>>>>>> (alongside with GC and latency increase).
>>>>>>
>>>>>> My first thought was a sudden application burst, though, I see no
>>>>>> corresponding evolution on reads / write or even CPU.
>>>>>>
>>>>>> So I thought that this might come from the node themselves as IN
>>>>>> almost equal OUT Network. I tried lowering stream throughput on the whole
>>>>>> DC to 1 Mbps, with ~30 nodes --> 30 Mbps --> ~4 MB/s max. My network went a
>>>>>> lot higher about 30 M in both sides (see screenshots attached).
>>>>>>
>>>>>> I have tried to use iftop to see where this network is headed too,
>>>>>> but I was not able to do it because burst are very shorts.
>>>>>>
>>>>>> So, questions are:
>>>>>>
>>>>>> - Did someone experienced something similar already ? If so, any clue
>>>>>> would be appreciated :).
>>>>>> - How can I know (monitor, capture) where this big amount of network
>>>>>> is headed to or due to ?
>>>>>> - Am I right trying to figure out what this network is or should I
>>>>>> follow an other lead ?
>>>>>>
>>>>>> Notes: I also noticed that CPU does not spike nor does R&W, but disk
>>>>>> reads also spikes !
>>>>>>
>>>>>> C*heers,
>>>>>>
>>>>>> Alain
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Network / GC / Latency spike

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

Hi, just wanted to drop the follow up here.

I finally figure out that bigdata guys were basically hammering the cluster
by reading 2 month of data as fast as possible on one table at boot time to
cache it. As this table is storing 12 MB blobs (Bloom Filters), even if the
number of reads was not very high, as each row is really big, reads + read
repairs were putting to much pressure on Cassandra. Those reads were mixed
with much higher workloads so I was not seeing any burst in reads, making
this harder to troubleshoot. Local reads (from Sematext / Opscenter) helped
finding this out.

Given the use case (no random reads, write once, no update) and the data
size for each element, we will get this out of Cassandra to some HDFS or S3
storage, basically. We do not need any database for this kind of job.
Meanwhile we just disabled this feature as it is not something critical.

@Fabien, Thank you for your help.

C*heers,

Alain

2015-09-02 0:43 GMT+02:00 Fabien Rousseau <fa...@gmail.com>:

> Hi Alain,
>
> Maybe it's possible to confirm this by testing on a small cluster:
> - create a cluster of 2 nodes (using https://github.com/pcmanus/ccm for
> example)
> - create a fake wide row of a few mb (using the python driver for example)
> - drain and stop one of the two nodes
> - remove the sstables of the stopped node (to provoke inconsistencies)
> - start it again
> - select a small portion of the wide row (many times, use nodetool tpstats
> to know when a read repair has been triggered)
> - nodetool flush (on the previously stopped node)
> - check the size of the sstable (if a few kb, then only the selected slice
> was repaired, but if a few mb then the whole row was repaired)
>
> The wild guess was: if a read repair was triggered when reading a small
> portion of a wide row and if it resulted in streaming the whole wide row,
> it could explain a network burst. (But, on a second thought it make more
> sense to only repair the small portion being read...)
>
>
>
> 2015-09-01 12:05 GMT+02:00 Alain RODRIGUEZ <ar...@gmail.com>:
>
>> Hi Fabien, thanks for your help.
>>
>> I did not mention it but I indeed saw a correlation between latency and
>> read repairs spikes. Though this is like going from 5 RR per second to 10
>> per sec cluster wide according to opscenter: http://img42.com/L6gx1
>>
>> I have indeed some wide rows and this explanation looks reasonable to me,
>> I mean this makes sense. Yet isn't this amount of Read Repair too low to
>> induce such a "shitstorm" (even if it spikes x2, I got network x10) ? Also
>> wide rows are present on heavy used tables (sadly...), so I should be using
>> more network all the time (why only a few spikes per day (like 2 / 3 max) ?
>>
>> How could I confirm this, without removing RR and waiting a week I mean,
>> is there a way to see the size of the data being repaired through this
>> mechanism ?
>>
>> C*heers
>>
>> Alain
>>
>> 2015-09-01 0:11 GMT+02:00 Fabien Rousseau <fa...@gmail.com>:
>>
>>> Hi Alain,
>>>
>>> Could it be wide rows + read repair ? (Let's suppose the "read repair"
>>> repairs the full row, and it may not be subject to stream throughput limit)
>>>
>>> Best Regards
>>> Fabien
>>>
>>> 2015-08-31 15:56 GMT+02:00 Alain RODRIGUEZ <ar...@gmail.com>:
>>>
>>>> I just realised that I have no idea about how this mailing list handle
>>>> attached files.
>>>>
>>>> Please find screenshots there --> http://img42.com/collection/y2KxS
>>>>
>>>> Alain
>>>>
>>>> 2015-08-31 15:48 GMT+02:00 Alain RODRIGUEZ <ar...@gmail.com>:
>>>>
>>>>> Hi,
>>>>>
>>>>> Running a 2.0.16 C* on AWS (private VPC, 2 DC).
>>>>>
>>>>> I am facing an issue on our EU DC where I have a network burst
>>>>> (alongside with GC and latency increase).
>>>>>
>>>>> My first thought was a sudden application burst, though, I see no
>>>>> corresponding evolution on reads / write or even CPU.
>>>>>
>>>>> So I thought that this might come from the node themselves as IN
>>>>> almost equal OUT Network. I tried lowering stream throughput on the whole
>>>>> DC to 1 Mbps, with ~30 nodes --> 30 Mbps --> ~4 MB/s max. My network went a
>>>>> lot higher about 30 M in both sides (see screenshots attached).
>>>>>
>>>>> I have tried to use iftop to see where this network is headed too, but
>>>>> I was not able to do it because burst are very shorts.
>>>>>
>>>>> So, questions are:
>>>>>
>>>>> - Did someone experienced something similar already ? If so, any clue
>>>>> would be appreciated :).
>>>>> - How can I know (monitor, capture) where this big amount of network
>>>>> is headed to or due to ?
>>>>> - Am I right trying to figure out what this network is or should I
>>>>> follow an other lead ?
>>>>>
>>>>> Notes: I also noticed that CPU does not spike nor does R&W, but disk
>>>>> reads also spikes !
>>>>>
>>>>> C*heers,
>>>>>
>>>>> Alain
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Network / GC / Latency spike

Posted by Fabien Rousseau <fa...@gmail.com>.

Hi Alain,

Maybe it's possible to confirm this by testing on a small cluster:
- create a cluster of 2 nodes (using https://github.com/pcmanus/ccm for
example)
- create a fake wide row of a few mb (using the python driver for example)
- drain and stop one of the two nodes
- remove the sstables of the stopped node (to provoke inconsistencies)
- start it again
- select a small portion of the wide row (many times, use nodetool tpstats
to know when a read repair has been triggered)
- nodetool flush (on the previously stopped node)
- check the size of the sstable (if a few kb, then only the selected slice
was repaired, but if a few mb then the whole row was repaired)

The wild guess was: if a read repair was triggered when reading a small
portion of a wide row and if it resulted in streaming the whole wide row,
it could explain a network burst. (But, on a second thought it make more
sense to only repair the small portion being read...)



2015-09-01 12:05 GMT+02:00 Alain RODRIGUEZ <ar...@gmail.com>:

> Hi Fabien, thanks for your help.
>
> I did not mention it but I indeed saw a correlation between latency and
> read repairs spikes. Though this is like going from 5 RR per second to 10
> per sec cluster wide according to opscenter: http://img42.com/L6gx1
>
> I have indeed some wide rows and this explanation looks reasonable to me,
> I mean this makes sense. Yet isn't this amount of Read Repair too low to
> induce such a "shitstorm" (even if it spikes x2, I got network x10) ? Also
> wide rows are present on heavy used tables (sadly...), so I should be using
> more network all the time (why only a few spikes per day (like 2 / 3 max) ?
>
> How could I confirm this, without removing RR and waiting a week I mean,
> is there a way to see the size of the data being repaired through this
> mechanism ?
>
> C*heers
>
> Alain
>
> 2015-09-01 0:11 GMT+02:00 Fabien Rousseau <fa...@gmail.com>:
>
>> Hi Alain,
>>
>> Could it be wide rows + read repair ? (Let's suppose the "read repair"
>> repairs the full row, and it may not be subject to stream throughput limit)
>>
>> Best Regards
>> Fabien
>>
>> 2015-08-31 15:56 GMT+02:00 Alain RODRIGUEZ <ar...@gmail.com>:
>>
>>> I just realised that I have no idea about how this mailing list handle
>>> attached files.
>>>
>>> Please find screenshots there --> http://img42.com/collection/y2KxS
>>>
>>> Alain
>>>
>>> 2015-08-31 15:48 GMT+02:00 Alain RODRIGUEZ <ar...@gmail.com>:
>>>
>>>> Hi,
>>>>
>>>> Running a 2.0.16 C* on AWS (private VPC, 2 DC).
>>>>
>>>> I am facing an issue on our EU DC where I have a network burst
>>>> (alongside with GC and latency increase).
>>>>
>>>> My first thought was a sudden application burst, though, I see no
>>>> corresponding evolution on reads / write or even CPU.
>>>>
>>>> So I thought that this might come from the node themselves as IN almost
>>>> equal OUT Network. I tried lowering stream throughput on the whole DC to 1
>>>> Mbps, with ~30 nodes --> 30 Mbps --> ~4 MB/s max. My network went a lot
>>>> higher about 30 M in both sides (see screenshots attached).
>>>>
>>>> I have tried to use iftop to see where this network is headed too, but
>>>> I was not able to do it because burst are very shorts.
>>>>
>>>> So, questions are:
>>>>
>>>> - Did someone experienced something similar already ? If so, any clue
>>>> would be appreciated :).
>>>> - How can I know (monitor, capture) where this big amount of network is
>>>> headed to or due to ?
>>>> - Am I right trying to figure out what this network is or should I
>>>> follow an other lead ?
>>>>
>>>> Notes: I also noticed that CPU does not spike nor does R&W, but disk
>>>> reads also spikes !
>>>>
>>>> C*heers,
>>>>
>>>> Alain
>>>>
>>>
>>>
>>
>

Re: Network / GC / Latency spike

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

Hi Fabien, thanks for your help.

I did not mention it but I indeed saw a correlation between latency and
read repairs spikes. Though this is like going from 5 RR per second to 10
per sec cluster wide according to opscenter: http://img42.com/L6gx1

I have indeed some wide rows and this explanation looks reasonable to me, I
mean this makes sense. Yet isn't this amount of Read Repair too low to
induce such a "shitstorm" (even if it spikes x2, I got network x10) ? Also
wide rows are present on heavy used tables (sadly...), so I should be using
more network all the time (why only a few spikes per day (like 2 / 3 max) ?

How could I confirm this, without removing RR and waiting a week I mean, is
there a way to see the size of the data being repaired through this
mechanism ?

C*heers

Alain

2015-09-01 0:11 GMT+02:00 Fabien Rousseau <fa...@gmail.com>:

> Hi Alain,
>
> Could it be wide rows + read repair ? (Let's suppose the "read repair"
> repairs the full row, and it may not be subject to stream throughput limit)
>
> Best Regards
> Fabien
>
> 2015-08-31 15:56 GMT+02:00 Alain RODRIGUEZ <ar...@gmail.com>:
>
>> I just realised that I have no idea about how this mailing list handle
>> attached files.
>>
>> Please find screenshots there --> http://img42.com/collection/y2KxS
>>
>> Alain
>>
>> 2015-08-31 15:48 GMT+02:00 Alain RODRIGUEZ <ar...@gmail.com>:
>>
>>> Hi,
>>>
>>> Running a 2.0.16 C* on AWS (private VPC, 2 DC).
>>>
>>> I am facing an issue on our EU DC where I have a network burst
>>> (alongside with GC and latency increase).
>>>
>>> My first thought was a sudden application burst, though, I see no
>>> corresponding evolution on reads / write or even CPU.
>>>
>>> So I thought that this might come from the node themselves as IN almost
>>> equal OUT Network. I tried lowering stream throughput on the whole DC to 1
>>> Mbps, with ~30 nodes --> 30 Mbps --> ~4 MB/s max. My network went a lot
>>> higher about 30 M in both sides (see screenshots attached).
>>>
>>> I have tried to use iftop to see where this network is headed too, but I
>>> was not able to do it because burst are very shorts.
>>>
>>> So, questions are:
>>>
>>> - Did someone experienced something similar already ? If so, any clue
>>> would be appreciated :).
>>> - How can I know (monitor, capture) where this big amount of network is
>>> headed to or due to ?
>>> - Am I right trying to figure out what this network is or should I
>>> follow an other lead ?
>>>
>>> Notes: I also noticed that CPU does not spike nor does R&W, but disk
>>> reads also spikes !
>>>
>>> C*heers,
>>>
>>> Alain
>>>
>>
>>
>

Re: Network / GC / Latency spike

Posted by Fabien Rousseau <fa...@gmail.com>.

Hi Alain,

Could it be wide rows + read repair ? (Let's suppose the "read repair"
repairs the full row, and it may not be subject to stream throughput limit)

Best Regards
Fabien

2015-08-31 15:56 GMT+02:00 Alain RODRIGUEZ <ar...@gmail.com>:

> I just realised that I have no idea about how this mailing list handle
> attached files.
>
> Please find screenshots there --> http://img42.com/collection/y2KxS
>
> Alain
>
> 2015-08-31 15:48 GMT+02:00 Alain RODRIGUEZ <ar...@gmail.com>:
>
>> Hi,
>>
>> Running a 2.0.16 C* on AWS (private VPC, 2 DC).
>>
>> I am facing an issue on our EU DC where I have a network burst (alongside
>> with GC and latency increase).
>>
>> My first thought was a sudden application burst, though, I see no
>> corresponding evolution on reads / write or even CPU.
>>
>> So I thought that this might come from the node themselves as IN almost
>> equal OUT Network. I tried lowering stream throughput on the whole DC to 1
>> Mbps, with ~30 nodes --> 30 Mbps --> ~4 MB/s max. My network went a lot
>> higher about 30 M in both sides (see screenshots attached).
>>
>> I have tried to use iftop to see where this network is headed too, but I
>> was not able to do it because burst are very shorts.
>>
>> So, questions are:
>>
>> - Did someone experienced something similar already ? If so, any clue
>> would be appreciated :).
>> - How can I know (monitor, capture) where this big amount of network is
>> headed to or due to ?
>> - Am I right trying to figure out what this network is or should I follow
>> an other lead ?
>>
>> Notes: I also noticed that CPU does not spike nor does R&W, but disk
>> reads also spikes !
>>
>> C*heers,
>>
>> Alain
>>
>
>

Re: Network / GC / Latency spike

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

I just realised that I have no idea about how this mailing list handle
attached files.

Please find screenshots there --> http://img42.com/collection/y2KxS

Alain

2015-08-31 15:48 GMT+02:00 Alain RODRIGUEZ <ar...@gmail.com>:

> Hi,
>
> Running a 2.0.16 C* on AWS (private VPC, 2 DC).
>
> I am facing an issue on our EU DC where I have a network burst (alongside
> with GC and latency increase).
>
> My first thought was a sudden application burst, though, I see no
> corresponding evolution on reads / write or even CPU.
>
> So I thought that this might come from the node themselves as IN almost
> equal OUT Network. I tried lowering stream throughput on the whole DC to 1
> Mbps, with ~30 nodes --> 30 Mbps --> ~4 MB/s max. My network went a lot
> higher about 30 M in both sides (see screenshots attached).
>
> I have tried to use iftop to see where this network is headed too, but I
> was not able to do it because burst are very shorts.
>
> So, questions are:
>
> - Did someone experienced something similar already ? If so, any clue
> would be appreciated :).
> - How can I know (monitor, capture) where this big amount of network is
> headed to or due to ?
> - Am I right trying to figure out what this network is or should I follow
> an other lead ?
>
> Notes: I also noticed that CPU does not spike nor does R&W, but disk reads
> also spikes !
>
> C*heers,
>
> Alain
>