You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by shimi <sh...@gmail.com> on 2011/04/11 15:28:05 UTC

Read time get worse during dynamic snitch reset

I finally upgraded 0.6.x to 0.7.4.  The nodes are running with the new
version for several days across 2 data centers.
I noticed that the read time in some of the nodes increase by x50-60 every
ten minutes.
There was no indication in the logs for something that happen at the same
time. The only thing that I know that is running every 10 minutes is
the dynamic snitch reset.
So I changed dynamic_snitch_reset_interval_in_ms to 20 minutes and now I
have the problem once in every 20 minutes.

I am running all nodes with:
replica_placement_strategy:
org.apache.cassandra.locator.NetworkTopologyStrategy
      strategy_options:
        DC1 : 2
        DC2 : 2
      replication_factor: 4

(DC1 and DC2 are taken from the ips)
Does anyone familiar with this kind of behavior?

Shimi

Re: Read time get worse during dynamic snitch reset

Posted by shimi <sh...@gmail.com>.

I finally found some time to get back to this issue.
I turned on the DEBUG log on the StorageProxy and it shows that all of these
request are read from the other datacenter.

Shimi

On Tue, Apr 12, 2011 at 2:31 PM, aaron morton <aa...@thelastpickle.com>wrote:

> Something feels odd.
>
> From Peters nice write up of the dynamic snitch
> http://www.mail-archive.com/user@cassandra.apache.org/msg12092.html The
> RackInferringSnitch (and the PropertyFileSnitch) derive from the
> AbstractNetworkTopologySnitch and should...
> "
> In the case of the NetworkTopologyStrategy, it inherits the
> implementation in AbstractNetworkTopologySnitch which sorts by
> AbstractNetworkTopologySnitch.compareEndPoints(), which:
>
> (1) Always prefers itself to any other node. So "myself" is always
> "closest", no matter what.
> (2) Else, always prefers a node in the same rack, to a node in a different
> rack.
> (3) Else, always prefers a node in the same dc, to a node in a different
> dc.
> <http://www.mail-archive.com/user@cassandra.apache.org/msg12092.html>"
>
> AFAIK the (data) request should be going to the local DC even after the
> DynamicSnitch has reset the scores. Because the underlying
> RackInferringSnitch should prefer local nodes.
>
> Just for fun check rack and dc assignments are what you thought using the
> operations on o.a.c.db.EndpointSnitchInfo bean in JConsole. Pass in the ip
> address for the nodes in each dc. If possible can you provide some info on
> the ip's in each dc?
>
> Aaron
>
> On 12 Apr 2011, at 18:24, shimi wrote:
>
> On Tue, Apr 12, 2011 at 12:26 AM, aaron morton <aa...@thelastpickle.com>wrote:
>
>> The reset interval clears the latency tracked for each node so a bad node
>> will be read from again. The scores for each node are then updated every
>> 100ms (default) using the last 100 responses from a node.
>>
>> How long does the bad performance last for?
>>
> Only a few seconds and but there are a lot of read requests during this
> time
>
>>
>> What CL are you reading at ? At Quorum with RF 4 the read request will be
>> sent to 3 nodes, ordered by proximity and wellness according to the dynamic
>> snitch. (for background recent discussion on dynamic snitch
>> http://www.mail-archive.com/user@cassandra.apache.org/msg12089.html)
>>
> I am reading with CL of ONE,  read_repair_chance=0.33, RackInferringSnitch
> and keys_cached = rows_cached = 0
>
>>
>> You can take a look at the weights and timings used by the DynamicSnitch
>> in JConsole under o.a.c.db.DynamicSnitchEndpoint . Also at DEBUG log level
>> you will be able to see which nodes the request is sent to.
>>
> Everything looks OK. The weights are around 3 for the nodes in the same
> data center and around 5 for the others. I will turn on the DEBUG level to
> see if I can find more info.
>
>>
>> My guess is the DynamicSnitch is doing the right thing and the slow down
>> is a node with a problem getting back into the list of nodes used for your
>> read. It's then moved down the list as it's bad performance is noticed.
>>
> Looking the DynamicSnitch MBean I don't see any problems with any of the
> nodes. My guess is that during the reset time there are reads that are sent
> to the other data center.
>
>>
>> Hope that helps
>> Aaron
>>
>
> Shimi
>
>
>>
>> On 12 Apr 2011, at 01:28, shimi wrote:
>>
>> I finally upgraded 0.6.x to 0.7.4.  The nodes are running with the new
>> version for several days across 2 data centers.
>> I noticed that the read time in some of the nodes increase by x50-60 every
>> ten minutes.
>> There was no indication in the logs for something that happen at the same
>> time. The only thing that I know that is running every 10 minutes is
>> the dynamic snitch reset.
>> So I changed dynamic_snitch_reset_interval_in_ms to 20 minutes and now I
>> have the problem once in every 20 minutes.
>>
>> I am running all nodes with:
>> replica_placement_strategy:
>> org.apache.cassandra.locator.NetworkTopologyStrategy
>>       strategy_options:
>>         DC1 : 2
>>         DC2 : 2
>>       replication_factor: 4
>>
>> (DC1 and DC2 are taken from the ips)
>> Does anyone familiar with this kind of behavior?
>>
>> Shimi
>>
>>
>>
>
>

Re: Read time get worse during dynamic snitch reset

Posted by aaron morton <aa...@thelastpickle.com>.

Something feels odd. 

From Peters nice write up of the dynamic snitch http://www.mail-archive.com/user@cassandra.apache.org/msg12092.html The RackInferringSnitch (and the PropertyFileSnitch) derive from the AbstractNetworkTopologySnitch and should...
"
In the case of the NetworkTopologyStrategy, it inherits the
implementation in AbstractNetworkTopologySnitch which sorts by
AbstractNetworkTopologySnitch.compareEndPoints(), which:

(1) Always prefers itself to any other node. So "myself" is always
"closest", no matter what.
(2) Else, always prefers a node in the same rack, to a node in a different rack.
(3) Else, always prefers a node in the same dc, to a node in a different dc.
"

AFAIK the (data) request should be going to the local DC even after the DynamicSnitch has reset the scores. Because the underlying RackInferringSnitch should prefer local nodes.

Just for fun check rack and dc assignments are what you thought using the operations on o.a.c.db.EndpointSnitchInfo bean in JConsole. Pass in the ip address for the nodes in each dc. If possible can you provide some info on the ip's in each dc?

Aaron
 
On 12 Apr 2011, at 18:24, shimi wrote:

> On Tue, Apr 12, 2011 at 12:26 AM, aaron morton <aa...@thelastpickle.com> wrote:
> The reset interval clears the latency tracked for each node so a bad node will be read from again. The scores for each node are then updated every 100ms (default) using the last 100 responses from a node. 
> 
> How long does the bad performance last for?
> Only a few seconds and but there are a lot of read requests during this time
> 
> What CL are you reading at ? At Quorum with RF 4 the read request will be sent to 3 nodes, ordered by proximity and wellness according to the dynamic snitch. (for background recent discussion on dynamic snitch http://www.mail-archive.com/user@cassandra.apache.org/msg12089.html)
> I am reading with CL of ONE,  read_repair_chance=0.33, RackInferringSnitch and keys_cached = rows_cached = 0
> 
> You can take a look at the weights and timings used by the DynamicSnitch in JConsole under o.a.c.db.DynamicSnitchEndpoint . Also at DEBUG log level you will be able to see which nodes the request is sent to. 
> Everything looks OK. The weights are around 3 for the nodes in the same data center and around 5 for the others. I will turn on the DEBUG level to see if I can find more info.
> 
> My guess is the DynamicSnitch is doing the right thing and the slow down is a node with a problem getting back into the list of nodes used for your read. It's then moved down the list as it's bad performance is noticed.
> Looking the DynamicSnitch MBean I don't see any problems with any of the nodes. My guess is that during the reset time there are reads that are sent to the other data center. 
> 
> Hope that helps
> Aaron
> 
> Shimi
>  
> 
> On 12 Apr 2011, at 01:28, shimi wrote:
> 
>> I finally upgraded 0.6.x to 0.7.4.  The nodes are running with the new version for several days across 2 data centers.
>> I noticed that the read time in some of the nodes increase by x50-60 every ten minutes.
>> There was no indication in the logs for something that happen at the same time. The only thing that I know that is running every 10 minutes is the dynamic snitch reset.
>> So I changed dynamic_snitch_reset_interval_in_ms to 20 minutes and now I have the problem once in every 20 minutes.
>> 
>> I am running all nodes with:
>> replica_placement_strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
>>       strategy_options:
>>         DC1 : 2
>>         DC2 : 2
>>       replication_factor: 4
>> 
>> (DC1 and DC2 are taken from the ips)
>> Does anyone familiar with this kind of behavior?
>> 
>> Shimi
>> 
> 
>

Re: Read time get worse during dynamic snitch reset

Posted by shimi <sh...@gmail.com>.

On Tue, Apr 12, 2011 at 12:26 AM, aaron morton <aa...@thelastpickle.com>wrote:

> The reset interval clears the latency tracked for each node so a bad node
> will be read from again. The scores for each node are then updated every
> 100ms (default) using the last 100 responses from a node.
>
> How long does the bad performance last for?
>
Only a few seconds and but there are a lot of read requests during this time

>
> What CL are you reading at ? At Quorum with RF 4 the read request will be
> sent to 3 nodes, ordered by proximity and wellness according to the dynamic
> snitch. (for background recent discussion on dynamic snitch
> http://www.mail-archive.com/user@cassandra.apache.org/msg12089.html)
>
I am reading with CL of ONE,  read_repair_chance=0.33, RackInferringSnitch
and keys_cached = rows_cached = 0

>
> You can take a look at the weights and timings used by the DynamicSnitch in
> JConsole under o.a.c.db.DynamicSnitchEndpoint . Also at DEBUG log level you
> will be able to see which nodes the request is sent to.
>
Everything looks OK. The weights are around 3 for the nodes in the same data
center and around 5 for the others. I will turn on the DEBUG level to see if
I can find more info.

>
> My guess is the DynamicSnitch is doing the right thing and the slow down is
> a node with a problem getting back into the list of nodes used for your
> read. It's then moved down the list as it's bad performance is noticed.
>
Looking the DynamicSnitch MBean I don't see any problems with any of the
nodes. My guess is that during the reset time there are reads that are sent
to the other data center.

>
> Hope that helps
> Aaron
>

Shimi


>
> On 12 Apr 2011, at 01:28, shimi wrote:
>
> I finally upgraded 0.6.x to 0.7.4.  The nodes are running with the new
> version for several days across 2 data centers.
> I noticed that the read time in some of the nodes increase by x50-60 every
> ten minutes.
> There was no indication in the logs for something that happen at the same
> time. The only thing that I know that is running every 10 minutes is
> the dynamic snitch reset.
> So I changed dynamic_snitch_reset_interval_in_ms to 20 minutes and now I
> have the problem once in every 20 minutes.
>
> I am running all nodes with:
> replica_placement_strategy:
> org.apache.cassandra.locator.NetworkTopologyStrategy
>       strategy_options:
>         DC1 : 2
>         DC2 : 2
>       replication_factor: 4
>
> (DC1 and DC2 are taken from the ips)
> Does anyone familiar with this kind of behavior?
>
> Shimi
>
>
>

Re: Read time get worse during dynamic snitch reset

Posted by aaron morton <aa...@thelastpickle.com>.

The reset interval clears the latency tracked for each node so a bad node will be read from again. The scores for each node are then updated every 100ms (default) using the last 100 responses from a node. 

How long does the bad performance last for?

What CL are you reading at ? At Quorum with RF 4 the read request will be sent to 3 nodes, ordered by proximity and wellness according to the dynamic snitch. (for background recent discussion on dynamic snitch http://www.mail-archive.com/user@cassandra.apache.org/msg12089.html)

You can take a look at the weights and timings used by the DynamicSnitch in JConsole under o.a.c.db.DynamicSnitchEndpoint . Also at DEBUG log level you will be able to see which nodes the request is sent to. 

My guess is the DynamicSnitch is doing the right thing and the slow down is a node with a problem getting back into the list of nodes used for your read. It's then moved down the list as it's bad performance is noticed.

Hope that helps
Aaron

On 12 Apr 2011, at 01:28, shimi wrote:

> I finally upgraded 0.6.x to 0.7.4.  The nodes are running with the new version for several days across 2 data centers.
> I noticed that the read time in some of the nodes increase by x50-60 every ten minutes.
> There was no indication in the logs for something that happen at the same time. The only thing that I know that is running every 10 minutes is the dynamic snitch reset.
> So I changed dynamic_snitch_reset_interval_in_ms to 20 minutes and now I have the problem once in every 20 minutes.
> 
> I am running all nodes with:
> replica_placement_strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
>       strategy_options:
>         DC1 : 2
>         DC2 : 2
>       replication_factor: 4
> 
> (DC1 and DC2 are taken from the ips)
> Does anyone familiar with this kind of behavior?
> 
> Shimi
>