You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Anuj Wadehra <an...@yahoo.co.in> on 2015/11/11 22:05:31 UTC

Repair Hangs while requesting Merkle Trees

Hi,
We have 2 DCs at remote locations with 10GBps connectivity.We are able to complete repair (-par -pr) on 5 nodes. On only one node in DC2, we are unable to complete repair as it always hangs. Node sends Merkle Tree requests, but one or more nodes in DC1 (remote) never show that they sent the merkle tree reply to requesting node.
Repair hangs infinitely. 

After increasing request_timeout_in_ms on affected node, we were able to successfully run repair on one of the two occassions.

Any comments, why this is happening on just one node? In OutboundTcpConnection.java,  when isTimeOut method always returns false for non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why increasing request timeout solved problem on one occasion ?

Thanks
Anuj Wadehra

Re: Repair Hangs while requesting Merkle Trees

Posted by Anuj Wadehra <an...@yahoo.co.in>.
Thanks Bryan !!


Connection is in ESTBLISHED state on on end and completely missing at other end (in another dc).


Yes, we can revisit TCP tuning.But the problem is node specific. So not sure whether tuning is the culprit.

Thanks

Anuj

Sent from Yahoo Mail on Android

From:"Bryan Cheng" <br...@blockcypher.com>
Date:Wed, 18 Nov, 2015 at 2:04 am
Subject:Re: Repair Hangs while requesting Merkle Trees

Ah OK, might have misunderstood you. Streaming socket should not be in play during merkle tree generation (validation compaction). They may come in play during merkle tree exchange- that I'm not sure about. You can read a bit more here: https://issues.apache.org/jira/browse/CASSANDRA-8611.


Regardless, you should have it set- 1 hr is usually a good conservative estimate, but you can go much lower safely.


What state is the connection on that only shows on one side? Is it ESTABLISHED, or something like CLOSE_WAIT?


Here's a good place to start for tuning, though it doesn't have as much about network tuning: https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html. More generally, TCP tuning usually revolves around a balance between latency and bandwidth. Over long connections (we're talking 10s of ms, instead of the sub 1ms you usually see in a good dc network), your expectations will shift greatly. Stuff like NODELAY on tcp is very nice for cutting your latencies when you're inside a DC, but will generate lots of small packets that will hurt your bandwidth over longer connections due to the need to wait for acks. otc_coalescing_strategy is on a similar vein, bundling together nearby messages to trade latency for throughput. You'll also probably want to tune your tcp buffers and window sizes, since that determines how much data can be in-flight between acknowledgements, and the default size is pitiful for any decent  network size. Google around for TCP
 tuning/buffer tuning and you should find some good resources.


On Mon, Nov 16, 2015 at 5:23 PM, Anuj Wadehra <an...@yahoo.co.in> wrote:

Hi Bryan,


Thanks for the reply !!

I didnt mean streaming_socket_tomeout_in_ms. I meant when you run netstats (Linux cmnd) on  node A in DC1, you will notice that there is connection in established state with node B in DC2. But when you run netstats on node B, you wont find any connection with node A. Such connections are there across dc? Is it a problem.


We havent set streaming_socket_timeout_in_ms which I know must be set. But I am not  sure wtheher setting this property has any impact on merkle tree requests. I thought its valid for data streaming if some mismatch is found and data needs to be streamed.Please confirm. Whats the value you use for streaming socket timeout?


Morever, if socket timeout is the issue, that should happen on other nodes too...repair is not running on just one node, as merkle tree request is getting lost n not transmitted to one or more nodes in remote dc.


I am not sure about exact distance. But they are connected with a very high speed 10gbps link.


When you say different TCP stack tuning..do u have any document/blog/link describing recommendations for multi Dc Cassandra setup?  Can you elaborate what all settings need to be different? 



Thanks

Anuj









Sent from Yahoo Mail on Android

From:"Bryan Cheng" <br...@blockcypher.com>
Date:Tue, 17 Nov, 2015 at 5:54 am


Subject:Re: Repair Hangs while requesting Merkle Trees

Hi Anuj,


Did you mean streaming_socket_timeout_in_ms? If not, then you definitely want that set. Even the best network connections will break occasionally, and in Cassandra < 2.1.10 (I believe) this would leave those connections hanging indefinitely on one end.


How far away are your two DC's from a network perspective, out of curiosity? You'll almost certainly be doing different TCP stack tuning for cross-DC, notably your buffer sizes, window params, cassandra-specific stuff like otc_coalescing_strategy, inter_dc_tcp_nodelay, etc.


On Sat, Nov 14, 2015 at 10:35 AM, Anuj Wadehra <an...@yahoo.co.in> wrote:

One more observation.We observed that there are few TCP connections which node shows as Established but when we go to node at other end,connection is not there. They are called "phantom" connections I guess. Can this be a possible cause?


Thanks

Anuj


Sent from Yahoo Mail on Android

From:"Anuj Wadehra" <an...@yahoo.co.in>
Date:Sat, 14 Nov, 2015 at 11:59 pm


Subject:Re: Repair Hangs while requesting Merkle Trees

Thanks Daemeon !!


I wil capture the output of netstats and share in next few days. We were thinking of taking tcp dumps also. If its a network issue and increasing request timeout worked, not sure how Cassandra is dropping messages based on timeout.Repair messages are non droppable and not supposed to be timedout.


2 of the 3 nodes in the DC are able to complete repair without any issue. Just one node is problematic.


I also observed frequent messages in logs of other nodes which say that hints replay timedout..and the node where hints were being replayed is always a remote dc node. Is it related some how?


Thanks

Anuj

Sent from Yahoo Mail on Android

From:"daemeon reiydelle" <da...@gmail.com>
Date:Thu, 12 Nov, 2015 at 10:34 am
Subject:Re: Repair Hangs while requesting Merkle Trees



Have you checked the network statistics on that machine? (netstats -tas) while attempting to repair ... if netstats show ANY issues you have a problem. If you can put the command in a loop running every 60 seconds for maybe 15 minutes and post back?

Out of curiousity, how many remote DC nodes are getting successfully repaired?



.......
“Life should not be a journey to the grave with the intention of arriving safely in a
pretty and well preserved body, but rather to skid in broadside in a cloud of smoke,
thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a Ride!” 
- Hunter Thompson

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 9872


On Wed, Nov 11, 2015 at 1:06 PM, Anuj Wadehra <an...@yahoo.co.in> wrote:

Hi,


we are using 2.0.14. We have 2 DCs at remote locations with 10GBps connectivity.We are able to complete repair (-par -pr) on 5 nodes. On only one node in DC2, we are unable to complete repair as it always hangs. Node sends Merkle Tree requests, but one or more nodes in DC1 (remote) never show that they sent the merkle tree reply to requesting node.
Repair hangs infinitely. 

After increasing request_timeout_in_ms on affected node, we were able to successfully run repair on one of the two occassions.

Any comments, why this is happening on just one node? In OutboundTcpConnection.java,  when isTimeOut method always returns false for non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why increasing request timeout solved problem on one occasion ?



Thanks

Anuj Wadehra 




On Thursday, 12 November 2015 2:35 AM, Anuj Wadehra <an...@yahoo.co.in> wrote:



Hi,


We have 2 DCs at remote locations with 10GBps connectivity.We are able to complete repair (-par -pr) on 5 nodes. On only one node in DC2, we are unable to complete repair as it always hangs. Node sends Merkle Tree requests, but one or more nodes in DC1 (remote) never show that they sent the merkle tree reply to requesting node.
Repair hangs infinitely. 

After increasing request_timeout_in_ms on affected node, we were able to successfully run repair on one of the two occassions.

Any comments, why this is happening on just one node? In OutboundTcpConnection.java,  when isTimeOut method always returns false for non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why increasing request timeout solved problem on one occasion ?



Thanks

Anuj Wadehra







Re: Repair Hangs while requesting Merkle Trees

Posted by Bryan Cheng <br...@blockcypher.com>.
Ah OK, might have misunderstood you. Streaming socket should not be in play
during merkle tree generation (validation compaction). They may come in
play during merkle tree exchange- that I'm not sure about. You can read a
bit more here: https://issues.apache.org/jira/browse/CASSANDRA-8611.

Regardless, you should have it set- 1 hr is usually a good conservative
estimate, but you can go much lower safely.

What state is the connection on that only shows on one side? Is it
ESTABLISHED, or something like CLOSE_WAIT?

Here's a good place to start for tuning, though it doesn't have as much
about network tuning:
https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html. More
generally, TCP tuning usually revolves around a balance between latency and
bandwidth. Over long connections (we're talking 10s of ms, instead of the
sub 1ms you usually see in a good dc network), your expectations will shift
greatly. Stuff like NODELAY on tcp is very nice for cutting your latencies
when you're inside a DC, but will generate lots of small packets that will
hurt your bandwidth over longer connections due to the need to wait for
acks. otc_coalescing_strategy is on a similar vein, bundling together
nearby messages to trade latency for throughput. You'll also probably want
to tune your tcp buffers and window sizes, since that determines how much
data can be in-flight between acknowledgements, and the default size is
pitiful for any decent  network size. Google around for TCP tuning/buffer
tuning and you should find some good resources.

On Mon, Nov 16, 2015 at 5:23 PM, Anuj Wadehra <an...@yahoo.co.in>
wrote:

> Hi Bryan,
>
> Thanks for the reply !!
> I didnt mean streaming_socket_tomeout_in_ms. I meant when you run netstats
> (Linux cmnd) on  node A in DC1, you will notice that there is connection in
> established state with node B in DC2. But when you run netstats on node B,
> you wont find any connection with node A. Such connections are there across
> dc? Is it a problem.
>
> We havent set streaming_socket_timeout_in_ms which I know must be set. But
> I am not  sure wtheher setting this property has any impact on merkle tree
> requests. I thought its valid for data streaming if some mismatch is found
> and data needs to be streamed.Please confirm. Whats the value you use for
> streaming socket timeout?
>
> Morever, if socket timeout is the issue, that should happen on other nodes
> too...repair is not running on just one node, as merkle tree request is
> getting lost n not transmitted to one or more nodes in remote dc.
>
> I am not sure about exact distance. But they are connected with a very
> high speed 10gbps link.
>
> When you say different TCP stack tuning..do u have any document/blog/link
> describing recommendations for multi Dc Cassandra setup?  Can you elaborate
> what all settings need to be different?
>
>
> Thanks
> Anuj
>
>
>
>
>
>
>
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
> ------------------------------
> *From*:"Bryan Cheng" <br...@blockcypher.com>
> *Date*:Tue, 17 Nov, 2015 at 5:54 am
>
> *Subject*:Re: Repair Hangs while requesting Merkle Trees
>
> Hi Anuj,
>
> Did you mean streaming_socket_timeout_in_ms? If not, then you definitely
> want that set. Even the best network connections will break occasionally,
> and in Cassandra < 2.1.10 (I believe) this would leave those connections
> hanging indefinitely on one end.
>
> How far away are your two DC's from a network perspective, out of
> curiosity? You'll almost certainly be doing different TCP stack tuning for
> cross-DC, notably your buffer sizes, window params, cassandra-specific
> stuff like otc_coalescing_strategy, inter_dc_tcp_nodelay, etc.
>
> On Sat, Nov 14, 2015 at 10:35 AM, Anuj Wadehra <an...@yahoo.co.in>
> wrote:
>
>> One more observation.We observed that there are few TCP connections which
>> node shows as Established but when we go to node at other end,connection is
>> not there. They are called "phantom" connections I guess. Can this be a
>> possible cause?
>>
>> Thanks
>> Anuj
>>
>> Sent from Yahoo Mail on Android
>> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>> ------------------------------
>> *From*:"Anuj Wadehra" <an...@yahoo.co.in>
>> *Date*:Sat, 14 Nov, 2015 at 11:59 pm
>>
>> *Subject*:Re: Repair Hangs while requesting Merkle Trees
>>
>> Thanks Daemeon !!
>>
>> I wil capture the output of netstats and share in next few days. We were
>> thinking of taking tcp dumps also. If its a network issue and increasing
>> request timeout worked, not sure how Cassandra is dropping messages based
>> on timeout.Repair messages are non droppable and not supposed to be
>> timedout.
>>
>> 2 of the 3 nodes in the DC are able to complete repair without any issue.
>> Just one node is problematic.
>>
>> I also observed frequent messages in logs of other nodes which say that
>> hints replay timedout..and the node where hints were being replayed is
>> always a remote dc node. Is it related some how?
>>
>> Thanks
>> Anuj
>>
>> Sent from Yahoo Mail on Android
>> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>> ------------------------------
>> *From*:"daemeon reiydelle" <da...@gmail.com>
>> *Date*:Thu, 12 Nov, 2015 at 10:34 am
>> *Subject*:Re: Repair Hangs while requesting Merkle Trees
>>
>>
>> Have you checked the network statistics on that machine? (netstats -tas)
>> while attempting to repair ... if netstats show ANY issues you have a
>> problem. If you can put the command in a loop running every 60 seconds for
>> maybe 15 minutes and post back?
>>
>> Out of curiousity, how many remote DC nodes are getting successfully
>> repaired?
>>
>>
>>
>> *.......*
>>
>>
>>
>>
>>
>>
>> *“Life should not be a journey to the grave with the intention of
>> arriving safely in apretty and well preserved body, but rather to skid in
>> broadside in a cloud of smoke,thoroughly used up, totally worn out, and
>> loudly proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M.
>> ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*
>>
>> On Wed, Nov 11, 2015 at 1:06 PM, Anuj Wadehra <an...@yahoo.co.in>
>> wrote:
>>
>>> Hi,
>>>
>>> we are using 2.0.14. We have 2 DCs at remote locations with 10GBps
>>> connectivity.We are able to complete repair (-par -pr) on 5 nodes. On only
>>> one node in DC2, we are unable to complete repair as it always hangs. Node
>>> sends Merkle Tree requests, but one or more nodes in DC1 (remote) never
>>> show that they sent the merkle tree reply to requesting node.
>>> Repair hangs infinitely.
>>>
>>> After increasing request_timeout_in_ms on affected node, we were able to
>>> successfully run repair on one of the two occassions.
>>>
>>> Any comments, why this is happening on just one node? In
>>> OutboundTcpConnection.java,  when isTimeOut method always returns false for
>>> non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why
>>> increasing request timeout solved problem on one occasion ?
>>>
>>>
>>> Thanks
>>> Anuj Wadehra
>>>
>>>
>>>
>>> On Thursday, 12 November 2015 2:35 AM, Anuj Wadehra <
>>> anujw_2003@yahoo.co.in> wrote:
>>>
>>>
>>> Hi,
>>>
>>> We have 2 DCs at remote locations with 10GBps connectivity.We are able
>>> to complete repair (-par -pr) on 5 nodes. On only one node in DC2, we are
>>> unable to complete repair as it always hangs. Node sends Merkle Tree
>>> requests, but one or more nodes in DC1 (remote) never show that they sent
>>> the merkle tree reply to requesting node.
>>> Repair hangs infinitely.
>>>
>>> After increasing request_timeout_in_ms on affected node, we were able to
>>> successfully run repair on one of the two occassions.
>>>
>>> Any comments, why this is happening on just one node? In
>>> OutboundTcpConnection.java,  when isTimeOut method always returns false for
>>> non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why
>>> increasing request timeout solved problem on one occasion ?
>>>
>>>
>>> Thanks
>>> Anuj Wadehra
>>>
>>>
>>>
>>
>

Re: Repair Hangs while requesting Merkle Trees

Posted by Anuj Wadehra <an...@yahoo.co.in>.
Hi Bryan,


Thanks for the reply !!

I didnt mean streaming_socket_tomeout_in_ms. I meant when you run netstats (Linux cmnd) on  node A in DC1, you will notice that there is connection in established state with node B in DC2. But when you run netstats on node B, you wont find any connection with node A. Such connections are there across dc? Is it a problem.


We havent set streaming_socket_timeout_in_ms which I know must be set. But I am not  sure wtheher setting this property has any impact on merkle tree requests. I thought its valid for data streaming if some mismatch is found and data needs to be streamed.Please confirm. Whats the value you use for streaming socket timeout?


Morever, if socket timeout is the issue, that should happen on other nodes too...repair is not running on just one node, as merkle tree request is getting lost n not transmitted to one or more nodes in remote dc.


I am not sure about exact distance. But they are connected with a very high speed 10gbps link.


When you say different TCP stack tuning..do u have any document/blog/link describing recommendations for multi Dc Cassandra setup?  Can you elaborate what all settings need to be different? 



Thanks

Anuj









Sent from Yahoo Mail on Android

From:"Bryan Cheng" <br...@blockcypher.com>
Date:Tue, 17 Nov, 2015 at 5:54 am
Subject:Re: Repair Hangs while requesting Merkle Trees

Hi Anuj,


Did you mean streaming_socket_timeout_in_ms? If not, then you definitely want that set. Even the best network connections will break occasionally, and in Cassandra < 2.1.10 (I believe) this would leave those connections hanging indefinitely on one end.


How far away are your two DC's from a network perspective, out of curiosity? You'll almost certainly be doing different TCP stack tuning for cross-DC, notably your buffer sizes, window params, cassandra-specific stuff like otc_coalescing_strategy, inter_dc_tcp_nodelay, etc.


On Sat, Nov 14, 2015 at 10:35 AM, Anuj Wadehra <an...@yahoo.co.in> wrote:

One more observation.We observed that there are few TCP connections which node shows as Established but when we go to node at other end,connection is not there. They are called "phantom" connections I guess. Can this be a possible cause?


Thanks

Anuj


Sent from Yahoo Mail on Android

From:"Anuj Wadehra" <an...@yahoo.co.in>
Date:Sat, 14 Nov, 2015 at 11:59 pm


Subject:Re: Repair Hangs while requesting Merkle Trees

Thanks Daemeon !!


I wil capture the output of netstats and share in next few days. We were thinking of taking tcp dumps also. If its a network issue and increasing request timeout worked, not sure how Cassandra is dropping messages based on timeout.Repair messages are non droppable and not supposed to be timedout.


2 of the 3 nodes in the DC are able to complete repair without any issue. Just one node is problematic.


I also observed frequent messages in logs of other nodes which say that hints replay timedout..and the node where hints were being replayed is always a remote dc node. Is it related some how?


Thanks

Anuj

Sent from Yahoo Mail on Android

From:"daemeon reiydelle" <da...@gmail.com>
Date:Thu, 12 Nov, 2015 at 10:34 am
Subject:Re: Repair Hangs while requesting Merkle Trees



Have you checked the network statistics on that machine? (netstats -tas) while attempting to repair ... if netstats show ANY issues you have a problem. If you can put the command in a loop running every 60 seconds for maybe 15 minutes and post back?

Out of curiousity, how many remote DC nodes are getting successfully repaired?



.......
“Life should not be a journey to the grave with the intention of arriving safely in a
pretty and well preserved body, but rather to skid in broadside in a cloud of smoke,
thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a Ride!” 
- Hunter Thompson

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 9872


On Wed, Nov 11, 2015 at 1:06 PM, Anuj Wadehra <an...@yahoo.co.in> wrote:

Hi,


we are using 2.0.14. We have 2 DCs at remote locations with 10GBps connectivity.We are able to complete repair (-par -pr) on 5 nodes. On only one node in DC2, we are unable to complete repair as it always hangs. Node sends Merkle Tree requests, but one or more nodes in DC1 (remote) never show that they sent the merkle tree reply to requesting node.
Repair hangs infinitely. 

After increasing request_timeout_in_ms on affected node, we were able to successfully run repair on one of the two occassions.

Any comments, why this is happening on just one node? In OutboundTcpConnection.java,  when isTimeOut method always returns false for non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why increasing request timeout solved problem on one occasion ?



Thanks

Anuj Wadehra 




On Thursday, 12 November 2015 2:35 AM, Anuj Wadehra <an...@yahoo.co.in> wrote:



Hi,


We have 2 DCs at remote locations with 10GBps connectivity.We are able to complete repair (-par -pr) on 5 nodes. On only one node in DC2, we are unable to complete repair as it always hangs. Node sends Merkle Tree requests, but one or more nodes in DC1 (remote) never show that they sent the merkle tree reply to requesting node.
Repair hangs infinitely. 

After increasing request_timeout_in_ms on affected node, we were able to successfully run repair on one of the two occassions.

Any comments, why this is happening on just one node? In OutboundTcpConnection.java,  when isTimeOut method always returns false for non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why increasing request timeout solved problem on one occasion ?



Thanks

Anuj Wadehra






Re: Repair Hangs while requesting Merkle Trees

Posted by Bryan Cheng <br...@blockcypher.com>.
Hi Anuj,

Did you mean streaming_socket_timeout_in_ms? If not, then you definitely
want that set. Even the best network connections will break occasionally,
and in Cassandra < 2.1.10 (I believe) this would leave those connections
hanging indefinitely on one end.

How far away are your two DC's from a network perspective, out of
curiosity? You'll almost certainly be doing different TCP stack tuning for
cross-DC, notably your buffer sizes, window params, cassandra-specific
stuff like otc_coalescing_strategy, inter_dc_tcp_nodelay, etc.

On Sat, Nov 14, 2015 at 10:35 AM, Anuj Wadehra <an...@yahoo.co.in>
wrote:

> One more observation.We observed that there are few TCP connections which
> node shows as Established but when we go to node at other end,connection is
> not there. They are called "phantom" connections I guess. Can this be a
> possible cause?
>
> Thanks
> Anuj
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
> ------------------------------
> *From*:"Anuj Wadehra" <an...@yahoo.co.in>
> *Date*:Sat, 14 Nov, 2015 at 11:59 pm
>
> *Subject*:Re: Repair Hangs while requesting Merkle Trees
>
> Thanks Daemeon !!
>
> I wil capture the output of netstats and share in next few days. We were
> thinking of taking tcp dumps also. If its a network issue and increasing
> request timeout worked, not sure how Cassandra is dropping messages based
> on timeout.Repair messages are non droppable and not supposed to be
> timedout.
>
> 2 of the 3 nodes in the DC are able to complete repair without any issue.
> Just one node is problematic.
>
> I also observed frequent messages in logs of other nodes which say that
> hints replay timedout..and the node where hints were being replayed is
> always a remote dc node. Is it related some how?
>
> Thanks
> Anuj
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
> ------------------------------
> *From*:"daemeon reiydelle" <da...@gmail.com>
> *Date*:Thu, 12 Nov, 2015 at 10:34 am
> *Subject*:Re: Repair Hangs while requesting Merkle Trees
>
>
> Have you checked the network statistics on that machine? (netstats -tas)
> while attempting to repair ... if netstats show ANY issues you have a
> problem. If you can put the command in a loop running every 60 seconds for
> maybe 15 minutes and post back?
>
> Out of curiousity, how many remote DC nodes are getting successfully
> repaired?
>
>
>
> *.......*
>
>
>
>
>
>
> *“Life should not be a journey to the grave with the intention of arriving
> safely in apretty and well preserved body, but rather to skid in broadside
> in a cloud of smoke,thoroughly used up, totally worn out, and loudly
> proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
> (+1) 415.501.0198 <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872
> <%28%2B44%29%20%280%29%2020%208144%209872>*
>
> On Wed, Nov 11, 2015 at 1:06 PM, Anuj Wadehra <an...@yahoo.co.in>
> wrote:
>
>> Hi,
>>
>> we are using 2.0.14. We have 2 DCs at remote locations with 10GBps
>> connectivity.We are able to complete repair (-par -pr) on 5 nodes. On only
>> one node in DC2, we are unable to complete repair as it always hangs. Node
>> sends Merkle Tree requests, but one or more nodes in DC1 (remote) never
>> show that they sent the merkle tree reply to requesting node.
>> Repair hangs infinitely.
>>
>> After increasing request_timeout_in_ms on affected node, we were able to
>> successfully run repair on one of the two occassions.
>>
>> Any comments, why this is happening on just one node? In
>> OutboundTcpConnection.java,  when isTimeOut method always returns false for
>> non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why
>> increasing request timeout solved problem on one occasion ?
>>
>>
>> Thanks
>> Anuj Wadehra
>>
>>
>>
>> On Thursday, 12 November 2015 2:35 AM, Anuj Wadehra <
>> anujw_2003@yahoo.co.in> wrote:
>>
>>
>> Hi,
>>
>> We have 2 DCs at remote locations with 10GBps connectivity.We are able to
>> complete repair (-par -pr) on 5 nodes. On only one node in DC2, we are
>> unable to complete repair as it always hangs. Node sends Merkle Tree
>> requests, but one or more nodes in DC1 (remote) never show that they sent
>> the merkle tree reply to requesting node.
>> Repair hangs infinitely.
>>
>> After increasing request_timeout_in_ms on affected node, we were able to
>> successfully run repair on one of the two occassions.
>>
>> Any comments, why this is happening on just one node? In
>> OutboundTcpConnection.java,  when isTimeOut method always returns false for
>> non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why
>> increasing request timeout solved problem on one occasion ?
>>
>>
>> Thanks
>> Anuj Wadehra
>>
>>
>>
>

Re: Repair Hangs while requesting Merkle Trees

Posted by Anuj Wadehra <an...@yahoo.co.in>.
One more observation.We observed that there are few TCP connections which node shows as Established but when we go to node at other end,connection is not there. They are called "phantom" connections I guess. Can this be a possible cause?


Thanks

Anuj


Sent from Yahoo Mail on Android

From:"Anuj Wadehra" <an...@yahoo.co.in>
Date:Sat, 14 Nov, 2015 at 11:59 pm
Subject:Re: Repair Hangs while requesting Merkle Trees

Thanks Daemeon !!


I wil capture the output of netstats and share in next few days. We were thinking of taking tcp dumps also. If its a network issue and increasing request timeout worked, not sure how Cassandra is dropping messages based on timeout.Repair messages are non droppable and not supposed to be timedout.


2 of the 3 nodes in the DC are able to complete repair without any issue. Just one node is problematic.


I also observed frequent messages in logs of other nodes which say that hints replay timedout..and the node where hints were being replayed is always a remote dc node. Is it related some how?


Thanks

Anuj

Sent from Yahoo Mail on Android

From:"daemeon reiydelle" <da...@gmail.com>
Date:Thu, 12 Nov, 2015 at 10:34 am
Subject:Re: Repair Hangs while requesting Merkle Trees



Have you checked the network statistics on that machine? (netstats -tas) while attempting to repair ... if netstats show ANY issues you have a problem. If you can put the command in a loop running every 60 seconds for maybe 15 minutes and post back?

Out of curiousity, how many remote DC nodes are getting successfully repaired?



.......
“Life should not be a journey to the grave with the intention of arriving safely in a
pretty and well preserved body, but rather to skid in broadside in a cloud of smoke,
thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a Ride!” 
- Hunter Thompson

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 9872


On Wed, Nov 11, 2015 at 1:06 PM, Anuj Wadehra <an...@yahoo.co.in> wrote:

Hi,


we are using 2.0.14. We have 2 DCs at remote locations with 10GBps connectivity.We are able to complete repair (-par -pr) on 5 nodes. On only one node in DC2, we are unable to complete repair as it always hangs. Node sends Merkle Tree requests, but one or more nodes in DC1 (remote) never show that they sent the merkle tree reply to requesting node.
Repair hangs infinitely. 

After increasing request_timeout_in_ms on affected node, we were able to successfully run repair on one of the two occassions.

Any comments, why this is happening on just one node? In OutboundTcpConnection.java,  when isTimeOut method always returns false for non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why increasing request timeout solved problem on one occasion ?



Thanks

Anuj Wadehra 




On Thursday, 12 November 2015 2:35 AM, Anuj Wadehra <an...@yahoo.co.in> wrote:



Hi,


We have 2 DCs at remote locations with 10GBps connectivity.We are able to complete repair (-par -pr) on 5 nodes. On only one node in DC2, we are unable to complete repair as it always hangs. Node sends Merkle Tree requests, but one or more nodes in DC1 (remote) never show that they sent the merkle tree reply to requesting node.
Repair hangs infinitely. 

After increasing request_timeout_in_ms on affected node, we were able to successfully run repair on one of the two occassions.

Any comments, why this is happening on just one node? In OutboundTcpConnection.java,  when isTimeOut method always returns false for non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why increasing request timeout solved problem on one occasion ?



Thanks

Anuj Wadehra





Re: Repair Hangs while requesting Merkle Trees

Posted by Anuj Wadehra <an...@yahoo.co.in>.
Thanks Daemeon !!


I wil capture the output of netstats and share in next few days. We were thinking of taking tcp dumps also. If its a network issue and increasing request timeout worked, not sure how Cassandra is dropping messages based on timeout.Repair messages are non droppable and not supposed to be timedout.


2 of the 3 nodes in the DC are able to complete repair without any issue. Just one node is problematic.


I also observed frequent messages in logs of other nodes which say that hints replay timedout..and the node where hints were being replayed is always a remote dc node. Is it related some how?


Thanks

Anuj

Sent from Yahoo Mail on Android

From:"daemeon reiydelle" <da...@gmail.com>
Date:Thu, 12 Nov, 2015 at 10:34 am
Subject:Re: Repair Hangs while requesting Merkle Trees

Have you checked the network statistics on that machine? (netstats -tas) while attempting to repair ... if netstats show ANY issues you have a problem. If you can put the command in a loop running every 60 seconds for maybe 15 minutes and post back?

Out of curiousity, how many remote DC nodes are getting successfully repaired?



.......
“Life should not be a journey to the grave with the intention of arriving safely in a
pretty and well preserved body, but rather to skid in broadside in a cloud of smoke,
thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a Ride!” 
- Hunter Thompson

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 9872


On Wed, Nov 11, 2015 at 1:06 PM, Anuj Wadehra <an...@yahoo.co.in> wrote:

Hi,


we are using 2.0.14. We have 2 DCs at remote locations with 10GBps connectivity.We are able to complete repair (-par -pr) on 5 nodes. On only one node in DC2, we are unable to complete repair as it always hangs. Node sends Merkle Tree requests, but one or more nodes in DC1 (remote) never show that they sent the merkle tree reply to requesting node.
Repair hangs infinitely. 

After increasing request_timeout_in_ms on affected node, we were able to successfully run repair on one of the two occassions.

Any comments, why this is happening on just one node? In OutboundTcpConnection.java,  when isTimeOut method always returns false for non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why increasing request timeout solved problem on one occasion ?



Thanks

Anuj Wadehra 




On Thursday, 12 November 2015 2:35 AM, Anuj Wadehra <an...@yahoo.co.in> wrote:



Hi,


We have 2 DCs at remote locations with 10GBps connectivity.We are able to complete repair (-par -pr) on 5 nodes. On only one node in DC2, we are unable to complete repair as it always hangs. Node sends Merkle Tree requests, but one or more nodes in DC1 (remote) never show that they sent the merkle tree reply to requesting node.
Repair hangs infinitely. 

After increasing request_timeout_in_ms on affected node, we were able to successfully run repair on one of the two occassions.

Any comments, why this is happening on just one node? In OutboundTcpConnection.java,  when isTimeOut method always returns false for non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why increasing request timeout solved problem on one occasion ?



Thanks

Anuj Wadehra





Re: Repair Hangs while requesting Merkle Trees

Posted by daemeon reiydelle <da...@gmail.com>.
Have you checked the network statistics on that machine? (netstats -tas)
while attempting to repair ... if netstats show ANY issues you have a
problem. If you can put the command in a loop running every 60 seconds for
maybe 15 minutes and post back?

Out of curiousity, how many remote DC nodes are getting successfully
repaired?



*.......*






*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
(+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Wed, Nov 11, 2015 at 1:06 PM, Anuj Wadehra <an...@yahoo.co.in>
wrote:

> Hi,
>
> we are using 2.0.14. We have 2 DCs at remote locations with 10GBps
> connectivity.We are able to complete repair (-par -pr) on 5 nodes. On only
> one node in DC2, we are unable to complete repair as it always hangs. Node
> sends Merkle Tree requests, but one or more nodes in DC1 (remote) never
> show that they sent the merkle tree reply to requesting node.
> Repair hangs infinitely.
>
> After increasing request_timeout_in_ms on affected node, we were able to
> successfully run repair on one of the two occassions.
>
> Any comments, why this is happening on just one node? In
> OutboundTcpConnection.java,  when isTimeOut method always returns false for
> non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why
> increasing request timeout solved problem on one occasion ?
>
>
> Thanks
> Anuj Wadehra
>
>
>
> On Thursday, 12 November 2015 2:35 AM, Anuj Wadehra <
> anujw_2003@yahoo.co.in> wrote:
>
>
> Hi,
>
> We have 2 DCs at remote locations with 10GBps connectivity.We are able to
> complete repair (-par -pr) on 5 nodes. On only one node in DC2, we are
> unable to complete repair as it always hangs. Node sends Merkle Tree
> requests, but one or more nodes in DC1 (remote) never show that they sent
> the merkle tree reply to requesting node.
> Repair hangs infinitely.
>
> After increasing request_timeout_in_ms on affected node, we were able to
> successfully run repair on one of the two occassions.
>
> Any comments, why this is happening on just one node? In
> OutboundTcpConnection.java,  when isTimeOut method always returns false for
> non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why
> increasing request timeout solved problem on one occasion ?
>
>
> Thanks
> Anuj Wadehra
>
>
>

Re: Repair Hangs while requesting Merkle Trees

Posted by Anuj Wadehra <an...@yahoo.co.in>.
Hi,
we are using 2.0.14. We have 2 DCs at remote locations with 10GBps connectivity.We are able to complete repair (-par -pr) on 5 nodes. On only one node in DC2, we are unable to complete repair as it always hangs. Node sends Merkle Tree requests, but one or more nodes in DC1 (remote) never show that they sent the merkle tree reply to requesting node.
Repair hangs infinitely. 

After increasing request_timeout_in_ms on affected node, we were able to successfully run repair on one of the two occassions.

Any comments, why this is happening on just one node? In OutboundTcpConnection.java,  when isTimeOut method always returns false for non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why increasing request timeout solved problem on one occasion ?

Thanks
Anuj Wadehra 


     On Thursday, 12 November 2015 2:35 AM, Anuj Wadehra <an...@yahoo.co.in> wrote:
   

 Hi,
We have 2 DCs at remote locations with 10GBps connectivity.We are able to complete repair (-par -pr) on 5 nodes. On only one node in DC2, we are unable to complete repair as it always hangs. Node sends Merkle Tree requests, but one or more nodes in DC1 (remote) never show that they sent the merkle tree reply to requesting node.
Repair hangs infinitely. 

After increasing request_timeout_in_ms on affected node, we were able to successfully run repair on one of the two occassions.

Any comments, why this is happening on just one node? In OutboundTcpConnection.java,  when isTimeOut method always returns false for non-droppable verb such as Merkle Tree Request(verb=REPAIR_MESSAGE),why increasing request timeout solved problem on one occasion ?

Thanks
Anuj Wadehra