You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Vasileios Vlachos <va...@gmail.com> on 2014/08/05 10:28:46 UTC

Node stuck during nodetool rebuild

Hello All,

We are on 1.2.18 (running on Ubuntu 12.04) and we recently tried to add a
second DC on our demo environment, just before trying it on live. The
existing DC1 has two nodes which approximately hold 10G of data (RF=2). In
order to add the second DC, DC2, we followed this procedure:

On DC1 nodes:
1. Changed the Snitch in the cassandra.yaml from default to
GossipingPropertyFileSnitch.
2. Configured the cassandra-rackdc.properties (DC1, RAC1).
3. Rolling restart
4. Update replication strategy for each keyspace, for example: ALTER
KEYSPACE <keyspace> WITH REPLICATION =
{'class':'NetworkTopologyStrategy','DC1':2};

On DC2 nodes:
5. Edit the cassandra.yaml with: auto_bootstrap: false, seeds (one IP from
DC1), cluster name to match whatever we have on DC1 nodes, correct IP
settings, num_tokens, initial_token left unset and finally the snitch
(GossipingPropertyFileSnitch, as in DC1).
6. Changed the cassandra-rackdc.properties (DC2, RAC1)

On the Application:
7. Changed the C# DataStax driver load balancing policy to be
DCAwareRoundRobinPolicy
8. Changed the application consistency level from QUORUM to LOCAL_QUORUM
9. After deleting the data, commitlog and saved_caches directory we started
cassandra both nodes in the new DC, DC2. According to the logs at this
point all nodes were able to see all other nodes with the correct/expected
output when running nodetool status.

On DC1 nodes:
10. After cassandra was running on DC2, we changed the Keyspace RF to
include the new DC as follows:  ALTER KEYSPACE <keyspace> WITH REPLICATION
= {'class':'NetworkTopologyStrategy','DC1':2, 'DC2':2};
11. As a last step and in order to stream the data across to the second DC,
we run this on node1 of DC2: nodetool rebuild DC1. After the successful
completion of this, we were planning to run the same on node2 of DC2.

The problem is that the nodetool seems to be stuck, and nodetool netstats
on node1 of DC2 appears to be stuck at 10% streaming a 5G file from node2
at DC1. This doesn't tally with nodetool netstats when running it against
either of the DC1 nodes. The DC1 nodes don't think they stream anything to
DC2.

It is worth pointing that initially we tried to run 'nodetool rebuild DC1'
on both nodes at DC2, given the small amount of data to be streamed in
total (approximately 10G as I explained above). We exoerienced the same
problem, with the only difference being that 'nodetool rebuild DC1' stuck
on both nodes at DC2 very soon after running it, whereas now it happened
only after running it for an hour or so. We thought the problem was that we
tried to run nodetool against both nodes at the same time. So, we tried
running it only against node 1 after we deleted all the data, commitlog and
caches on both nodes and started from step (9) again. Now nodetool rebuild
is running against node1 at DC2 for more than 12 hours with no luck... The
weird thing is that the cassandra logs appear to be clean and the VPN
between the two DCs has no problems at all.

Any thoughts? Have we missed something in the steps I described? Is
anything wrong in the procedure? Any help would be much appreciated.

Thanks,

Vasilis

Re: Node stuck during nodetool rebuild

Posted by Vasileios Vlachos <va...@gmail.com>.
Hello Mark and Rob,

Thank you very much for your input, I will increase the phi threshold and
report back any progress.

Vasilis
On 5 Aug 2014 21:52, "Mark Reddy" <ma...@boxever.com> wrote:

> Hi Vasilis,
>
> To further on what Rob said
>
> I believe you might be able to tune the phi detector threshold to help
>> this operation complete, hopefully someone with direct experience of same
>> will chime in.
>
>
> I have been through this operation where streams break due to a node
> falsely being marked down (flapping). In an attempt to  mitigate this I
> increase the phi_convict_threshold in cassandra.yaml from 8 to 10, after
> which the rebuild was able to successfully complete. The default value for
> phi_convict_threshold is 8 with 12 being the maximum recommended value.
>
>
> Mark
>
>
> On Tue, Aug 5, 2014 at 7:22 PM, Robert Coli <rc...@eventbrite.com> wrote:
>
>> On Tue, Aug 5, 2014 at 1:28 AM, Vasileios Vlachos <
>> vasileiosvlachos@gmail.com> wrote:
>>
>>> The problem is that the nodetool seems to be stuck, and nodetool
>>> netstats on node1 of DC2 appears to be stuck at 10% streaming a 5G file
>>> from node2 at DC1. This doesn't tally with nodetool netstats when running
>>> it against either of the DC1 nodes. The DC1 nodes don't think they stream
>>> anything to DC2.
>>>
>>
>> Yes, streaming is fragile and breaks and hangs forever and your only
>> option in most cases is to stop the rebuilding node, nuke its data, and
>> start again.
>>
>> I believe you might be able to tune the phi detector threshold to help
>> this operation complete, hopefully someone with direct experience of same
>> will chime in.
>>
>> =Rob
>>
>>
>
>

Re: Node stuck during nodetool rebuild

Posted by Vasileios Vlachos <va...@gmail.com>.
Actually something else I would like to ask... Do you know if phi is
related to streaming_socket_timeout_in_ms? It seems to be set to infinity
by default. Could that be related to the hang behaviour of rebuild? Would
you recommend changing the default or I have completely misinterpreted its
meaning?

Many thanks,

Vasilis
On 5 Aug 2014 21:52, "Mark Reddy" <ma...@boxever.com> wrote:

> Hi Vasilis,
>
> To further on what Rob said
>
> I believe you might be able to tune the phi detector threshold to help
>> this operation complete, hopefully someone with direct experience of same
>> will chime in.
>
>
> I have been through this operation where streams break due to a node
> falsely being marked down (flapping). In an attempt to  mitigate this I
> increase the phi_convict_threshold in cassandra.yaml from 8 to 10, after
> which the rebuild was able to successfully complete. The default value for
> phi_convict_threshold is 8 with 12 being the maximum recommended value.
>
>
> Mark
>
>
> On Tue, Aug 5, 2014 at 7:22 PM, Robert Coli <rc...@eventbrite.com> wrote:
>
>> On Tue, Aug 5, 2014 at 1:28 AM, Vasileios Vlachos <
>> vasileiosvlachos@gmail.com> wrote:
>>
>>> The problem is that the nodetool seems to be stuck, and nodetool
>>> netstats on node1 of DC2 appears to be stuck at 10% streaming a 5G file
>>> from node2 at DC1. This doesn't tally with nodetool netstats when running
>>> it against either of the DC1 nodes. The DC1 nodes don't think they stream
>>> anything to DC2.
>>>
>>
>> Yes, streaming is fragile and breaks and hangs forever and your only
>> option in most cases is to stop the rebuilding node, nuke its data, and
>> start again.
>>
>> I believe you might be able to tune the phi detector threshold to help
>> this operation complete, hopefully someone with direct experience of same
>> will chime in.
>>
>> =Rob
>>
>>
>
>

Re: Node stuck during nodetool rebuild

Posted by Mark Reddy <ma...@boxever.com>.
Hi Vasilis,

To further on what Rob said

I believe you might be able to tune the phi detector threshold to help this
> operation complete, hopefully someone with direct experience of same will
> chime in.


I have been through this operation where streams break due to a node
falsely being marked down (flapping). In an attempt to  mitigate this I
increase the phi_convict_threshold in cassandra.yaml from 8 to 10, after
which the rebuild was able to successfully complete. The default value for
phi_convict_threshold is 8 with 12 being the maximum recommended value.


Mark


On Tue, Aug 5, 2014 at 7:22 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Tue, Aug 5, 2014 at 1:28 AM, Vasileios Vlachos <
> vasileiosvlachos@gmail.com> wrote:
>
>> The problem is that the nodetool seems to be stuck, and nodetool netstats
>> on node1 of DC2 appears to be stuck at 10% streaming a 5G file from node2
>> at DC1. This doesn't tally with nodetool netstats when running it against
>> either of the DC1 nodes. The DC1 nodes don't think they stream anything to
>> DC2.
>>
>
> Yes, streaming is fragile and breaks and hangs forever and your only
> option in most cases is to stop the rebuilding node, nuke its data, and
> start again.
>
> I believe you might be able to tune the phi detector threshold to help
> this operation complete, hopefully someone with direct experience of same
> will chime in.
>
> =Rob
>
>

Re: Node stuck during nodetool rebuild

Posted by Robert Coli <rc...@eventbrite.com>.
On Tue, Aug 5, 2014 at 1:28 AM, Vasileios Vlachos <
vasileiosvlachos@gmail.com> wrote:

> The problem is that the nodetool seems to be stuck, and nodetool netstats
> on node1 of DC2 appears to be stuck at 10% streaming a 5G file from node2
> at DC1. This doesn't tally with nodetool netstats when running it against
> either of the DC1 nodes. The DC1 nodes don't think they stream anything to
> DC2.
>

Yes, streaming is fragile and breaks and hangs forever and your only option
in most cases is to stop the rebuilding node, nuke its data, and start
again.

I believe you might be able to tune the phi detector threshold to help this
operation complete, hopefully someone with direct experience of same will
chime in.

=Rob