You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Paulo Ricardo Motta Gomes <pa...@chaordicsystems.com> on 2014/03/14 22:18:51 UTC

Cannot bootstrap replacement node

Hello,

I'm having some trouble during bootstrap of a replacement node and I'm
suspecting it could be a bug in Cassandra. I'm using C* 1.2.13, RF=2, with
Vnodes disabled. Below is a simplified version of my ring:

* n1 : token 100
* n2 : token 200 (DEAD)
* n3 : token 300
* n4 : token 0

n2 has died, so I tried bootstraping a new replacement node:

* x : token 199 (n2.token-1)

Even though n2 was terminated, and being seen as DOWN by n1, n3 and n4, the
replacement node x was seeing n2 as UP, immediately trying to stream data
from it during bootstrap. After about 10 minutes, when x detected n2 as
DOWN, the bootstrap failed for obvious reasons.

Since the previous procedure did not work, I tried the next procedure for
replacing n2:

- Remove n2 from the ring. This makes n3 stream n2's data to n1.
- After the leave is complete, try to bootstrap X again.

Ideally, x would stream data from n1 and n3, but it always streams data
only from n3. The problem is that at some point n3 is seen as DOWN by x,
failing the bootstrap process again.

I suspect there is some kind of inconsistency in the gossip information of
n2 that is preventing x from streaming data from both n1 and n3. I tried
purging n2 from gossip, using Gossiper.unsafeAssassinateEndpoint() via JMX,
but I'm getting the following error:

*"Problem invoking unsafeAssassinateEndpoint :
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0"*

My next and last approach is to manually copy the sstables via rsync from
n3 and start x with auto_bootstrap=false, but I really didn't want to use
this approach. Is it so hard to bootstrap a new node when not using Vnodes
in C* 1.2, or this could be hiding some kind of bug? Any feedback would be
greatly appreciated.

Cheers,

-- 
*Paulo Motta*

Chaordic | *Platform*
*www.chaordic.com.br <http://www.chaordic.com.br/>*

Re: Cannot bootstrap replacement node

Posted by Robert Coli <rc...@eventbrite.com>.
On Fri, Mar 14, 2014 at 2:18 PM, Paulo Ricardo Motta Gomes <
paulo.motta@chaordicsystems.com> wrote:

> My next and last approach is to manually copy the sstables via rsync from
> n3 and start x with auto_bootstrap=false, but I really didn't want to use
> this approach. Is it so hard to bootstrap a new node when not using Vnodes
> in C* 1.2, or this could be hiding some kind of bug? Any feedback would be
> greatly appreciated.
>

I've heard some reports of this in the era of 1.2.x you are referring to.

I'd probably try the auto_bootstrap:false process, but would consider
upgrading to 1.2.15 first. There are issues within 1.2.x with
auto_bootstrap:false under certain circumstances, can't find the JIRA right
now but check CHANGES.txt ... note that 1.2.15 will not coalesce a cluster
without a loaded schema.

=Rob