You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by John Pyeatt <jo...@singlewire.com> on 2014/02/11 17:22:22 UTC

1.2.15 non-seed nodes never join cluster. JOINING: waiting for schema information to complete

I am trying to bring up a 6 node cluster in AWS. 3 seed nodes and 3
non-seed nodes. One of each in each availability zone with 1.2.15 and my
non-seed nodes never join the cluster. If I run 1.2.14 everything works
fine. We are not using vnodes and all of the initial_token values are
assigned based on the Murmur3 calculations.

This isn't a data migration from a previous version. It is a completely
clean cluster which I am starting from scratch.

The seed nodes come up and join the cluster just fine. But none of my
non-seed nodes are joining the cluster. In the logs I am seeing the
following from one of my non-seed nodes. Note the repeats of the last lines
that never go away.

 INFO 15:58:54,729 Handshaking version with /10.0.12.13
 INFO 15:58:55,724 Handshaking version with /10.0.32.126
 INFO 15:58:56,726 Handshaking version with /10.0.22.230
INFO 15:58:56,929 Node /10.0.32.126 is now part of the cluster
 INFO 15:58:56,930 InetAddress /10.0.32.126 is now UP
 INFO 15:58:56,957 Node /10.0.12.103 is now part of the cluster
 INFO 15:58:56,960 InetAddress /10.0.12.103 is now UP
 INFO 15:58:56,967 Node /10.0.22.206 is now part of the cluster
 INFO 15:58:56,968 InetAddress /10.0.22.206 is now UP
 INFO 15:58:56,975 Node /10.0.12.13 is now part of the cluster
 INFO 15:58:56,976 InetAddress /10.0.12.13 is now UP
 INFO 15:58:56,984 Node /10.0.22.230 is now part of the cluster
 INFO 15:58:56,984 InetAddress /10.0.22.230 is now UP
 INFO 15:58:57,010 CFS(Keyspace='system', ColumnFamily='peers') liveRatio
is 12.87932647333957 (just-counted was 12.87932647333957).  calculation
took 19ms for 38 columns
 INFO 15:58:57,679 Handshaking version with /10.0.22.206
 INFO 15:58:57,726 Handshaking version with /10.0.22.230
 INFO 15:58:58,728 Handshaking version with /10.0.12.13
 INFO 15:58:59,730 Handshaking version with /10.0.12.103
 INFO 15:59:06,090 Handshaking version with /10.0.32.126







* INFO 15:59:23,932 JOINING: waiting for schema information to
complete INFO 15:59:24,932 JOINING: waiting for schema information to
complete INFO 15:59:25,933 JOINING: waiting for schema information to
complete INFO 15:59:26,933 JOINING: waiting for schema information to
complete INFO 15:59:27,934 JOINING: waiting for schema information to
complete INFO 15:59:28,934 JOINING: waiting for schema information to
complete INFO 15:59:29,935 JOINING: waiting for schema information to
complete INFO 15:59:30,935 JOINING: waiting for schema information to
complete*

So I suspect it is some sort of bootstrapping issue. I checked the
CHANGES.txt and noticed this for 1.2.15.
*Move handling of migration event source to solve bootstrap race
(CASSANDRA-6648)*
I looked at 6648 and there seems, based on some of the comments that there
is a lack of confidence in this problem.

Has anyone else seen this problem?
-- 
John Pyeatt
Singlewire Software, LLC
www.singlewire.com
------------------
608.661.1184
john.pyeatt@singlewire.com

Re: 1.2.15 non-seed nodes never join cluster. JOINING: waiting for schema information to complete

Posted by Michael Shuler <mi...@pbandjelly.org>.
On 02/11/2014 10:34 AM, sankalp kohli wrote:
> If you don't have a schema, you are probably hitting this
> https://issues.apache.org/jira/browse/CASSANDRA-6685

Looks like #6685 was committed to the cassandra-1.2 branch, yesterday.

SNAPSHOT artifacts can be grabbed for the latest build of each branch, 
if anyone's watching for something to be committed.  I just finished 
setting this up in jenkins, yesterday.

   http://cassci.datastax.com/job/cassandra-1.2/lastSuccessfulBuild/
   http://cassci.datastax.com/job/cassandra-2.0/lastSuccessfulBuild/
   http://cassci.datastax.com/job/trunk/lastSuccessfulBuild/

<Insert "these are unreleased snapshots" disclaimer, here>

-- 
Kind regards,
Michael

Re: 1.2.15 non-seed nodes never join cluster. JOINING: waiting for schema information to complete

Posted by sankalp kohli <ko...@gmail.com>.
If you don't have a schema, you are probably hitting this
https://issues.apache.org/jira/browse/CASSANDRA-6685


On Tue, Feb 11, 2014 at 8:22 AM, John Pyeatt <jo...@singlewire.com>wrote:

> I am trying to bring up a 6 node cluster in AWS. 3 seed nodes and 3
> non-seed nodes. One of each in each availability zone with 1.2.15 and my
> non-seed nodes never join the cluster. If I run 1.2.14 everything works
> fine. We are not using vnodes and all of the initial_token values are
> assigned based on the Murmur3 calculations.
>
> This isn't a data migration from a previous version. It is a completely
> clean cluster which I am starting from scratch.
>
> The seed nodes come up and join the cluster just fine. But none of my
> non-seed nodes are joining the cluster. In the logs I am seeing the
> following from one of my non-seed nodes. Note the repeats of the last lines
> that never go away.
>
>  INFO 15:58:54,729 Handshaking version with /10.0.12.13
>  INFO 15:58:55,724 Handshaking version with /10.0.32.126
>  INFO 15:58:56,726 Handshaking version with /10.0.22.230
> INFO 15:58:56,929 Node /10.0.32.126 is now part of the cluster
>  INFO 15:58:56,930 InetAddress /10.0.32.126 is now UP
>  INFO 15:58:56,957 Node /10.0.12.103 is now part of the cluster
>  INFO 15:58:56,960 InetAddress /10.0.12.103 is now UP
>  INFO 15:58:56,967 Node /10.0.22.206 is now part of the cluster
>  INFO 15:58:56,968 InetAddress /10.0.22.206 is now UP
>  INFO 15:58:56,975 Node /10.0.12.13 is now part of the cluster
>  INFO 15:58:56,976 InetAddress /10.0.12.13 is now UP
>  INFO 15:58:56,984 Node /10.0.22.230 is now part of the cluster
>  INFO 15:58:56,984 InetAddress /10.0.22.230 is now UP
>  INFO 15:58:57,010 CFS(Keyspace='system', ColumnFamily='peers') liveRatio
> is 12.87932647333957 (just-counted was 12.87932647333957).  calculation
> took 19ms for 38 columns
>  INFO 15:58:57,679 Handshaking version with /10.0.22.206
>  INFO 15:58:57,726 Handshaking version with /10.0.22.230
>  INFO 15:58:58,728 Handshaking version with /10.0.12.13
>  INFO 15:58:59,730 Handshaking version with /10.0.12.103
>  INFO 15:59:06,090 Handshaking version with /10.0.32.126
>
>
>
>
>
>
>
> * INFO 15:59:23,932 JOINING: waiting for schema information to complete
>  INFO 15:59:24,932 JOINING: waiting for schema information to complete INFO
> 15:59:25,933 JOINING: waiting for schema information to complete INFO
> 15:59:26,933 JOINING: waiting for schema information to complete  INFO
> 15:59:27,934 JOINING: waiting for schema information to complete INFO
> 15:59:28,934 JOINING: waiting for schema information to complete INFO
> 15:59:29,935 JOINING: waiting for schema information to complete  INFO
> 15:59:30,935 JOINING: waiting for schema information to complete*
>
> So I suspect it is some sort of bootstrapping issue. I checked the
> CHANGES.txt and noticed this for 1.2.15.
> *Move handling of migration event source to solve bootstrap race
> (CASSANDRA-6648)*
> I looked at 6648 and there seems, based on some of the comments that there
> is a lack of confidence in this problem.
>
> Has anyone else seen this problem?
> --
> John Pyeatt
> Singlewire Software, LLC
> www.singlewire.com
> ------------------
> 608.661.1184
> john.pyeatt@singlewire.com
>