You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jason Harvey <al...@gmail.com> on 2012/09/05 13:29:34 UTC

Cannot bootstrap new nodes in 1.0.11 ring - schema issue

Hey folks,

I have a 1.0.11 ring running in production with 6 nodes. Trying to 
bootstrap a new node in, and I'm getting the following consistently:

 INFO [main] 2012-09-05 04:24:13,317 StorageService.java (line 668) 
JOINING: waiting for schema information to complete


After waiting for over 30 minutes, I restarted the node to try again, and 
got the same thing. Tried wiping out the data dir on the new node, as well. 
Same result.

Turned on DEBUG, and got the following:

 INFO [main] 2012-09-05 03:58:55,205 StorageService.java (line 668) 
JOINING: waiting for schema information to complete
DEBUG [MigrationStage:1] 2012-09-05 03:59:11,440 
DefinitionsUpdateVerbHandler.java (line 70) Applying UpdateColumnFamily 
from /10.140.128.218
DEBUG [MigrationStage:1] 2012-09-05 03:59:11,440 
DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
version mismatch. cannot apply.
DEBUG [MigrationStage:1] 2012-09-05 03:59:11,631 
DefinitionsUpdateVerbHandler.java (line 70) Applying UpdateColumnFamily 
from /10.140.128.218
DEBUG [MigrationStage:1] 2012-09-05 03:59:11,631 
DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
version mismatch. cannot apply.


The logs continue with a bunch of failed migration errors from each node in 
the ring.

So, I'm guessing that there is a schema history problem on one of my nodes? 
Any clues on how I can fix this? I had considered wiping out the schema on 
one of my running nodes and starting it back up, but I'm worried it might 
not come back if it gets the same errors.


Also as a random question: is there any way to 'merge' historical schema 
changes together?


Thanks,
Jason

Re: Cannot bootstrap new nodes in 1.0.11 ring - schema issue

Posted by Jason Harvey <al...@gmail.com>.
I attempted to manually load the Schema sstables onto the new node and 
bootstrap it. Unfortunately when doing so, the new node believed it was 
already bootstrapped, and just joined the ring with zero data.

To fix (read: hack) that, I removed the following logic from 
StorageService.java:523:

        if (DatabaseDescriptor.isAutoBootstrap()
            && !(SystemTable.isBootstrapped()
                 || 
DatabaseDescriptor.getSeeds().contains(FBUtilities.getBroadcastAddress())
                 || !Schema.instance.getNonSystemTables().isEmpty()))


I replaced everything after the && with just !SystemTable.isBootstrapped(). 
No idea why that logic was failing, as I had zero non-system tables.


Of course, while this technically works, I'd rather not use a hacked build 
every time I need to bootstrap :/

Re: Cannot bootstrap new nodes in 1.0.11 ring - schema issue

Posted by Jason Harvey <al...@gmail.com>.
After a chat with driftx today, I tried wiping out my MigrationInfo on the 
ring and rolling a restart. I then made a single change to the schema so at 
least 1 migration would exist. Unfortunately the same error persists: 
"Previous version mismatch". Also occasionally the node is bootstrapping 
without applying any schema on startup. The behaviour is inconsistent, 
despite wiping the entire data directory and commitlogs on the 
bootstrapping node.

I added some debug statements to Migration.java to find exactly what the 
mismatch was. Here is what I have now:

DEBUG [MigrationStage:1] 2012-09-07 04:44:53,669 Migration.java (line 98) 
lastversion: ee323110-eedf-11e1-0000-5027269873df  getVersion: 
00000000-0000-1000-0000-000000000000
DEBUG [MigrationStage:1] 2012-09-07 04:44:53,669 
DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
version mismatch. cannot apply.
DEBUG [MigrationStage:1] 2012-09-07 04:44:53,670 
DefinitionsUpdateVerbHandler.java (line 70) Applying UpdateColumnFamily 
from /10.140.129.18
DEBUG [MigrationStage:1] 2012-09-07 04:44:53,670 Migration.java (line 98) 
lastversion: ee323110-eedf-11e1-0000-5027269873df  getVersion: 
00000000-0000-1000-0000-000000000000
DEBUG [MigrationStage:1] 2012-09-07 04:44:53,670 
DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
version mismatch. cannot apply.

The "Previous version mismatch" event happens when lastVersion != 
getVersion. Obviously that is the case here, as getVersion is blank. Don't 
all nodes bootstrap with a blank schema version? Why would the Migration 
logic expect the lastVersion to match the bootstrapping nodes getVersion?



On Wednesday, September 5, 2012 4:29:34 AM UTC-7, Jason Harvey wrote:
>
> Hey folks,
>
> I have a 1.0.11 ring running in production with 6 nodes. Trying to 
> bootstrap a new node in, and I'm getting the following consistently:
>
>  INFO [main] 2012-09-05 04:24:13,317 StorageService.java (line 668) 
> JOINING: waiting for schema information to complete
>
>
> After waiting for over 30 minutes, I restarted the node to try again, and 
> got the same thing. Tried wiping out the data dir on the new node, as well. 
> Same result.
>
> Turned on DEBUG, and got the following:
>
>  INFO [main] 2012-09-05 03:58:55,205 StorageService.java (line 668) 
> JOINING: waiting for schema information to complete
> DEBUG [MigrationStage:1] 2012-09-05 03:59:11,440 
> DefinitionsUpdateVerbHandler.java (line 70) Applying UpdateColumnFamily 
> from /10.140.128.218
> DEBUG [MigrationStage:1] 2012-09-05 03:59:11,440 
> DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
> version mismatch. cannot apply.
> DEBUG [MigrationStage:1] 2012-09-05 03:59:11,631 
> DefinitionsUpdateVerbHandler.java (line 70) Applying UpdateColumnFamily 
> from /10.140.128.218
> DEBUG [MigrationStage:1] 2012-09-05 03:59:11,631 
> DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
> version mismatch. cannot apply.
>
>
> The logs continue with a bunch of failed migration errors from each node 
> in the ring.
>
> So, I'm guessing that there is a schema history problem on one of my 
> nodes? Any clues on how I can fix this? I had considered wiping out the 
> schema on one of my running nodes and starting it back up, but I'm worried 
> it might not come back if it gets the same errors.
>
>
> Also as a random question: is there any way to 'merge' historical schema 
> changes together?
>
>
> Thanks,
> Jason
>