You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sergio Bossa (JIRA)" <ji...@apache.org> on 2014/02/04 13:38:09 UTC

[jira] [Created] (CASSANDRA-6648) Race condition during node bootstrapping

Sergio Bossa created CASSANDRA-6648:
---------------------------------------

             Summary: Race condition during node bootstrapping
                 Key: CASSANDRA-6648
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6648
             Project: Cassandra
          Issue Type: Bug
          Components: Core
            Reporter: Sergio Bossa
            Priority: Critical


When bootstrapping a new node, data is "missing" as if the new node didn't actually bootstrap, which I tracked down to the following scenario:

1) New node joins token ring and waits for schema to be settled before actually bootstrapping.
2) The schema scheck somewhat passes and it starts bootstrapping.
3) Bootstrapping doesn't find the ks/cf that should have received from the other node.
4) Queries at this point cause NPEs, until when later they "recover" but data is missed.

The problem seems to be caused by a race condition between the migration manager and the bootstrapper, with the former running after the latter.
I think this is supposed to protect against such scenarios:
{noformat}
            while (!MigrationManager.isReadyForBootstrap())
            {
                setMode(Mode.JOINING, "waiting for schema information to complete", true);
                Uninterruptibles.sleepUninterruptibly(1, TimeUnit.SECONDS);
            }
{noformat}

But MigrationManager.isReadyForBootstrap() implementation is quite fragile and doesn't take into account "slow" schema propagation.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)