You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Ryan McGuire (JIRA)" <ji...@apache.org> on 2014/02/05 00:10:12 UTC
[jira] [Issue Comment Deleted] (CASSANDRA-6648) Race condition
during node bootstrapping
[ https://issues.apache.org/jira/browse/CASSANDRA-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ryan McGuire updated CASSANDRA-6648:
------------------------------------
Comment: was deleted
(was: Test case for when there's a patch:
{code}
# Bring up test cluster:
ccm create bootstrap_bug
ccm populate -n 3
ccm start
ccm node1 stress -n 10000
# Bootstrap a new node:
ccm add -b node4 -t 127.0.0.4:9160 -l 127.0.0.4:7000 -j 7400 --binary-itf 127.0.0.4:9042
ccm node4 start
# Query data from the new node:
ccm node4 cqlsh
cqlsh> select * from "Keyspace1"."Standard1" limit 10;
Bad Request: Keyspace Keyspace1 does not exist
{code})
> Race condition during node bootstrapping
> ----------------------------------------
>
> Key: CASSANDRA-6648
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6648
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Sergio Bossa
> Assignee: Sergio Bossa
> Priority: Critical
> Attachments: 6648-v2.txt, 6648-v3.txt, CASSANDRA-6648.patch
>
>
> When bootstrapping a new node, data is "missing" as if the new node didn't actually bootstrap, which I tracked down to the following scenario:
> 1) New node joins token ring and waits for schema to be settled before actually bootstrapping.
> 2) The schema scheck somewhat passes and it starts bootstrapping.
> 3) Bootstrapping doesn't find the ks/cf that should have received from the other node.
> 4) Queries at this point cause NPEs, until when later they "recover" but data is missed.
> The problem seems to be caused by a race condition between the migration manager and the bootstrapper, with the former running after the latter.
> I think this is supposed to protect against such scenarios:
> {noformat}
> while (!MigrationManager.isReadyForBootstrap())
> {
> setMode(Mode.JOINING, "waiting for schema information to complete", true);
> Uninterruptibles.sleepUninterruptibly(1, TimeUnit.SECONDS);
> }
> {noformat}
> But MigrationManager.isReadyForBootstrap() implementation is quite fragile and doesn't take into account "slow" schema propagation.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)