You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2009/08/04 01:17:14 UTC

[jira] Commented: (CASSANDRA-195) Improve bootstrap algorithm

    [ https://issues.apache.org/jira/browse/CASSANDRA-195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738669#action_12738669 ] 

Jonathan Ellis commented on CASSANDRA-195:
------------------------------------------

Some implementation-level details:
 
 - add a "bootstrap mode" to cassandra startup.  When started with bootstrap mode, it would wait a minute or two to get the node/token map, then tell the node whose range it is moving into to send over the data.  When that is done it will start answering replies.  We don't want the node to behave like a normal node at all until then, so we should take the bootstrap command out of nodeprobe.  If it can't complete bootstrap, it should abort.  (Bootstrap by definition requires operator intervention so this is fair.)
 - the node D should continue receiving writes for the range in question during this process, and forward them to the bootstrapping node Z
 - if anticompact splits existing SSTables (removing the old "big" one) and leaves both live during this process, we will save doing an extra scan of the old SSTable for Cleanup later in the old model of copying out the to-move data to a special directory.

> Improve bootstrap algorithm
> ---------------------------
>
>                 Key: CASSANDRA-195
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-195
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: all
>            Reporter: Sandeep Tata
>             Fix For: 0.5
>
>
> When you add a node to an existing cluster and the map gets updated, the new node may respond to read requests by saying it doesn't have any of the data until it gets the data from the node(s) the previously owned this range (the load-balancing code, when working properly can take care of this). While this behaviour is compatible with eventual consistency, it would be much friendlier for the new node not to "surface" in the EndPoint maps for reads until it has transferred the data over from the old nodes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.