You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Peter Schuller <pe...@infidyne.com> on 2011/01/20 14:05:34 UTC

Clarification on intended bootstrapping semantics

(I considered user@ but I don't want to cause confusion so limiting
myself to dev@)

I would like to clarify/confirm a few things so that I understand
correctly and so that I can update the documentation to be clearer. I
was unsure about some of these things until code inspection. (If I
have totally missed documentation that covers this in this level of
details please let me know.)

So to be clear: I'm making statements and asking for confirmation. I'm
not making claims, so don't assume this is true if you don't know.
There is some overlap between some of the points.

(1) Starting from scratch with N nodes, M of which are seed nodes (M
<= N), no initial token and autobootstrap false, starting them all up
with clean data directories should be supported without concern for
start-up order.  The cluster is expected to converge. As long as no
writes are happening, and there is no data in the cluster, there is no
problem. There will be no "divergent histories" that prevent gossip
from working.

(2) Specifically, given that one wants for example 2 seeds, there is
no requirement to join the "second" seed as a non-seed *first*, only
to then restart with it as seed after having joined the cluster.

(3) The critical invariant for the operator to maintain with respect
to seed nodes, is that no node is ever listed as a seed node in other
node's configuration, without said seed node first having joined the
cluster.
(3a) The exception being initial cluster start where there is no data
and no traffic, where it's fine to just start all nodes in arbitrary
order.

(4) It is always fine for a seed node to consider itself a seed even
during initial start-up and joining the ring.

(5) enabling auto_bootstrap does not just affect the method by which
tokens are selected, but also affects *whether* the bootstrap process
includes streaming data from other nodes prior to becoming up in the
ring (i.e., whether StorageService.bootstrap() is going to be called
in initServer())

(6) having a node join a pre-existing cluster with data in it without
auto_bootstrap set to true, would cause the ring to join the cluster
but be void of data, thus potentially violating consistency guarantees
(but recovery is possible by running repair)

(7) A consequence of (5)+(6) is that auto_bootstrap should *always* be
enabled on all nodes in a production cluster, except:
(7a) New nodes being brought in as seeds
(7b) During the very first initial cluster setup with no data

(7) The above is intended and on purpose, and it would be correct to
operate under these assumptions when updating/improving documentation.

-- 
/ Peter Schuller

Re: Clarification on intended bootstrapping semantics

Posted by Jonathan Ellis <jb...@gmail.com>.
On Thu, Jan 20, 2011 at 5:05 AM, Peter Schuller
<pe...@infidyne.com> wrote:
> (1) Starting from scratch with N nodes, M of which are seed nodes (M
> <= N), no initial token and autobootstrap false, starting them all up
> with clean data directories should be supported without concern for
> start-up order.  The cluster is expected to converge. As long as no
> writes are happening, and there is no data in the cluster, there is no
> problem. There will be no "divergent histories" that prevent gossip
> from working.

Right.

> (2) Specifically, given that one wants for example 2 seeds, there is
> no requirement to join the "second" seed as a non-seed *first*, only
> to then restart with it as seed after having joined the cluster.

Right.

> (3) The critical invariant for the operator to maintain with respect
> to seed nodes, is that no node is ever listed as a seed node in other
> node's configuration, without said seed node first having joined the
> cluster.

It's more forgiving than that.  The real critical invariant is that
nodes should not have disjoint sets of seed nodes.  (Which we commonly
interpret as "keep all the seed lists the same.")

> (4) It is always fine for a seed node to consider itself a seed even
> during initial start-up and joining the ring.

Yes.

> (5) enabling auto_bootstrap does not just affect the method by which
> tokens are selected, but also affects *whether* the bootstrap process
> includes streaming data from other nodes prior to becoming up in the
> ring (i.e., whether StorageService.bootstrap() is going to be called
> in initServer())

Right, in fact, the second part is the primary effect since usually
you should specify initial_token when adding nodes.

> (6) having a node join a pre-existing cluster with data in it without
> auto_bootstrap set to true, would cause the ring to join the cluster
> but be void of data, thus potentially violating consistency guarantees
> (but recovery is possible by running repair)

Right.

> (7) A consequence of (5)+(6) is that auto_bootstrap should *always* be
> enabled on all nodes in a production cluster, except:
> (7a) New nodes being brought in as seeds

No, this will break things as in (6).  The right way to add new seeds
is to first add it as a non-seed, then update config files to add it
to seed list later.

> (7b) During the very first initial cluster setup with no data

Yes.

> (7) The above is intended and on purpose, and it would be correct to
> operate under these assumptions when updating/improving documentation.

Yes. :)

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com