You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Brian Tarbox <br...@gmail.com> on 2012/11/06 15:50:22 UTC

question about data replication

I'm working with both Cassandra and Zookeeper so please excuse me if this
is a dumb question but does Zookeeper allow/require me to specify the
number of copies of data (like Cassandra does) or is it simply the case
that if a majority of nodes are up then ALL of my data is available?

Thanks.  I'm guessing this should be obvious to me but searching the
various docs didn't yield a clear answer.

Thanks again.

Brian Tarbox

-- 
http://about.me/BrianTarbox

Re: question about data replication

Posted by Ted Dunning <te...@gmail.com>.
Implied by the number of nodes.

On Tue, Nov 6, 2012 at 8:56 AM, Brian Tarbox <br...@gmail.com> wrote:

> Ted,
> Thanks. Is it an actual value I set in zoo.cfg or is it just implied by the
> number of nodes in my cluster?
> Sorry for being dense :-)
>
> Brian
>
>
> On Tue, Nov 6, 2012 at 11:37 AM, Ted Dunning <te...@gmail.com>
> wrote:
>
> > You specify the MINIMUM number of copies when you define the number of
> > nodes in your ZK cluster.
> >
> > The idea is that ZK requires strong consistency and provides guarantees
> to
> > that effect.  The only way to provide those guarantees is if a majority
> of
> > the ZK cluster agree to and persist all changes.  That is in strong
> > contrast to Cassandra which tries to provide availability instead of
> > consistency.
> >
> > Since ZK requires a majority for every commit, a cluster defined with N
> > nodes will require ceiling((N+1)/2) nodes to commit every change.
> >  Likewise, N is not flexible without some care to make sure that these
> > guarantees are maintained.
> >
> > On Tue, Nov 6, 2012 at 6:50 AM, Brian Tarbox <br...@gmail.com>
> > wrote:
> >
> > > I'm working with both Cassandra and Zookeeper so please excuse me if
> this
> > > is a dumb question but does Zookeeper allow/require me to specify the
> > > number of copies of data (like Cassandra does) or is it simply the case
> > > that if a majority of nodes are up then ALL of my data is available?
> > >
> > > Thanks.  I'm guessing this should be obvious to me but searching the
> > > various docs didn't yield a clear answer.
> > >
> > > Thanks again.
> > >
> > > Brian Tarbox
> > >
> > > --
> > > http://about.me/BrianTarbox
> > >
> >
>
>
>
> --
> http://about.me/BrianTarbox
>

Re: question about data replication

Posted by Brian Tarbox <br...@gmail.com>.
Ted,
Thanks. Is it an actual value I set in zoo.cfg or is it just implied by the
number of nodes in my cluster?
Sorry for being dense :-)

Brian


On Tue, Nov 6, 2012 at 11:37 AM, Ted Dunning <te...@gmail.com> wrote:

> You specify the MINIMUM number of copies when you define the number of
> nodes in your ZK cluster.
>
> The idea is that ZK requires strong consistency and provides guarantees to
> that effect.  The only way to provide those guarantees is if a majority of
> the ZK cluster agree to and persist all changes.  That is in strong
> contrast to Cassandra which tries to provide availability instead of
> consistency.
>
> Since ZK requires a majority for every commit, a cluster defined with N
> nodes will require ceiling((N+1)/2) nodes to commit every change.
>  Likewise, N is not flexible without some care to make sure that these
> guarantees are maintained.
>
> On Tue, Nov 6, 2012 at 6:50 AM, Brian Tarbox <br...@gmail.com>
> wrote:
>
> > I'm working with both Cassandra and Zookeeper so please excuse me if this
> > is a dumb question but does Zookeeper allow/require me to specify the
> > number of copies of data (like Cassandra does) or is it simply the case
> > that if a majority of nodes are up then ALL of my data is available?
> >
> > Thanks.  I'm guessing this should be obvious to me but searching the
> > various docs didn't yield a clear answer.
> >
> > Thanks again.
> >
> > Brian Tarbox
> >
> > --
> > http://about.me/BrianTarbox
> >
>



-- 
http://about.me/BrianTarbox

Re: question about data replication

Posted by Ted Dunning <te...@gmail.com>.
You specify the MINIMUM number of copies when you define the number of
nodes in your ZK cluster.

The idea is that ZK requires strong consistency and provides guarantees to
that effect.  The only way to provide those guarantees is if a majority of
the ZK cluster agree to and persist all changes.  That is in strong
contrast to Cassandra which tries to provide availability instead of
consistency.

Since ZK requires a majority for every commit, a cluster defined with N
nodes will require ceiling((N+1)/2) nodes to commit every change.
 Likewise, N is not flexible without some care to make sure that these
guarantees are maintained.

On Tue, Nov 6, 2012 at 6:50 AM, Brian Tarbox <br...@gmail.com> wrote:

> I'm working with both Cassandra and Zookeeper so please excuse me if this
> is a dumb question but does Zookeeper allow/require me to specify the
> number of copies of data (like Cassandra does) or is it simply the case
> that if a majority of nodes are up then ALL of my data is available?
>
> Thanks.  I'm guessing this should be obvious to me but searching the
> various docs didn't yield a clear answer.
>
> Thanks again.
>
> Brian Tarbox
>
> --
> http://about.me/BrianTarbox
>