You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Alessio Pace <al...@gmail.com> on 2009/02/04 10:49:14 UTC

Questions about couchDB algorithms

Hi,

I have just discovered couchDB and I'm very interested in knowing more about
the internal details of it, because apart from reading that is multi-master
and that the system is eventually consistent, I don't see much informations
about other various key design things (I apologize if I wasn't able to find
them), like:

- update propagation through gossiping: based on any known algorithm?

- group membership among the various site: how is it done, through
gossiping? If so, based on any known algorithm?

- are sites required to accept incoming connections (-> can I replicate on
nodes behind NAT?)

- how are conflicts detected? Do you use explicit representation of
operations, vector clocks, ... or what?

- do you have datas on how much it scales with respect to number of
databases/documents/sites ? And, does its way of working adapt to changes in
the amount of those factors?


Thanks in advance for any clarification!

Best regards,
Alessio Pace.

Re: Questions about couchDB algorithms

Posted by Ulises <ul...@gmail.com>.
> I have just discovered couchDB and I'm very interested in knowing more about
> the internal details of it, because apart from reading that is multi-master

I don't know about your particular questions, some other folk will
probably answer those better than me, but this is a really good read:
http://horicky.blogspot.com/2008/10/couchdb-implementation.html

HTH,

U

Re: [user] Re: Questions about couchDB algorithms

Posted by Alessio Pace <al...@gmail.com>.
On Wed, Feb 4, 2009 at 3:30 PM, Wout Mertens <wm...@cisco.com> wrote:

> On Feb 4, 2009, at 3:18 PM, Alessio Pace wrote:
>
>  I mean: how does it deal with dynamic networks, where nodes join and
>> leave,
>> and you have to know onto which you can push/pull ? I am obviously talking
>> about cases in which you can't list the a priori list of few cluster
>> machines on text file and copy it on all the machines.
>>
>
> There is nothing like that in place already. There is only the replication
> mechanism and you're responsible for starting it, in both directions if
> needed, tunneling the HTTP if needed, keeping track of the nodes in the
> setup etc.
>
> CouchDB just makes sure that you can replicate no matter what happened
> between replication runs.


Ok, thanks a lot for this explanation.

--
Alessio Pace.


>
>
> Wout.
>

Re: [user] Re: Questions about couchDB algorithms

Posted by Wout Mertens <wm...@cisco.com>.
On Feb 4, 2009, at 3:18 PM, Alessio Pace wrote:

> I mean: how does it deal with dynamic networks, where nodes join and  
> leave,
> and you have to know onto which you can push/pull ? I am obviously  
> talking
> about cases in which you can't list the a priori list of few cluster
> machines on text file and copy it on all the machines.

There is nothing like that in place already. There is only the  
replication mechanism and you're responsible for starting it, in both  
directions if needed, tunneling the HTTP if needed, keeping track of  
the nodes in the setup etc.

CouchDB just makes sure that you can replicate no matter what happened  
between replication runs.

Wout.

Re: Questions about couchDB algorithms

Posted by Paul Davis <pa...@gmail.com>.
Don't forget the link that Ulises posted:

http://horicky.blogspot.com/2008/10/couchdb-implementation.html

There's a nice description of Replication there too.

On Wed, Feb 4, 2009 at 9:46 AM, Alessio Pace <al...@gmail.com> wrote:
> Hi,
>
> On Wed, Feb 4, 2009 at 3:38 PM, Paul Davis <pa...@gmail.com>wrote:
>
>> On Wed, Feb 4, 2009 at 9:18 AM, Alessio Pace <al...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > On Wed, Feb 4, 2009 at 2:59 PM, Paul Davis <paul.joseph.davis@gmail.com
>> >wrote:
>> >
>> >> On Wed, Feb 4, 2009 at 4:49 AM, Alessio Pace <al...@gmail.com>
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> > I have just discovered couchDB and I'm very interested in knowing more
>> >> about
>> >> > the internal details of it, because apart from reading that is
>> >> multi-master
>> >> > and that the system is eventually consistent, I don't see much
>> >> informations
>> >> > about other various key design things (I apologize if I wasn't able to
>> >> find
>> >> > them), like:
>> >> >
>> >> > - update propagation through gossiping: based on any known algorithm?
>> >> >
>> >>
>> >> The current state of affairs in terms of keeping multiple nodes in
>> >> sync uses the baked in replication mechanism. At the moment it is up
>> >> to the user to ensure that nodes are kept in sync via these
>> >> facilities.
>> >
>> >
>> > Could you formulate a bit more about this?
>> >
>>
>> Hmm. There doesn't appear to be much in the way of documentation on
>> replication beyond this:
>>
>> http://wiki.apache.org/couchdb/Frequently_asked_questions#how_replication
>
>
> Yes I saw that unfortunately there is not much, I would have liked some
> informations on how replication is done more in detail.
>
>
>>
>>
>> Replication is basically a triggered async mechanism for ensuring that
>> all updates on node A are on node B (assuming replication from A -> B
>> obviously). It's incremental in operation, so repeatedly replicating
>> will only send new changes etc.
>>
>> >
>> >> There is quite a bit of active development on this front
>> >> so it's best to stay tuned and see what comes out.
>> >>
>> >> > - group membership among the various site: how is it done, through
>> >> > gossiping? If so, based on any known algorithm?
>> >> >
>> >>
>> >> Not sure what you mean here.
>> >
>> >
>> > I mean: how does it deal with dynamic networks, where nodes join and
>> leave,
>> > and you have to know onto which you can push/pull ? I am obviously
>> talking
>> > about cases in which you can't list the a priori list of few cluster
>> > machines on text file and copy it on all the machines.
>> >
>>
>> Replication is asynchronous and triggered. There's no constant
>> connections or anything of that nature. If you did pull replication
>> for everything then there'd be no issues other than on rejoining the
>> network a node may require a bit of time to catch up with the current
>> state of things.
>>
>> >
>> >>
>> >>
>> >> > - are sites required to accept incoming connections (-> can I
>> replicate
>> >> on
>> >> > nodes behind NAT?)
>> >> >
>> >>
>> >> Replication is both push and pull. So you just need to be able to have
>> >> your nodes behind NAT know when to trigger replication.
>>
>
>> >
>> >
>> > You mean that if a public node generates an update, it can't be
>> replicated
>> > to a natted target node unless it pulls the replication itself?
>> >
>>
>> Either that or you'll need to setup port forwarding as in all things NAT.
>
>
>
> Yes, unless other less straighforeward techniques are employed.
>
> Thanks.
> Regards,
> --
> Alessio Pace
>

Re: Questions about couchDB algorithms

Posted by Alessio Pace <al...@gmail.com>.
Hi,

On Wed, Feb 4, 2009 at 3:38 PM, Paul Davis <pa...@gmail.com>wrote:

> On Wed, Feb 4, 2009 at 9:18 AM, Alessio Pace <al...@gmail.com>
> wrote:
> > Hi,
> >
> > On Wed, Feb 4, 2009 at 2:59 PM, Paul Davis <paul.joseph.davis@gmail.com
> >wrote:
> >
> >> On Wed, Feb 4, 2009 at 4:49 AM, Alessio Pace <al...@gmail.com>
> >> wrote:
> >> > Hi,
> >> >
> >> > I have just discovered couchDB and I'm very interested in knowing more
> >> about
> >> > the internal details of it, because apart from reading that is
> >> multi-master
> >> > and that the system is eventually consistent, I don't see much
> >> informations
> >> > about other various key design things (I apologize if I wasn't able to
> >> find
> >> > them), like:
> >> >
> >> > - update propagation through gossiping: based on any known algorithm?
> >> >
> >>
> >> The current state of affairs in terms of keeping multiple nodes in
> >> sync uses the baked in replication mechanism. At the moment it is up
> >> to the user to ensure that nodes are kept in sync via these
> >> facilities.
> >
> >
> > Could you formulate a bit more about this?
> >
>
> Hmm. There doesn't appear to be much in the way of documentation on
> replication beyond this:
>
> http://wiki.apache.org/couchdb/Frequently_asked_questions#how_replication


Yes I saw that unfortunately there is not much, I would have liked some
informations on how replication is done more in detail.


>
>
> Replication is basically a triggered async mechanism for ensuring that
> all updates on node A are on node B (assuming replication from A -> B
> obviously). It's incremental in operation, so repeatedly replicating
> will only send new changes etc.
>
> >
> >> There is quite a bit of active development on this front
> >> so it's best to stay tuned and see what comes out.
> >>
> >> > - group membership among the various site: how is it done, through
> >> > gossiping? If so, based on any known algorithm?
> >> >
> >>
> >> Not sure what you mean here.
> >
> >
> > I mean: how does it deal with dynamic networks, where nodes join and
> leave,
> > and you have to know onto which you can push/pull ? I am obviously
> talking
> > about cases in which you can't list the a priori list of few cluster
> > machines on text file and copy it on all the machines.
> >
>
> Replication is asynchronous and triggered. There's no constant
> connections or anything of that nature. If you did pull replication
> for everything then there'd be no issues other than on rejoining the
> network a node may require a bit of time to catch up with the current
> state of things.
>
> >
> >>
> >>
> >> > - are sites required to accept incoming connections (-> can I
> replicate
> >> on
> >> > nodes behind NAT?)
> >> >
> >>
> >> Replication is both push and pull. So you just need to be able to have
> >> your nodes behind NAT know when to trigger replication.
>

> >
> >
> > You mean that if a public node generates an update, it can't be
> replicated
> > to a natted target node unless it pulls the replication itself?
> >
>
> Either that or you'll need to setup port forwarding as in all things NAT.



Yes, unless other less straighforeward techniques are employed.

Thanks.
Regards,
--
Alessio Pace

Re: Questions about couchDB algorithms

Posted by Paul Davis <pa...@gmail.com>.
On Wed, Feb 4, 2009 at 9:18 AM, Alessio Pace <al...@gmail.com> wrote:
> Hi,
>
> On Wed, Feb 4, 2009 at 2:59 PM, Paul Davis <pa...@gmail.com>wrote:
>
>> On Wed, Feb 4, 2009 at 4:49 AM, Alessio Pace <al...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > I have just discovered couchDB and I'm very interested in knowing more
>> about
>> > the internal details of it, because apart from reading that is
>> multi-master
>> > and that the system is eventually consistent, I don't see much
>> informations
>> > about other various key design things (I apologize if I wasn't able to
>> find
>> > them), like:
>> >
>> > - update propagation through gossiping: based on any known algorithm?
>> >
>>
>> The current state of affairs in terms of keeping multiple nodes in
>> sync uses the baked in replication mechanism. At the moment it is up
>> to the user to ensure that nodes are kept in sync via these
>> facilities.
>
>
> Could you formulate a bit more about this?
>

Hmm. There doesn't appear to be much in the way of documentation on
replication beyond this:

http://wiki.apache.org/couchdb/Frequently_asked_questions#how_replication

Replication is basically a triggered async mechanism for ensuring that
all updates on node A are on node B (assuming replication from A -> B
obviously). It's incremental in operation, so repeatedly replicating
will only send new changes etc.

>
>> There is quite a bit of active development on this front
>> so it's best to stay tuned and see what comes out.
>>
>> > - group membership among the various site: how is it done, through
>> > gossiping? If so, based on any known algorithm?
>> >
>>
>> Not sure what you mean here.
>
>
> I mean: how does it deal with dynamic networks, where nodes join and leave,
> and you have to know onto which you can push/pull ? I am obviously talking
> about cases in which you can't list the a priori list of few cluster
> machines on text file and copy it on all the machines.
>

Replication is asynchronous and triggered. There's no constant
connections or anything of that nature. If you did pull replication
for everything then there'd be no issues other than on rejoining the
network a node may require a bit of time to catch up with the current
state of things.

>
>>
>>
>> > - are sites required to accept incoming connections (-> can I replicate
>> on
>> > nodes behind NAT?)
>> >
>>
>> Replication is both push and pull. So you just need to be able to have
>> your nodes behind NAT know when to trigger replication.
>
>
> You mean that if a public node generates an update, it can't be replicated
> to a natted target node unless it pulls the replication itself?
>

Either that or you'll need to setup port forwarding as in all things NAT.

>
> Thank you.
>
> Best regards,
> --
> Alessio Pace
>

HTH,
Paul Davis

Re: Questions about couchDB algorithms

Posted by Alessio Pace <al...@gmail.com>.
Hi,

On Wed, Feb 4, 2009 at 2:59 PM, Paul Davis <pa...@gmail.com>wrote:

> On Wed, Feb 4, 2009 at 4:49 AM, Alessio Pace <al...@gmail.com>
> wrote:
> > Hi,
> >
> > I have just discovered couchDB and I'm very interested in knowing more
> about
> > the internal details of it, because apart from reading that is
> multi-master
> > and that the system is eventually consistent, I don't see much
> informations
> > about other various key design things (I apologize if I wasn't able to
> find
> > them), like:
> >
> > - update propagation through gossiping: based on any known algorithm?
> >
>
> The current state of affairs in terms of keeping multiple nodes in
> sync uses the baked in replication mechanism. At the moment it is up
> to the user to ensure that nodes are kept in sync via these
> facilities.


Could you formulate a bit more about this?


> There is quite a bit of active development on this front
> so it's best to stay tuned and see what comes out.
>
> > - group membership among the various site: how is it done, through
> > gossiping? If so, based on any known algorithm?
> >
>
> Not sure what you mean here.


I mean: how does it deal with dynamic networks, where nodes join and leave,
and you have to know onto which you can push/pull ? I am obviously talking
about cases in which you can't list the a priori list of few cluster
machines on text file and copy it on all the machines.


>
>
> > - are sites required to accept incoming connections (-> can I replicate
> on
> > nodes behind NAT?)
> >
>
> Replication is both push and pull. So you just need to be able to have
> your nodes behind NAT know when to trigger replication.


You mean that if a public node generates an update, it can't be replicated
to a natted target node unless it pulls the replication itself?


Thank you.

Best regards,
--
Alessio Pace

Re: Questions about couchDB algorithms

Posted by Paul Davis <pa...@gmail.com>.
On Wed, Feb 4, 2009 at 4:49 AM, Alessio Pace <al...@gmail.com> wrote:
> Hi,
>
> I have just discovered couchDB and I'm very interested in knowing more about
> the internal details of it, because apart from reading that is multi-master
> and that the system is eventually consistent, I don't see much informations
> about other various key design things (I apologize if I wasn't able to find
> them), like:
>
> - update propagation through gossiping: based on any known algorithm?
>

The current state of affairs in terms of keeping multiple nodes in
sync uses the baked in replication mechanism. At the moment it is up
to the user to ensure that nodes are kept in sync via these
facilities. There is quite a bit of active development on this front
so it's best to stay tuned and see what comes out.

> - group membership among the various site: how is it done, through
> gossiping? If so, based on any known algorithm?
>

Not sure what you mean here.

> - are sites required to accept incoming connections (-> can I replicate on
> nodes behind NAT?)
>

Replication is both push and pull. So you just need to be able to have
your nodes behind NAT know when to trigger replication.

> - how are conflicts detected? Do you use explicit representation of
> operations, vector clocks, ... or what?
>

The best reference for the actual details on this would be the code.
IIRC, Damien was doing a bit of work on this recently. The basic idea
is documents with the longest edit history become the 'winning' doc.
Then it's up to your app to resolve conflicts.

> - do you have datas on how much it scales with respect to number of
> databases/documents/sites ? And, does its way of working adapt to changes in
> the amount of those factors?
>

I have yet to see multi-node numbers. So far there are some pretty
impressive numbers floating around for single node setups. I don't
have anything handy but Google and the ML archives should help you out
there.

>
> Thanks in advance for any clarification!
>
> Best regards,
> Alessio Pace.
>

HTH,
Paul Davis