You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Tiago Silva <tm...@gmail.com> on 2010/12/14 22:34:04 UTC

CouchDB partitioning proposal

Hi,

I want to contribute on CouchDB partitioning proposal (
http://wiki.apache.org/couchdb/Partitioning_proposal) and I would like to
know if anyone can help me to find the issues on this topic. Please tell me
what issues are being developed currently, the ones that are already closed
and what you suggest to start to develop now.


Thank you,
Tiago Silva

P.S. Please reply to all recipients of this mail message in order to better
filter this conversation in CouchDB dev mailing list.

Re: CouchDB partitioning proposal

Posted by Robert Dionne <di...@dionne-associates.com>.
On Dec 18, 2010, at 8:00 PM, Klaus Trainer wrote:

> Hi guys!
> 
> 
> My two cents:
> 
> 
> If I had a few months to do some research in the area of Distributed
> Programming and CouchDB, I'd take the thread "How fast do CouchDB
> propagate changes to other nodes" on the user mailing list as an
> inspiration (which I've just read).
> 
> For instance, one could do some research about the challenges of having
> updates propagated in soft real time through a system of many loosely
> connected CouchDB nodes that get a lot of updates independently of each
> other. Maybe there's some room for optimizing CouchDB's replication, in
> particular for such scenarios.

Great suggestion. This is a challenging area, even within clusters of couchdb nodes that aren't loosely coupled. There is information
that needs to be maintained globally, .ie. at all nodes for a healthy cluster and this needs to be kept in sync. As Klaus mentioned
earlier BigCouch addresses a lot of the needs of distribution (it puts the C back in CouchDB), and there are areas that need work, .eg. splitting/merging of partitions
dynamically while keeping the cluster up[1]. BigCouch has a well-defined architecture and layered approach that makes exploration
and experimentation easier[1,2,3]. The inter-node communication component[2] was built to be standalone and geared towards use with CouchDB.

Cheers,

Bob


[1] https://github.com/cloudant/mem3
[2] https://github.com/cloudant/rexi
[3] https://github.com/cloudant/fabric



> 
> At first, in order to find out about different possible tradeoffs, one
> would have to start comparing and evaluating different concepts.
> 
> For instance, one could find out about how replication things work, e.g.
> in CouchDB and in Riak. In terms of finding common ancestors of subtrees
> and detecting conflicts, there might be even a few things one could
> learn from Git...
> 
> 
> Anyway, you are welcomed to present new ideas! Or if not, some paper
> that gives an in-depth description of an existing feature of CouchDB
> (e.g. replication) would be great as well, as that provided insights for
> people who are not familiar with that particular codebase.
> 
> 
> Cheers,
> Klaus
> 
> 
> On Tue, 2010-12-14 at 22:54 +0000, Iago Abal wrote:
>> Hi all,
>> 
>> Well, to be more specific we are a group of classmates that have decided to
>> work on CouchDB as MSc coursework (Tiago might want to be brief...). We have
>> the task of study CouchDB until February and then, the idea is to spend 4-5
>> months working on a contribution for CouchDB. Our main problem seems to be
>> that wiki stuff is very out-of-date, when we read that CouchDB lacks feature
>> A and we decide to focus in this problem we finally find that it is already
>> solved. We have spent some time learn the very basic about CouchDB but we
>> are having troubles to properly define the project so we would appreciate
>> commentaries about what kind of contribution (related with distributed
>> systems topic) is of the interest of th CouchDB community.
>> 
>> Thanks in advance,
>> 
>> On Tue, Dec 14, 2010 at 10:03 PM, Klaus Trainer <kl...@web.de>wrote:
>> 
>>> Hi Tiago,
>>> 
>>> check out BigCouch: https://github.com/cloudant/bigcouch. Most of it has
>>> been done by the guys at Cloudant. They're building a scalable CouchDB
>>> hosting platform (https://cloudant.com), in which BigCouch is more or
>>> less the core of it. If you've any questions regarding Cloudant or
>>> BigCouch, you maybe can find some help in the #cloudant IRC room at
>>> Freenode.
>>> 
>>> For a (quick) introduction to BigCouch you can check out e.g.:
>>> 
>>> - "Dynamo and CouchDB Clusters":
>>> http://blog.cloudant.com/dynamo-and-couchdb-clusters
>>> - "Scaling Out with BigCouch"—Webcast: http://is.gd/iKLwM
>>> 
>>> Cheers,
>>> Klaus
>>> 
>>> 
>>> On Tue, 2010-12-14 at 21:34 +0000, Tiago Silva wrote:
>>>> Hi,
>>>> 
>>>> I want to contribute on CouchDB partitioning proposal (
>>>> http://wiki.apache.org/couchdb/Partitioning_proposal) and I would like
>>> to
>>>> know if anyone can help me to find the issues on this topic. Please tell
>>> me
>>>> what issues are being developed currently, the ones that are already
>>> closed
>>>> and what you suggest to start to develop now.
>>>> 
>>>> 
>>>> Thank you,
>>>> Tiago Silva
>>>> 
>>>> P.S. Please reply to all recipients of this mail message in order to
>>> better
>>>> filter this conversation in CouchDB dev mailing list.
>>> 
>>> 
>>> 
>> 
>> 
> 
> 


Re: CouchDB partitioning proposal

Posted by Klaus Trainer <kl...@web.de>.
Hi guys!


My two cents:


If I had a few months to do some research in the area of Distributed
Programming and CouchDB, I'd take the thread "How fast do CouchDB
propagate changes to other nodes" on the user mailing list as an
inspiration (which I've just read).

For instance, one could do some research about the challenges of having
updates propagated in soft real time through a system of many loosely
connected CouchDB nodes that get a lot of updates independently of each
other. Maybe there's some room for optimizing CouchDB's replication, in
particular for such scenarios.

At first, in order to find out about different possible tradeoffs, one
would have to start comparing and evaluating different concepts.

For instance, one could find out about how replication things work, e.g.
in CouchDB and in Riak. In terms of finding common ancestors of subtrees
and detecting conflicts, there might be even a few things one could
learn from Git...


Anyway, you are welcomed to present new ideas! Or if not, some paper
that gives an in-depth description of an existing feature of CouchDB
(e.g. replication) would be great as well, as that provided insights for
people who are not familiar with that particular codebase.


Cheers,
Klaus


On Tue, 2010-12-14 at 22:54 +0000, Iago Abal wrote:
> Hi all,
> 
> Well, to be more specific we are a group of classmates that have decided to
> work on CouchDB as MSc coursework (Tiago might want to be brief...). We have
> the task of study CouchDB until February and then, the idea is to spend 4-5
> months working on a contribution for CouchDB. Our main problem seems to be
> that wiki stuff is very out-of-date, when we read that CouchDB lacks feature
> A and we decide to focus in this problem we finally find that it is already
> solved. We have spent some time learn the very basic about CouchDB but we
> are having troubles to properly define the project so we would appreciate
> commentaries about what kind of contribution (related with distributed
> systems topic) is of the interest of th CouchDB community.
> 
> Thanks in advance,
> 
> On Tue, Dec 14, 2010 at 10:03 PM, Klaus Trainer <kl...@web.de>wrote:
> 
> > Hi Tiago,
> >
> > check out BigCouch: https://github.com/cloudant/bigcouch. Most of it has
> > been done by the guys at Cloudant. They're building a scalable CouchDB
> > hosting platform (https://cloudant.com), in which BigCouch is more or
> > less the core of it. If you've any questions regarding Cloudant or
> > BigCouch, you maybe can find some help in the #cloudant IRC room at
> > Freenode.
> >
> > For a (quick) introduction to BigCouch you can check out e.g.:
> >
> > - "Dynamo and CouchDB Clusters":
> > http://blog.cloudant.com/dynamo-and-couchdb-clusters
> > - "Scaling Out with BigCouch"—Webcast: http://is.gd/iKLwM
> >
> > Cheers,
> > Klaus
> >
> >
> > On Tue, 2010-12-14 at 21:34 +0000, Tiago Silva wrote:
> > > Hi,
> > >
> > > I want to contribute on CouchDB partitioning proposal (
> > > http://wiki.apache.org/couchdb/Partitioning_proposal) and I would like
> > to
> > > know if anyone can help me to find the issues on this topic. Please tell
> > me
> > > what issues are being developed currently, the ones that are already
> > closed
> > > and what you suggest to start to develop now.
> > >
> > >
> > > Thank you,
> > > Tiago Silva
> > >
> > > P.S. Please reply to all recipients of this mail message in order to
> > better
> > > filter this conversation in CouchDB dev mailing list.
> >
> >
> >
> 
> 



Re: CouchDB partitioning proposal

Posted by Iago Abal <ia...@gmail.com>.
Hi all,

Well, to be more specific we are a group of classmates that have decided to
work on CouchDB as MSc coursework (Tiago might want to be brief...). We have
the task of study CouchDB until February and then, the idea is to spend 4-5
months working on a contribution for CouchDB. Our main problem seems to be
that wiki stuff is very out-of-date, when we read that CouchDB lacks feature
A and we decide to focus in this problem we finally find that it is already
solved. We have spent some time learn the very basic about CouchDB but we
are having troubles to properly define the project so we would appreciate
commentaries about what kind of contribution (related with distributed
systems topic) is of the interest of th CouchDB community.

Thanks in advance,

On Tue, Dec 14, 2010 at 10:03 PM, Klaus Trainer <kl...@web.de>wrote:

> Hi Tiago,
>
> check out BigCouch: https://github.com/cloudant/bigcouch. Most of it has
> been done by the guys at Cloudant. They're building a scalable CouchDB
> hosting platform (https://cloudant.com), in which BigCouch is more or
> less the core of it. If you've any questions regarding Cloudant or
> BigCouch, you maybe can find some help in the #cloudant IRC room at
> Freenode.
>
> For a (quick) introduction to BigCouch you can check out e.g.:
>
> - "Dynamo and CouchDB Clusters":
> http://blog.cloudant.com/dynamo-and-couchdb-clusters
> - "Scaling Out with BigCouch"—Webcast: http://is.gd/iKLwM
>
> Cheers,
> Klaus
>
>
> On Tue, 2010-12-14 at 21:34 +0000, Tiago Silva wrote:
> > Hi,
> >
> > I want to contribute on CouchDB partitioning proposal (
> > http://wiki.apache.org/couchdb/Partitioning_proposal) and I would like
> to
> > know if anyone can help me to find the issues on this topic. Please tell
> me
> > what issues are being developed currently, the ones that are already
> closed
> > and what you suggest to start to develop now.
> >
> >
> > Thank you,
> > Tiago Silva
> >
> > P.S. Please reply to all recipients of this mail message in order to
> better
> > filter this conversation in CouchDB dev mailing list.
>
>
>


-- 
Iago Abal Rivas

Re: CouchDB partitioning proposal

Posted by Klaus Trainer <kl...@web.de>.
Hi Tiago,

check out BigCouch: https://github.com/cloudant/bigcouch. Most of it has
been done by the guys at Cloudant. They're building a scalable CouchDB
hosting platform (https://cloudant.com), in which BigCouch is more or
less the core of it. If you've any questions regarding Cloudant or
BigCouch, you maybe can find some help in the #cloudant IRC room at
Freenode.

For a (quick) introduction to BigCouch you can check out e.g.:

- "Dynamo and CouchDB Clusters":
http://blog.cloudant.com/dynamo-and-couchdb-clusters
- "Scaling Out with BigCouch"—Webcast: http://is.gd/iKLwM

Cheers,
Klaus


On Tue, 2010-12-14 at 21:34 +0000, Tiago Silva wrote:
> Hi,
> 
> I want to contribute on CouchDB partitioning proposal (
> http://wiki.apache.org/couchdb/Partitioning_proposal) and I would like to
> know if anyone can help me to find the issues on this topic. Please tell me
> what issues are being developed currently, the ones that are already closed
> and what you suggest to start to develop now.
> 
> 
> Thank you,
> Tiago Silva
> 
> P.S. Please reply to all recipients of this mail message in order to better
> filter this conversation in CouchDB dev mailing list.