You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Kristopher Tate <kr...@bluebridge.jp> on 2008/04/22 12:00:37 UTC

Node Partitioning: Next Steps

Hi all, thank-you for your work on this exciting project!

My colleagues and myself over at Zooomr are wondering what has been  
done to the affect of Node Partitioning (written in the FAQ), what  
needs to be done and perhaps how we can contribute to the project.

I understand that partitioning isn't in the immediate roadmap, but if  
someone could give us some direction on what has been done in this  
area, we would love to contribute to this project.

kristopher tate
cto & founder - zooomr.com

Re: Node Partitioning: Next Steps

Posted by Damien Katz <da...@gmail.com>.

On Apr 22, 2008, at 6:00 AM, Kristopher Tate wrote:
> Hi all, thank-you for your work on this exciting project!
>
> My colleagues and myself over at Zooomr are wondering what has been  
> done to the affect of Node Partitioning (written in the FAQ), what  
> needs to be done and perhaps how we can contribute to the project.
>
> I understand that partitioning isn't in the immediate roadmap, but  
> if someone could give us some direction on what has been done in  
> this area, we would love to contribute to this project.
>

The current plan is to put partitioning off until post 1.0, but we'd  
be more than happy if you want to help out with something before then.  
Some thoughts:

For clustering and partitioning, ideally CouchDB would make things so  
simple that you just add or remove a machine to the pool and  
everything automatically rebalances and adjusts and gives you linear  
scalability and you never have to think about it except to replace bad  
machines. But we'll never reach that ideal, the best we can do is come  
asymptotically closer to it.

So the thing about partitioning and clustering is there are so many  
different ways to slice it and things to optimize for, it's hard to  
know at this early stage in CouchDB development which approach to  
take. And maybe several approaches are in order depending on what you  
want to optimize for, so which to take first?

And what layer will the clustering and partitioning be built at? I had  
assumed it would be in Erlang, with the nodes talking to each other  
via Erlang messaging. But if we can optimize HTTP enough, it might be  
worth it doing above the HTTP level, for all the existing HTTP  
servers, proxies and libraries out there (and less custom Erlang code  
that needs to be written).

So I guess I don't really have answers, but those are my thoughts. If  
you have some ideas about this feel free to let them fly. Questions  
too :)


> kristopher tate
> cto & founder - zooomr.com