You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@couchdb.apache.org by Apache Wiki <wi...@apache.org> on 2009/02/27 01:09:06 UTC

[Couchdb Wiki] Trivial Update of "Partitioning proposal" by BenBrowning

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The following page has been changed by BenBrowning:
http://wiki.apache.org/couchdb/Partitioning_proposal

------------------------------------------------------------------------------
Support a tree partition topology that can be as shallow or deep as the user needs. It should be possible to have a flat tree with only one root and many leaf nodes, a binary tree structure, or any variant in-between.

=== Consistent Hashing Algorithm ===
- The mapping of IDs to nodes should use a [http://www.spiteful.com/2008/03/17/programmers-toolbox-part-3-consistent-hashing/ Consistent Hashing Algorithm]. What hasn't been decided on fully (I don't think) is if a proxy node just maps IDs to its direct children of if a proxy node knows how to map IDs directly to a leaf node all the way down the tree. With this type of hashing algorithm, adding or removing a storage node just requires moving data around on its neighbors and not the entire system. Also, node failover (which is out of the scope of this document) becomes easier since you know exactly what data needs to be replicated to which servers to maintain a redundant copy of each node and the failed node's load gets spread among the remaining servers instead of just one.
+ The mapping of IDs to nodes should use a [http://www.spiteful.com/2008/03/17/programmers-toolbox-part-3-consistent-hashing/ Consistent Hashing Algorithm]. What hasn't been decided on fully (I don't think) is if a proxy node just maps IDs to its direct children or if a proxy node knows how to map IDs directly to a leaf node all the way down the tree. With this type of hashing algorithm, adding or removing a storage node just requires moving data around on its neighbors and not the entire system. Also, node failover (which is out of the scope of this document) becomes easier since you know exactly what data needs to be replicated to which servers to maintain a redundant copy of each node and the failed node's load gets spread among the remaining servers instead of just one.

=== Proxy and Storage Nodes ===
Allow a node to be a proxy node, a storage node, or both. Storage nodes contain the data and would typically be the leaf nodes of a tree. Proxy nodes combine results from multiple storage nodes before passing them up the tree (or back to the client). The distinction is entirely in configuration and only exists to simplify the mental model. If a node's ID hash points all requests to other nodes, that node is a proxy node. If a node's ID hash points all requests to itself, it is a storage node. If a node's ID hash points some requests to other nodes and some requests to itself, it is both.