You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Stanley Iriele <si...@gmail.com> on 2013/08/03 18:34:48 UTC

distributed map/reduce in BigCouch

hello,

let me preface my question with the fact that I saw that BigCouch uses
clustering techniques, like quorum, found in the dynamo white paper so I
read about half of it yesterday.

How then does distributed map/reduce work?

if not all nodes have replications of all things how does that coordination
happen during view building?
also its sharded right so certain nodes have a certain range of keys.

My problem is this. I need a solution that can incrementally scale across
many hard disks...does big couch do this? with views and such?..if
so..how?...and what are the drawbacks?

Thanks for any kind of response.

Regards,

Stanley

Re: distributed map/reduce in BigCouch

Posted by Adam Kocoloski <ko...@apache.org>.

I'm not a Couchbase expert so take this with a grain of salt.  XDCR in Couchbase is more analogous to running replication between BigCouch clusters than it is to the distributed Erlang goop that happens inside a BigCouch cluster.

I believe one significant difference between XDCR and CouchDB replication is that XDCR has no ability to preserve conflicting versions of documents that show up in the two clusters.  It deterministically chooses one version (i.e., both clusters will choose the same winner) and discards the other.  When faced with this situation a BigCouch cluster will also deterministically choose a "winner", but both edits are preserved and can be presented to clients upon request.
 
Adam

On Aug 5, 2013, at 9:15 PM, Stanley Iriele <si...@breaktimestudios.com> wrote:

> hey,
> 
> that...was tremendously helpful. I assumed thats was how it all worked but
> in my head it seemed "To good to be true" so I had to really make sure. I
> saw the news about cochdb merging the bigcouch code into the code base. So
> I take it it is safe for me t start my project using couchdb 1.3.0 as I
> have been doing and just stand up bigcouch when i need to.
> 
> I have to play with Bigcouch a little more. while I am here...what is the
> difference between Bigcouch's clustering techniques and couchbase's XDCR? I
> won't be using couchbase because it doesn't have the flexibility and
> functionality that couchdb does. but I was generally curious what the
> difference was.
> 
> again..many thanks...I can now go full steam ahead with my project
> regards,
> 
> Stanley
> 
> 
> 
> 
> On Sun, Aug 4, 2013 at 10:29 AM, Adam Kocoloski <ad...@gmail.com>wrote:
> 
>> The 'q' value is the number of unique shards that comprise the single
>> logical database.  BigCouch will keep 'n' replicas of each shard, so in
>> total a database will have q*n .couch files associated with it.
>> 
>> The number of nodes in the cluster is an independent concern; BigCouch
>> will happily store multiple shards on a single machine or leave some
>> machines in the cluster without any data from a particular clustered
>> database.  The number of nodes must only be greater than or equal to the
>> 'n' value for all databases (BigCouch will not store multiple copies of a
>> shard on the same node in the cluster).
>> 
>> Adam
>> 
>> On Aug 4, 2013, at 4:32 PM, Stanley Iriele <si...@breaktimestudios.com>
>> wrote:
>> 
>>> Thanks joan,
>>> 
>>> I worded part of that poorly...When I said hard drives I really meant
>>> physical machines.. But the bottleneck was disk space; so I said hard
>>> disks... But That completely answered my question actually.... I
>> appreciate
>>> that...what is this 'q' value?..is that the number of nodes?.. Or is that
>>> the r + w > n thing... Either way that answers questions thank you!
>>> On Aug 4, 2013 7:18 AM, "Joan Touzet" <wo...@apache.org> wrote:
>>> 
>>>> Hi Stanley,
>>>> 
>>>> Let me provide a simplistic explanation, and others can help refine it
>>>> as necessary.
>>>> 
>>>> On Sat, Aug 03, 2013 at 09:34:48AM -0700, Stanley Iriele wrote:
>>>>> How then does distributed map/reduce work?
>>>> 
>>>> Each BigCouch node with a shard of the database also keeps that shard of
>>>> the view. When a request is made for a view, sufficient nodes are
>>>> queried to retrieve the view result, with the reduce operation occurring
>>>> as part of the return.
>>>> 
>>>>> if not all nodes have replications of all things how does that
>>>> coordination
>>>>> happen during view building?
>>>> 
>>>> This is not true, all nodes do not have replications of all things. If
>>>> you ask a node for a view on a database it does not have at all, it will
>>>> use information in the partition map to ask that question of a node that
>>>> has at least one shard of the database in question, which will in turn
>>>> complete the scatter/gather request to other nodes participating in that
>>>> database.
>>>> 
>>>>> also its sharded right so certain nodes have a certain range of keys.
>>>> 
>>>> Right, see above.
>>>> 
>>>>> My problem is this. I need a solution that can incrementally scale
>> across
>>>>> many hard disks...does big couch do this? with views and such?..if
>>>>> so..how?...and what are the drawbacks?
>>>> 
>>>> I wouldn't necessarily recommend running 1 BigCouch process per HD you
>>>> have on a single machine, but there's no reason that it wouldn't work.
>>>> 
>>>> The real challenge is that a database's partition map is determined
>>>> staticly at the time of database creation. If you choose to add more HDs
>>>> after this creation time, you will have to create a new database with
>>>> more shards, then replicate data to this new database to use the new
>>>> disks. The other option would be to use a very high number for 'q', then
>>>> rebalance the shard map onto the new disks and BigCouch processes. There
>>>> is a StackOverflow answer from Robert Newson that describes the process
>>>> for performing this operation.
>>>> 
>>>> In short, neither is trivial nor automated. For a single-machine system,
>>>> you'd do far better to use some sort of Logical Volume Manager to deal
>>>> with expanding storage over time, such as Linux's lvm, some HW raid
>>>> cards, ZFS or similar features in OSX and Windows.
>>>> 
>>>>> Thanks for any kind of response.
>>>>> 
>>>>> Regards,
>>>>> 
>>>>> Stanley
>>>> 
>>>> --
>>>> Joan Touzet | joant@atypical.net | wohali everywhere else
>>>> 
>> 
>>

Re: distributed map/reduce in BigCouch

Posted by Jens Alfke <je...@couchbase.com>.

On Aug 5, 2013, at 1:15 PM, Stanley Iriele <si...@breaktimestudios.com> wrote:

> I have to play with Bigcouch a little more. while I am here...what is the
> difference between Bigcouch's clustering techniques and couchbase's XDCR?

They’re not the same type of thing. BigCouch clustering is a lot like Couchbase Server’s clustering. (I don’t know enough details of either one to describe the differences, but at a high level they both work the same way, by partitioning the keyspace and giving each node ownership of a subset of the keys.)

XDCR (Cross-Data-Center Replication) is a different protocol used to replicate changes between Couchbase clusters that are not in close contact, for example if you run an application in geographically dispersed data centers. It has a similar role to CouchDB replication, although the actual protocols are completely different, and XDCR is less sophisticated since Couchbase doesn’t track revision histories of documents.

—Jens

Re: distributed map/reduce in BigCouch

Posted by Stanley Iriele <si...@breaktimestudios.com>.

hey,

that...was tremendously helpful. I assumed thats was how it all worked but
in my head it seemed "To good to be true" so I had to really make sure. I
saw the news about cochdb merging the bigcouch code into the code base. So
I take it it is safe for me t start my project using couchdb 1.3.0 as I
have been doing and just stand up bigcouch when i need to.

I have to play with Bigcouch a little more. while I am here...what is the
difference between Bigcouch's clustering techniques and couchbase's XDCR? I
won't be using couchbase because it doesn't have the flexibility and
functionality that couchdb does. but I was generally curious what the
difference was.

again..many thanks...I can now go full steam ahead with my project
regards,

Stanley




On Sun, Aug 4, 2013 at 10:29 AM, Adam Kocoloski <ad...@gmail.com>wrote:

> The 'q' value is the number of unique shards that comprise the single
> logical database.  BigCouch will keep 'n' replicas of each shard, so in
> total a database will have q*n .couch files associated with it.
>
> The number of nodes in the cluster is an independent concern; BigCouch
> will happily store multiple shards on a single machine or leave some
> machines in the cluster without any data from a particular clustered
> database.  The number of nodes must only be greater than or equal to the
> 'n' value for all databases (BigCouch will not store multiple copies of a
> shard on the same node in the cluster).
>
> Adam
>
> On Aug 4, 2013, at 4:32 PM, Stanley Iriele <si...@breaktimestudios.com>
> wrote:
>
> > Thanks joan,
> >
> > I worded part of that poorly...When I said hard drives I really meant
> > physical machines.. But the bottleneck was disk space; so I said hard
> > disks... But That completely answered my question actually.... I
> appreciate
> > that...what is this 'q' value?..is that the number of nodes?.. Or is that
> > the r + w > n thing... Either way that answers questions thank you!
> > On Aug 4, 2013 7:18 AM, "Joan Touzet" <wo...@apache.org> wrote:
> >
> >> Hi Stanley,
> >>
> >> Let me provide a simplistic explanation, and others can help refine it
> >> as necessary.
> >>
> >> On Sat, Aug 03, 2013 at 09:34:48AM -0700, Stanley Iriele wrote:
> >>> How then does distributed map/reduce work?
> >>
> >> Each BigCouch node with a shard of the database also keeps that shard of
> >> the view. When a request is made for a view, sufficient nodes are
> >> queried to retrieve the view result, with the reduce operation occurring
> >> as part of the return.
> >>
> >>> if not all nodes have replications of all things how does that
> >> coordination
> >>> happen during view building?
> >>
> >> This is not true, all nodes do not have replications of all things. If
> >> you ask a node for a view on a database it does not have at all, it will
> >> use information in the partition map to ask that question of a node that
> >> has at least one shard of the database in question, which will in turn
> >> complete the scatter/gather request to other nodes participating in that
> >> database.
> >>
> >>> also its sharded right so certain nodes have a certain range of keys.
> >>
> >> Right, see above.
> >>
> >>> My problem is this. I need a solution that can incrementally scale
> across
> >>> many hard disks...does big couch do this? with views and such?..if
> >>> so..how?...and what are the drawbacks?
> >>
> >> I wouldn't necessarily recommend running 1 BigCouch process per HD you
> >> have on a single machine, but there's no reason that it wouldn't work.
> >>
> >> The real challenge is that a database's partition map is determined
> >> staticly at the time of database creation. If you choose to add more HDs
> >> after this creation time, you will have to create a new database with
> >> more shards, then replicate data to this new database to use the new
> >> disks. The other option would be to use a very high number for 'q', then
> >> rebalance the shard map onto the new disks and BigCouch processes. There
> >> is a StackOverflow answer from Robert Newson that describes the process
> >> for performing this operation.
> >>
> >> In short, neither is trivial nor automated. For a single-machine system,
> >> you'd do far better to use some sort of Logical Volume Manager to deal
> >> with expanding storage over time, such as Linux's lvm, some HW raid
> >> cards, ZFS or similar features in OSX and Windows.
> >>
> >>> Thanks for any kind of response.
> >>>
> >>> Regards,
> >>>
> >>> Stanley
> >>
> >> --
> >> Joan Touzet | joant@atypical.net | wohali everywhere else
> >>
>
>

Re: distributed map/reduce in BigCouch

Posted by Adam Kocoloski <ad...@gmail.com>.

The 'q' value is the number of unique shards that comprise the single logical database.  BigCouch will keep 'n' replicas of each shard, so in total a database will have q*n .couch files associated with it.

The number of nodes in the cluster is an independent concern; BigCouch will happily store multiple shards on a single machine or leave some machines in the cluster without any data from a particular clustered database.  The number of nodes must only be greater than or equal to the 'n' value for all databases (BigCouch will not store multiple copies of a shard on the same node in the cluster).

Adam

On Aug 4, 2013, at 4:32 PM, Stanley Iriele <si...@breaktimestudios.com> wrote:

> Thanks joan,
> 
> I worded part of that poorly...When I said hard drives I really meant
> physical machines.. But the bottleneck was disk space; so I said hard
> disks... But That completely answered my question actually.... I appreciate
> that...what is this 'q' value?..is that the number of nodes?.. Or is that
> the r + w > n thing... Either way that answers questions thank you!
> On Aug 4, 2013 7:18 AM, "Joan Touzet" <wo...@apache.org> wrote:
> 
>> Hi Stanley,
>> 
>> Let me provide a simplistic explanation, and others can help refine it
>> as necessary.
>> 
>> On Sat, Aug 03, 2013 at 09:34:48AM -0700, Stanley Iriele wrote:
>>> How then does distributed map/reduce work?
>> 
>> Each BigCouch node with a shard of the database also keeps that shard of
>> the view. When a request is made for a view, sufficient nodes are
>> queried to retrieve the view result, with the reduce operation occurring
>> as part of the return.
>> 
>>> if not all nodes have replications of all things how does that
>> coordination
>>> happen during view building?
>> 
>> This is not true, all nodes do not have replications of all things. If
>> you ask a node for a view on a database it does not have at all, it will
>> use information in the partition map to ask that question of a node that
>> has at least one shard of the database in question, which will in turn
>> complete the scatter/gather request to other nodes participating in that
>> database.
>> 
>>> also its sharded right so certain nodes have a certain range of keys.
>> 
>> Right, see above.
>> 
>>> My problem is this. I need a solution that can incrementally scale across
>>> many hard disks...does big couch do this? with views and such?..if
>>> so..how?...and what are the drawbacks?
>> 
>> I wouldn't necessarily recommend running 1 BigCouch process per HD you
>> have on a single machine, but there's no reason that it wouldn't work.
>> 
>> The real challenge is that a database's partition map is determined
>> staticly at the time of database creation. If you choose to add more HDs
>> after this creation time, you will have to create a new database with
>> more shards, then replicate data to this new database to use the new
>> disks. The other option would be to use a very high number for 'q', then
>> rebalance the shard map onto the new disks and BigCouch processes. There
>> is a StackOverflow answer from Robert Newson that describes the process
>> for performing this operation.
>> 
>> In short, neither is trivial nor automated. For a single-machine system,
>> you'd do far better to use some sort of Logical Volume Manager to deal
>> with expanding storage over time, such as Linux's lvm, some HW raid
>> cards, ZFS or similar features in OSX and Windows.
>> 
>>> Thanks for any kind of response.
>>> 
>>> Regards,
>>> 
>>> Stanley
>> 
>> --
>> Joan Touzet | joant@atypical.net | wohali everywhere else
>>

Re: distributed map/reduce in BigCouch

Posted by Stanley Iriele <si...@breaktimestudios.com>.

Thanks joan,

I worded part of that poorly...When I said hard drives I really meant
physical machines.. But the bottleneck was disk space; so I said hard
disks... But That completely answered my question actually.... I appreciate
that...what is this 'q' value?..is that the number of nodes?.. Or is that
the r + w > n thing... Either way that answers questions thank you!
On Aug 4, 2013 7:18 AM, "Joan Touzet" <wo...@apache.org> wrote:

> Hi Stanley,
>
> Let me provide a simplistic explanation, and others can help refine it
> as necessary.
>
> On Sat, Aug 03, 2013 at 09:34:48AM -0700, Stanley Iriele wrote:
> > How then does distributed map/reduce work?
>
> Each BigCouch node with a shard of the database also keeps that shard of
> the view. When a request is made for a view, sufficient nodes are
> queried to retrieve the view result, with the reduce operation occurring
> as part of the return.
>
> > if not all nodes have replications of all things how does that
> coordination
> > happen during view building?
>
> This is not true, all nodes do not have replications of all things. If
> you ask a node for a view on a database it does not have at all, it will
> use information in the partition map to ask that question of a node that
> has at least one shard of the database in question, which will in turn
> complete the scatter/gather request to other nodes participating in that
> database.
>
> > also its sharded right so certain nodes have a certain range of keys.
>
> Right, see above.
>
> > My problem is this. I need a solution that can incrementally scale across
> > many hard disks...does big couch do this? with views and such?..if
> > so..how?...and what are the drawbacks?
>
> I wouldn't necessarily recommend running 1 BigCouch process per HD you
> have on a single machine, but there's no reason that it wouldn't work.
>
> The real challenge is that a database's partition map is determined
> staticly at the time of database creation. If you choose to add more HDs
> after this creation time, you will have to create a new database with
> more shards, then replicate data to this new database to use the new
> disks. The other option would be to use a very high number for 'q', then
> rebalance the shard map onto the new disks and BigCouch processes. There
> is a StackOverflow answer from Robert Newson that describes the process
> for performing this operation.
>
> In short, neither is trivial nor automated. For a single-machine system,
> you'd do far better to use some sort of Logical Volume Manager to deal
> with expanding storage over time, such as Linux's lvm, some HW raid
> cards, ZFS or similar features in OSX and Windows.
>
> > Thanks for any kind of response.
> >
> > Regards,
> >
> > Stanley
>
> --
> Joan Touzet | joant@atypical.net | wohali everywhere else
>

Re: distributed map/reduce in BigCouch

Posted by Joan Touzet <wo...@apache.org>.

Hi Stanley,

Let me provide a simplistic explanation, and others can help refine it
as necessary.

On Sat, Aug 03, 2013 at 09:34:48AM -0700, Stanley Iriele wrote:
> How then does distributed map/reduce work?

Each BigCouch node with a shard of the database also keeps that shard of
the view. When a request is made for a view, sufficient nodes are
queried to retrieve the view result, with the reduce operation occurring
as part of the return.

> if not all nodes have replications of all things how does that coordination
> happen during view building?

This is not true, all nodes do not have replications of all things. If
you ask a node for a view on a database it does not have at all, it will
use information in the partition map to ask that question of a node that
has at least one shard of the database in question, which will in turn
complete the scatter/gather request to other nodes participating in that
database.

> also its sharded right so certain nodes have a certain range of keys.

Right, see above.

> My problem is this. I need a solution that can incrementally scale across
> many hard disks...does big couch do this? with views and such?..if
> so..how?...and what are the drawbacks?

I wouldn't necessarily recommend running 1 BigCouch process per HD you
have on a single machine, but there's no reason that it wouldn't work.

The real challenge is that a database's partition map is determined
staticly at the time of database creation. If you choose to add more HDs
after this creation time, you will have to create a new database with
more shards, then replicate data to this new database to use the new
disks. The other option would be to use a very high number for 'q', then
rebalance the shard map onto the new disks and BigCouch processes. There
is a StackOverflow answer from Robert Newson that describes the process
for performing this operation.

In short, neither is trivial nor automated. For a single-machine system,
you'd do far better to use some sort of Logical Volume Manager to deal
with expanding storage over time, such as Linux's lvm, some HW raid
cards, ZFS or similar features in OSX and Windows.

> Thanks for any kind of response.
> 
> Regards,
> 
> Stanley

-- 
Joan Touzet | joant@atypical.net | wohali everywhere else