You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Christopher O'Connell <jw...@gmail.com> on 2009/11/09 22:41:45 UTC

Concurrent Trees

On Nov 9, 2009, at 14:01, Paul Davis <pa...@gmail.com>  
wrote:
>>
>>>
>>
> In trunk the computation and index writing are split to take up to two
> cores. It'd be theoretically possible to expand the mapping side to
> use all available cores but most reports say that the btree writing
> process is CPU bound, so any more than a single mapper would just be a
> waste of resources.
>
> Unfortunately, parallelizing the btree updates is rather non-trivial.
> I've contemplated a few different methods for attempting it, but
> nothing that has come of my limited self amusement in that respect.
>
> HTH,
> Paul Davis

Dear Paul,

I'm willing to help work on this, however, as you note, truely  
concurrent updates to a tree are hard. I've been playing with subtree  
locking, but 1) it doesn't work right now and 2) the root is still  
temporarily locked, although it does generally reduce bottlenecks  
somewhat.

I've been reading about highly concurrent trees, I'll send some links  
when I get home.

~ Christopher

Re: Concurrent Trees

Posted by Jan Lehnardt <ja...@apache.org>.
On 9 Nov 2009, at 23:59, Paul Davis wrote:

> Moved to dev@ so Jan doesn't yell at me.

GOOD JOB PAUL!

Cheers
Jan
--



> 
> On Mon, Nov 9, 2009 at 4:41 PM, Christopher O'Connell
> <jw...@gmail.com> wrote:
>> On Nov 9, 2009, at 14:01, Paul Davis <pa...@gmail.com> wrote:
>>>> 
>>>>> 
>>>> 
>>> In trunk the computation and index writing are split to take up to two
>>> cores. It'd be theoretically possible to expand the mapping side to
>>> use all available cores but most reports say that the btree writing
>>> process is CPU bound, so any more than a single mapper would just be a
>>> waste of resources.
>>> 
>>> Unfortunately, parallelizing the btree updates is rather non-trivial.
>>> I've contemplated a few different methods for attempting it, but
>>> nothing that has come of my limited self amusement in that respect.
>>> 
>>> HTH,
>>> Paul Davis
>> 
>> Dear Paul,
>> 
>> I'm willing to help work on this, however, as you note, truely concurrent
>> updates to a tree are hard. I've been playing with subtree locking, but 1)
>> it doesn't work right now and 2) the root is still temporarily locked,
>> although it does generally reduce bottlenecks somewhat.
>> 
>> I've been reading about highly concurrent trees, I'll send some links when I
>> get home.
>> 
>> ~ Christopher
>> 
> 
> Cool beans. I sure can't seem to find the paper I was reading last
> that had an implementable sounding algorithm. The hard part with lots
> of these algorithms that they appear to be incompatible with an append
> only update scheme like we've got going on.
> 
> Lately I've been contemplating is having a single PID for the upper
> nodes and writes come in and write leaves independently and then throw
> thew new bits at the btree pid which then accepts anything in its
> mailbox that is compatible with the current state and writes those to
> disk, any thing that's incompatible is rejected, and the writer has to
> retry that write.
> 
> Or something to that effect.
> 
> Paul Davis
> 


Re: Concurrent Trees

Posted by Paul Davis <pa...@gmail.com>.
Moved to dev@ so Jan doesn't yell at me.

On Mon, Nov 9, 2009 at 4:41 PM, Christopher O'Connell
<jw...@gmail.com> wrote:
> On Nov 9, 2009, at 14:01, Paul Davis <pa...@gmail.com> wrote:
>>>
>>>>
>>>
>> In trunk the computation and index writing are split to take up to two
>> cores. It'd be theoretically possible to expand the mapping side to
>> use all available cores but most reports say that the btree writing
>> process is CPU bound, so any more than a single mapper would just be a
>> waste of resources.
>>
>> Unfortunately, parallelizing the btree updates is rather non-trivial.
>> I've contemplated a few different methods for attempting it, but
>> nothing that has come of my limited self amusement in that respect.
>>
>> HTH,
>> Paul Davis
>
> Dear Paul,
>
> I'm willing to help work on this, however, as you note, truely concurrent
> updates to a tree are hard. I've been playing with subtree locking, but 1)
> it doesn't work right now and 2) the root is still temporarily locked,
> although it does generally reduce bottlenecks somewhat.
>
> I've been reading about highly concurrent trees, I'll send some links when I
> get home.
>
> ~ Christopher
>

Cool beans. I sure can't seem to find the paper I was reading last
that had an implementable sounding algorithm. The hard part with lots
of these algorithms that they appear to be incompatible with an append
only update scheme like we've got going on.

Lately I've been contemplating is having a single PID for the upper
nodes and writes come in and write leaves independently and then throw
thew new bits at the btree pid which then accepts anything in its
mailbox that is compatible with the current state and writes those to
disk, any thing that's incompatible is rejected, and the writer has to
retry that write.

Or something to that effect.

Paul Davis