You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Elf <el...@gmail.com> on 2009/05/04 13:50:26 UTC

couchdb on smp?

Hello.
I'm using couchdb-0.9 (erlang 13.2) on my linux server with 4 CPUs
(Core 2 Quad).
Every time couchdb needs to reindex views, i see 2 serious processes
in htop - beam and couchjs. Each of them eats < 100% of 1 cpu, and sum
of their usage is about (but not greater) 100% of 1 cpu (80/20, 70/30
and so on).
Can somebody explain, why that 2 different processes (they have
differend PIDs) doesn't used different cpus - (one process for cpu)?


-- 
----------------
Best regards
Elf
mailto:elf2001@gmail.com

Re: couchdb on smp?

Posted by Elf <el...@gmail.com>.
> So if you're not seeing CPUs getting used, try recompiling erlang and
> turning on smp support, etc.  Oh, and I also had to recompile CouchDB
> after upgrading erlang with smp kpoll and sctp.

Oh sh*t, i have all of theese flags off...
Thank, I'll try that.

>
> Cheers
> James Marca
>
> On Mon, May 04, 2009 at 02:50:26PM +0300, Elf wrote:
>> Hello.
>> I'm using couchdb-0.9 (erlang 13.2) on my linux server with 4 CPUs
>> (Core 2 Quad).
>> Every time couchdb needs to reindex views, i see 2 serious processes
>> in htop - beam and couchjs. Each of them eats < 100% of 1 cpu, and sum
>> of their usage is about (but not greater) 100% of 1 cpu (80/20, 70/30
>> and so on).
>> Can somebody explain, why that 2 different processes (they have
>> differend PIDs) doesn't used different cpus - (one process for cpu)?
>>
>>
>> --
>> ----------------
>> Best regards
>> Elf
>> mailto:elf2001@gmail.com
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>



-- 
----------------
Best regards
Elf
mailto:elf2001@gmail.com

Re: couchdb on smp?

Posted by James Marca <jm...@translab.its.uci.edu>.
Hello,

I too am running couchdb-0.9 on erlang 13.2

I am running Gentoo.  My use flags for erlang are:

dev-lang/erlang-13.2-r1  
USE="emacs java kpoll sctp smp ssl -doc -hipe -odbc -tk -wxwindows

I didn't turn on high performance erlang because the ebuild said "your
on your own if you do"  But I turned on lots of other flags that
looked like it would help erlang use all 8 cores on my server (dual
4-core Intel Xeon CPU  E5420  @ 2.50GHz)

So, I have 12 databases (one per month) for storing some data, each
about gig or so large.  I set up an identical view in each all at the
same time and erlang + couch did a good job using my CPU---using top
(press 1 to see all cpus), all 8 cpus were under load, with loads
ranging from 49 to 61, and the main beam.smp process was running at
340% CPU (obviously spreading its load across the cores).

So if you're not seeing CPUs getting used, try recompiling erlang and
turning on smp support, etc.  Oh, and I also had to recompile CouchDB
after upgrading erlang with smp kpoll and sctp.  

Cheers
James Marca

On Mon, May 04, 2009 at 02:50:26PM +0300, Elf wrote:
> Hello.
> I'm using couchdb-0.9 (erlang 13.2) on my linux server with 4 CPUs
> (Core 2 Quad).
> Every time couchdb needs to reindex views, i see 2 serious processes
> in htop - beam and couchjs. Each of them eats < 100% of 1 cpu, and sum
> of their usage is about (but not greater) 100% of 1 cpu (80/20, 70/30
> and so on).
> Can somebody explain, why that 2 different processes (they have
> differend PIDs) doesn't used different cpus - (one process for cpu)?
> 
> 
> -- 
> ----------------
> Best regards
> Elf
> mailto:elf2001@gmail.com

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: couchdb on smp?

Posted by Paul Davis <pa...@gmail.com>.
Also a point I forgot to make explicit is that the erlview engine
doesn't require any JSON serialization in either direction obviously.
So in the end it'll probably be what you want if you're going for pure
speed anyway.

Paul

On Mon, May 4, 2009 at 12:07 PM, Paul Davis <pa...@gmail.com> wrote:
> On Mon, May 4, 2009 at 11:58 AM, Chris Anderson <jc...@apache.org> wrote:
>> On Mon, May 4, 2009 at 7:23 AM, Zachary Zolton <za...@gmail.com> wrote:
>>> @janl
>>>
>>> Perhaps he's asking why there's no activity on the other processor?
>>>
>>> I think his expectation here is that Map-Reduce would be parallelized.
>>> Correct me if I'm wrong, but CouchDB does not yet exploit parallelism
>>> in view indexing yet, right?
>>>
>>
>> Correct. And the JavaScript view server doesn't make it easy to do so
>> without starting multiple OS processes per map-function. I can see
>> this being nifty down the road, but it's an optimization.
>>
>> However, interest is growing in an Erlang view server, and
>> http://github.com/mmcdanie/erlview points out that the view process
>> architecture could use some changes to make Erlang views parallel.
>> Those changes will in turn make it more trivial to parallelize JS view
>> computation, in the cases that need it. So I can see us getting to
>> parallel JS views, but I think the cleaner way to get there is to
>> start with proper Erlang views.
>>
>
> I'd also throw in that the biggest bottleneck for JS views is probably
> the JSON serialization. I've been asked to try getting eep0018 in to a
> copy of trunk as a configuration parameter. I'll get a branch on
> github up by the end of the week to have a reference for that. My gut
> feeling is that once we get that sorted out, we'll probably see that
> disk I/O becomes the bottleneck. I could see maybe eeking out some
> extra performance by streaming data through the map servers as opposed
> to the serialized delegation method we're using now, but then it
> becomes a cost/benfit of complexity. We'll see how it goes.
>
> Paul Davis
>
>>> —zdzolton
>>>
>>> On Mon, May 4, 2009 at 7:46 AM, Jan Lehnardt <ja...@apache.org> wrote:
>>>> Beam and couchjs pipe data back and  forth during view generation. While the
>>>> one works, the other waits. The scheduler is smart enough to keep the
>>>> processes local to a single CPU. Otherwise it's be even more expensive.
>>>>
>>>> Cheers
>>>> Jan
>>>> --
>>>>
>>>> On 04.05.2009, at 12:50, Elf <el...@gmail.com> wrote:
>>>>
>>>>> Hello.
>>>>> I'm using couchdb-0.9 (erlang 13.2) on my linux server with 4 CPUs
>>>>> (Core 2 Quad).
>>>>> Every time couchdb needs to reindex views, i see 2 serious processes
>>>>> in htop - beam and couchjs. Each of them eats < 100% of 1 cpu, and sum
>>>>> of their usage is about (but not greater) 100% of 1 cpu (80/20, 70/30
>>>>> and so on).
>>>>> Can somebody explain, why that 2 different processes (they have
>>>>> differend PIDs) doesn't used different cpus - (one process for cpu)?
>>>>>
>>>>>
>>>>> --
>>>>> ----------------
>>>>> Best regards
>>>>> Elf
>>>>> mailto:elf2001@gmail.com
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Chris Anderson
>> http://jchrisa.net
>> http://couch.io
>>
>

Re: couchdb on smp?

Posted by Paul Davis <pa...@gmail.com>.
On Mon, May 4, 2009 at 11:58 AM, Chris Anderson <jc...@apache.org> wrote:
> On Mon, May 4, 2009 at 7:23 AM, Zachary Zolton <za...@gmail.com> wrote:
>> @janl
>>
>> Perhaps he's asking why there's no activity on the other processor?
>>
>> I think his expectation here is that Map-Reduce would be parallelized.
>> Correct me if I'm wrong, but CouchDB does not yet exploit parallelism
>> in view indexing yet, right?
>>
>
> Correct. And the JavaScript view server doesn't make it easy to do so
> without starting multiple OS processes per map-function. I can see
> this being nifty down the road, but it's an optimization.
>
> However, interest is growing in an Erlang view server, and
> http://github.com/mmcdanie/erlview points out that the view process
> architecture could use some changes to make Erlang views parallel.
> Those changes will in turn make it more trivial to parallelize JS view
> computation, in the cases that need it. So I can see us getting to
> parallel JS views, but I think the cleaner way to get there is to
> start with proper Erlang views.
>

I'd also throw in that the biggest bottleneck for JS views is probably
the JSON serialization. I've been asked to try getting eep0018 in to a
copy of trunk as a configuration parameter. I'll get a branch on
github up by the end of the week to have a reference for that. My gut
feeling is that once we get that sorted out, we'll probably see that
disk I/O becomes the bottleneck. I could see maybe eeking out some
extra performance by streaming data through the map servers as opposed
to the serialized delegation method we're using now, but then it
becomes a cost/benfit of complexity. We'll see how it goes.

Paul Davis

>> —zdzolton
>>
>> On Mon, May 4, 2009 at 7:46 AM, Jan Lehnardt <ja...@apache.org> wrote:
>>> Beam and couchjs pipe data back and  forth during view generation. While the
>>> one works, the other waits. The scheduler is smart enough to keep the
>>> processes local to a single CPU. Otherwise it's be even more expensive.
>>>
>>> Cheers
>>> Jan
>>> --
>>>
>>> On 04.05.2009, at 12:50, Elf <el...@gmail.com> wrote:
>>>
>>>> Hello.
>>>> I'm using couchdb-0.9 (erlang 13.2) on my linux server with 4 CPUs
>>>> (Core 2 Quad).
>>>> Every time couchdb needs to reindex views, i see 2 serious processes
>>>> in htop - beam and couchjs. Each of them eats < 100% of 1 cpu, and sum
>>>> of their usage is about (but not greater) 100% of 1 cpu (80/20, 70/30
>>>> and so on).
>>>> Can somebody explain, why that 2 different processes (they have
>>>> differend PIDs) doesn't used different cpus - (one process for cpu)?
>>>>
>>>>
>>>> --
>>>> ----------------
>>>> Best regards
>>>> Elf
>>>> mailto:elf2001@gmail.com
>>>>
>>>
>>
>
>
>
> --
> Chris Anderson
> http://jchrisa.net
> http://couch.io
>

Re: couchdb on smp?

Posted by Chris Anderson <jc...@apache.org>.
On Mon, May 4, 2009 at 7:23 AM, Zachary Zolton <za...@gmail.com> wrote:
> @janl
>
> Perhaps he's asking why there's no activity on the other processor?
>
> I think his expectation here is that Map-Reduce would be parallelized.
> Correct me if I'm wrong, but CouchDB does not yet exploit parallelism
> in view indexing yet, right?
>

Correct. And the JavaScript view server doesn't make it easy to do so
without starting multiple OS processes per map-function. I can see
this being nifty down the road, but it's an optimization.

However, interest is growing in an Erlang view server, and
http://github.com/mmcdanie/erlview points out that the view process
architecture could use some changes to make Erlang views parallel.
Those changes will in turn make it more trivial to parallelize JS view
computation, in the cases that need it. So I can see us getting to
parallel JS views, but I think the cleaner way to get there is to
start with proper Erlang views.

> —zdzolton
>
> On Mon, May 4, 2009 at 7:46 AM, Jan Lehnardt <ja...@apache.org> wrote:
>> Beam and couchjs pipe data back and  forth during view generation. While the
>> one works, the other waits. The scheduler is smart enough to keep the
>> processes local to a single CPU. Otherwise it's be even more expensive.
>>
>> Cheers
>> Jan
>> --
>>
>> On 04.05.2009, at 12:50, Elf <el...@gmail.com> wrote:
>>
>>> Hello.
>>> I'm using couchdb-0.9 (erlang 13.2) on my linux server with 4 CPUs
>>> (Core 2 Quad).
>>> Every time couchdb needs to reindex views, i see 2 serious processes
>>> in htop - beam and couchjs. Each of them eats < 100% of 1 cpu, and sum
>>> of their usage is about (but not greater) 100% of 1 cpu (80/20, 70/30
>>> and so on).
>>> Can somebody explain, why that 2 different processes (they have
>>> differend PIDs) doesn't used different cpus - (one process for cpu)?
>>>
>>>
>>> --
>>> ----------------
>>> Best regards
>>> Elf
>>> mailto:elf2001@gmail.com
>>>
>>
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: couchdb on smp?

Posted by Zachary Zolton <za...@gmail.com>.
@janl

Perhaps he's asking why there's no activity on the other processor?

I think his expectation here is that Map-Reduce would be parallelized.
Correct me if I'm wrong, but CouchDB does not yet exploit parallelism
in view indexing yet, right?

—zdzolton

On Mon, May 4, 2009 at 7:46 AM, Jan Lehnardt <ja...@apache.org> wrote:
> Beam and couchjs pipe data back and  forth during view generation. While the
> one works, the other waits. The scheduler is smart enough to keep the
> processes local to a single CPU. Otherwise it's be even more expensive.
>
> Cheers
> Jan
> --
>
> On 04.05.2009, at 12:50, Elf <el...@gmail.com> wrote:
>
>> Hello.
>> I'm using couchdb-0.9 (erlang 13.2) on my linux server with 4 CPUs
>> (Core 2 Quad).
>> Every time couchdb needs to reindex views, i see 2 serious processes
>> in htop - beam and couchjs. Each of them eats < 100% of 1 cpu, and sum
>> of their usage is about (but not greater) 100% of 1 cpu (80/20, 70/30
>> and so on).
>> Can somebody explain, why that 2 different processes (they have
>> differend PIDs) doesn't used different cpus - (one process for cpu)?
>>
>>
>> --
>> ----------------
>> Best regards
>> Elf
>> mailto:elf2001@gmail.com
>>
>

Re: couchdb on smp?

Posted by Jan Lehnardt <ja...@apache.org>.
Beam and couchjs pipe data back and  forth during view generation.  
While the one works, the other waits. The scheduler is smart enough to  
keep the processes local to a single CPU. Otherwise it's be even more  
expensive.

Cheers
Jan
--

On 04.05.2009, at 12:50, Elf <el...@gmail.com> wrote:

> Hello.
> I'm using couchdb-0.9 (erlang 13.2) on my linux server with 4 CPUs
> (Core 2 Quad).
> Every time couchdb needs to reindex views, i see 2 serious processes
> in htop - beam and couchjs. Each of them eats < 100% of 1 cpu, and sum
> of their usage is about (but not greater) 100% of 1 cpu (80/20, 70/30
> and so on).
> Can somebody explain, why that 2 different processes (they have
> differend PIDs) doesn't used different cpus - (one process for cpu)?
>
>
> -- 
> ----------------
> Best regards
> Elf
> mailto:elf2001@gmail.com
>