You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@nutch.apache.org by Marek Bachmann <m....@uni-kassel.de> on 2011/06/10 14:41:30 UTC

Using multi cores on local machines

Hello again,

I noticed that in the reduce phase only use one cpu core. This processes 
take very long time with 100 % usage but only on one core. Is there a 
possibility to parallelise this processes on multiple cores on one local 
machine? Could using Hadoop help in some way? I have no experience with 
Hadoop at all. :-/

11/06/10 14:38:21 INFO mapred.JobClient:  map 100% reduce 94%
11/06/10 14:38:23 INFO mapred.LocalJobRunner: reduce > reduce
11/06/10 14:38:26 INFO mapred.LocalJobRunner: reduce > reduce
11/06/10 14:38:29 INFO mapred.LocalJobRunner: reduce > reduce
11/06/10 14:38:32 INFO mapred.LocalJobRunner: reduce > reduce
11/06/10 14:38:35 INFO mapred.LocalJobRunner: reduce > reduce
11/06/10 14:38:38 INFO mapred.LocalJobRunner: reduce > reduce
11/06/10 14:38:41 INFO mapred.LocalJobRunner: reduce > reduce
11/06/10 14:38:44 INFO mapred.LocalJobRunner: reduce > reduce
11/06/10 14:38:47 INFO mapred.LocalJobRunner: reduce > reduce
11/06/10 14:38:50 INFO mapred.LocalJobRunner: reduce > reduce
11/06/10 14:38:53 INFO mapred.LocalJobRunner: reduce > reduce
11/06/10 14:38:56 INFO mapred.LocalJobRunner: reduce > reduce
11/06/10 14:38:57 INFO mapred.JobClient:  map 100% reduce 95%


Here is a copy of top's output while running a reduce:

top - 14:30:53 up 12 days, 33 min,  3 users,  load average: 0.81, 0.38, 0.35
Tasks: 123 total,   1 running, 122 sleeping,   0 stopped,   0 zombie
Cpu(s): 25.1%us,  0.2%sy,  0.0%ni, 74.8%id,  0.0%wa,  0.0%hi,  0.0%si, 
0.0%st
Mem:   8003904k total,  5762520k used,  2241384k free,   120180k buffers
Swap:   418808k total,        4k used,   418804k free,  3713236k cached

   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 
 

25835 root      20   0 4371m 1.6g  10m S  101 21.3   5:18.69 java

Tank you

Re: Using multi cores on local machines

Posted by MilleBii <mi...@gmail.com>.

@Julien & all Thx for the correction,

@Ken , you know what I just got the book last week, and I'm in the process
of reading it. And whilst I was reading it, I said oops my answer is wrong.

You guys corrected it, fine.

I got to this conclusion because I only ever used  a pseudo/distributed or a
two server cluster and in those cases there is only one reducer.

In the book it is recommended to have less reducer than nodes for
optimisation reasons.

@Marek,
Although I had tried in the past, I never succeeded to get more reducers




2011/6/10 Ken Krugler <kk...@transpac.com>

>
> On Jun 10, 2011, at 7:50am, Marek Bachmann wrote:
>
> > Thanks to you all,
> >
> > so to get it on one point: Is it possible to speed up the map / reduce
> task (what ever it exactly does) on a single quad core machine, and if so,
> does anyone know a resource where I can get a little documentation? :-)
>
> Get "Hadoop: The Definitive Guide" by Tom White.
>
> And then set up your machine to run in pseudo-distributed mode.
>
> -- Ken
>
> >
> > Thank you once again.
> >
> > Greetings,
> >
> > Marek
> >
> > On 10.06.2011 16:26, Julien Nioche wrote:
> >> Raymond,
> >>
> >> Hadoop is using a map/reduce algorithm, the reduce phase is that phase
> which
> >>> collects the results from // execution.
> >>> It is inherently not possible to parrallelized that phase.
> >>>
> >>
> >> Sorry to contradict you Raymond but this is incorrect. You can specify
> the
> >> number of reducers to use e.g.
> >>
> >> -D mapred.reduce.tasks=$numTasks
> >>
> >> but obviously this will work only in (pseudo)distributed mode i.e. with
> the
> >> various Hadoop services running indepently of Nutch
> >>
> >>
> >>
> >>
> >>
> >>
> >>>
> >>> -Raymond-
> >>>
> >>> 2011/6/10 Marek Bachmann<m....@uni-kassel.de>
> >>>
> >>>> Hello again,
> >>>>
> >>>> I noticed that in the reduce phase only use one cpu core. This
> processes
> >>>> take very long time with 100 % usage but only on one core. Is there a
> >>>> possibility to parallelise this processes on multiple cores on one
> local
> >>>> machine? Could using Hadoop help in some way? I have no experience
> with
> >>>> Hadoop at all. :-/
> >>>>
> >>>> 11/06/10 14:38:21 INFO mapred.JobClient:  map 100% reduce 94%
> >>>> 11/06/10 14:38:23 INFO mapred.LocalJobRunner: reduce>  reduce
> >>>> 11/06/10 14:38:26 INFO mapred.LocalJobRunner: reduce>  reduce
> >>>> 11/06/10 14:38:29 INFO mapred.LocalJobRunner: reduce>  reduce
> >>>> 11/06/10 14:38:32 INFO mapred.LocalJobRunner: reduce>  reduce
> >>>> 11/06/10 14:38:35 INFO mapred.LocalJobRunner: reduce>  reduce
> >>>> 11/06/10 14:38:38 INFO mapred.LocalJobRunner: reduce>  reduce
> >>>> 11/06/10 14:38:41 INFO mapred.LocalJobRunner: reduce>  reduce
> >>>> 11/06/10 14:38:44 INFO mapred.LocalJobRunner: reduce>  reduce
> >>>> 11/06/10 14:38:47 INFO mapred.LocalJobRunner: reduce>  reduce
> >>>> 11/06/10 14:38:50 INFO mapred.LocalJobRunner: reduce>  reduce
> >>>> 11/06/10 14:38:53 INFO mapred.LocalJobRunner: reduce>  reduce
> >>>> 11/06/10 14:38:56 INFO mapred.LocalJobRunner: reduce>  reduce
> >>>> 11/06/10 14:38:57 INFO mapred.JobClient:  map 100% reduce 95%
> >>>>
> >>>>
> >>>> Here is a copy of top's output while running a reduce:
> >>>>
> >>>> top - 14:30:53 up 12 days, 33 min,  3 users,  load average: 0.81,
> 0.38,
> >>>> 0.35
> >>>> Tasks: 123 total,   1 running, 122 sleeping,   0 stopped,   0 zombie
> >>>> Cpu(s): 25.1%us,  0.2%sy,  0.0%ni, 74.8%id,  0.0%wa,  0.0%hi,  0.0%si,
> >>>> 0.0%st
> >>>> Mem:   8003904k total,  5762520k used,  2241384k free,   120180k
> buffers
> >>>> Swap:   418808k total,        4k used,   418804k free,  3713236k
> cached
> >>>>
> >>>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> >>>>
> >>>> 25835 root      20   0 4371m 1.6g  10m S  101 21.3   5:18.69 java
> >>>>
> >>>> Tank you
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> -MilleBii-
> >>>
> >>
> >>
> >>
> >
>
> --------------------------
> Ken Krugler
> +1 530-210-6378
> http://bixolabs.com
> custom data mining solutions
>
>
>
>
>
>
>


-- 
-MilleBii-

Re: Using multi cores on local machines

Posted by Ken Krugler <kk...@transpac.com>.

On Jun 10, 2011, at 7:50am, Marek Bachmann wrote:

> Thanks to you all,
> 
> so to get it on one point: Is it possible to speed up the map / reduce task (what ever it exactly does) on a single quad core machine, and if so, does anyone know a resource where I can get a little documentation? :-)

Get "Hadoop: The Definitive Guide" by Tom White.

And then set up your machine to run in pseudo-distributed mode.

-- Ken

> 
> Thank you once again.
> 
> Greetings,
> 
> Marek
> 
> On 10.06.2011 16:26, Julien Nioche wrote:
>> Raymond,
>> 
>> Hadoop is using a map/reduce algorithm, the reduce phase is that phase which
>>> collects the results from // execution.
>>> It is inherently not possible to parrallelized that phase.
>>> 
>> 
>> Sorry to contradict you Raymond but this is incorrect. You can specify the
>> number of reducers to use e.g.
>> 
>> -D mapred.reduce.tasks=$numTasks
>> 
>> but obviously this will work only in (pseudo)distributed mode i.e. with the
>> various Hadoop services running indepently of Nutch
>> 
>> 
>> 
>> 
>> 
>> 
>>> 
>>> -Raymond-
>>> 
>>> 2011/6/10 Marek Bachmann<m....@uni-kassel.de>
>>> 
>>>> Hello again,
>>>> 
>>>> I noticed that in the reduce phase only use one cpu core. This processes
>>>> take very long time with 100 % usage but only on one core. Is there a
>>>> possibility to parallelise this processes on multiple cores on one local
>>>> machine? Could using Hadoop help in some way? I have no experience with
>>>> Hadoop at all. :-/
>>>> 
>>>> 11/06/10 14:38:21 INFO mapred.JobClient:  map 100% reduce 94%
>>>> 11/06/10 14:38:23 INFO mapred.LocalJobRunner: reduce>  reduce
>>>> 11/06/10 14:38:26 INFO mapred.LocalJobRunner: reduce>  reduce
>>>> 11/06/10 14:38:29 INFO mapred.LocalJobRunner: reduce>  reduce
>>>> 11/06/10 14:38:32 INFO mapred.LocalJobRunner: reduce>  reduce
>>>> 11/06/10 14:38:35 INFO mapred.LocalJobRunner: reduce>  reduce
>>>> 11/06/10 14:38:38 INFO mapred.LocalJobRunner: reduce>  reduce
>>>> 11/06/10 14:38:41 INFO mapred.LocalJobRunner: reduce>  reduce
>>>> 11/06/10 14:38:44 INFO mapred.LocalJobRunner: reduce>  reduce
>>>> 11/06/10 14:38:47 INFO mapred.LocalJobRunner: reduce>  reduce
>>>> 11/06/10 14:38:50 INFO mapred.LocalJobRunner: reduce>  reduce
>>>> 11/06/10 14:38:53 INFO mapred.LocalJobRunner: reduce>  reduce
>>>> 11/06/10 14:38:56 INFO mapred.LocalJobRunner: reduce>  reduce
>>>> 11/06/10 14:38:57 INFO mapred.JobClient:  map 100% reduce 95%
>>>> 
>>>> 
>>>> Here is a copy of top's output while running a reduce:
>>>> 
>>>> top - 14:30:53 up 12 days, 33 min,  3 users,  load average: 0.81, 0.38,
>>>> 0.35
>>>> Tasks: 123 total,   1 running, 122 sleeping,   0 stopped,   0 zombie
>>>> Cpu(s): 25.1%us,  0.2%sy,  0.0%ni, 74.8%id,  0.0%wa,  0.0%hi,  0.0%si,
>>>> 0.0%st
>>>> Mem:   8003904k total,  5762520k used,  2241384k free,   120180k buffers
>>>> Swap:   418808k total,        4k used,   418804k free,  3713236k cached
>>>> 
>>>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>> 
>>>> 25835 root      20   0 4371m 1.6g  10m S  101 21.3   5:18.69 java
>>>> 
>>>> Tank you
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> -MilleBii-
>>> 
>> 
>> 
>> 
> 

--------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
custom data mining solutions

Re: Using multi cores on local machines

Posted by Marek Bachmann <m....@uni-kassel.de>.

Thanks to you all,

so to get it on one point: Is it possible to speed up the map / reduce 
task (what ever it exactly does) on a single quad core machine, and if 
so, does anyone know a resource where I can get a little documentation? :-)

Thank you once again.

Greetings,

Marek

On 10.06.2011 16:26, Julien Nioche wrote:
> Raymond,
>
> Hadoop is using a map/reduce algorithm, the reduce phase is that phase which
>> collects the results from // execution.
>> It is inherently not possible to parrallelized that phase.
>>
>
> Sorry to contradict you Raymond but this is incorrect. You can specify the
> number of reducers to use e.g.
>
> -D mapred.reduce.tasks=$numTasks
>
> but obviously this will work only in (pseudo)distributed mode i.e. with the
> various Hadoop services running indepently of Nutch
>
>
>
>
>
>
>>
>> -Raymond-
>>
>> 2011/6/10 Marek Bachmann<m....@uni-kassel.de>
>>
>>> Hello again,
>>>
>>> I noticed that in the reduce phase only use one cpu core. This processes
>>> take very long time with 100 % usage but only on one core. Is there a
>>> possibility to parallelise this processes on multiple cores on one local
>>> machine? Could using Hadoop help in some way? I have no experience with
>>> Hadoop at all. :-/
>>>
>>> 11/06/10 14:38:21 INFO mapred.JobClient:  map 100% reduce 94%
>>> 11/06/10 14:38:23 INFO mapred.LocalJobRunner: reduce>  reduce
>>> 11/06/10 14:38:26 INFO mapred.LocalJobRunner: reduce>  reduce
>>> 11/06/10 14:38:29 INFO mapred.LocalJobRunner: reduce>  reduce
>>> 11/06/10 14:38:32 INFO mapred.LocalJobRunner: reduce>  reduce
>>> 11/06/10 14:38:35 INFO mapred.LocalJobRunner: reduce>  reduce
>>> 11/06/10 14:38:38 INFO mapred.LocalJobRunner: reduce>  reduce
>>> 11/06/10 14:38:41 INFO mapred.LocalJobRunner: reduce>  reduce
>>> 11/06/10 14:38:44 INFO mapred.LocalJobRunner: reduce>  reduce
>>> 11/06/10 14:38:47 INFO mapred.LocalJobRunner: reduce>  reduce
>>> 11/06/10 14:38:50 INFO mapred.LocalJobRunner: reduce>  reduce
>>> 11/06/10 14:38:53 INFO mapred.LocalJobRunner: reduce>  reduce
>>> 11/06/10 14:38:56 INFO mapred.LocalJobRunner: reduce>  reduce
>>> 11/06/10 14:38:57 INFO mapred.JobClient:  map 100% reduce 95%
>>>
>>>
>>> Here is a copy of top's output while running a reduce:
>>>
>>> top - 14:30:53 up 12 days, 33 min,  3 users,  load average: 0.81, 0.38,
>>> 0.35
>>> Tasks: 123 total,   1 running, 122 sleeping,   0 stopped,   0 zombie
>>> Cpu(s): 25.1%us,  0.2%sy,  0.0%ni, 74.8%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Mem:   8003904k total,  5762520k used,  2241384k free,   120180k buffers
>>> Swap:   418808k total,        4k used,   418804k free,  3713236k cached
>>>
>>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>
>>> 25835 root      20   0 4371m 1.6g  10m S  101 21.3   5:18.69 java
>>>
>>> Tank you
>>>
>>
>>
>>
>> --
>> -MilleBii-
>>
>
>
>

Re: Using multi cores on local machines

Posted by Julien Nioche <li...@gmail.com>.

Raymond,

Hadoop is using a map/reduce algorithm, the reduce phase is that phase which
> collects the results from // execution.
> It is inherently not possible to parrallelized that phase.
>

Sorry to contradict you Raymond but this is incorrect. You can specify the
number of reducers to use e.g.

-D mapred.reduce.tasks=$numTasks

but obviously this will work only in (pseudo)distributed mode i.e. with the
various Hadoop services running indepently of Nutch






>
> -Raymond-
>
> 2011/6/10 Marek Bachmann <m....@uni-kassel.de>
>
> > Hello again,
> >
> > I noticed that in the reduce phase only use one cpu core. This processes
> > take very long time with 100 % usage but only on one core. Is there a
> > possibility to parallelise this processes on multiple cores on one local
> > machine? Could using Hadoop help in some way? I have no experience with
> > Hadoop at all. :-/
> >
> > 11/06/10 14:38:21 INFO mapred.JobClient:  map 100% reduce 94%
> > 11/06/10 14:38:23 INFO mapred.LocalJobRunner: reduce > reduce
> > 11/06/10 14:38:26 INFO mapred.LocalJobRunner: reduce > reduce
> > 11/06/10 14:38:29 INFO mapred.LocalJobRunner: reduce > reduce
> > 11/06/10 14:38:32 INFO mapred.LocalJobRunner: reduce > reduce
> > 11/06/10 14:38:35 INFO mapred.LocalJobRunner: reduce > reduce
> > 11/06/10 14:38:38 INFO mapred.LocalJobRunner: reduce > reduce
> > 11/06/10 14:38:41 INFO mapred.LocalJobRunner: reduce > reduce
> > 11/06/10 14:38:44 INFO mapred.LocalJobRunner: reduce > reduce
> > 11/06/10 14:38:47 INFO mapred.LocalJobRunner: reduce > reduce
> > 11/06/10 14:38:50 INFO mapred.LocalJobRunner: reduce > reduce
> > 11/06/10 14:38:53 INFO mapred.LocalJobRunner: reduce > reduce
> > 11/06/10 14:38:56 INFO mapred.LocalJobRunner: reduce > reduce
> > 11/06/10 14:38:57 INFO mapred.JobClient:  map 100% reduce 95%
> >
> >
> > Here is a copy of top's output while running a reduce:
> >
> > top - 14:30:53 up 12 days, 33 min,  3 users,  load average: 0.81, 0.38,
> > 0.35
> > Tasks: 123 total,   1 running, 122 sleeping,   0 stopped,   0 zombie
> > Cpu(s): 25.1%us,  0.2%sy,  0.0%ni, 74.8%id,  0.0%wa,  0.0%hi,  0.0%si,
> > 0.0%st
> > Mem:   8003904k total,  5762520k used,  2241384k free,   120180k buffers
> > Swap:   418808k total,        4k used,   418804k free,  3713236k cached
> >
> >  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> >
> > 25835 root      20   0 4371m 1.6g  10m S  101 21.3   5:18.69 java
> >
> > Tank you
> >
>
>
>
> --
> -MilleBii-
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Re: Using multi cores on local machines

Posted by Andrzej Bialecki <ab...@getopt.org>.

On 6/10/11 3:57 PM, MilleBii wrote:
> Hadoop is using a map/reduce algorithm, the reduce phase is that phase which
> collects the results from // execution.
> It is inherently not possible to parrallelized that phase.

Actually, this is not true at all - it's perfectly ok to have multiple 
reduce tasks and have them run in parallel.

The only gotcha why it didn't work in this case? The LocalJobRunner - 
it's limited to run only one map and one reduce task at a time, because 
it's not meant to be used for anything serious.

In order to have multiple tasks running in parallel you need to use the 
distributed JobTracker/TaskTracker, even if it's just a single node.

-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: Using multi cores on local machines

Posted by Marek Bachmann <m....@uni-kassel.de>.

Thanks for your reply Raymond.

Just for my comprehension: You mean that a >single< reduce phase isn't 
possible to parallelise? So I guess the problem in my case is that there 
is only one map and reduce process on a local machine?
In other words: In order to process the work with parallel reduce 
processes it would be necessary to run  multiple map processes before.

I think my problem with that topic is, that I just don't know what 
exactly happens in the map / reduce phase.
Know a good link to get me informed? :)

Cheers,

Marek

On 10.06.2011 15:57, MilleBii wrote:
> Hadoop is using a map/reduce algorithm, the reduce phase is that phase which
> collects the results from // execution.
> It is inherently not possible to parrallelized that phase.
>
> -Raymond-
>
> 2011/6/10 Marek Bachmann<m....@uni-kassel.de>
>
>> Hello again,
>>
>> I noticed that in the reduce phase only use one cpu core. This processes
>> take very long time with 100 % usage but only on one core. Is there a
>> possibility to parallelise this processes on multiple cores on one local
>> machine? Could using Hadoop help in some way? I have no experience with
>> Hadoop at all. :-/
>>
>> 11/06/10 14:38:21 INFO mapred.JobClient:  map 100% reduce 94%
>> 11/06/10 14:38:23 INFO mapred.LocalJobRunner: reduce>  reduce
>> 11/06/10 14:38:26 INFO mapred.LocalJobRunner: reduce>  reduce
>> 11/06/10 14:38:29 INFO mapred.LocalJobRunner: reduce>  reduce
>> 11/06/10 14:38:32 INFO mapred.LocalJobRunner: reduce>  reduce
>> 11/06/10 14:38:35 INFO mapred.LocalJobRunner: reduce>  reduce
>> 11/06/10 14:38:38 INFO mapred.LocalJobRunner: reduce>  reduce
>> 11/06/10 14:38:41 INFO mapred.LocalJobRunner: reduce>  reduce
>> 11/06/10 14:38:44 INFO mapred.LocalJobRunner: reduce>  reduce
>> 11/06/10 14:38:47 INFO mapred.LocalJobRunner: reduce>  reduce
>> 11/06/10 14:38:50 INFO mapred.LocalJobRunner: reduce>  reduce
>> 11/06/10 14:38:53 INFO mapred.LocalJobRunner: reduce>  reduce
>> 11/06/10 14:38:56 INFO mapred.LocalJobRunner: reduce>  reduce
>> 11/06/10 14:38:57 INFO mapred.JobClient:  map 100% reduce 95%
>>
>>
>> Here is a copy of top's output while running a reduce:
>>
>> top - 14:30:53 up 12 days, 33 min,  3 users,  load average: 0.81, 0.38,
>> 0.35
>> Tasks: 123 total,   1 running, 122 sleeping,   0 stopped,   0 zombie
>> Cpu(s): 25.1%us,  0.2%sy,  0.0%ni, 74.8%id,  0.0%wa,  0.0%hi,  0.0%si,
>> 0.0%st
>> Mem:   8003904k total,  5762520k used,  2241384k free,   120180k buffers
>> Swap:   418808k total,        4k used,   418804k free,  3713236k cached
>>
>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>
>> 25835 root      20   0 4371m 1.6g  10m S  101 21.3   5:18.69 java
>>
>> Tank you
>>
>
>
>

Re: Using multi cores on local machines

Posted by MilleBii <mi...@gmail.com>.

Hadoop is using a map/reduce algorithm, the reduce phase is that phase which
collects the results from // execution.
It is inherently not possible to parrallelized that phase.

-Raymond-

2011/6/10 Marek Bachmann <m....@uni-kassel.de>

> Hello again,
>
> I noticed that in the reduce phase only use one cpu core. This processes
> take very long time with 100 % usage but only on one core. Is there a
> possibility to parallelise this processes on multiple cores on one local
> machine? Could using Hadoop help in some way? I have no experience with
> Hadoop at all. :-/
>
> 11/06/10 14:38:21 INFO mapred.JobClient:  map 100% reduce 94%
> 11/06/10 14:38:23 INFO mapred.LocalJobRunner: reduce > reduce
> 11/06/10 14:38:26 INFO mapred.LocalJobRunner: reduce > reduce
> 11/06/10 14:38:29 INFO mapred.LocalJobRunner: reduce > reduce
> 11/06/10 14:38:32 INFO mapred.LocalJobRunner: reduce > reduce
> 11/06/10 14:38:35 INFO mapred.LocalJobRunner: reduce > reduce
> 11/06/10 14:38:38 INFO mapred.LocalJobRunner: reduce > reduce
> 11/06/10 14:38:41 INFO mapred.LocalJobRunner: reduce > reduce
> 11/06/10 14:38:44 INFO mapred.LocalJobRunner: reduce > reduce
> 11/06/10 14:38:47 INFO mapred.LocalJobRunner: reduce > reduce
> 11/06/10 14:38:50 INFO mapred.LocalJobRunner: reduce > reduce
> 11/06/10 14:38:53 INFO mapred.LocalJobRunner: reduce > reduce
> 11/06/10 14:38:56 INFO mapred.LocalJobRunner: reduce > reduce
> 11/06/10 14:38:57 INFO mapred.JobClient:  map 100% reduce 95%
>
>
> Here is a copy of top's output while running a reduce:
>
> top - 14:30:53 up 12 days, 33 min,  3 users,  load average: 0.81, 0.38,
> 0.35
> Tasks: 123 total,   1 running, 122 sleeping,   0 stopped,   0 zombie
> Cpu(s): 25.1%us,  0.2%sy,  0.0%ni, 74.8%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Mem:   8003904k total,  5762520k used,  2241384k free,   120180k buffers
> Swap:   418808k total,        4k used,   418804k free,  3713236k cached
>
>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>
> 25835 root      20   0 4371m 1.6g  10m S  101 21.3   5:18.69 java
>
> Tank you
>



-- 
-MilleBii-