You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Saptarshi Guha <sa...@gmail.com> on 2009/04/14 19:25:20 UTC

Is combiner and map in same JVM?

Hello,
Suppose I have a Hadoop job and have set my combiner to the Reducer class.
Does the map function and the combiner function run in the same JVM in
different threads? or in different JVMs?
I ask because I have to load a native library and if they are in the same
JVM then the native library is loaded once and I have to take  precautions.

Thank you
Saptarshi Guha

Re: Is combiner and map in same JVM?

Posted by Owen O'Malley <om...@apache.org>.
On Apr 14, 2009, at 11:10 AM, Saptarshi Guha wrote:

> Thanks. I am using 0.19, and to confirm, the map and combiner (in  
> the map jvm) are run in *different* threads at the same time?

And the change was actually made in 0.18. So since then, the combiner  
is called 0, 1, or many times on each key in both the mapper and the  
reducer. It is called in a separate thread from the base application  
in the map (in the reduce task, the combiner is only use during the  
shuffle).

> My native library is not thread safe, so I would have to implement  
> locks. Aaron's email gave me hope(since the map and combiner would  
> then be running sequentially), but this appears to make things  
> complicated.

Yes, you'll probably need locks around your code that isn't thread safe.

-- Owen



Re: Is combiner and map in same JVM?

Posted by Saptarshi Guha <sa...@gmail.com>.
Thanks. I am using 0.19, and to confirm, the map and combiner (in the map
jvm) are run in *different* threads at the same time?
My native library is not thread safe, so I would have to implement locks.
Aaron's email gave me hope(since the map and combiner would then be running
sequentially), but this appears to make things complicated.


Saptarshi Guha


On Tue, Apr 14, 2009 at 2:01 PM, Owen O'Malley <om...@apache.org> wrote:

>
> On Apr 14, 2009, at 10:52 AM, Aaron Kimball wrote:
>
>  They're in the same JVM, and I believe in the same thread.
>>
>
> They are the same JVM. They *used* to be the same thread. In either 0.19 or
> 0.20, combiners are also called in the reduce JVM if spills are required.
>
> -- Owen
>

Re: Is combiner and map in same JVM?

Posted by Owen O'Malley <om...@apache.org>.
On Apr 14, 2009, at 10:52 AM, Aaron Kimball wrote:

> They're in the same JVM, and I believe in the same thread.

They are the same JVM. They *used* to be the same thread. In either  
0.19 or 0.20, combiners are also called in the reduce JVM if spills  
are required.

-- Owen

Re: Is combiner and map in same JVM?

Posted by Aaron Kimball <aa...@cloudera.com>.
They're in the same JVM, and I believe in the same thread.
- Aaron

On Tue, Apr 14, 2009 at 10:25 AM, Saptarshi Guha
<sa...@gmail.com>wrote:

> Hello,
> Suppose I have a Hadoop job and have set my combiner to the Reducer class.
> Does the map function and the combiner function run in the same JVM in
> different threads? or in different JVMs?
> I ask because I have to load a native library and if they are in the same
> JVM then the native library is loaded once and I have to take  precautions.
>
> Thank you
> Saptarshi Guha
>