You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Francesco De Luca <f....@gmail.com> on 2011/05/27 16:16:35 UTC

How does Map and Reduce class are sent to remote node by hadoop ??

Anyone knows the mechanism that hadoop use to load Map and Reduce class on
the remote node
where the JobTracker submit the tasks?

In particular, how can hadoop retrieves the .class files ?

Thanks

Re: How does Map and Reduce class are sent to remote node by hadoop ??

Posted by Francesco De Luca <f....@gmail.com>.
Thank you Robert,

infact i didn't specify any job.jar so i think that hadoop have some
mechanism to guess the jar

2011/5/27 Robert Evans <ev...@yahoo-inc.com>

>  Francesco,
>
> The mapreduce client will create a jar called job.jar and place it in HDFS
> in a staging directory.  This is the jar that you specified to your job
> conf, or I believe that it tries to guess the jar based off of the Mapper
> class and the Reducer class but I am not sure of that.  Once the job tracker
> has told a TaskTracker to run a given job the TaskTracker will download the
> jar, and then fork off a new JVM to execute the Mapper or Reducer.  If you
> jar has dependencies then these usually have to be shipped with it as part
> of the cache archive interface.
>
> --Bobby Evans
>
>
> On 5/27/11 9:16 AM, "Francesco De Luca" <f....@gmail.com> wrote:
>
> Anyone knows the mechanism that hadoop use to load Map and Reduce class on
> the remote node
> where the JobTracker submit the tasks?
>
> In particular, how can hadoop retrieves the .class files ?
>
> Thanks
>
>

Re: How does Map and Reduce class are sent to remote node by hadoop ??

Posted by Robert Evans <ev...@yahoo-inc.com>.
Francesco,

The mapreduce client will create a jar called job.jar and place it in HDFS in a staging directory.  This is the jar that you specified to your job conf, or I believe that it tries to guess the jar based off of the Mapper class and the Reducer class but I am not sure of that.  Once the job tracker has told a TaskTracker to run a given job the TaskTracker will download the jar, and then fork off a new JVM to execute the Mapper or Reducer.  If you jar has dependencies then these usually have to be shipped with it as part of the cache archive interface.

--Bobby Evans

On 5/27/11 9:16 AM, "Francesco De Luca" <f....@gmail.com> wrote:

Anyone knows the mechanism that hadoop use to load Map and Reduce class on the remote node
where the JobTracker submit the tasks?

In particular, how can hadoop retrieves the .class files ?

Thanks