You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by "Y. Dong" <tq...@gmail.com> on 2013/08/20 12:06:55 UTC
MapReduce code location
Hi All,
I'm a Mapreduce newbie, what I want to know is that, say I have a mapper class:
public Class Map implements Mapper {
public List A;
public static List B;
public Map(){ //class constructor
System.out.println("Im initializing");
}
@Override
protected void map(………){
System.out.println("Im inside a mapper");
…….
}
}
when I run this mapper on a multi-machine hadoop configuration, will hadoop instantiate
multiple instances of this class then transmit them to every remote machine? So in a remote
machine will the map(…) method be able to access List A and List B locally from its own memory?
If yes, in the map method, what if I run System.out.println, will printed message be only shown on
the remote machine but not the machine I start the whole map reduce job?
Thanks.
Eason
Re: MapReduce code location
Posted by Kun Ling <lk...@gmail.com>.
Hi Y. Dong,
Here is for your questions:
1. will hadoop instantiate multiple instances of this class then transmit
them to every remote machine?
ANSWER: Each of the TaskTracker in Hadoop cluster will create an
instance of your Map class, and the transmission of the data is
accomplished by other part of the framework in Hadoop cluster.
since each TaskTracker starts a JVM, which will create an object of
your Map class,and will feed key-value pairs of your input data to your
map method. And the shuffle phase will pass the Map output data to Reduce
method.
2. in a remote machine will the map(…) method be able to access List A and
List B locally from its own memory?
ANSWER: because each TaskTracker node have its only Map object, they have
List A and List B in their local memory only.
Hoping the above answer helps you.
yours,
Kun Ling
On Tue, Aug 20, 2013 at 6:06 PM, Y. Dong <tq...@gmail.com> wrote:
> Hi All,
>
> I'm a Mapreduce newbie, what I want to know is that, say I have a mapper
> class:
>
> public Class Map implements Mapper {
>
> public List A;
> public static List B;
>
> public Map(){ //class constructor
> System.out.println("Im initializing");
> }
>
> @Override
> protected void map(………){
> System.out.println("Im inside a mapper");
> …….
> }
>
> }
>
> when I run this mapper on a multi-machine hadoop configuration, will
> hadoop instantiate
> multiple instances of this class then transmit them to every remote
> machine? So in a remote
> machine will the map(…) method be able to access List A and List B locally
> from its own memory?
> If yes, in the map method, what if I run System.out.println, will printed
> message be only shown on
> the remote machine but not the machine I start the whole map reduce job?
>
> Thanks.
>
> Eason
--
http://www.lingcc.com
Re: MapReduce code location
Posted by Kun Ling <lk...@gmail.com>.
Hi Y. Dong,
Here is for your questions:
1. will hadoop instantiate multiple instances of this class then transmit
them to every remote machine?
ANSWER: Each of the TaskTracker in Hadoop cluster will create an
instance of your Map class, and the transmission of the data is
accomplished by other part of the framework in Hadoop cluster.
since each TaskTracker starts a JVM, which will create an object of
your Map class,and will feed key-value pairs of your input data to your
map method. And the shuffle phase will pass the Map output data to Reduce
method.
2. in a remote machine will the map(…) method be able to access List A and
List B locally from its own memory?
ANSWER: because each TaskTracker node have its only Map object, they have
List A and List B in their local memory only.
Hoping the above answer helps you.
yours,
Kun Ling
On Tue, Aug 20, 2013 at 6:06 PM, Y. Dong <tq...@gmail.com> wrote:
> Hi All,
>
> I'm a Mapreduce newbie, what I want to know is that, say I have a mapper
> class:
>
> public Class Map implements Mapper {
>
> public List A;
> public static List B;
>
> public Map(){ //class constructor
> System.out.println("Im initializing");
> }
>
> @Override
> protected void map(………){
> System.out.println("Im inside a mapper");
> …….
> }
>
> }
>
> when I run this mapper on a multi-machine hadoop configuration, will
> hadoop instantiate
> multiple instances of this class then transmit them to every remote
> machine? So in a remote
> machine will the map(…) method be able to access List A and List B locally
> from its own memory?
> If yes, in the map method, what if I run System.out.println, will printed
> message be only shown on
> the remote machine but not the machine I start the whole map reduce job?
>
> Thanks.
>
> Eason
--
http://www.lingcc.com
Re: MapReduce code location
Posted by Kun Ling <lk...@gmail.com>.
Hi Y. Dong,
Here is for your questions:
1. will hadoop instantiate multiple instances of this class then transmit
them to every remote machine?
ANSWER: Each of the TaskTracker in Hadoop cluster will create an
instance of your Map class, and the transmission of the data is
accomplished by other part of the framework in Hadoop cluster.
since each TaskTracker starts a JVM, which will create an object of
your Map class,and will feed key-value pairs of your input data to your
map method. And the shuffle phase will pass the Map output data to Reduce
method.
2. in a remote machine will the map(…) method be able to access List A and
List B locally from its own memory?
ANSWER: because each TaskTracker node have its only Map object, they have
List A and List B in their local memory only.
Hoping the above answer helps you.
yours,
Kun Ling
On Tue, Aug 20, 2013 at 6:06 PM, Y. Dong <tq...@gmail.com> wrote:
> Hi All,
>
> I'm a Mapreduce newbie, what I want to know is that, say I have a mapper
> class:
>
> public Class Map implements Mapper {
>
> public List A;
> public static List B;
>
> public Map(){ //class constructor
> System.out.println("Im initializing");
> }
>
> @Override
> protected void map(………){
> System.out.println("Im inside a mapper");
> …….
> }
>
> }
>
> when I run this mapper on a multi-machine hadoop configuration, will
> hadoop instantiate
> multiple instances of this class then transmit them to every remote
> machine? So in a remote
> machine will the map(…) method be able to access List A and List B locally
> from its own memory?
> If yes, in the map method, what if I run System.out.println, will printed
> message be only shown on
> the remote machine but not the machine I start the whole map reduce job?
>
> Thanks.
>
> Eason
--
http://www.lingcc.com
Re: MapReduce code location
Posted by Kun Ling <lk...@gmail.com>.
Hi Y. Dong,
Here is for your questions:
1. will hadoop instantiate multiple instances of this class then transmit
them to every remote machine?
ANSWER: Each of the TaskTracker in Hadoop cluster will create an
instance of your Map class, and the transmission of the data is
accomplished by other part of the framework in Hadoop cluster.
since each TaskTracker starts a JVM, which will create an object of
your Map class,and will feed key-value pairs of your input data to your
map method. And the shuffle phase will pass the Map output data to Reduce
method.
2. in a remote machine will the map(…) method be able to access List A and
List B locally from its own memory?
ANSWER: because each TaskTracker node have its only Map object, they have
List A and List B in their local memory only.
Hoping the above answer helps you.
yours,
Kun Ling
On Tue, Aug 20, 2013 at 6:06 PM, Y. Dong <tq...@gmail.com> wrote:
> Hi All,
>
> I'm a Mapreduce newbie, what I want to know is that, say I have a mapper
> class:
>
> public Class Map implements Mapper {
>
> public List A;
> public static List B;
>
> public Map(){ //class constructor
> System.out.println("Im initializing");
> }
>
> @Override
> protected void map(………){
> System.out.println("Im inside a mapper");
> …….
> }
>
> }
>
> when I run this mapper on a multi-machine hadoop configuration, will
> hadoop instantiate
> multiple instances of this class then transmit them to every remote
> machine? So in a remote
> machine will the map(…) method be able to access List A and List B locally
> from its own memory?
> If yes, in the map method, what if I run System.out.println, will printed
> message be only shown on
> the remote machine but not the machine I start the whole map reduce job?
>
> Thanks.
>
> Eason
--
http://www.lingcc.com