You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Pedro Costa <ps...@gmail.com> on 2011/02/04 16:06:53 UTC

location awareness on RT tasks?

Hi,

When hadoop is running in cluster, the output of the Reducers are
saved in HDFS. The MapReduce have also location awareness on where is
saved the data?

For example, we've TT1 running in Machine1, and TT2 running in
Machine2. The replication of HDFS is 3. The Reduce Task RT1 is running
in TT1. So, when the reducer saves output in HDFS, 2 replicas of the
output goes to TT1 and the third one goes to TT2? Is this what
happens?

Thanks,

-- 
Pedro

Re: location awareness on RT tasks?

Posted by Mahadev Konar <ma...@apache.org>.
Hi Pedro,
  You can read abt the hdfs placement policy at:

http://hadoop.apache.org/common/docs/r0.20.2/hdfs_design.html

thanks
mahadev

On Fri, Feb 4, 2011 at 7:06 AM, Pedro Costa <ps...@gmail.com> wrote:
> Hi,
>
> When hadoop is running in cluster, the output of the Reducers are
> saved in HDFS. The MapReduce have also location awareness on where is
> saved the data?
>
> For example, we've TT1 running in Machine1, and TT2 running in
> Machine2. The replication of HDFS is 3. The Reduce Task RT1 is running
> in TT1. So, when the reducer saves output in HDFS, 2 replicas of the
> output goes to TT1 and the third one goes to TT2? Is this what
> happens?
>
> Thanks,
>
> --
> Pedro
>