You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Pedro Costa <ps...@gmail.com> on 2011/06/15 21:21:42 UTC

define a specific data node from a reduce task.

Hi,

I'm running an MR application that produces an output that is saved in
HDFS. My application has 5 slave nodes (so it has also 5 data nodes).
The hdfs file replication factor is 1. I want from my application, or
from the Hadoop MR source code tell which data node my result should
be. For example, if I've datanode1-5, I would like to say to the name
node to save the reduce output result data in a specific datanode.

Is it possible?


Thanks,

Re: define a specific data node from a reduce task.

Posted by Denny Ye <de...@gmail.com>.

Each reduce task take the role of DFS client by using specified output
folder. The placement of reduce
output result judged by NameNode block placement policy. One thing I can
confirm, as the cluster inner
slave, the first block reduce task produced in one DataNode can be placed
into local filesystem.

On Thu, Jun 16, 2011 at 3:21 AM, Pedro Costa <ps...@gmail.com> wrote:

> Hi,
>
> I'm running an MR application that produces an output that is saved in
> HDFS. My application has 5 slave nodes (so it has also 5 data nodes).
> The hdfs file replication factor is 1. I want from my application, or
> from the Hadoop MR source code tell which data node my result should
> be. For example, if I've datanode1-5, I would like to say to the name
> node to save the reduce output result data in a specific datanode.
>
> Is it possible?
>
>
> Thanks,
>