You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Alieh Saeedi <al...@yahoo.com> on 2012/01/31 13:19:53 UTC

replication

As I read in Hadoop tutorial, Hadoop replicate file blocks by a factor (default 3), in other words, it replicate each block file 3 times. Does Hadoop do it for all files? I mean files written by reducers are replicated too? 

Re: replication

Posted by Harsh J <ha...@cloudera.com>.
The replication factor is per-file (and not HDFS-wide), and can be
controlled by setting "dfs.replication" in your job config to the
desired amount, to affect the whole job.

If you want to use config files itself to propagate this, place your
chosen "dfs.replication" default value inside conf/hdfs-site.xml.

On Tue, Jan 31, 2012 at 5:49 PM, Alieh Saeedi <al...@yahoo.com> wrote:
> As I read in Hadoop tutorial, Hadoop replicate file blocks by a factor
> (default 3), in other words, it replicate each block file 3 times. Does
> Hadoop do it for all files? I mean files written by reducers are replicated
> too?

Yes, all files written to HDFS will be replicated, but you can control
your # of replicas as detailed on top of post. Setting replication
factor to 1 would mean no replication.

-- 
Harsh J
Customer Ops. Engineer, Cloudera