You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Mike Kendall <mk...@justin.tv> on 2009/11/19 18:58:42 UTC

how do i force replication?

everything online says that replication will be taken care of automatically,
but i've had a file (that i uploaded through the put command on one node)
sitting with a replication of 1 for three days.

Re: how do i force replication?

Posted by Mike Kendall <mk...@justin.tv>.
and setrep is a good tool to add to my arsenal.  thanks.

On Thu, Nov 19, 2009 at 10:28 AM, Michael Thomas <th...@hep.caltech.edu>wrote:

> On 11/19/2009 10:25 AM, Brian Bockelman wrote:
>
>> Hey Mike,
>>
>> 1) What was the initial replication factor requested?  It will always stay
>> at that level until you request a new one.
>> 2) I think to manually change a file's replication it is "hadoop dfsadmin
>> -setrep" or something like that.  Don't trust what I wrote, trust the help
>> output.
>>
>
> To change the replication to 5:
>
> hadoop fs -setrep 5 $filename
>
> Or to change an entire directory recursively:
>
> hadoop fs -setrep -R 5 $filename
>
> --Mike
>
>
>  3) If a file is stuck at 1 replica, it usually means that HDFS is trying
>> to replicate the block, but for some reason, the datanode can't/won't send
>> it to another datanode.  I've found things like network partition,
>> disk-level corruption, or truncation can cause this.
>>
>> Grep the NN logs for the block ID -- you'll quickly be able to determine
>> whether the NN is repeatedly trying to replicate and failing for some
>> reason.  Then, discover what datanode holds the block (or one of the
>> attempted destination nodes) and grep its log for errors.
>>
>> Good luck.
>>
>> Brian
>>
>> On Nov 19, 2009, at 11:58 AM, Mike Kendall wrote:
>>
>>  everything online says that replication will be taken care of
>>> automatically,
>>> but i've had a file (that i uploaded through the put command on one node)
>>> sitting with a replication of 1 for three days.
>>>
>>
>>
>
>

Re: how do i force replication?

Posted by Michael Thomas <th...@hep.caltech.edu>.
On 11/19/2009 10:25 AM, Brian Bockelman wrote:
> Hey Mike,
>
> 1) What was the initial replication factor requested?  It will always stay at that level until you request a new one.
> 2) I think to manually change a file's replication it is "hadoop dfsadmin -setrep" or something like that.  Don't trust what I wrote, trust the help output.

To change the replication to 5:

hadoop fs -setrep 5 $filename

Or to change an entire directory recursively:

hadoop fs -setrep -R 5 $filename

--Mike

> 3) If a file is stuck at 1 replica, it usually means that HDFS is trying to replicate the block, but for some reason, the datanode can't/won't send it to another datanode.  I've found things like network partition, disk-level corruption, or truncation can cause this.
>
> Grep the NN logs for the block ID -- you'll quickly be able to determine whether the NN is repeatedly trying to replicate and failing for some reason.  Then, discover what datanode holds the block (or one of the attempted destination nodes) and grep its log for errors.
>
> Good luck.
>
> Brian
>
> On Nov 19, 2009, at 11:58 AM, Mike Kendall wrote:
>
>> everything online says that replication will be taken care of automatically,
>> but i've had a file (that i uploaded through the put command on one node)
>> sitting with a replication of 1 for three days.
>



Re: how do i force replication?

Posted by Brian Bockelman <bb...@cse.unl.edu>.
Hey Mike,

1) What was the initial replication factor requested?  It will always stay at that level until you request a new one.
2) I think to manually change a file's replication it is "hadoop dfsadmin -setrep" or something like that.  Don't trust what I wrote, trust the help output.
3) If a file is stuck at 1 replica, it usually means that HDFS is trying to replicate the block, but for some reason, the datanode can't/won't send it to another datanode.  I've found things like network partition, disk-level corruption, or truncation can cause this.

Grep the NN logs for the block ID -- you'll quickly be able to determine whether the NN is repeatedly trying to replicate and failing for some reason.  Then, discover what datanode holds the block (or one of the attempted destination nodes) and grep its log for errors.

Good luck.

Brian

On Nov 19, 2009, at 11:58 AM, Mike Kendall wrote:

> everything online says that replication will be taken care of automatically,
> but i've had a file (that i uploaded through the put command on one node)
> sitting with a replication of 1 for three days.


Re: how do i force replication?

Posted by Stephen Watt <sw...@us.ibm.com>.
If you have upped your replication after you have created the file, (I 
believe) one can also re-balance the cluster to ensure the replication 
factor is consistent across the board by running :

bin/hadoop balancer

You might try running this and see whether it fixes your issue.

Kind regards
Steve Watt



From:
Boris Shkolnik <bo...@yahoo-inc.com>
To:
<co...@hadoop.apache.org>
Date:
11/19/2009 12:19 PM
Subject:
Re: how do i force replication?



What is your configured replication level?
(<name>dfs.replication</name> in hdfs-site.xml or hdfs-default.xml)

One can specify replication when creating a file, but if you used put it
should've taken the one from the configuration.

If it is higher then 1 it should happen automatically.
If it doesn't look for errors in the logs.

Boris.


On 11/19/09 9:58 AM, "Mike Kendall" <mk...@justin.tv> wrote:

> everything online says that replication will be taken care of 
automatically,
> but i've had a file (that i uploaded through the put command on one 
node)
> sitting with a replication of 1 for three days.




Re: how do i force replication?

Posted by Mike Kendall <mk...@justin.tv>.
ohhhhh, i bet the node's replication level overrode the master's...  yeah.

thanks.

On Thu, Nov 19, 2009 at 10:17 AM, Boris Shkolnik <bo...@yahoo-inc.com>wrote:

> What is your configured replication level?
> (<name>dfs.replication</name> in hdfs-site.xml or hdfs-default.xml)
>
> One can specify replication when creating a file, but if you used put it
> should've taken the one from the configuration.
>
> If it is higher then 1 it should happen automatically.
> If it doesn't look for errors in the logs.
>
> Boris.
>
>
> On 11/19/09 9:58 AM, "Mike Kendall" <mk...@justin.tv> wrote:
>
> > everything online says that replication will be taken care of
> automatically,
> > but i've had a file (that i uploaded through the put command on one node)
> > sitting with a replication of 1 for three days.
>
>

Re: how do i force replication?

Posted by Boris Shkolnik <bo...@yahoo-inc.com>.
What is your configured replication level?
(<name>dfs.replication</name> in hdfs-site.xml or hdfs-default.xml)

One can specify replication when creating a file, but if you used put it
should've taken the one from the configuration.

If it is higher then 1 it should happen automatically.
If it doesn't look for errors in the logs.

Boris.


On 11/19/09 9:58 AM, "Mike Kendall" <mk...@justin.tv> wrote:

> everything online says that replication will be taken care of automatically,
> but i've had a file (that i uploaded through the put command on one node)
> sitting with a replication of 1 for three days.