You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Ajit Ratnaparkhi <aj...@gmail.com> on 2012/08/22 09:28:55 UTC

What about over-replicated blocks in HDFS?

Hi,

This is about case where HDFS has some data blocks which are
over-replicated.

Scenario is discussed below,
If one of datanodes goes down, Namenode will see some blocks as under
replicated and will start replication of under replicated blocks to bring
their replication level back to expected. If after that datanode which was
down comes up again without any data loss, at this time there will be
blocks having more replication level than expected. Does namenode itself
take care of removing extra blocks? or do we need to schedule balancer for
that?

-Ajit

Re: What about over-replicated blocks in HDFS?

Posted by Harsh J <ha...@cloudera.com>.
Ajit,

NameNode takes care of over-replication situations, and you needn't
have to worry about over-replicated blocks or do anything manually.

On Wed, Aug 22, 2012 at 12:58 PM, Ajit Ratnaparkhi
<aj...@gmail.com> wrote:
> Hi,
>
> This is about case where HDFS has some data blocks which are
> over-replicated.
>
> Scenario is discussed below,
> If one of datanodes goes down, Namenode will see some blocks as under
> replicated and will start replication of under replicated blocks to bring
> their replication level back to expected. If after that datanode which was
> down comes up again without any data loss, at this time there will be blocks
> having more replication level than expected. Does namenode itself take care
> of removing extra blocks? or do we need to schedule balancer for that?
>
> -Ajit



-- 
Harsh J

Re: What about over-replicated blocks in HDFS?

Posted by Harsh J <ha...@cloudera.com>.
Ajit,

NameNode takes care of over-replication situations, and you needn't
have to worry about over-replicated blocks or do anything manually.

On Wed, Aug 22, 2012 at 12:58 PM, Ajit Ratnaparkhi
<aj...@gmail.com> wrote:
> Hi,
>
> This is about case where HDFS has some data blocks which are
> over-replicated.
>
> Scenario is discussed below,
> If one of datanodes goes down, Namenode will see some blocks as under
> replicated and will start replication of under replicated blocks to bring
> their replication level back to expected. If after that datanode which was
> down comes up again without any data loss, at this time there will be blocks
> having more replication level than expected. Does namenode itself take care
> of removing extra blocks? or do we need to schedule balancer for that?
>
> -Ajit



-- 
Harsh J