You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by David Ginzburg <gi...@hotmail.com> on 2012/05/02 13:49:41 UTC

Rack decommission

Hi,

I have a cluster with 4 racks.
I am scheduled to replace one of them, with slave servers included.
The cluster is configured with rack awareness and replication factor is 3.
It is my understanding that on rack aware clusters, block placement policy ensures blocks are at least on two racks.
I am concerned about decommission of whole rack.

I considered decommission each node in the rack separately and let it balance automatically  , but I am afraid it will take a lot of time.

If a DN is decommissioned and returns, Does the NM update it's block meta data. If it does,  How does it decide when to update the meta data map?

Is there a suggested way to decommission a whole rack , I may have missed?







 		 	   		  

RE: Rack decommission

Posted by David Ginzburg <gi...@hotmail.com>.
Thank you,
It has.
I did not expect to see this number >0 since my cluster is configured with rack awareness. 
Should I be concerned as to the reliability of the rack awareness? How can I make sure my rack awareness is configured perfectly, 
beside checking and rechecking it ?
  

Date: Fri, 4 May 2012 09:49:41 -0500
Subject: Re: Rack decommission
From: ravihadoop@gmail.com
To: hdfs-user@hadoop.apache.org

Hi Dave,

I'm not entirely certain, but briefly looking at code:http://search-hadoop.com/c/Hadoop:/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java%7C%7C%252Bebs+%252Bhdfs Line 2679


            if ((curReplicas == 0) && (num.decommissionedReplicas() > 0)) {
              decommissionOnlyReplicas++;
            }

If the block replica is available ONLY on the node which is being decommissioned, then it is labeled as "Blocks with No Live Replicas". I would think that this number should go down after the NN replicates these blocks. Is it?


HTH
Ravi.

On Fri, May 4, 2012 at 5:30 AM, David Ginzburg <gi...@hotmail.com> wrote:





Thank you,

I am a bit confused about the meaning of the state -"Blocks With No Live Replicas". 
Could not find documentation.
Can you clarify?



Date: Thu, 3 May 2012 10:44:07 -0500

Subject: Re: Rack decommission
From: ravihadoop@gmail.com
To: hdfs-user@hadoop.apache.org


Hi David,

> If a DN is decommissioned and returns, Does the NM update it's block 
meta data. If it does,  How does it decide when to update the meta data 
map?

I presume you meant "the NN update it's block meta data". The answer is yes. Soon as the NN decides that a node is dead / decommissioned and that blocks have / will fallen below their replication factor, it will order more replications. This is what is going to happen when you decommission the nodes.



When the DNs come back, and they report their blocks to the NN, the NN will realise that the blocks are over-replicated and will order deletions.

> When I view the "dfsnodelist.jsp?whatNodes=

DECOMMISSIONING"  page, there are many slaves with "Blocks With No Live Replicas"
Obviously depends on how loaded your HDFS was. 





On Thu, May 3, 2012 at 10:04 AM, David Ginzburg <gi...@hotmail.com> wrote:






Hi,
I have began decommissioning the servers  in the rack.
When I view the "dfsnodelist.jsp?whatNodes=DECOMMISSIONING"  page, there are many slaves with "Blocks With No Live Replicas"
Is it possible , even if I have rack awareness configured ? 





From: ginzman@hotmail.com
To: hdfs-user@hadoop.apache.org; common-user@hadoop.apache.org


Subject: Rack decommission
Date: Wed, 2 May 2012 11:49:41 +0000





Hi,

I have a cluster with 4 racks.
I am scheduled to replace one of them, with slave servers included.
The cluster is configured with rack awareness and replication factor is 3.
It is my understanding that on rack aware clusters, block placement policy ensures blocks are at least on two racks.


I am concerned about decommission of whole rack.

I considered decommission each node in the rack separately and let it balance automatically  , but I am afraid it will take a lot of time.

If a DN is decommissioned and returns, Does the NM update it's block meta data. If it does,  How does it decide when to update the meta data map?



Is there a suggested way to decommission a whole rack , I may have missed?







 		 	   		   		 	   		  

 		 	   		  

 		 	   		  

Re: Rack decommission

Posted by Ravi Prakash <ra...@gmail.com>.
Hi Dave,

I'm not entirely certain, but briefly looking at code:
http://search-hadoop.com/c/Hadoop:/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java%7C%7C%252Bebs+%252BhdfsLine
2679

            if ((curReplicas == 0) && (num.decommissionedReplicas() > 0)) {
              decommissionOnlyReplicas++;
            }

If the block replica is available ONLY on the node which is being
decommissioned, then it is labeled as "Blocks with No Live Replicas". I
would think that this number should go down after the NN replicates these
blocks. Is it?

HTH
Ravi.

On Fri, May 4, 2012 at 5:30 AM, David Ginzburg <gi...@hotmail.com> wrote:

>  Thank you,
>
> I am a bit confused about the meaning of the state -"Blocks With No Live
> Replicas".
> Could not find documentation.
> Can you clarify?
>
>
>
> ------------------------------
> Date: Thu, 3 May 2012 10:44:07 -0500
> Subject: Re: Rack decommission
> From: ravihadoop@gmail.com
> To: hdfs-user@hadoop.apache.org
>
>
> Hi David,
>
> > If a DN is decommissioned and returns, Does the NM update it's block
> meta data. If it does,  How does it decide when to update the meta data map?
>
> I presume you meant "the NN update it's block meta data". The answer is
> yes. Soon as the NN decides that a node is dead / decommissioned and that
> blocks have / will fallen below their replication factor, it will order
> more replications. This is what is going to happen when you decommission
> the nodes.
>
> When the DNs come back, and they report their blocks to the NN, the NN
> will realise that the blocks are over-replicated and will order deletions.
>
> > When I view the "dfsnodelist.jsp?whatNodes=
> DECOMMISSIONING"  page, there are many slaves with "Blocks With No Live
> Replicas"
>
> Obviously depends on how loaded your HDFS was.
>
>
>
>
>
> On Thu, May 3, 2012 at 10:04 AM, David Ginzburg <gi...@hotmail.com>wrote:
>
>  Hi,
> I have began decommissioning the servers  in the rack.
> When I view the "dfsnodelist.jsp?whatNodes=DECOMMISSIONING"  page, there
> are many slaves with "Blocks With No Live Replicas"
> Is it possible , even if I have rack awareness configured ?
>
>
>
> ------------------------------
> From: ginzman@hotmail.com
> To: hdfs-user@hadoop.apache.org; common-user@hadoop.apache.org
> Subject: Rack decommission
> Date: Wed, 2 May 2012 11:49:41 +0000
>
>
>  Hi,
>
> I have a cluster with 4 racks.
> I am scheduled to replace one of them, with slave servers included.
> The cluster is configured with rack awareness and replication factor is 3.
> It is my understanding that on rack aware clusters, block placement policy
> ensures blocks are at least on two racks.
> I am concerned about decommission of whole rack.
>
> I considered decommission each node in the rack separately and let it
> balance automatically  , but I am afraid it will take a lot of time.
>
> If a DN is decommissioned and returns, Does the NM update it's block meta
> data. If it does,  How does it decide when to update the meta data map?
>
> Is there a suggested way to decommission a whole rack , I may have missed?
>
>
>
>
>
>
>
>
>

RE: Rack decommission

Posted by David Ginzburg <gi...@hotmail.com>.
No, HDFS was 50% free.
Decommission about a third of the 250 TB successfully at the end. 

From: Tong.Zhu@rms.com
To: hdfs-user@hadoop.apache.org
Date: Fri, 4 May 2012 09:42:00 -0700
Subject: RE: Rack decommission









Is it possible because there is no available empty blocks in the cluster to be used for replication?

 


From: David Ginzburg [mailto:ginzman@hotmail.com]


Sent: Friday, May 04, 2012 3:30 AM

To: HDFS USER mail list

Subject: RE: Rack decommission


 

Thank you,



I am a bit confused about the meaning of the state -"Blocks With No Live Replicas".


Could not find documentation.

Can you clarify?










Date: Thu, 3 May 2012 10:44:07 -0500

Subject: Re: Rack decommission

From: ravihadoop@gmail.com

To: hdfs-user@hadoop.apache.org



Hi David,



> If a DN is decommissioned and returns, Does the NM update it's block meta data. If it does,  How does it decide when to update the meta data map?



I presume you meant "the NN update it's block meta data". The answer is yes. Soon as the NN decides that a node is dead / decommissioned and that blocks have / will fallen below their replication factor, it will order more replications. This is what is going
 to happen when you decommission the nodes.



When the DNs come back, and they report their blocks to the NN, the NN will realise that the blocks are over-replicated and will order deletions.



> When I view the "dfsnodelist.jsp?whatNodes=

DECOMMISSIONING"  page, there are many slaves with "Blocks With No Live Replicas"



Obviously depends on how loaded your HDFS was. 











On Thu, May 3, 2012 at 10:04 AM, David Ginzburg <gi...@hotmail.com> wrote:


Hi,

I have began decommissioning the servers  in the rack.

When I view the "dfsnodelist.jsp?whatNodes=DECOMMISSIONING"  page, there are many slaves with "Blocks With No Live Replicas"

Is it possible , even if I have rack awareness configured ? 












From:
ginzman@hotmail.com

To: hdfs-user@hadoop.apache.org;
common-user@hadoop.apache.org

Subject: Rack decommission

Date: Wed, 2 May 2012 11:49:41 +0000



 

Hi,



I have a cluster with 4 racks.

I am scheduled to replace one of them, with slave servers included.

The cluster is configured with rack awareness and replication factor is 3.

It is my understanding that on rack aware clusters, block placement policy ensures blocks are at least on two racks.

I am concerned about decommission of whole rack.



I considered decommission each node in the rack separately and let it balance automatically  , but I am afraid it will take a lot of time.



If a DN is decommissioned and returns, Does the NM update it's block meta data. If it does,  How does it decide when to update the meta data map?



Is there a suggested way to decommission a whole rack , I may have missed?





















 






This message and any attachments contain information that may be RMS Inc. confidential and/or privileged. If you are not the intended recipient (or authorized to receive for the intended recipient), and have received
 this message in error, any use, disclosure or distribution is strictly prohibited. If you have received this message in error, please notify the sender immediately by replying to the e-mail and permanently deleting the message from your computer and/or storage
 system.

 		 	   		  

RE: Rack decommission

Posted by Tong Zhu <To...@rms.com>.
Is it possible because there is no available empty blocks in the cluster to be used for replication?

From: David Ginzburg [mailto:ginzman@hotmail.com]
Sent: Friday, May 04, 2012 3:30 AM
To: HDFS USER mail list
Subject: RE: Rack decommission

Thank you,

I am a bit confused about the meaning of the state -"Blocks With No Live Replicas".
Could not find documentation.
Can you clarify?


________________________________
Date: Thu, 3 May 2012 10:44:07 -0500
Subject: Re: Rack decommission
From: ravihadoop@gmail.com
To: hdfs-user@hadoop.apache.org

Hi David,

> If a DN is decommissioned and returns, Does the NM update it's block meta data. If it does,  How does it decide when to update the meta data map?

I presume you meant "the NN update it's block meta data". The answer is yes. Soon as the NN decides that a node is dead / decommissioned and that blocks have / will fallen below their replication factor, it will order more replications. This is what is going to happen when you decommission the nodes.

When the DNs come back, and they report their blocks to the NN, the NN will realise that the blocks are over-replicated and will order deletions.

> When I view the "dfsnodelist.jsp?whatNodes=
DECOMMISSIONING"  page, there are many slaves with "Blocks With No Live Replicas"

Obviously depends on how loaded your HDFS was.




On Thu, May 3, 2012 at 10:04 AM, David Ginzburg <gi...@hotmail.com>> wrote:
Hi,
I have began decommissioning the servers  in the rack.
When I view the "dfsnodelist.jsp?whatNodes=DECOMMISSIONING"  page, there are many slaves with "Blocks With No Live Replicas"
Is it possible , even if I have rack awareness configured ?


________________________________
From: ginzman@hotmail.com<ma...@hotmail.com>
To: hdfs-user@hadoop.apache.org<ma...@hadoop.apache.org>; common-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Rack decommission
Date: Wed, 2 May 2012 11:49:41 +0000

Hi,

I have a cluster with 4 racks.
I am scheduled to replace one of them, with slave servers included.
The cluster is configured with rack awareness and replication factor is 3.
It is my understanding that on rack aware clusters, block placement policy ensures blocks are at least on two racks.
I am concerned about decommission of whole rack.

I considered decommission each node in the rack separately and let it balance automatically  , but I am afraid it will take a lot of time.

If a DN is decommissioned and returns, Does the NM update it's block meta data. If it does,  How does it decide when to update the meta data map?

Is there a suggested way to decommission a whole rack , I may have missed?








________________________________
This message and any attachments contain information that may be RMS Inc. confidential and/or privileged. If you are not the intended recipient (or authorized to receive for the intended recipient), and have received this message in error, any use, disclosure or distribution is strictly prohibited. If you have received this message in error, please notify the sender immediately by replying to the e-mail and permanently deleting the message from your computer and/or storage system.

RE: Rack decommission

Posted by David Ginzburg <gi...@hotmail.com>.
Thank you,

I am a bit confused about the meaning of the state -"Blocks With No Live Replicas". 
Could not find documentation.
Can you clarify?



Date: Thu, 3 May 2012 10:44:07 -0500
Subject: Re: Rack decommission
From: ravihadoop@gmail.com
To: hdfs-user@hadoop.apache.org

Hi David,

> If a DN is decommissioned and returns, Does the NM update it's block 
meta data. If it does,  How does it decide when to update the meta data 
map?

I presume you meant "the NN update it's block meta data". The answer is yes. Soon as the NN decides that a node is dead / decommissioned and that blocks have / will fallen below their replication factor, it will order more replications. This is what is going to happen when you decommission the nodes.


When the DNs come back, and they report their blocks to the NN, the NN will realise that the blocks are over-replicated and will order deletions.

> When I view the "dfsnodelist.jsp?whatNodes=
DECOMMISSIONING"  page, there are many slaves with "Blocks With No Live Replicas"
Obviously depends on how loaded your HDFS was. 





On Thu, May 3, 2012 at 10:04 AM, David Ginzburg <gi...@hotmail.com> wrote:





Hi,
I have began decommissioning the servers  in the rack.
When I view the "dfsnodelist.jsp?whatNodes=DECOMMISSIONING"  page, there are many slaves with "Blocks With No Live Replicas"
Is it possible , even if I have rack awareness configured ? 




From: ginzman@hotmail.com
To: hdfs-user@hadoop.apache.org; common-user@hadoop.apache.org

Subject: Rack decommission
Date: Wed, 2 May 2012 11:49:41 +0000





Hi,

I have a cluster with 4 racks.
I am scheduled to replace one of them, with slave servers included.
The cluster is configured with rack awareness and replication factor is 3.
It is my understanding that on rack aware clusters, block placement policy ensures blocks are at least on two racks.

I am concerned about decommission of whole rack.

I considered decommission each node in the rack separately and let it balance automatically  , but I am afraid it will take a lot of time.

If a DN is decommissioned and returns, Does the NM update it's block meta data. If it does,  How does it decide when to update the meta data map?


Is there a suggested way to decommission a whole rack , I may have missed?







 		 	   		   		 	   		  

 		 	   		  

Re: Rack decommission

Posted by Ravi Prakash <ra...@gmail.com>.
Hi David,

> If a DN is decommissioned and returns, Does the NM update it's block meta
data. If it does,  How does it decide when to update the meta data map?

I presume you meant "the NN update it's block meta data". The answer is
yes. Soon as the NN decides that a node is dead / decommissioned and that
blocks have / will fallen below their replication factor, it will order
more replications. This is what is going to happen when you decommission
the nodes.

When the DNs come back, and they report their blocks to the NN, the NN will
realise that the blocks are over-replicated and will order deletions.

> When I view the "dfsnodelist.jsp?whatNodes=
DECOMMISSIONING"  page, there are many slaves with "Blocks With No Live
Replicas"

Obviously depends on how loaded your HDFS was.





On Thu, May 3, 2012 at 10:04 AM, David Ginzburg <gi...@hotmail.com> wrote:

>  Hi,
> I have began decommissioning the servers  in the rack.
> When I view the "dfsnodelist.jsp?whatNodes=DECOMMISSIONING"  page, there
> are many slaves with "Blocks With No Live Replicas"
> Is it possible , even if I have rack awareness configured ?
>
>
>
> ------------------------------
> From: ginzman@hotmail.com
> To: hdfs-user@hadoop.apache.org; common-user@hadoop.apache.org
> Subject: Rack decommission
> Date: Wed, 2 May 2012 11:49:41 +0000
>
>
>  Hi,
>
> I have a cluster with 4 racks.
> I am scheduled to replace one of them, with slave servers included.
> The cluster is configured with rack awareness and replication factor is 3.
> It is my understanding that on rack aware clusters, block placement policy
> ensures blocks are at least on two racks.
> I am concerned about decommission of whole rack.
>
> I considered decommission each node in the rack separately and let it
> balance automatically  , but I am afraid it will take a lot of time.
>
> If a DN is decommissioned and returns, Does the NM update it's block meta
> data. If it does,  How does it decide when to update the meta data map?
>
> Is there a suggested way to decommission a whole rack , I may have missed?
>
>
>
>
>
>
>
>

RE: Rack decommission

Posted by David Ginzburg <gi...@hotmail.com>.
Hi,
I have began decommissioning the servers  in the rack.
When I view the "dfsnodelist.jsp?whatNodes=DECOMMISSIONING"  page, there are many slaves with "Blocks With No Live Replicas"
Is it possible , even if I have rack awareness configured ? 



From: ginzman@hotmail.com
To: hdfs-user@hadoop.apache.org; common-user@hadoop.apache.org
Subject: Rack decommission
Date: Wed, 2 May 2012 11:49:41 +0000





Hi,

I have a cluster with 4 racks.
I am scheduled to replace one of them, with slave servers included.
The cluster is configured with rack awareness and replication factor is 3.
It is my understanding that on rack aware clusters, block placement policy ensures blocks are at least on two racks.
I am concerned about decommission of whole rack.

I considered decommission each node in the rack separately and let it balance automatically  , but I am afraid it will take a lot of time.

If a DN is decommissioned and returns, Does the NM update it's block meta data. If it does,  How does it decide when to update the meta data map?

Is there a suggested way to decommission a whole rack , I may have missed?