You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by He Chen <ai...@gmail.com> on 2010/08/06 17:35:59 UTC

Re: Best way to reduce a 8-node cluster in half and get hdfs to come out of safe mode

Way#3

1) bring up all 8 dn and the nn
2) retire one of your 4 nodes:
           kill the datanode process
           hadoop dfsadmin -refreshNodes  (this should be done on nn)
3) do 2) extra three times

On Fri, Aug 6, 2010 at 1:21 AM, Allen Wittenauer
<aw...@linkedin.com>wrote:

>
> On Aug 5, 2010, at 10:42 PM, Steve Kuo wrote:
>
> > As part of our experimentation, the plan is to pull 4 slave nodes out of
> a
> > 8-slave/1-master cluster. With replication factor set to 3, I thought
> > losing half of the cluster may be too much for hdfs to recover.  Thus I
> > copied out all relevant data from hdfs to local disk and reconfigure the
> > cluster.
>
> It depends.  If you have configured Hadoop to have a topology such that the
> 8 nodes were in 2 logical racks, then it would have worked just fine.  If
> you didn't have any topology configured, then each node is considered its
> own rack.  So pulling half of the grid down means you are likely losing a
> good chunk of all your blocks.
>
>
>
>
> >
> > The 4 slave nodes started okay but hdfs never left safe mode.  The nn.log
> > has the following line.  What is the best way to deal with this?  Shall I
> > restart the cluster with 8-node and then delete
> > /data/hadoop-hadoop/mapred/system?  Or shall I reformat hdfs?
>
> Two ways to go:
>
> Way #1:
>
> 1) configure dfs.hosts
> 2) bring up all 8 nodes
> 3) configure dfs.hosts.exclude to include the 4 you don't want
> 4) dfsadmin -refreshNodes to start decommissioning the 4 you don't want
>
> Way #2:
>
> 1) configure a topology
> 2) bring up all 8 nodes
> 3) setrep all files +1
> 4) wait for nn to finish replication
> 5) pull 4 nodes
> 6) bring down nn
> 7) remove topology
> 8) bring nn up
> 9) setrep -1
>
>
>
>


-- 
Best Wishes！
顺送商祺！

－－
Chen He
(402)613-9298
PhD. student of CSE Dept.
Research Assistant of Holland Computing Center
University of Nebraska-Lincoln
Lincoln NE 68588

Re: Best way to reduce a 8-node cluster in half and get hdfs to come out of safe mode

Posted by Steve Kuo <ku...@gmail.com>.

Thanks Allen for your advice.

Re: Best way to reduce a 8-node cluster in half and get hdfs to come out of safe mode

Posted by Allen Wittenauer <aw...@linkedin.com>.

On Aug 6, 2010, at 8:35 AM, He Chen wrote:

> Way#3
> 
> 1) bring up all 8 dn and the nn
> 2) retire one of your 4 nodes:
>           kill the datanode process
>           hadoop dfsadmin -refreshNodes  (this should be done on nn)

No need to refresh nodes.  It only re-reads the dfs.hosts.* files.

> 3) do 2) extra three times

Depending upon what the bandwidth param is, this should theoretically take a significantly longer time.  Since you need for the grid to get back to healthy before each kill.