You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "Zheng, Wenxing (NSN - CN/Beijing)" <we...@nsn.com> on 2009/09/11 08:46:39 UTC
unscribe

 

-----Original Message-----
From: ext David B. Ritch [mailto:david.ritch@gmail.com] 
Sent: Friday, September 11, 2009 11:07
To: common-user@hadoop.apache.org
Subject: Re: Decommissioning Individual Disks

Thank you both.  That's what we did today.  It seems fairly reasonable
when a node has a few disks, say 3-5.  However, at some sites, with
larger nodes, it seems more awkward.  When a node has a dozen or more
disks (as used in the larger terasort benchmarks), migrating the data
off all the disks is likely to be more of an issue.  I hope that there
is a better solution to this before my client moves to much larger
nodes!  ;-)

dbr

On 9/10/2009 10:07 PM, Amandeep Khurana wrote:
> I think decommissioning the node and replacing the disk is a cleaner 
> approach. That's what I'd recommend doing as well..
>
> On 9/10/09, Alex Loddengaard <al...@cloudera.com> wrote:
>   
>> Hi David,
>> Unfortunately there's really no way to do what you're hoping to do in

>> an automatic way.  You can move the block files (including their 
>> .meta files) from one disk to another.  Do this when the datanode
daemon is stopped.
>>  Then, when you start the datanode daemon, it will scan dfs.data.dir 
>> and be totally happy if blocks have moved hard drives.  I've never 
>> tried to do this myself, but others on the list have suggested this 
>> technique for "balancing disks."
>>
>> You could also change your process around a little.  It's not too 
>> crazy to decommission an entire node, replace one of its disks, then 
>> bring it back into the cluster.  Seems to me that this is a much 
>> saner approach: your ops team will tell you which disk needs 
>> replacing.  You decommission the node, they replace the disk, you add

>> the node back to the pool.  Your call I guess, though.
>>
>> Hope this was helpful.
>>
>> Alex
>>
>> On Thu, Sep 10, 2009 at 6:30 PM, David B. Ritch
>> <da...@gmail.com>wrote:
>>
>>     
>>> What do you do with the data on a failing disk when you replace it?
>>>
>>> Our support person comes in occasionally, and often replaces several

>>> disks when he does.  These are disks that have not yet failed, but 
>>> firmware indicates that failure is imminent.  We need to be able to 
>>> migrate our data off these disks before replacing them.  If we were 
>>> replacing entire servers, we would decommission them - but we have 3

>>> data disks per server.  If we were replacing one disk at a time, we 
>>> wouldn't worry about it (because of redundancy).  We can 
>>> decommission the servers, but moving all the data off of all their
disks is a waste.
>>>
>>> What's the best way to handle this?
>>>
>>> Thanks!
>>>
>>> David
>>>
>>>       
>>     
>
>