You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "Zheng, Wenxing (NSN - CN/Beijing)" <we...@nsn.com> on 2009/09/11 08:46:39 UTC
unscribe
-----Original Message-----
From: ext David B. Ritch [mailto:david.ritch@gmail.com]
Sent: Friday, September 11, 2009 11:07
To: common-user@hadoop.apache.org
Subject: Re: Decommissioning Individual Disks
Thank you both. That's what we did today. It seems fairly reasonable
when a node has a few disks, say 3-5. However, at some sites, with
larger nodes, it seems more awkward. When a node has a dozen or more
disks (as used in the larger terasort benchmarks), migrating the data
off all the disks is likely to be more of an issue. I hope that there
is a better solution to this before my client moves to much larger
nodes! ;-)
dbr
On 9/10/2009 10:07 PM, Amandeep Khurana wrote:
> I think decommissioning the node and replacing the disk is a cleaner
> approach. That's what I'd recommend doing as well..
>
> On 9/10/09, Alex Loddengaard <al...@cloudera.com> wrote:
>
>> Hi David,
>> Unfortunately there's really no way to do what you're hoping to do in
>> an automatic way. You can move the block files (including their
>> .meta files) from one disk to another. Do this when the datanode
daemon is stopped.
>> Then, when you start the datanode daemon, it will scan dfs.data.dir
>> and be totally happy if blocks have moved hard drives. I've never
>> tried to do this myself, but others on the list have suggested this
>> technique for "balancing disks."
>>
>> You could also change your process around a little. It's not too
>> crazy to decommission an entire node, replace one of its disks, then
>> bring it back into the cluster. Seems to me that this is a much
>> saner approach: your ops team will tell you which disk needs
>> replacing. You decommission the node, they replace the disk, you add
>> the node back to the pool. Your call I guess, though.
>>
>> Hope this was helpful.
>>
>> Alex
>>
>> On Thu, Sep 10, 2009 at 6:30 PM, David B. Ritch
>> <da...@gmail.com>wrote:
>>
>>
>>> What do you do with the data on a failing disk when you replace it?
>>>
>>> Our support person comes in occasionally, and often replaces several
>>> disks when he does. These are disks that have not yet failed, but
>>> firmware indicates that failure is imminent. We need to be able to
>>> migrate our data off these disks before replacing them. If we were
>>> replacing entire servers, we would decommission them - but we have 3
>>> data disks per server. If we were replacing one disk at a time, we
>>> wouldn't worry about it (because of redundancy). We can
>>> decommission the servers, but moving all the data off of all their
disks is a waste.
>>>
>>> What's the best way to handle this?
>>>
>>> Thanks!
>>>
>>> David
>>>
>>>
>>
>
>