You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Bill Au <bi...@gmail.com> on 2009/02/02 23:00:14 UTC
Re: decommissioned node showing up ad dead node in web based
interface to namenode (dfshealth.jsp)
It looks like the behavior is the same with 0.18.2 and 0.19.0. Even though
I removed the decommissioned node from the exclude file and run the
refreshNode command, the decommissioned node still show up as a dead node.
What I did noticed is that if I leave the decommissioned node in the exclude
and restart HDFS, the node will show up as a dead node after restart. But
then if I remove it from the exclude file and run the refreshNode command,
it will disappear from the status page (dfshealth.jsp).
So it looks like I will have to stop and start the entire cluster in order
to get what I want.
Bill
On Thu, Jan 29, 2009 at 5:40 PM, Bill Au <bi...@gmail.com> wrote:
> Not sure why but this does not work for me. I am running 0.18.2. I ran
> hadoop dfsadmin -refreshNodes after removing the decommissioned node from
> the exclude file. It still shows up as a dead node. I also removed it from
> the slaves file and ran the refresh nodes command again. It still shows up
> as a dead node after that.
>
> I am going to upgrade to 0.19.0 to see if it makes any difference.
>
> Bill
>
>
> On Tue, Jan 27, 2009 at 7:01 PM, paul <pa...@gmail.com> wrote:
>
>> Once the nodes are listed as dead, if you still have the host names in
>> your
>> conf/exclude file, remove the entries and then run hadoop dfsadmin
>> -refreshNodes.
>>
>>
>> This works for us on our cluster.
>>
>>
>>
>> -paul
>>
>>
>> On Tue, Jan 27, 2009 at 5:08 PM, Bill Au <bi...@gmail.com> wrote:
>>
>> > I was able to decommission a datanode successfully without having to
>> stop
>> > my
>> > cluster. But I noticed that after a node has been decommissioned, it
>> shows
>> > up as a dead node in the web base interface to the namenode (ie
>> > dfshealth.jsp). My cluster is relatively small and losing a datanode
>> will
>> > have performance impact. So I have a need to monitor the health of my
>> > cluster and take steps to revive any dead datanode in a timely fashion.
>> So
>> > is there any way to altogether "get rid of" any decommissioned datanode
>> > from
>> > the web interace of the namenode? Or is there a better way to monitor
>> the
>> > health of the cluster?
>> >
>> > Bill
>> >
>>
>
>
Re: decommissioned node showing up ad dead node in web based
interface to namenode (dfshealth.jsp)
Posted by Bill Au <bi...@gmail.com>.
I have been looking into this some more by looking a the output of dfsadmin
-report during the decommissioning process. After a node has been
decommissioned, dfsadmin -report shows that the node is in the
Decommissioned state. The web interface dfshealth.jsp shows it as a dead
node. After I removed the decommissioned node from the exclude file and run
the refreshNodes command, the web interface continues to show it as a dead
node but dfsadmin -report shows the node to be in service. After I restart
HDFS dfsadmin -report shows the correct information again.
If I restart HDFS leaving the decommissioned node in the exlude, the web
interface shows it as a dead node and dfsadmin -report shows it to be in
service. But after I remove it from the exclude file and run the
refreshNodes command, both the web interface and dfsadmin -report show the
correct information.
It looks to me I should only remove the decommissioned node from the exclude
file after restarting HDFS.
I would still like to see the web interface report any decommissioned node
as decommissioned rather than dead as with the case with dfsadmin -report.
I am willing to work on a patch for this. Before I start, does anyone know
if this is already in the works?
Bill
On Mon, Feb 2, 2009 at 5:00 PM, Bill Au <bi...@gmail.com> wrote:
> It looks like the behavior is the same with 0.18.2 and 0.19.0. Even though
> I removed the decommissioned node from the exclude file and run the
> refreshNode command, the decommissioned node still show up as a dead node.
> What I did noticed is that if I leave the decommissioned node in the exclude
> and restart HDFS, the node will show up as a dead node after restart. But
> then if I remove it from the exclude file and run the refreshNode command,
> it will disappear from the status page (dfshealth.jsp).
>
> So it looks like I will have to stop and start the entire cluster in order
> to get what I want.
>
> Bill
>
>
> On Thu, Jan 29, 2009 at 5:40 PM, Bill Au <bi...@gmail.com> wrote:
>
>> Not sure why but this does not work for me. I am running 0.18.2. I ran
>> hadoop dfsadmin -refreshNodes after removing the decommissioned node from
>> the exclude file. It still shows up as a dead node. I also removed it from
>> the slaves file and ran the refresh nodes command again. It still shows up
>> as a dead node after that.
>>
>> I am going to upgrade to 0.19.0 to see if it makes any difference.
>>
>> Bill
>>
>>
>> On Tue, Jan 27, 2009 at 7:01 PM, paul <pa...@gmail.com> wrote:
>>
>>> Once the nodes are listed as dead, if you still have the host names in
>>> your
>>> conf/exclude file, remove the entries and then run hadoop dfsadmin
>>> -refreshNodes.
>>>
>>>
>>> This works for us on our cluster.
>>>
>>>
>>>
>>> -paul
>>>
>>>
>>> On Tue, Jan 27, 2009 at 5:08 PM, Bill Au <bi...@gmail.com> wrote:
>>>
>>> > I was able to decommission a datanode successfully without having to
>>> stop
>>> > my
>>> > cluster. But I noticed that after a node has been decommissioned, it
>>> shows
>>> > up as a dead node in the web base interface to the namenode (ie
>>> > dfshealth.jsp). My cluster is relatively small and losing a datanode
>>> will
>>> > have performance impact. So I have a need to monitor the health of my
>>> > cluster and take steps to revive any dead datanode in a timely fashion.
>>> So
>>> > is there any way to altogether "get rid of" any decommissioned datanode
>>> > from
>>> > the web interace of the namenode? Or is there a better way to monitor
>>> the
>>> > health of the cluster?
>>> >
>>> > Bill
>>> >
>>>
>>
>>
>