You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Bill Au <bi...@gmail.com> on 2009/02/02 23:00:14 UTC

Re: decommissioned node showing up ad dead node in web based interface to namenode (dfshealth.jsp)

It looks like the behavior is the same with 0.18.2 and 0.19.0.  Even though
I removed the decommissioned node from the exclude file and run the
refreshNode command, the decommissioned node still show up as a dead node.
What I did noticed is that if I leave the decommissioned node in the exclude
and restart HDFS, the node will show up as a dead node after restart.  But
then if I remove it from the exclude file and run the refreshNode command,
it will disappear from the status page (dfshealth.jsp).

So it looks like I will have to stop and start the entire cluster in order
to get what I want.

Bill

On Thu, Jan 29, 2009 at 5:40 PM, Bill Au <bi...@gmail.com> wrote:

> Not sure why but this does not work for me.  I am running 0.18.2.  I ran
> hadoop dfsadmin -refreshNodes after removing the decommissioned node from
> the exclude file.  It still shows up as a dead node.  I also removed it from
> the slaves file and ran the refresh nodes command again.  It still shows up
> as a dead node after that.
>
> I am going to upgrade to 0.19.0 to see if it makes any difference.
>
> Bill
>
>
> On Tue, Jan 27, 2009 at 7:01 PM, paul <pa...@gmail.com> wrote:
>
>> Once the nodes are listed as dead, if you still have the host names in
>> your
>> conf/exclude file, remove the entries and then run hadoop dfsadmin
>> -refreshNodes.
>>
>>
>> This works for us on our cluster.
>>
>>
>>
>> -paul
>>
>>
>> On Tue, Jan 27, 2009 at 5:08 PM, Bill Au <bi...@gmail.com> wrote:
>>
>> > I was able to decommission a datanode successfully without having to
>> stop
>> > my
>> > cluster.  But I noticed that after a node has been decommissioned, it
>> shows
>> > up as a dead node in the web base interface to the namenode (ie
>> > dfshealth.jsp).  My cluster is relatively small and losing a datanode
>> will
>> > have performance impact.  So I have a need to monitor the health of my
>> > cluster and take steps to revive any dead datanode in a timely fashion.
>>  So
>> > is there any way to altogether "get rid of" any decommissioned datanode
>> > from
>> > the web interace of the namenode?  Or is there a better way to monitor
>> the
>> > health of the cluster?
>> >
>> > Bill
>> >
>>
>
>

Re: decommissioned node showing up ad dead node in web based interface to namenode (dfshealth.jsp)

Posted by Bill Au <bi...@gmail.com>.
I have been looking into this some more by looking a the output of dfsadmin
-report during the decommissioning process.  After a node has been
decommissioned, dfsadmin -report shows that the node is in the
Decommissioned state.  The web interface dfshealth.jsp shows it as a dead
node.  After I removed the decommissioned node from the exclude file and run
the refreshNodes command, the web interface continues to show it as a dead
node but dfsadmin -report shows the node to be in service.  After I restart
HDFS dfsadmin -report shows the correct information again.

If I restart HDFS leaving the decommissioned node in the exlude, the web
interface shows it as a dead node and dfsadmin -report shows it to be in
service.  But after I remove it from the exclude file and run the
refreshNodes command, both the web interface and dfsadmin -report show the
correct information.

It looks to me I should only remove the decommissioned node from the exclude
file after restarting HDFS.

I would still like to see the web interface report any decommissioned node
as decommissioned rather than dead as with the case with dfsadmin -report.
I am willing to work on a patch for this.  Before I start, does anyone know
if this is already in the works?

Bill

On Mon, Feb 2, 2009 at 5:00 PM, Bill Au <bi...@gmail.com> wrote:

> It looks like the behavior is the same with 0.18.2 and 0.19.0.  Even though
> I removed the decommissioned node from the exclude file and run the
> refreshNode command, the decommissioned node still show up as a dead node.
> What I did noticed is that if I leave the decommissioned node in the exclude
> and restart HDFS, the node will show up as a dead node after restart.  But
> then if I remove it from the exclude file and run the refreshNode command,
> it will disappear from the status page (dfshealth.jsp).
>
> So it looks like I will have to stop and start the entire cluster in order
> to get what I want.
>
> Bill
>
>
> On Thu, Jan 29, 2009 at 5:40 PM, Bill Au <bi...@gmail.com> wrote:
>
>> Not sure why but this does not work for me.  I am running 0.18.2.  I ran
>> hadoop dfsadmin -refreshNodes after removing the decommissioned node from
>> the exclude file.  It still shows up as a dead node.  I also removed it from
>> the slaves file and ran the refresh nodes command again.  It still shows up
>> as a dead node after that.
>>
>> I am going to upgrade to 0.19.0 to see if it makes any difference.
>>
>> Bill
>>
>>
>> On Tue, Jan 27, 2009 at 7:01 PM, paul <pa...@gmail.com> wrote:
>>
>>> Once the nodes are listed as dead, if you still have the host names in
>>> your
>>> conf/exclude file, remove the entries and then run hadoop dfsadmin
>>> -refreshNodes.
>>>
>>>
>>> This works for us on our cluster.
>>>
>>>
>>>
>>> -paul
>>>
>>>
>>> On Tue, Jan 27, 2009 at 5:08 PM, Bill Au <bi...@gmail.com> wrote:
>>>
>>> > I was able to decommission a datanode successfully without having to
>>> stop
>>> > my
>>> > cluster.  But I noticed that after a node has been decommissioned, it
>>> shows
>>> > up as a dead node in the web base interface to the namenode (ie
>>> > dfshealth.jsp).  My cluster is relatively small and losing a datanode
>>> will
>>> > have performance impact.  So I have a need to monitor the health of my
>>> > cluster and take steps to revive any dead datanode in a timely fashion.
>>>  So
>>> > is there any way to altogether "get rid of" any decommissioned datanode
>>> > from
>>> > the web interace of the namenode?  Or is there a better way to monitor
>>> the
>>> > health of the cluster?
>>> >
>>> > Bill
>>> >
>>>
>>
>>
>