You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Sudhir Vallamkondu <Su...@icrossing.com> on 2010/12/08 16:41:09 UTC

Re: Help: 1) Hadoop processes still are running after we stopped hadoop.2) How to exclude a dead node?

Yes.

Reference: I couldn't find a apache hadoop page describing this but see
below link 
http://serverfault.com/questions/115148/hadoop-slaves-file-necessary


On 12/7/10 11:59 PM, "common-user-digest-help@hadoop.apache.org"
<co...@hadoop.apache.org> wrote:

> From: li ping <li...@gmail.com>
> Date: Wed, 8 Dec 2010 14:17:40 +0800
> To: <co...@hadoop.apache.org>
> Subject: Re: Help: 1) Hadoop processes still are running after we stopped >
> hadoop.2) How to exclude a dead node?
> 
> I am not sure I have fully understand your post.
> You mean the conf/slaves only be used for stop/start script to start or stop
> the datanode/tasktracker?
> And the conf/master only contains the information about the secondary
> namenode?
> 
> Thanks
> 
> On Wed, Dec 8, 2010 at 1:44 PM, Sudhir Vallamkondu <
> Sudhir.Vallamkondu@icrossing.com> wrote:
> 
>> There is a proper decommissioning process to remove dead nodes. See the FAQ
>> link here:
>> 
>> http://wiki.apache.org/hadoop/FAQ#I_want_to_make_a_large_cluster_smaller_by_
>> taking_out_a_bunch_of_nodes_simultaneously._How_can_this_be_done.3F
>> 
>> For a fact $HADOOP_HOME/conf/slaves is not used by the name node to keep
>> track of datanodes/tasktracker. It is merely used by the stop/start hadoop
>> scripts to know which nodes to start datanode / tasktracker services.
>> Similarly there is confusion regarding understanding the
>> $HADOOP_HOME/conf/master file. That file contains the details of the
>> machine
>> where secondary name node is running, not the name node/job tracker.
>> 
>> With regards to not all java/hadoop processes getting killed, this may be
>> happening due to hadoop loosing track of pid files. By default the pid
>> files
>> are configured to be created in the /tmp directory. If these pid files get
>> deleted then stop/start scripts cannot detect running hadoop processes. I
>> suggest changing location of pid files to a persistent location like
>> /var/hadoop/. The $HADOOP_HOME/conf/hadoop-env.sh file has details on
>> configuring the PID location
>> 
>> - Sudhir
>> 
>> 
>> On 12/7/10 5:07 PM, "common-user-digest-help@hadoop.apache.org"
>> <co...@hadoop.apache.org> wrote:
>> 
>>> From: Tali K <nc...@hotmail.com>
>>> Date: Tue, 7 Dec 2010 10:40:16 -0800
>>> To: <co...@hadoop.apache.org>
>>> Subject: Help: 1) Hadoop processes still are running after we stopped
>>> hadoop.2)  How to exclude a dead node?
>>> 
>>> 
>>> 1)When I stopped hadoop, we checked all the nodes and found that 2 or 3
>>> java/hadoop processes were still running on each node.  So we went to
>> each
>>> node and did a 'killall java' - in some cases I had to do 'killall -9
>> java'.
>>> My question : why is is this happening and what would be recommendations
>> , how
>>> to make sure that there is no hadoop processes running after I stopped
>> hadoop
>>> with stop-all.sh?
>>> 
>>> 2) Also we have a dead node. We  removed this node  from
>>> $HADOOP_HOME/conf/slaves.  This file is supposed to tell the namenode
>>>  which machines are supposed to be datanodes/tasktrackers.
>>> We  started hadoop again, and were surprised to see a dead node in
>>  hadoop
>>> 'report' ("$HADOOP_HOME/bin/hadoop dfsadmin -report|less")
>>> It is only after blocking a deadnode and restarting hadoop, deadnode no
>> longer
>>> showed up in hreport.
>>> Any recommendations, how to deal with dead nodes?
>> 


iCrossing Privileged and Confidential Information
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information of iCrossing. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.