You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Mayuran Yogarajah <ma...@casalemedia.com> on 2009/04/29 01:00:33 UTC

Master crashed

The master in my cluster crashed, the dfs/mapred java processes are
still running on the slaves.  What should I do next? I brought the master
back up and ran stop-mapred.sh and stop-dfs.sh and it said this:

slave1.test.com: no tasktracker to stop
slave1.test.com: no datanode to stop

Not sure what happened here, please advise.

thanks,
M

Re: Master crashed

Posted by Scott Carey <sc...@richrelevance.com>.

On 4/30/09 10:18 AM, "Mayuran Yogarajah" <ma...@casalemedia.com>
wrote:

> Alex Loddengaard wrote:
>> I'm confused.  Why are you trying to stop things when you're bringing the
>> name node back up?  Try running start-all.sh instead.
>> 
>> Alex
>> 
>>  
> Won't that try to start the daemons on the slave nodes again? They're
> already running.
> 

That doesn't matter, start-all.sh detects already running processes and does
not bring up duplicates. You can run it 100x in a row without a stop if you
wanted:

    namenode running as process 12621. Stop it first.
    datanode running as process 28540. Stop it first.
    jobtracker running as process 12814. Stop it first.
    tasktracker running as process 28763. Stop it first.



> M
>> On Tue, Apr 28, 2009 at 4:00 PM, Mayuran Yogarajah <
>> mayuran.yogarajah@casalemedia.com> wrote:
>> 
>>  
>>> The master in my cluster crashed, the dfs/mapred java processes are
>>> still running on the slaves.  What should I do next? I brought the master
>>> back up and ran stop-mapred.sh and stop-dfs.sh and it said this:
>>> 
>>> slave1.test.com: no tasktracker to stop
>>> slave1.test.com: no datanode to stop
>>> 
>>> Not sure what happened here, please advise.
>>> 
>>> thanks,
>>> M
>>> 
>>>    
> 
>

Re: Master crashed

Posted by Mayuran Yogarajah <ma...@casalemedia.com>.

Alex Loddengaard wrote:
> I'm confused.  Why are you trying to stop things when you're bringing the
> name node back up?  Try running start-all.sh instead.
>
> Alex
>
>   
Won't that try to start the daemons on the slave nodes again? They're 
already running.

M
> On Tue, Apr 28, 2009 at 4:00 PM, Mayuran Yogarajah <
> mayuran.yogarajah@casalemedia.com> wrote:
>
>   
>> The master in my cluster crashed, the dfs/mapred java processes are
>> still running on the slaves.  What should I do next? I brought the master
>> back up and ran stop-mapred.sh and stop-dfs.sh and it said this:
>>
>> slave1.test.com: no tasktracker to stop
>> slave1.test.com: no datanode to stop
>>
>> Not sure what happened here, please advise.
>>
>> thanks,
>> M
>>
>>

Re: Master crashed

Posted by Alex Loddengaard <al...@cloudera.com>.

I'm confused.  Why are you trying to stop things when you're bringing the
name node back up?  Try running start-all.sh instead.

Alex

On Tue, Apr 28, 2009 at 4:00 PM, Mayuran Yogarajah <
mayuran.yogarajah@casalemedia.com> wrote:

> The master in my cluster crashed, the dfs/mapred java processes are
> still running on the slaves.  What should I do next? I brought the master
> back up and ran stop-mapred.sh and stop-dfs.sh and it said this:
>
> slave1.test.com: no tasktracker to stop
> slave1.test.com: no datanode to stop
>
> Not sure what happened here, please advise.
>
> thanks,
> M
>