You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Yossi Ittach <yo...@gmail.com> on 2008/11/16 09:43:59 UTC

Recovering NN failure when the SNN data is on another server

Hi all

I apologize if the topic has already been answered - I couldn't find it.

I'm trying to restart a failed NN using "hadoop namenode -importCheckpoint"
, and the SNN is configured on another server. However , the NN keeps
looking for the SNN data folder on the local server , and not on the SNN
Server.
Any ideas?

10X!

Vale et me ama
Yossi

Re: Recovering NN failure when the SNN data is on another server

Posted by Sagar Naik <sn...@attributor.com>.
Let me correct myself.
  - backup of dfs.data.dir and dfs.name.dir on NN and SNN
 - If secondary namenode is not running on same machine as namenode, 
copy over the fs.checkpoint.dir from secondary onto namenode.
 - If you want to override NN image by SNN's image , delete the 
dfs.name.dir  (dfs.name.dir has been backed-up)
 - start only the namenode  with "-importCheckpoint"
 -
For additional info :
https://issues.apache.org/jira/browse/HADOOP-2585?focusedCommentId=12558173#action_12558173 


-Sagar


Sagar Naik wrote:
>
> Take backup of you dfs.data.dir (both on namenode and secondary 
> namenode).
> If secondary namenode is not running on same machine as namenode, copy 
> over the fs.checkpoint.dir from secondary onto namenode.
>
> start only the namenode . The "importCheckpoint" fails for a valid NN 
> image. If you want to override NN image by SNN's image , delete the 
> dfs.name.dir
>
> For additional info :
> https://issues.apache.org/jira/browse/HADOOP-2585?focusedCommentId=12558173#action_12558173 
>
>
> Pl note I am not an expert.
> Just had similar problem and this worked for me
>
> -Sagar
>
>
> Yossi Ittach wrote:
>> Hi all
>>
>> I apologize if the topic has already been answered - I couldn't find it.
>>
>> I'm trying to restart a failed NN using "hadoop namenode 
>> -importCheckpoint"
>> , and the SNN is configured on another server. However , the NN keeps
>> looking for the SNN data folder on the local server , and not on the SNN
>> Server.
>> Any ideas?
>>
>> 10X!
>>
>> Vale et me ama
>> Yossi
>>
>>   
>


Re: Recovering NN failure when the SNN data is on another server

Posted by Sagar Naik <sn...@attributor.com>.
Take backup of you dfs.data.dir (both on namenode and secondary namenode).
If secondary namenode is not running on same machine as namenode, copy 
over the fs.checkpoint.dir from secondary onto namenode.

start only the namenode . The "importCheckpoint" fails for a valid NN 
image. If you want to override NN image by SNN's image , delete the 
dfs.name.dir

For additional info :
https://issues.apache.org/jira/browse/HADOOP-2585?focusedCommentId=12558173#action_12558173

Pl note I am not an expert.
Just had similar problem and this worked for me

-Sagar


Yossi Ittach wrote:
> Hi all
>
> I apologize if the topic has already been answered - I couldn't find it.
>
> I'm trying to restart a failed NN using "hadoop namenode -importCheckpoint"
> , and the SNN is configured on another server. However , the NN keeps
> looking for the SNN data folder on the local server , and not on the SNN
> Server.
> Any ideas?
>
> 10X!
>
> Vale et me ama
> Yossi
>
>