You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Brian Jeltema <br...@digitalenvoy.net> on 2015/03/25 13:10:43 UTC

namenode recovery

I have a question about a recovery scenario for Hadoop 2.4.

I have a small development cluster, no HA configured, that was taken down cleanly, 
that is, all services were stopped (via Ambari) and all the nodes were then rebooted.
However, the reboot of the namenode system failed; that system is completely dead.
The only HDFS service running on the system was the namenode; the secondary namenode
was running elsewhere and came back, as well as all of the datanodes. 

In this scenario, can I just start a namenode on one of the other nodes? Will it recover
the fsimage that was checkpointed by the secondary namenode? 

Thanks
Brian

Re: namenode recovery

Posted by Harsh J <ha...@cloudera.com>.
Not automatically. You'd need to copy over the VERSION and fsimage* files
from your SecondaryNN's checkpoint directory over to the new NameNode's
configured name directory to start it back up with the checkpointed data.

On Wed, Mar 25, 2015 at 5:40 PM, Brian Jeltema <
brian.jeltema@digitalenvoy.net> wrote:

> I have a question about a recovery scenario for Hadoop 2.4.
>
> I have a small development cluster, no HA configured, that was taken down
> cleanly,
> that is, all services were stopped (via Ambari) and all the nodes were
> then rebooted.
> However, the reboot of the namenode system failed; that system is
> completely dead.
> The only HDFS service running on the system was the namenode; the
> secondary namenode
> was running elsewhere and came back, as well as all of the datanodes.
>
> In this scenario, can I just start a namenode on one of the other nodes?
> Will it recover
> the fsimage that was checkpointed by the secondary namenode?
>
> Thanks
> Brian




-- 
Harsh J

Re: namenode recovery

Posted by Harsh J <ha...@cloudera.com>.
Not automatically. You'd need to copy over the VERSION and fsimage* files
from your SecondaryNN's checkpoint directory over to the new NameNode's
configured name directory to start it back up with the checkpointed data.

On Wed, Mar 25, 2015 at 5:40 PM, Brian Jeltema <
brian.jeltema@digitalenvoy.net> wrote:

> I have a question about a recovery scenario for Hadoop 2.4.
>
> I have a small development cluster, no HA configured, that was taken down
> cleanly,
> that is, all services were stopped (via Ambari) and all the nodes were
> then rebooted.
> However, the reboot of the namenode system failed; that system is
> completely dead.
> The only HDFS service running on the system was the namenode; the
> secondary namenode
> was running elsewhere and came back, as well as all of the datanodes.
>
> In this scenario, can I just start a namenode on one of the other nodes?
> Will it recover
> the fsimage that was checkpointed by the secondary namenode?
>
> Thanks
> Brian




-- 
Harsh J

Re: namenode recovery

Posted by Harsh J <ha...@cloudera.com>.
Not automatically. You'd need to copy over the VERSION and fsimage* files
from your SecondaryNN's checkpoint directory over to the new NameNode's
configured name directory to start it back up with the checkpointed data.

On Wed, Mar 25, 2015 at 5:40 PM, Brian Jeltema <
brian.jeltema@digitalenvoy.net> wrote:

> I have a question about a recovery scenario for Hadoop 2.4.
>
> I have a small development cluster, no HA configured, that was taken down
> cleanly,
> that is, all services were stopped (via Ambari) and all the nodes were
> then rebooted.
> However, the reboot of the namenode system failed; that system is
> completely dead.
> The only HDFS service running on the system was the namenode; the
> secondary namenode
> was running elsewhere and came back, as well as all of the datanodes.
>
> In this scenario, can I just start a namenode on one of the other nodes?
> Will it recover
> the fsimage that was checkpointed by the secondary namenode?
>
> Thanks
> Brian




-- 
Harsh J

Re: namenode recovery

Posted by Harsh J <ha...@cloudera.com>.
Not automatically. You'd need to copy over the VERSION and fsimage* files
from your SecondaryNN's checkpoint directory over to the new NameNode's
configured name directory to start it back up with the checkpointed data.

On Wed, Mar 25, 2015 at 5:40 PM, Brian Jeltema <
brian.jeltema@digitalenvoy.net> wrote:

> I have a question about a recovery scenario for Hadoop 2.4.
>
> I have a small development cluster, no HA configured, that was taken down
> cleanly,
> that is, all services were stopped (via Ambari) and all the nodes were
> then rebooted.
> However, the reboot of the namenode system failed; that system is
> completely dead.
> The only HDFS service running on the system was the namenode; the
> secondary namenode
> was running elsewhere and came back, as well as all of the datanodes.
>
> In this scenario, can I just start a namenode on one of the other nodes?
> Will it recover
> the fsimage that was checkpointed by the secondary namenode?
>
> Thanks
> Brian




-- 
Harsh J