You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by Alexander Batyrshin <0x...@gmail.com> on 2019/08/28 00:00:34 UTC

NameNode recovering

 Hello all,
We have an issue with free disk space that trigger https://issues.apache.org/jira/browse/HDFS-13269 <https://issues.apache.org/jira/browse/HDFS-13269> (our versions is 3.1.1).
So right now I have Active NameNode in SafeMode that can’t do checkpoint.
Second NameNode after restart is trying to build actual fsimage from jnedits but it will take extremely long time (speed ~ 2M tx / hour and we need to get ~ 173M).


Any ideas how to recover?

Re: NameNode recovering

Posted by HK <he...@gmail.com>.

If it OK to put your active namenode to safe mode, there is a way.

   - Put your active namenode to safe mode
   - Checkpoint the name space
   - Bootstrap Standby namenode, and start.

I think hadoop 3 allows you to use checkpoint node, it will help you to
avoid this kind of issues.

On Wed, Aug 28, 2019 at 5:30 AM Alexander Batyrshin <0x...@gmail.com>
wrote:

>  Hello all,
> We have an issue with free disk space that trigger
> https://issues.apache.org/jira/browse/HDFS-13269 (our versions is 3.1.1).
> So right now I have Active NameNode in SafeMode that can’t do checkpoint.
> Second NameNode after restart is trying to build actual fsimage from
> jnedits but it will take extremely long time (speed ~ 2M tx / hour and we
> need to get ~ 173M).
>
>
> Any ideas how to recover?
>