You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by jiang licht <li...@yahoo.com> on 2010/06/25 03:02:14 UTC

How often namenode dumps in-memory hdfs meta data to disk (dfs.name.dir)?

Is it configurable?

Thanks,

--Michael


      

Re: How often namenode dumps in-memory hdfs meta data to disk (dfs.name.dir)?

Posted by jiang licht <li...@yahoo.com>.
Thanks, I understand in-memory editlog is updated on every change but thought that it was buffered before been serialized to disk. So, in case of busy writting data into hdfs, namenode is actually frequently doing disk write, but each time in a small chunk, and at os level, this will be buffered ...

--Michael

--- On Fri, 6/25/10, Allen Wittenauer <aw...@linkedin.com> wrote:

From: Allen Wittenauer <aw...@linkedin.com>
Subject: Re: How often namenode dumps in-memory hdfs meta data to disk (dfs.name.dir)?
To: "<co...@hadoop.apache.org>" <co...@hadoop.apache.org>
Date: Friday, June 25, 2010, 12:17 PM


On Jun 25, 2010, at 12:24 AM, jiang licht wrote:
> Then, I think that namenode will need to write editlog to disk at a regular interval.

This is incorrect.

> Correct me if I misunderstand anything.

The edits log is updated on every change.  Create a file?  Updated.  Changed permissions?  Updated.  

> So, how often namenode write any meta data/change to dfs.name.dir?

Constantly.





      

Re: How often namenode dumps in-memory hdfs meta data to disk (dfs.name.dir)?

Posted by Allen Wittenauer <aw...@linkedin.com>.
On Jun 25, 2010, at 12:24 AM, jiang licht wrote:
> Then, I think that namenode will need to write editlog to disk at a regular interval.

This is incorrect.

> Correct me if I misunderstand anything.

The edits log is updated on every change.  Create a file?  Updated.  Changed permissions?  Updated.  

> So, how often namenode write any meta data/change to dfs.name.dir?

Constantly.



Re: How often namenode dumps in-memory hdfs meta data to disk (dfs.name.dir)?

Posted by jiang licht <li...@yahoo.com>.
I guess I need to rephrase my question. First here's what I understand. Basically, there is a fsimage, which is stored as a file in dfs.name.dir folder(s) and there is also a editlog which keeps tracking every update to the hdfs, and also saved in dfs.name.dir folder(s). According to current hdfs documentation, when namenode starts, it update in-memory fsimage with editlog and truncate old editlog and save the new fsimage to disk and then editlog will only record changes after this point (still one checkpoint implementation in the latest stable version?).

So, according to this process, meta data (including editlog and fsimage) is read once when namenode starts; fsimage on the disk (in dfs.name.dir) is written once after namenode starts and applies the editlog to the one just read from the disk; editlog on the disk (in dfs.name.dir) is also overwritten (truncated) at this point.

Then, I think that namenode will need to write editlog to disk at a regular interval. Otherwise, if namenode service is terminated unexpectedly, any change since its startup will get lost. By the way, for recovery from namenode failure purpose, apparently, the 
whole dfs.name.dir has to be backed up not just fsimage. However, any 
change made to the hdfs after a checkpoint followed by a backup will still get lost if there is a failure. Considering this, using a list of folders for dfs.name.dir will be the safest way compared to simply copy the folder over in some manner (e.g. rsync).

Correct me if I misunderstand anything.

So, how often namenode write any meta data/change to dfs.name.dir?



Thanks,

-Michael

--- On Thu, 6/24/10, Allen Wittenauer <aw...@linkedin.com> wrote:

From: Allen Wittenauer <aw...@linkedin.com>
Subject: Re: How often namenode dumps in-memory hdfs meta data to disk (dfs.name.dir)?
To: "<co...@hadoop.apache.org>" <co...@hadoop.apache.org>
Date: Thursday, June 24, 2010, 8:04 PM


On Jun 24, 2010, at 6:02 PM, jiang licht wrote:

> Is it configurable?


It doesn't.

The only thing the namenode does is update the edits log, and that is continually rolling.

If you want it to dump a new fsimage, then you need to use the secondary namenode to perform the merge.


      

Re: How often namenode dumps in-memory hdfs meta data to disk (dfs.name.dir)?

Posted by Allen Wittenauer <aw...@linkedin.com>.
On Jun 24, 2010, at 6:02 PM, jiang licht wrote:

> Is it configurable?


It doesn't.

The only thing the namenode does is update the edits log, and that is continually rolling.

If you want it to dump a new fsimage, then you need to use the secondary namenode to perform the merge.