You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jason Venner <ja...@attributor.com> on 2009/01/07 19:44:20 UTC

Question about the Namenode edit log and syncing the edit log to disk. 0.19.0

I have always assumed (which is clearly my error) that edit log writes 
were flushed to storage to ensure that the edit log was consistent 
during machine crash recovery.

I have been working through FSEditLog.java and I don't see any calls of 
force(true) on the file channel or sync on the file descriptor, and the 
edit log is not opened with an 's' or 'd' ie: the open flags are "rw" 
and not "rws" or "rwd".

The only thing I see in the code, is that the space in the file where 
the updates will be written is preallocated.

Have I missed the mechanism that the edit log data is flushed to the disk?

Is the edit log data not forcibly flushed to the disk, instead reling on 
the host operating system to perform the physical writes at a later date?

Thanks -- Jason

Re: Question about the Namenode edit log and syncing the edit log to disk. 0.19.0

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.
 From Java documentation
http://java.sun.com/javase/6/docs/api/java/nio/channels/FileChannel.html#force(boolean)
"Passing false for this parameter indicates that only updates to the file's content need be written to storage; passing true  indicates that updates to both the file's content and metadata must be written, which generally requires at least one more I/O 
operation."
See also a comment here
https://issues.apache.org/jira/browse/HADOOP-2330?focusedCommentId=12614526#action_12614526

We are trying to avoid extra (meta-data) io during sync.
This is why "s" is not  appropriate here.
We do not use "d" option because HDFS controls the syncs internally:
the modifications are batched and then sync-ed together.
More info here:
https://issues.apache.org/jira/browse/HADOOP-1003

Thanks for the question.
--Konstantin

Jason Venner wrote:
> I have always assumed (which is clearly my error) that edit log writes 
> were flushed to storage to ensure that the edit log was consistent 
> during machine crash recovery.
> 
> I have been working through FSEditLog.java and I don't see any calls of 
> force(true) on the file channel or sync on the file descriptor, and the 
> edit log is not opened with an 's' or 'd' ie: the open flags are "rw" 
> and not "rws" or "rwd".
> 
> The only thing I see in the code, is that the space in the file where 
> the updates will be written is preallocated.
> 
> Have I missed the mechanism that the edit log data is flushed to the disk?
> 
> Is the edit log data not forcibly flushed to the disk, instead reling on 
> the host operating system to perform the physical writes at a later date?
> 
> Thanks -- Jason
> 

Re: Question about the Namenode edit log and syncing the edit log to disk. 0.19.0

Posted by Raghu Angadi <ra...@yahoo-inc.com>.
Did you look at FSEditLog.EditLogFileOutputStream.flushAndSync()?

This code was re-organized sometime back. But the guarantees it provides 
should be exactly same as before. Please let us know otherwise.

Raghu.

Jason Venner wrote:
> I have always assumed (which is clearly my error) that edit log writes 
> were flushed to storage to ensure that the edit log was consistent 
> during machine crash recovery.
> 
> I have been working through FSEditLog.java and I don't see any calls of 
> force(true) on the file channel or sync on the file descriptor, and the 
> edit log is not opened with an 's' or 'd' ie: the open flags are "rw" 
> and not "rws" or "rwd".
> 
> The only thing I see in the code, is that the space in the file where 
> the updates will be written is preallocated.
> 
> Have I missed the mechanism that the edit log data is flushed to the disk?
> 
> Is the edit log data not forcibly flushed to the disk, instead reling on 
> the host operating system to perform the physical writes at a later date?
> 
> Thanks -- Jason