You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Vishnu Viswanath <vi...@gmail.com> on 2014/01/07 13:26:45 UTC

Content of FSImage

Hi All,

I read that block information is stored in memory by hadoop once it
receives block report from the datanodes.

EditLog logs the changes.

What is exactly stored in the FSImage file?
Does it store information on the files in HDFS and how many blocks are
there etc?

Thanks

Re: Content of FSImage

Posted by Hardik Pandya <sm...@gmail.com>.
Yes - The entire file system namespace, including the mapping of blocks to
files and file system properties, is stored in a file called the FsImage.
The FsImage is stored as a file in the NameNode’s local file system too.

When the NameNode starts up, it reads the FsImage and EditLog from disk,
applies all the transactions from the EditLog to the in-memory
representation of the FsImage, and flushes out this new version into a new
FsImage on disk. It can then truncate the old EditLog because its
transactions have been applied to the persistent FsImage. This process is
called a checkpoint. In the current implementation, a checkpoint only
occurs when the NameNode starts up. Work is in progress to support periodic
checkpointing in the near future.
During Metadata Disk Failure

The FsImage and the EditLog are central data structures of HDFS. A
corruption of these files can cause the HDFS instance to be non-functional.
For this reason, the NameNode can be configured to support maintaining
multiple copies of the FsImage and EditLog. Any update to either the
FsImage or EditLog causes each of the FsImages and EditLogs to get updated
synchronously. This synchronous updating of multiple copies of the
FsImageand EditLog may degrade the rate of namespace transactions per
second that
a NameNode can support. However, this degradation is acceptable because
even though HDFS applications are very data intensive in nature, they are
not metadata intensive. When a NameNode restarts, it selects the latest
consistent FsImage and EditLog to use.

Reference : HDFS Design
Documentation<http://hadoop.apache.org/docs/stable1/hdfs_design.html>


On Tue, Jan 7, 2014 at 7:26 AM, Vishnu Viswanath <
vishnu.viswanath25@gmail.com> wrote:

> Hi All,
>
> I read that block information is stored in memory by hadoop once it
> receives block report from the datanodes.
>
> EditLog logs the changes.
>
> What is exactly stored in the FSImage file?
> Does it store information on the files in HDFS and how many blocks are
> there etc?
>
> Thanks
>
>

Re: Content of FSImage

Posted by Hardik Pandya <sm...@gmail.com>.
Yes - The entire file system namespace, including the mapping of blocks to
files and file system properties, is stored in a file called the FsImage.
The FsImage is stored as a file in the NameNode’s local file system too.

When the NameNode starts up, it reads the FsImage and EditLog from disk,
applies all the transactions from the EditLog to the in-memory
representation of the FsImage, and flushes out this new version into a new
FsImage on disk. It can then truncate the old EditLog because its
transactions have been applied to the persistent FsImage. This process is
called a checkpoint. In the current implementation, a checkpoint only
occurs when the NameNode starts up. Work is in progress to support periodic
checkpointing in the near future.
During Metadata Disk Failure

The FsImage and the EditLog are central data structures of HDFS. A
corruption of these files can cause the HDFS instance to be non-functional.
For this reason, the NameNode can be configured to support maintaining
multiple copies of the FsImage and EditLog. Any update to either the
FsImage or EditLog causes each of the FsImages and EditLogs to get updated
synchronously. This synchronous updating of multiple copies of the
FsImageand EditLog may degrade the rate of namespace transactions per
second that
a NameNode can support. However, this degradation is acceptable because
even though HDFS applications are very data intensive in nature, they are
not metadata intensive. When a NameNode restarts, it selects the latest
consistent FsImage and EditLog to use.

Reference : HDFS Design
Documentation<http://hadoop.apache.org/docs/stable1/hdfs_design.html>


On Tue, Jan 7, 2014 at 7:26 AM, Vishnu Viswanath <
vishnu.viswanath25@gmail.com> wrote:

> Hi All,
>
> I read that block information is stored in memory by hadoop once it
> receives block report from the datanodes.
>
> EditLog logs the changes.
>
> What is exactly stored in the FSImage file?
> Does it store information on the files in HDFS and how many blocks are
> there etc?
>
> Thanks
>
>

Re: Content of FSImage

Posted by Hardik Pandya <sm...@gmail.com>.
Yes - The entire file system namespace, including the mapping of blocks to
files and file system properties, is stored in a file called the FsImage.
The FsImage is stored as a file in the NameNode’s local file system too.

When the NameNode starts up, it reads the FsImage and EditLog from disk,
applies all the transactions from the EditLog to the in-memory
representation of the FsImage, and flushes out this new version into a new
FsImage on disk. It can then truncate the old EditLog because its
transactions have been applied to the persistent FsImage. This process is
called a checkpoint. In the current implementation, a checkpoint only
occurs when the NameNode starts up. Work is in progress to support periodic
checkpointing in the near future.
During Metadata Disk Failure

The FsImage and the EditLog are central data structures of HDFS. A
corruption of these files can cause the HDFS instance to be non-functional.
For this reason, the NameNode can be configured to support maintaining
multiple copies of the FsImage and EditLog. Any update to either the
FsImage or EditLog causes each of the FsImages and EditLogs to get updated
synchronously. This synchronous updating of multiple copies of the
FsImageand EditLog may degrade the rate of namespace transactions per
second that
a NameNode can support. However, this degradation is acceptable because
even though HDFS applications are very data intensive in nature, they are
not metadata intensive. When a NameNode restarts, it selects the latest
consistent FsImage and EditLog to use.

Reference : HDFS Design
Documentation<http://hadoop.apache.org/docs/stable1/hdfs_design.html>


On Tue, Jan 7, 2014 at 7:26 AM, Vishnu Viswanath <
vishnu.viswanath25@gmail.com> wrote:

> Hi All,
>
> I read that block information is stored in memory by hadoop once it
> receives block report from the datanodes.
>
> EditLog logs the changes.
>
> What is exactly stored in the FSImage file?
> Does it store information on the files in HDFS and how many blocks are
> there etc?
>
> Thanks
>
>

Re: Content of FSImage

Posted by Hardik Pandya <sm...@gmail.com>.
Yes - The entire file system namespace, including the mapping of blocks to
files and file system properties, is stored in a file called the FsImage.
The FsImage is stored as a file in the NameNode’s local file system too.

When the NameNode starts up, it reads the FsImage and EditLog from disk,
applies all the transactions from the EditLog to the in-memory
representation of the FsImage, and flushes out this new version into a new
FsImage on disk. It can then truncate the old EditLog because its
transactions have been applied to the persistent FsImage. This process is
called a checkpoint. In the current implementation, a checkpoint only
occurs when the NameNode starts up. Work is in progress to support periodic
checkpointing in the near future.
During Metadata Disk Failure

The FsImage and the EditLog are central data structures of HDFS. A
corruption of these files can cause the HDFS instance to be non-functional.
For this reason, the NameNode can be configured to support maintaining
multiple copies of the FsImage and EditLog. Any update to either the
FsImage or EditLog causes each of the FsImages and EditLogs to get updated
synchronously. This synchronous updating of multiple copies of the
FsImageand EditLog may degrade the rate of namespace transactions per
second that
a NameNode can support. However, this degradation is acceptable because
even though HDFS applications are very data intensive in nature, they are
not metadata intensive. When a NameNode restarts, it selects the latest
consistent FsImage and EditLog to use.

Reference : HDFS Design
Documentation<http://hadoop.apache.org/docs/stable1/hdfs_design.html>


On Tue, Jan 7, 2014 at 7:26 AM, Vishnu Viswanath <
vishnu.viswanath25@gmail.com> wrote:

> Hi All,
>
> I read that block information is stored in memory by hadoop once it
> receives block report from the datanodes.
>
> EditLog logs the changes.
>
> What is exactly stored in the FSImage file?
> Does it store information on the files in HDFS and how many blocks are
> there etc?
>
> Thanks
>
>