You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by shared mailinglists <sh...@gmail.com> on 2011/04/14 12:12:57 UTC

Checking integrity of secondary namenode previous.checkpoint fsimage

page 312 of Tom White's "Hadoop:  The Definitive Guide" mentions that the
Offline Image Viewer supplied with 0.21.0 can be used to test the integrity
of any backups taken from the Secondary Namenode (previous.checkpoint)
directory.

How does this work in practice?

I've tested the tool on a valid fsimage file and a dummy file and I find
that the only difference is that the output is a file of non-zero size if
successful, and a zero-size file if there is a failure (as well as the
message "Input file ended unexpectedly.  Exiting").

The documentation (
http://hadoop.apache.org/hdfs/docs/current/hdfs_imageviewer.html) doesn't
explicitly say how to verify fsimage integrity but I assume that the tool
completing without encountering an error is enough to prove that the image
is valid.  It does say "If the tool is not able to process an image file, it
will exit cleanly" - but this is not of much use for automated backups.

Since this is going to be a common use-case for the Offline Image Viewer, I
suggest that the documentation is updated to specifically state how to check
for a valid image (eg during an automated backup process).

So, can anyone confirm how the Offline Image Viewer is used to verify the
integrity of a fsimage file?

thanks

Re: Checking integrity of secondary namenode previous.checkpoint fsimage

Posted by shared mailinglists <sh...@gmail.com>.
The script below is what I am now using for checking image integrity.

#!/bin/bash
# check_previous_checkpoint_integrity.sh
#
# This simple script tests the latest image which is in the
'previous.checkpoint' directory for namenode images.
# This will provide an early warning as to whether an HDFS image has been
corrupted.

export HADOOP_HOME=/usr/local/hadoop-install/hadoop-0.21.0

INPUT_FILE=/home/hadoop/hdfs/namesecondary/previous.checkpoint/fsimage
OUTPUT_FILE=/tmp/fsimage.txt

# If successful, this will create a non-empty file called fsimage.txt
# If fsimage is invalid, an empty fsimage.txt file will be created
$HADOOP_HOME/bin/hdfs oiv -i $INPUT_FILE -o $OUTPUT_FILE

if [ -s $OUTPUT_FILE ]
then
        echo "OK (file modified at `stat -c %y $INPUT_FILE`)"
else
        echo "FAIL"
fi


On 14 April 2011 11:12, shared mailinglists
<sh...@gmail.com>wrote:

> page 312 of Tom White's "Hadoop:  The Definitive Guide" mentions that the
> Offline Image Viewer supplied with 0.21.0 can be used to test the integrity
> of any backups taken from the Secondary Namenode (previous.checkpoint)
> directory.
>
> How does this work in practice?
>
> I've tested the tool on a valid fsimage file and a dummy file and I find
> that the only difference is that the output is a file of non-zero size if
> successful, and a zero-size file if there is a failure (as well as the
> message "Input file ended unexpectedly.  Exiting").
>
> The documentation (
> http://hadoop.apache.org/hdfs/docs/current/hdfs_imageviewer.html) doesn't
> explicitly say how to verify fsimage integrity but I assume that the tool
> completing without encountering an error is enough to prove that the image
> is valid.  It does say "If the tool is not able to process an image file,
> it will exit cleanly" - but this is not of much use for automated backups.
>
> Since this is going to be a common use-case for the Offline Image Viewer, I
> suggest that the documentation is updated to specifically state how to check
> for a valid image (eg during an automated backup process).
>
> So, can anyone confirm how the Offline Image Viewer is used to verify the
> integrity of a fsimage file?
>
> thanks
>