You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Shailendra Mudgal <mu...@gmail.com> on 2007/01/17 11:28:51 UTC

How to recover data from filesystem

Hi,
I have some data in a filesystem. Now because of some problem. my hadoop is
not starting. Is there a way to get the contents of that filesystem ??

Thanks and regards,
Shailendra

Re: How to recover data from filesystem

Posted by Andrzej Bialecki <ab...@getopt.org>.
Shailendra Mudgal wrote:
> Hi,
> I have some data in a filesystem. Now because of some problem. my 
> hadoop is
> not starting. Is there a way to get the contents of that filesystem ??

Your previous post indicates that your FS got corrupted - specifically, 
a list of namespace edits is inconsistent, because there is an edit op 
recorded that concerns a location, for which there is no record of its 
parent dir being created.. These edit operations are stored in an 
"edits" file, which represents a sort of diff against the last recorded 
snapshot of the filesystem (itself stored in "fsimage" file). These 
edits are merged into fsimage only during namenode startup sequence - 
apparently there is some corruption there, and the namenode cannot merge 
this edits file into fsimage.

There are ways to fix this "edits" file, but it's quite hard - it 
involves using a binary editor, and finding record boundaries by hand, 
and then removing offending edits, or adding an explicit edit to create 
the parent dir.

There is another way too, if you are prepared to lose some data - that 
is, all data created / modified during the last period when the cluster 
was up, because that's when the last checkpoint (merge) of the fsimage 
occurred. The edit log is stored in namenode in a file called "edits". 
If you shut down the cluster, remove this file, and restart the cluster, 
the FS should come back healthy - but all data created/modified during 
the last period will be lost (datanodes will start to physically remove 
all blocks that are not accounted for in the current fsimage).

If it's any consolation - this problem is recognized, and people are 
actively working on fixing it.

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com