You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Otis Gospodnetic <ot...@yahoo.com> on 2008/05/09 05:35:01 UTC
Corrupt HDFS and salvaging data
Hi,
I have a case of a corrupt HDFS (according to bin/hadoop fsck) and I'm trying not to lose the precious data in it. I accidentally run bin/hadoop namenode -format on a *new DN* that I just added to the cluster. Is it possible for that to corrupt HDFS? I also had to explicitly kill DN daemons before that, because bin/stop-all.sh didn't stop them for some reason (it always did so before).
Is there any way to salvage the data? I have a 4-node cluster with replication factor of 3, though fsck reports lots of under-replicated blocks:
********************************
CORRUPT FILES: 3355
MISSING BLOCKS: 3462
MISSING SIZE: 17708821225 B
********************************
Minimally replicated blocks: 28802 (89.269775 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 17025 (52.76779 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 1.7750744
Missing replicas: 17025 (29.727087 %)
Number of data-nodes: 4
Number of racks: 1
The filesystem under path '/' is CORRUPT
What can one do at this point to save the data? If I run bin/hadoop fsck -move or -delete will I lose some of the data? Or will I simply end up with fewer block replicas and will thus have to force re-balancing in order to get back to a "safe" number of replicas?
Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch