You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Otis Gospodnetic <ot...@yahoo.com> on 2008/05/09 05:35:01 UTC

Corrupt HDFS and salvaging data

Hi,

I have a case of a corrupt HDFS (according to bin/hadoop fsck) and I'm trying not to lose the precious data in it.  I accidentally run bin/hadoop namenode -format on a *new DN* that I just added to the cluster.  Is it possible for that to corrupt HDFS?  I also had to explicitly kill DN daemons before that, because bin/stop-all.sh didn't stop them for some reason (it always did so before).

Is there any way to salvage the data?  I have a 4-node cluster with replication factor of 3, though fsck reports lots of under-replicated blocks:

  ********************************
  CORRUPT FILES:        3355
  MISSING BLOCKS:       3462
  MISSING SIZE:         17708821225 B
  ********************************
 Minimally replicated blocks:   28802 (89.269775 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       17025 (52.76779 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     1.7750744
 Missing replicas:              17025 (29.727087 %)
 Number of data-nodes:          4
 Number of racks:               1


The filesystem under path '/' is CORRUPT


What can one do at this point to save the data?  If I run bin/hadoop fsck -move or -delete will I lose some of the data?  Or will I simply end up with fewer block replicas and will thus have to force re-balancing in order to get back to a "safe" number of replicas?

Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch