You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Tadas Makčinskas <Ta...@bdc.lt> on 2012/12/20 15:42:23 UTC

steps to fix data block corruption after server failure

Having situation here. Some of our servers went away for a while. As
we attached them back to the cluster it appeared that as a result we have
multiple Missing/Corrupt blocks and some Mis-replicated blocks.

Can't figure out how to solve the issue of restoring the system to a
normal working state. Can't figure out neither a nice way to remove those
corrupted files, nor a way to restore them. All of these files are in the following folders:
   /user/<user>/.Trash
   /user/<user>/.staging

what following steps would be advised in order to solve our issue?

Thanks, Tadas


RE: steps to fix data block corruption after server failure

Posted by "Kartashov, Andy" <An...@mpac.ca>.
Tadas,

One time I remember disconnecting a bunch of DNodes from my dev cluster i/o of required, more elegant "exclude".
The next thing I learned was my FS was corrupted.  I did not care about my data ( I could re-import it again) but my NN metadata was messed up, so what worked for me was to -delete those corrupted files using "hadoop fsck" command.

Nonetheless, I would like to join you on the question. If neither -move or  -delete options work, is re-formatting the NN the only answer to resolve corrupted metadata inside Hadoop FS?

From: Tadas Makčinskas [mailto:Tadas.Makcinskas@bdc.lt]
Sent: Thursday, December 20, 2012 9:42 AM
To: user@hadoop.apache.org
Subject: steps to fix data block corruption after server failure

Having situation here. Some of our servers went away for a while. As
we attached them back to the cluster it appeared that as a result we have
multiple Missing/Corrupt blocks and some Mis-replicated blocks.

Can't figure out how to solve the issue of restoring the system to a
normal working state. Can't figure out neither a nice way to remove those
corrupted files, nor a way to restore them. All of these files are in the following folders:
   /user/<user>/.Trash
   /user/<user>/.staging

what following steps would be advised in order to solve our issue?

Thanks, Tadas

NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute piece jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent etre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'etes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser a l'environnement avant d'imprimer le présent courriel

RE: steps to fix data block corruption after server failure

Posted by "Kartashov, Andy" <An...@mpac.ca>.
Tadas,

One time I remember disconnecting a bunch of DNodes from my dev cluster i/o of required, more elegant "exclude".
The next thing I learned was my FS was corrupted.  I did not care about my data ( I could re-import it again) but my NN metadata was messed up, so what worked for me was to -delete those corrupted files using "hadoop fsck" command.

Nonetheless, I would like to join you on the question. If neither -move or  -delete options work, is re-formatting the NN the only answer to resolve corrupted metadata inside Hadoop FS?

From: Tadas Makčinskas [mailto:Tadas.Makcinskas@bdc.lt]
Sent: Thursday, December 20, 2012 9:42 AM
To: user@hadoop.apache.org
Subject: steps to fix data block corruption after server failure

Having situation here. Some of our servers went away for a while. As
we attached them back to the cluster it appeared that as a result we have
multiple Missing/Corrupt blocks and some Mis-replicated blocks.

Can't figure out how to solve the issue of restoring the system to a
normal working state. Can't figure out neither a nice way to remove those
corrupted files, nor a way to restore them. All of these files are in the following folders:
   /user/<user>/.Trash
   /user/<user>/.staging

what following steps would be advised in order to solve our issue?

Thanks, Tadas

NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute piece jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent etre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'etes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser a l'environnement avant d'imprimer le présent courriel

RE: steps to fix data block corruption after server failure

Posted by "Kartashov, Andy" <An...@mpac.ca>.
Tadas,

One time I remember disconnecting a bunch of DNodes from my dev cluster i/o of required, more elegant "exclude".
The next thing I learned was my FS was corrupted.  I did not care about my data ( I could re-import it again) but my NN metadata was messed up, so what worked for me was to -delete those corrupted files using "hadoop fsck" command.

Nonetheless, I would like to join you on the question. If neither -move or  -delete options work, is re-formatting the NN the only answer to resolve corrupted metadata inside Hadoop FS?

From: Tadas Makčinskas [mailto:Tadas.Makcinskas@bdc.lt]
Sent: Thursday, December 20, 2012 9:42 AM
To: user@hadoop.apache.org
Subject: steps to fix data block corruption after server failure

Having situation here. Some of our servers went away for a while. As
we attached them back to the cluster it appeared that as a result we have
multiple Missing/Corrupt blocks and some Mis-replicated blocks.

Can't figure out how to solve the issue of restoring the system to a
normal working state. Can't figure out neither a nice way to remove those
corrupted files, nor a way to restore them. All of these files are in the following folders:
   /user/<user>/.Trash
   /user/<user>/.staging

what following steps would be advised in order to solve our issue?

Thanks, Tadas

NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute piece jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent etre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'etes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser a l'environnement avant d'imprimer le présent courriel

RE: steps to fix data block corruption after server failure

Posted by "Kartashov, Andy" <An...@mpac.ca>.
Tadas,

One time I remember disconnecting a bunch of DNodes from my dev cluster i/o of required, more elegant "exclude".
The next thing I learned was my FS was corrupted.  I did not care about my data ( I could re-import it again) but my NN metadata was messed up, so what worked for me was to -delete those corrupted files using "hadoop fsck" command.

Nonetheless, I would like to join you on the question. If neither -move or  -delete options work, is re-formatting the NN the only answer to resolve corrupted metadata inside Hadoop FS?

From: Tadas Makčinskas [mailto:Tadas.Makcinskas@bdc.lt]
Sent: Thursday, December 20, 2012 9:42 AM
To: user@hadoop.apache.org
Subject: steps to fix data block corruption after server failure

Having situation here. Some of our servers went away for a while. As
we attached them back to the cluster it appeared that as a result we have
multiple Missing/Corrupt blocks and some Mis-replicated blocks.

Can't figure out how to solve the issue of restoring the system to a
normal working state. Can't figure out neither a nice way to remove those
corrupted files, nor a way to restore them. All of these files are in the following folders:
   /user/<user>/.Trash
   /user/<user>/.staging

what following steps would be advised in order to solve our issue?

Thanks, Tadas

NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute piece jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent etre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'etes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser a l'environnement avant d'imprimer le présent courriel