You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by John Lilley <jo...@redpoint.net> on 2014/03/04 14:08:07 UTC

Need help: fsck FAILs, refuses to clean up corrupt fs

I have a file system with some missing/corrupt blocks.  However, running hdfs fsck -delete also fails with errors.  How do I get around this?
Thanks
John

[hdfs@metallica yarn]$ hdfs fsck -delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
Connecting to namenode via http://anthrax.office.datalever.com:50070
FSCK started by hdfs (auth:SIMPLE) from /192.168.57.110 for path /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld at Tue Mar 04 06:05:40 MST 2014
.
/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200714

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200741

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200778

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: MISSING 3 blocks of total size 299116266 B.Status: CORRUPT
Total size:    299116266 B
Total dirs:    0
Total files:   1
Total symlinks:                0
Total blocks (validated):      3 (avg. block size 99705422 B)
  ********************************
  CORRUPT FILES:        1
  MISSING BLOCKS:       3
  MISSING SIZE:         299116266 B
  CORRUPT BLOCKS:       3
  ********************************
Minimally replicated blocks:   0 (0.0 %)
Over-replicated blocks:        0 (0.0 %)
Under-replicated blocks:       0 (0.0 %)
Mis-replicated blocks:         0 (0.0 %)
Default replication factor:    3
Average block replication:     0.0
Corrupt blocks:                3
Missing replicas:              0
Number of data-nodes:          8
Number of racks:               1
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
fsck encountered internal errors!


Fsck on path '/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld' FAILED

RE: Need help: fsck FAILs, refuses to clean up corrupt fs

Posted by John Lilley <jo...@redpoint.net>.
Ah... found the answer.  I had to manually leave safe mode to delete the corrupt files.
john

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, March 04, 2014 9:33 AM
To: user@hadoop.apache.org
Subject: RE: Need help: fsck FAILs, refuses to clean up corrupt fs

More information from the NameNode log.  I don't understand... it is saying that I cannot delete the corrupted file until the NameNode leaves safe mode, but it won't leave safe mode until the file system is no longer corrupt.  How do I get there from here?
Thanks
john

2014-03-04 06:02:51,584 ERROR namenode.NameNode (NamenodeFsck.java:deleteCorruptedFile(446)) - Fsck: error deleting corrupted file /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld. Name node is in safe mode.
The reported blocks 169302 needs additional 36 blocks to reach the threshold 1.0000 of total blocks 169337.
Safe mode will be turned off automatically
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1063)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:3141)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:3101)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3085)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:697)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.deleteCorruptedFile(NamenodeFsck.java:443)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:426)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.fsck(NamenodeFsck.java:206)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet$1.run(FsckServlet.java:67)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet.doGet(FsckServlet.java:58)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1081)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, March 04, 2014 6:08 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Need help: fsck FAILs, refuses to clean up corrupt fs

I have a file system with some missing/corrupt blocks.  However, running hdfs fsck -delete also fails with errors.  How do I get around this?
Thanks
John

[hdfs@metallica yarn]$ hdfs fsck -delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
Connecting to namenode via http://anthrax.office.datalever.com:50070
FSCK started by hdfs (auth:SIMPLE) from /192.168.57.110 for path /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld at Tue Mar 04 06:05:40 MST 2014
.
/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200714

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200741

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200778

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: MISSING 3 blocks of total size 299116266 B.Status: CORRUPT
Total size:    299116266 B
Total dirs:    0
Total files:   1
Total symlinks:                0
Total blocks (validated):      3 (avg. block size 99705422 B)
  ********************************
  CORRUPT FILES:        1
  MISSING BLOCKS:       3
  MISSING SIZE:         299116266 B
  CORRUPT BLOCKS:       3
  ********************************
Minimally replicated blocks:   0 (0.0 %)
Over-replicated blocks:        0 (0.0 %)
Under-replicated blocks:       0 (0.0 %)
Mis-replicated blocks:         0 (0.0 %)
Default replication factor:    3
Average block replication:     0.0
Corrupt blocks:                3
Missing replicas:              0
Number of data-nodes:          8
Number of racks:               1
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
fsck encountered internal errors!


Fsck on path '/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld' FAILED

RE: Need help: fsck FAILs, refuses to clean up corrupt fs

Posted by John Lilley <jo...@redpoint.net>.
Ah... found the answer.  I had to manually leave safe mode to delete the corrupt files.
john

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, March 04, 2014 9:33 AM
To: user@hadoop.apache.org
Subject: RE: Need help: fsck FAILs, refuses to clean up corrupt fs

More information from the NameNode log.  I don't understand... it is saying that I cannot delete the corrupted file until the NameNode leaves safe mode, but it won't leave safe mode until the file system is no longer corrupt.  How do I get there from here?
Thanks
john

2014-03-04 06:02:51,584 ERROR namenode.NameNode (NamenodeFsck.java:deleteCorruptedFile(446)) - Fsck: error deleting corrupted file /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld. Name node is in safe mode.
The reported blocks 169302 needs additional 36 blocks to reach the threshold 1.0000 of total blocks 169337.
Safe mode will be turned off automatically
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1063)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:3141)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:3101)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3085)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:697)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.deleteCorruptedFile(NamenodeFsck.java:443)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:426)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.fsck(NamenodeFsck.java:206)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet$1.run(FsckServlet.java:67)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet.doGet(FsckServlet.java:58)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1081)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, March 04, 2014 6:08 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Need help: fsck FAILs, refuses to clean up corrupt fs

I have a file system with some missing/corrupt blocks.  However, running hdfs fsck -delete also fails with errors.  How do I get around this?
Thanks
John

[hdfs@metallica yarn]$ hdfs fsck -delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
Connecting to namenode via http://anthrax.office.datalever.com:50070
FSCK started by hdfs (auth:SIMPLE) from /192.168.57.110 for path /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld at Tue Mar 04 06:05:40 MST 2014
.
/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200714

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200741

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200778

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: MISSING 3 blocks of total size 299116266 B.Status: CORRUPT
Total size:    299116266 B
Total dirs:    0
Total files:   1
Total symlinks:                0
Total blocks (validated):      3 (avg. block size 99705422 B)
  ********************************
  CORRUPT FILES:        1
  MISSING BLOCKS:       3
  MISSING SIZE:         299116266 B
  CORRUPT BLOCKS:       3
  ********************************
Minimally replicated blocks:   0 (0.0 %)
Over-replicated blocks:        0 (0.0 %)
Under-replicated blocks:       0 (0.0 %)
Mis-replicated blocks:         0 (0.0 %)
Default replication factor:    3
Average block replication:     0.0
Corrupt blocks:                3
Missing replicas:              0
Number of data-nodes:          8
Number of racks:               1
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
fsck encountered internal errors!


Fsck on path '/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld' FAILED

RE: Need help: fsck FAILs, refuses to clean up corrupt fs

Posted by divye sheth <di...@gmail.com>.
You can force namenode to leave safemode.

hadoop dfsadmin -safemode leave

Then run the hadoop fsck.

Thanks
Divye Sheth
On Mar 4, 2014 10:03 PM, "John Lilley" <jo...@redpoint.net> wrote:

>  More information from the NameNode log.  I don't understand... it is
> saying that I cannot delete the corrupted file until the NameNode leaves
> safe mode, but it won't leave safe mode until the file system is no longer
> corrupt.  How do I get there from here?
>
> Thanks
>
> john
>
>
>
> 2014-03-04 06:02:51,584 ERROR namenode.NameNode
> (NamenodeFsck.java:deleteCorruptedFile(446)) - Fsck: error deleting
> corrupted file
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
>
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld. Name node
> is in safe mode.
>
> The reported blocks 169302 needs additional 36 blocks to reach the
> threshold 1.0000 of total blocks 169337.
>
> Safe mode will be turned off automatically
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1063)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:3141)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:3101)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3085)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:697)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.deleteCorruptedFile(NamenodeFsck.java:443)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:426)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.fsck(NamenodeFsck.java:206)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FsckServlet$1.run(FsckServlet.java:67)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FsckServlet.doGet(FsckServlet.java:58)
>
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>
>         at
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
>
>         at
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1081)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>
>         at
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>
>         at
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>
>         at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>
>         at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>
>         at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>
>         at
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>
>         at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>
>         at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>
>         at org.mortbay.jetty.Server.handle(Server.java:326)
>
>         at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>
>         at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
>
>         at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
>
>         at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
>
>         at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
>
>         at
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
>
>         at
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
>
>
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, March 04, 2014 6:08 AM
> *To:* user@hadoop.apache.org
> *Subject:* Need help: fsck FAILs, refuses to clean up corrupt fs
>
>
>
> I have a file system with some missing/corrupt blocks.  However, running
> hdfs fsck -delete also fails with errors.  How do I get around this?
>
> Thanks
>
> John
>
>
>
> [hdfs@metallica yarn]$ hdfs fsck -delete
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
>
> Connecting to namenode via http://anthrax.office.datalever.com:50070
>
> FSCK started by hdfs (auth:SIMPLE) from /192.168.57.110 for path
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld at Tue Mar
> 04 06:05:40 MST 2014
>
> .
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT
> blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200714
>
>
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT
> blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200741
>
>
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT
> blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200778
>
>
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: MISSING 3
> blocks of total size 299116266 B.Status: CORRUPT
>
> Total size:    299116266 B
>
> Total dirs:    0
>
> Total files:   1
>
> Total symlinks:                0
>
> Total blocks (validated):      3 (avg. block size 99705422 B)
>
>   ********************************
>
>   CORRUPT FILES:        1
>
>   MISSING BLOCKS:       3
>
>   MISSING SIZE:         299116266 B
>
>   CORRUPT BLOCKS:       3
>
>   ********************************
>
> Minimally replicated blocks:   0 (0.0 %)
>
> Over-replicated blocks:        0 (0.0 %)
>
> Under-replicated blocks:       0 (0.0 %)
>
> Mis-replicated blocks:         0 (0.0 %)
>
> Default replication factor:    3
>
> Average block replication:     0.0
>
> Corrupt blocks:                3
>
> Missing replicas:              0
>
> Number of data-nodes:          8
>
> Number of racks:               1
>
> FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
>
> FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
>
> fsck encountered internal errors!
>
>
>
>
>
> Fsck on path
> '/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld' FAILED
>

RE: Need help: fsck FAILs, refuses to clean up corrupt fs

Posted by John Lilley <jo...@redpoint.net>.
Ah... found the answer.  I had to manually leave safe mode to delete the corrupt files.
john

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, March 04, 2014 9:33 AM
To: user@hadoop.apache.org
Subject: RE: Need help: fsck FAILs, refuses to clean up corrupt fs

More information from the NameNode log.  I don't understand... it is saying that I cannot delete the corrupted file until the NameNode leaves safe mode, but it won't leave safe mode until the file system is no longer corrupt.  How do I get there from here?
Thanks
john

2014-03-04 06:02:51,584 ERROR namenode.NameNode (NamenodeFsck.java:deleteCorruptedFile(446)) - Fsck: error deleting corrupted file /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld. Name node is in safe mode.
The reported blocks 169302 needs additional 36 blocks to reach the threshold 1.0000 of total blocks 169337.
Safe mode will be turned off automatically
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1063)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:3141)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:3101)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3085)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:697)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.deleteCorruptedFile(NamenodeFsck.java:443)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:426)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.fsck(NamenodeFsck.java:206)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet$1.run(FsckServlet.java:67)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet.doGet(FsckServlet.java:58)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1081)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, March 04, 2014 6:08 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Need help: fsck FAILs, refuses to clean up corrupt fs

I have a file system with some missing/corrupt blocks.  However, running hdfs fsck -delete also fails with errors.  How do I get around this?
Thanks
John

[hdfs@metallica yarn]$ hdfs fsck -delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
Connecting to namenode via http://anthrax.office.datalever.com:50070
FSCK started by hdfs (auth:SIMPLE) from /192.168.57.110 for path /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld at Tue Mar 04 06:05:40 MST 2014
.
/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200714

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200741

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200778

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: MISSING 3 blocks of total size 299116266 B.Status: CORRUPT
Total size:    299116266 B
Total dirs:    0
Total files:   1
Total symlinks:                0
Total blocks (validated):      3 (avg. block size 99705422 B)
  ********************************
  CORRUPT FILES:        1
  MISSING BLOCKS:       3
  MISSING SIZE:         299116266 B
  CORRUPT BLOCKS:       3
  ********************************
Minimally replicated blocks:   0 (0.0 %)
Over-replicated blocks:        0 (0.0 %)
Under-replicated blocks:       0 (0.0 %)
Mis-replicated blocks:         0 (0.0 %)
Default replication factor:    3
Average block replication:     0.0
Corrupt blocks:                3
Missing replicas:              0
Number of data-nodes:          8
Number of racks:               1
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
fsck encountered internal errors!


Fsck on path '/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld' FAILED

RE: Need help: fsck FAILs, refuses to clean up corrupt fs

Posted by divye sheth <di...@gmail.com>.
You can force namenode to leave safemode.

hadoop dfsadmin -safemode leave

Then run the hadoop fsck.

Thanks
Divye Sheth
On Mar 4, 2014 10:03 PM, "John Lilley" <jo...@redpoint.net> wrote:

>  More information from the NameNode log.  I don't understand... it is
> saying that I cannot delete the corrupted file until the NameNode leaves
> safe mode, but it won't leave safe mode until the file system is no longer
> corrupt.  How do I get there from here?
>
> Thanks
>
> john
>
>
>
> 2014-03-04 06:02:51,584 ERROR namenode.NameNode
> (NamenodeFsck.java:deleteCorruptedFile(446)) - Fsck: error deleting
> corrupted file
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
>
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld. Name node
> is in safe mode.
>
> The reported blocks 169302 needs additional 36 blocks to reach the
> threshold 1.0000 of total blocks 169337.
>
> Safe mode will be turned off automatically
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1063)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:3141)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:3101)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3085)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:697)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.deleteCorruptedFile(NamenodeFsck.java:443)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:426)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.fsck(NamenodeFsck.java:206)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FsckServlet$1.run(FsckServlet.java:67)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FsckServlet.doGet(FsckServlet.java:58)
>
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>
>         at
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
>
>         at
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1081)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>
>         at
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>
>         at
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>
>         at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>
>         at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>
>         at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>
>         at
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>
>         at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>
>         at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>
>         at org.mortbay.jetty.Server.handle(Server.java:326)
>
>         at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>
>         at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
>
>         at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
>
>         at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
>
>         at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
>
>         at
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
>
>         at
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
>
>
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, March 04, 2014 6:08 AM
> *To:* user@hadoop.apache.org
> *Subject:* Need help: fsck FAILs, refuses to clean up corrupt fs
>
>
>
> I have a file system with some missing/corrupt blocks.  However, running
> hdfs fsck -delete also fails with errors.  How do I get around this?
>
> Thanks
>
> John
>
>
>
> [hdfs@metallica yarn]$ hdfs fsck -delete
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
>
> Connecting to namenode via http://anthrax.office.datalever.com:50070
>
> FSCK started by hdfs (auth:SIMPLE) from /192.168.57.110 for path
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld at Tue Mar
> 04 06:05:40 MST 2014
>
> .
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT
> blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200714
>
>
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT
> blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200741
>
>
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT
> blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200778
>
>
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: MISSING 3
> blocks of total size 299116266 B.Status: CORRUPT
>
> Total size:    299116266 B
>
> Total dirs:    0
>
> Total files:   1
>
> Total symlinks:                0
>
> Total blocks (validated):      3 (avg. block size 99705422 B)
>
>   ********************************
>
>   CORRUPT FILES:        1
>
>   MISSING BLOCKS:       3
>
>   MISSING SIZE:         299116266 B
>
>   CORRUPT BLOCKS:       3
>
>   ********************************
>
> Minimally replicated blocks:   0 (0.0 %)
>
> Over-replicated blocks:        0 (0.0 %)
>
> Under-replicated blocks:       0 (0.0 %)
>
> Mis-replicated blocks:         0 (0.0 %)
>
> Default replication factor:    3
>
> Average block replication:     0.0
>
> Corrupt blocks:                3
>
> Missing replicas:              0
>
> Number of data-nodes:          8
>
> Number of racks:               1
>
> FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
>
> FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
>
> fsck encountered internal errors!
>
>
>
>
>
> Fsck on path
> '/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld' FAILED
>

RE: Need help: fsck FAILs, refuses to clean up corrupt fs

Posted by John Lilley <jo...@redpoint.net>.
Ah... found the answer.  I had to manually leave safe mode to delete the corrupt files.
john

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, March 04, 2014 9:33 AM
To: user@hadoop.apache.org
Subject: RE: Need help: fsck FAILs, refuses to clean up corrupt fs

More information from the NameNode log.  I don't understand... it is saying that I cannot delete the corrupted file until the NameNode leaves safe mode, but it won't leave safe mode until the file system is no longer corrupt.  How do I get there from here?
Thanks
john

2014-03-04 06:02:51,584 ERROR namenode.NameNode (NamenodeFsck.java:deleteCorruptedFile(446)) - Fsck: error deleting corrupted file /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld. Name node is in safe mode.
The reported blocks 169302 needs additional 36 blocks to reach the threshold 1.0000 of total blocks 169337.
Safe mode will be turned off automatically
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1063)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:3141)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:3101)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3085)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:697)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.deleteCorruptedFile(NamenodeFsck.java:443)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:426)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.fsck(NamenodeFsck.java:206)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet$1.run(FsckServlet.java:67)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet.doGet(FsckServlet.java:58)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1081)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, March 04, 2014 6:08 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Need help: fsck FAILs, refuses to clean up corrupt fs

I have a file system with some missing/corrupt blocks.  However, running hdfs fsck -delete also fails with errors.  How do I get around this?
Thanks
John

[hdfs@metallica yarn]$ hdfs fsck -delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
Connecting to namenode via http://anthrax.office.datalever.com:50070
FSCK started by hdfs (auth:SIMPLE) from /192.168.57.110 for path /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld at Tue Mar 04 06:05:40 MST 2014
.
/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200714

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200741

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200778

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: MISSING 3 blocks of total size 299116266 B.Status: CORRUPT
Total size:    299116266 B
Total dirs:    0
Total files:   1
Total symlinks:                0
Total blocks (validated):      3 (avg. block size 99705422 B)
  ********************************
  CORRUPT FILES:        1
  MISSING BLOCKS:       3
  MISSING SIZE:         299116266 B
  CORRUPT BLOCKS:       3
  ********************************
Minimally replicated blocks:   0 (0.0 %)
Over-replicated blocks:        0 (0.0 %)
Under-replicated blocks:       0 (0.0 %)
Mis-replicated blocks:         0 (0.0 %)
Default replication factor:    3
Average block replication:     0.0
Corrupt blocks:                3
Missing replicas:              0
Number of data-nodes:          8
Number of racks:               1
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
fsck encountered internal errors!


Fsck on path '/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld' FAILED

RE: Need help: fsck FAILs, refuses to clean up corrupt fs

Posted by divye sheth <di...@gmail.com>.
You can force namenode to leave safemode.

hadoop dfsadmin -safemode leave

Then run the hadoop fsck.

Thanks
Divye Sheth
On Mar 4, 2014 10:03 PM, "John Lilley" <jo...@redpoint.net> wrote:

>  More information from the NameNode log.  I don't understand... it is
> saying that I cannot delete the corrupted file until the NameNode leaves
> safe mode, but it won't leave safe mode until the file system is no longer
> corrupt.  How do I get there from here?
>
> Thanks
>
> john
>
>
>
> 2014-03-04 06:02:51,584 ERROR namenode.NameNode
> (NamenodeFsck.java:deleteCorruptedFile(446)) - Fsck: error deleting
> corrupted file
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
>
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld. Name node
> is in safe mode.
>
> The reported blocks 169302 needs additional 36 blocks to reach the
> threshold 1.0000 of total blocks 169337.
>
> Safe mode will be turned off automatically
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1063)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:3141)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:3101)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3085)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:697)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.deleteCorruptedFile(NamenodeFsck.java:443)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:426)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.fsck(NamenodeFsck.java:206)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FsckServlet$1.run(FsckServlet.java:67)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FsckServlet.doGet(FsckServlet.java:58)
>
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>
>         at
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
>
>         at
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1081)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>
>         at
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>
>         at
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>
>         at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>
>         at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>
>         at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>
>         at
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>
>         at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>
>         at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>
>         at org.mortbay.jetty.Server.handle(Server.java:326)
>
>         at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>
>         at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
>
>         at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
>
>         at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
>
>         at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
>
>         at
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
>
>         at
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
>
>
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, March 04, 2014 6:08 AM
> *To:* user@hadoop.apache.org
> *Subject:* Need help: fsck FAILs, refuses to clean up corrupt fs
>
>
>
> I have a file system with some missing/corrupt blocks.  However, running
> hdfs fsck -delete also fails with errors.  How do I get around this?
>
> Thanks
>
> John
>
>
>
> [hdfs@metallica yarn]$ hdfs fsck -delete
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
>
> Connecting to namenode via http://anthrax.office.datalever.com:50070
>
> FSCK started by hdfs (auth:SIMPLE) from /192.168.57.110 for path
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld at Tue Mar
> 04 06:05:40 MST 2014
>
> .
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT
> blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200714
>
>
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT
> blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200741
>
>
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT
> blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200778
>
>
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: MISSING 3
> blocks of total size 299116266 B.Status: CORRUPT
>
> Total size:    299116266 B
>
> Total dirs:    0
>
> Total files:   1
>
> Total symlinks:                0
>
> Total blocks (validated):      3 (avg. block size 99705422 B)
>
>   ********************************
>
>   CORRUPT FILES:        1
>
>   MISSING BLOCKS:       3
>
>   MISSING SIZE:         299116266 B
>
>   CORRUPT BLOCKS:       3
>
>   ********************************
>
> Minimally replicated blocks:   0 (0.0 %)
>
> Over-replicated blocks:        0 (0.0 %)
>
> Under-replicated blocks:       0 (0.0 %)
>
> Mis-replicated blocks:         0 (0.0 %)
>
> Default replication factor:    3
>
> Average block replication:     0.0
>
> Corrupt blocks:                3
>
> Missing replicas:              0
>
> Number of data-nodes:          8
>
> Number of racks:               1
>
> FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
>
> FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
>
> fsck encountered internal errors!
>
>
>
>
>
> Fsck on path
> '/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld' FAILED
>

RE: Need help: fsck FAILs, refuses to clean up corrupt fs

Posted by divye sheth <di...@gmail.com>.
You can force namenode to leave safemode.

hadoop dfsadmin -safemode leave

Then run the hadoop fsck.

Thanks
Divye Sheth
On Mar 4, 2014 10:03 PM, "John Lilley" <jo...@redpoint.net> wrote:

>  More information from the NameNode log.  I don't understand... it is
> saying that I cannot delete the corrupted file until the NameNode leaves
> safe mode, but it won't leave safe mode until the file system is no longer
> corrupt.  How do I get there from here?
>
> Thanks
>
> john
>
>
>
> 2014-03-04 06:02:51,584 ERROR namenode.NameNode
> (NamenodeFsck.java:deleteCorruptedFile(446)) - Fsck: error deleting
> corrupted file
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
>
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld. Name node
> is in safe mode.
>
> The reported blocks 169302 needs additional 36 blocks to reach the
> threshold 1.0000 of total blocks 169337.
>
> Safe mode will be turned off automatically
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1063)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:3141)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:3101)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3085)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:697)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.deleteCorruptedFile(NamenodeFsck.java:443)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:426)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.fsck(NamenodeFsck.java:206)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FsckServlet$1.run(FsckServlet.java:67)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FsckServlet.doGet(FsckServlet.java:58)
>
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>
>         at
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
>
>         at
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1081)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>
>         at
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>
>         at
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>
>         at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>
>         at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>
>         at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>
>         at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>
>         at
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>
>         at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>
>         at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>
>         at org.mortbay.jetty.Server.handle(Server.java:326)
>
>         at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>
>         at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
>
>         at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
>
>         at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
>
>         at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
>
>         at
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
>
>         at
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
>
>
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, March 04, 2014 6:08 AM
> *To:* user@hadoop.apache.org
> *Subject:* Need help: fsck FAILs, refuses to clean up corrupt fs
>
>
>
> I have a file system with some missing/corrupt blocks.  However, running
> hdfs fsck -delete also fails with errors.  How do I get around this?
>
> Thanks
>
> John
>
>
>
> [hdfs@metallica yarn]$ hdfs fsck -delete
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
>
> Connecting to namenode via http://anthrax.office.datalever.com:50070
>
> FSCK started by hdfs (auth:SIMPLE) from /192.168.57.110 for path
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld at Tue Mar
> 04 06:05:40 MST 2014
>
> .
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT
> blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200714
>
>
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT
> blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200741
>
>
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT
> blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200778
>
>
>
> /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: MISSING 3
> blocks of total size 299116266 B.Status: CORRUPT
>
> Total size:    299116266 B
>
> Total dirs:    0
>
> Total files:   1
>
> Total symlinks:                0
>
> Total blocks (validated):      3 (avg. block size 99705422 B)
>
>   ********************************
>
>   CORRUPT FILES:        1
>
>   MISSING BLOCKS:       3
>
>   MISSING SIZE:         299116266 B
>
>   CORRUPT BLOCKS:       3
>
>   ********************************
>
> Minimally replicated blocks:   0 (0.0 %)
>
> Over-replicated blocks:        0 (0.0 %)
>
> Under-replicated blocks:       0 (0.0 %)
>
> Mis-replicated blocks:         0 (0.0 %)
>
> Default replication factor:    3
>
> Average block replication:     0.0
>
> Corrupt blocks:                3
>
> Missing replicas:              0
>
> Number of data-nodes:          8
>
> Number of racks:               1
>
> FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
>
> FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
>
> fsck encountered internal errors!
>
>
>
>
>
> Fsck on path
> '/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld' FAILED
>

RE: Need help: fsck FAILs, refuses to clean up corrupt fs

Posted by John Lilley <jo...@redpoint.net>.
More information from the NameNode log.  I don't understand... it is saying that I cannot delete the corrupted file until the NameNode leaves safe mode, but it won't leave safe mode until the file system is no longer corrupt.  How do I get there from here?
Thanks
john

2014-03-04 06:02:51,584 ERROR namenode.NameNode (NamenodeFsck.java:deleteCorruptedFile(446)) - Fsck: error deleting corrupted file /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld. Name node is in safe mode.
The reported blocks 169302 needs additional 36 blocks to reach the threshold 1.0000 of total blocks 169337.
Safe mode will be turned off automatically
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1063)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:3141)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:3101)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3085)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:697)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.deleteCorruptedFile(NamenodeFsck.java:443)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:426)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.fsck(NamenodeFsck.java:206)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet$1.run(FsckServlet.java:67)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet.doGet(FsckServlet.java:58)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1081)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, March 04, 2014 6:08 AM
To: user@hadoop.apache.org
Subject: Need help: fsck FAILs, refuses to clean up corrupt fs

I have a file system with some missing/corrupt blocks.  However, running hdfs fsck -delete also fails with errors.  How do I get around this?
Thanks
John

[hdfs@metallica yarn]$ hdfs fsck -delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
Connecting to namenode via http://anthrax.office.datalever.com:50070
FSCK started by hdfs (auth:SIMPLE) from /192.168.57.110 for path /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld at Tue Mar 04 06:05:40 MST 2014
.
/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200714

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200741

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200778

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: MISSING 3 blocks of total size 299116266 B.Status: CORRUPT
Total size:    299116266 B
Total dirs:    0
Total files:   1
Total symlinks:                0
Total blocks (validated):      3 (avg. block size 99705422 B)
  ********************************
  CORRUPT FILES:        1
  MISSING BLOCKS:       3
  MISSING SIZE:         299116266 B
  CORRUPT BLOCKS:       3
  ********************************
Minimally replicated blocks:   0 (0.0 %)
Over-replicated blocks:        0 (0.0 %)
Under-replicated blocks:       0 (0.0 %)
Mis-replicated blocks:         0 (0.0 %)
Default replication factor:    3
Average block replication:     0.0
Corrupt blocks:                3
Missing replicas:              0
Number of data-nodes:          8
Number of racks:               1
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
fsck encountered internal errors!


Fsck on path '/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld' FAILED

RE: Need help: fsck FAILs, refuses to clean up corrupt fs

Posted by John Lilley <jo...@redpoint.net>.
More information from the NameNode log.  I don't understand... it is saying that I cannot delete the corrupted file until the NameNode leaves safe mode, but it won't leave safe mode until the file system is no longer corrupt.  How do I get there from here?
Thanks
john

2014-03-04 06:02:51,584 ERROR namenode.NameNode (NamenodeFsck.java:deleteCorruptedFile(446)) - Fsck: error deleting corrupted file /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld. Name node is in safe mode.
The reported blocks 169302 needs additional 36 blocks to reach the threshold 1.0000 of total blocks 169337.
Safe mode will be turned off automatically
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1063)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:3141)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:3101)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3085)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:697)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.deleteCorruptedFile(NamenodeFsck.java:443)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:426)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.fsck(NamenodeFsck.java:206)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet$1.run(FsckServlet.java:67)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet.doGet(FsckServlet.java:58)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1081)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, March 04, 2014 6:08 AM
To: user@hadoop.apache.org
Subject: Need help: fsck FAILs, refuses to clean up corrupt fs

I have a file system with some missing/corrupt blocks.  However, running hdfs fsck -delete also fails with errors.  How do I get around this?
Thanks
John

[hdfs@metallica yarn]$ hdfs fsck -delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
Connecting to namenode via http://anthrax.office.datalever.com:50070
FSCK started by hdfs (auth:SIMPLE) from /192.168.57.110 for path /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld at Tue Mar 04 06:05:40 MST 2014
.
/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200714

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200741

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200778

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: MISSING 3 blocks of total size 299116266 B.Status: CORRUPT
Total size:    299116266 B
Total dirs:    0
Total files:   1
Total symlinks:                0
Total blocks (validated):      3 (avg. block size 99705422 B)
  ********************************
  CORRUPT FILES:        1
  MISSING BLOCKS:       3
  MISSING SIZE:         299116266 B
  CORRUPT BLOCKS:       3
  ********************************
Minimally replicated blocks:   0 (0.0 %)
Over-replicated blocks:        0 (0.0 %)
Under-replicated blocks:       0 (0.0 %)
Mis-replicated blocks:         0 (0.0 %)
Default replication factor:    3
Average block replication:     0.0
Corrupt blocks:                3
Missing replicas:              0
Number of data-nodes:          8
Number of racks:               1
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
fsck encountered internal errors!


Fsck on path '/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld' FAILED

RE: Need help: fsck FAILs, refuses to clean up corrupt fs

Posted by John Lilley <jo...@redpoint.net>.
More information from the NameNode log.  I don't understand... it is saying that I cannot delete the corrupted file until the NameNode leaves safe mode, but it won't leave safe mode until the file system is no longer corrupt.  How do I get there from here?
Thanks
john

2014-03-04 06:02:51,584 ERROR namenode.NameNode (NamenodeFsck.java:deleteCorruptedFile(446)) - Fsck: error deleting corrupted file /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld. Name node is in safe mode.
The reported blocks 169302 needs additional 36 blocks to reach the threshold 1.0000 of total blocks 169337.
Safe mode will be turned off automatically
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1063)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:3141)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:3101)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3085)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:697)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.deleteCorruptedFile(NamenodeFsck.java:443)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:426)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.fsck(NamenodeFsck.java:206)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet$1.run(FsckServlet.java:67)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet.doGet(FsckServlet.java:58)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1081)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, March 04, 2014 6:08 AM
To: user@hadoop.apache.org
Subject: Need help: fsck FAILs, refuses to clean up corrupt fs

I have a file system with some missing/corrupt blocks.  However, running hdfs fsck -delete also fails with errors.  How do I get around this?
Thanks
John

[hdfs@metallica yarn]$ hdfs fsck -delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
Connecting to namenode via http://anthrax.office.datalever.com:50070
FSCK started by hdfs (auth:SIMPLE) from /192.168.57.110 for path /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld at Tue Mar 04 06:05:40 MST 2014
.
/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200714

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200741

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200778

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: MISSING 3 blocks of total size 299116266 B.Status: CORRUPT
Total size:    299116266 B
Total dirs:    0
Total files:   1
Total symlinks:                0
Total blocks (validated):      3 (avg. block size 99705422 B)
  ********************************
  CORRUPT FILES:        1
  MISSING BLOCKS:       3
  MISSING SIZE:         299116266 B
  CORRUPT BLOCKS:       3
  ********************************
Minimally replicated blocks:   0 (0.0 %)
Over-replicated blocks:        0 (0.0 %)
Under-replicated blocks:       0 (0.0 %)
Mis-replicated blocks:         0 (0.0 %)
Default replication factor:    3
Average block replication:     0.0
Corrupt blocks:                3
Missing replicas:              0
Number of data-nodes:          8
Number of racks:               1
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
fsck encountered internal errors!


Fsck on path '/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld' FAILED

RE: Need help: fsck FAILs, refuses to clean up corrupt fs

Posted by John Lilley <jo...@redpoint.net>.
More information from the NameNode log.  I don't understand... it is saying that I cannot delete the corrupted file until the NameNode leaves safe mode, but it won't leave safe mode until the file system is no longer corrupt.  How do I get there from here?
Thanks
john

2014-03-04 06:02:51,584 ERROR namenode.NameNode (NamenodeFsck.java:deleteCorruptedFile(446)) - Fsck: error deleting corrupted file /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld. Name node is in safe mode.
The reported blocks 169302 needs additional 36 blocks to reach the threshold 1.0000 of total blocks 169337.
Safe mode will be turned off automatically
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1063)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:3141)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:3101)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3085)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:697)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.deleteCorruptedFile(NamenodeFsck.java:443)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:426)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:289)
        at org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.fsck(NamenodeFsck.java:206)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet$1.run(FsckServlet.java:67)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.hdfs.server.namenode.FsckServlet.doGet(FsckServlet.java:58)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1081)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, March 04, 2014 6:08 AM
To: user@hadoop.apache.org
Subject: Need help: fsck FAILs, refuses to clean up corrupt fs

I have a file system with some missing/corrupt blocks.  However, running hdfs fsck -delete also fails with errors.  How do I get around this?
Thanks
John

[hdfs@metallica yarn]$ hdfs fsck -delete /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld
Connecting to namenode via http://anthrax.office.datalever.com:50070
FSCK started by hdfs (auth:SIMPLE) from /192.168.57.110 for path /rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld at Tue Mar 04 06:05:40 MST 2014
.
/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200714

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200741

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: CORRUPT blockpool BP-1827033441-192.168.57.112-1384284857542 block blk_1074200778

/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld: MISSING 3 blocks of total size 299116266 B.Status: CORRUPT
Total size:    299116266 B
Total dirs:    0
Total files:   1
Total symlinks:                0
Total blocks (validated):      3 (avg. block size 99705422 B)
  ********************************
  CORRUPT FILES:        1
  MISSING BLOCKS:       3
  MISSING SIZE:         299116266 B
  CORRUPT BLOCKS:       3
  ********************************
Minimally replicated blocks:   0 (0.0 %)
Over-replicated blocks:        0 (0.0 %)
Under-replicated blocks:       0 (0.0 %)
Mis-replicated blocks:         0 (0.0 %)
Default replication factor:    3
Average block replication:     0.0
Corrupt blocks:                3
Missing replicas:              0
Number of data-nodes:          8
Number of racks:               1
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
FSCK ended at Tue Mar 04 06:05:40 MST 2014 in 1 milliseconds
fsck encountered internal errors!


Fsck on path '/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_000000.dld' FAILED