You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Brian Bockelman (JIRA)" <ji...@apache.org> on 2008/10/06 18:47:44 UTC

[jira] Commented: (HADOOP-4351) ArrayIndexOutOfBoundsException during fsck

    [ https://issues.apache.org/jira/browse/HADOOP-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637136#action_12637136 ] 

Brian Bockelman commented on HADOOP-4351:
-----------------------------------------

I would note that this blocks the "hadoop fsck /" operation from completing, which seriously freaked out our admins.

Adding in some debugging statements, it appears that the error happens when there are corrupt entries which are not in the blocksMap, but they are in the corruptReplicas object.

Here's my debug statement output:

2008-10-06 10:57:17,987 ERROR org.apache.hadoop.fs.FSNamesystem: Internal error in calculating corrupt block locations for blk_-8711032927274230751_1834
2008-10-06 10:57:17,988 ERROR org.apache.hadoop.fs.FSNamesystem: There should be 2 corrupt replicas; there are 3 known replicas.  There are 1 known corrupt replicas.
2008-10-06 10:57:17,988 ERROR org.apache.hadoop.fs.FSNamesystem: We believe the value of blockCorrupt is false.  Current counter value: 2
2008-10-06 10:57:17,988 ERROR org.apache.hadoop.fs.FSNamesystem:  * Corrupt entry: 172.16.1.16:50010
2008-10-06 10:57:17,988 ERROR org.apache.hadoop.fs.FSNamesystem:  * machineSet entry: 172.16.1.10:50010
2008-10-06 10:57:17,988 ERROR org.apache.hadoop.fs.FSNamesystem:  * machineSet entry: 172.16.1.145:50010
2008-10-06 10:57:17,988 ERROR org.apache.hadoop.fs.FSNamesystem:  * block replica entry: 172.16.1.10:50010
2008-10-06 10:57:17,988 ERROR org.apache.hadoop.fs.FSNamesystem:  * block replica entry: 172.16.1.145:50010
2008-10-06 10:57:17,988 ERROR org.apache.hadoop.fs.FSNamesystem:  * block replica entry: 172.16.1.117:50010

So, nodes 10, 145, and 117 hold good replicas; 16 held a bad one.  However, because the allocated array size is (# of replicas) - (# of bad replicas), machineSet is allocated a size of 2, but we try to write 3 elements into it.


> ArrayIndexOutOfBoundsException during fsck
> ------------------------------------------
>
>                 Key: HADOOP-4351
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4351
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>
> After observing a lot of corrupted blocks, I suddenly started to get a lot of ArrayIndexOutOfBoundsException.
> It appears to be an issue very similar to HADOOP-3649, which is supposed to be fixed in 0.18.1.
> 2008-10-06 08:48:43,241 WARN /: /fsck?path=%2F:
> java.lang.ArrayIndexOutOfBoundsException: 2
>    at org.apache.hadoop.dfs.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:789)
>    at org.apache.hadoop.dfs.FSNamesystem.getBlockLocations(FSNamesystem.java:727)
>    at org.apache.hadoop.dfs.NamenodeFsck.check(NamenodeFsck.java:167)
>    at org.apache.hadoop.dfs.NamenodeFsck.check(NamenodeFsck.java:162)
>    at org.apache.hadoop.dfs.NamenodeFsck.check(NamenodeFsck.java:162)
>    at org.apache.hadoop.dfs.NamenodeFsck.check(NamenodeFsck.java:162)
>    at org.apache.hadoop.dfs.NamenodeFsck.check(NamenodeFsck.java:162)
>    at org.apache.hadoop.dfs.NamenodeFsck.fsck(NamenodeFsck.java:128)
>    at org.apache.hadoop.dfs.FsckServlet.doGet(FsckServlet.java:48)
>    at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
>    at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>    at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>    at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>    at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>    at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>    at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>    at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>    at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>    at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>    at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>    at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>    at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>    at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.