You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Wojciech Langiewicz <wl...@gmail.com> on 2011/03/24 09:43:06 UTC

Problem with NameNode Storage: IMAGE_AND_EDITS Failed

Hello,
Right now I'm having this issue on my cluster:
On NameNode web interface at the bottom of the page is table:
NameNode Storage
Storage Directory	Type	State
/srv/dfs/name	IMAGE_AND_EDITS	Failed

I have had this issue before, and I rebooted Hadoop, but lost many files.
What can I do now that I won't lose files since last backup, and what 
this warning actually means?
Any help would be greatly appreciated.
--
Wojciech Langiewicz

Re: Problem with NameNode Storage: IMAGE_AND_EDITS Failed

Posted by Wojciech Langiewicz <wl...@gmail.com>.
Before that there's another exception which shows that someone must have 
changed setting from 64k  back to 1024.

So probably I would like to know if there is a command to manually 
trigger saving metadata from NN that after rebooting everything should 
be ok.

java.io.IOException: Too many open files
         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
         at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:145)
         at 
org.mortbay.jetty.nio.SelectChannelConnector$1.acceptChannel(SelectChannelConnector.java:75)
         at 
org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:498)
         at 
org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:185)
         at 
org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124)
         at 
org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:707)
         at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)

On 24.03.2011 10:01, Wojciech Langiewicz wrote:
> Hello,
> Yes, it's the only one, it has free space, and is not corrupted locally.
> I have found this exception in namenode logs:
> 2011-03-24 09:46:51,531 WARN org.mortbay.log: /getimage:
> java.io.IOException: GetImage failed. java.lang.NullPointerException
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.getImageFile(FSImage.java:219)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.getFsImageName(FSImage.java:1584)
>
> at
> org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1.run(GetImageServlet.java:75)
>
> at
> org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1.run(GetImageServlet.java:70)
>
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
>
> at
> org.apache.hadoop.hdfs.server.namenode.GetImageServlet.doGet(GetImageServlet.java:70)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
> at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1124)
>
> at
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:826)
>
> at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1115)
>
> at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:361)
> at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
> at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>
> at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:324)
> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
> at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
>
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
> at
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
>
> at
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)
>
>
> It appears there every 5 minutes.
>
> On 24.03.2011 09:51, Harsh J wrote:
>> Hello,
>>
>> Can you verify and confirm if that location (is it the only one?) is a
>> valid one (as in, has free space left, is not corrupt, etc.)?
>>
>> Take a backup of whatever exists at your dfs.name.dir before you
>> proceed doing anything.
>>
>> On Thu, Mar 24, 2011 at 2:13 PM, Wojciech Langiewicz
>> <wl...@gmail.com> wrote:
>>> Hello,
>>> Right now I'm having this issue on my cluster:
>>> On NameNode web interface at the bottom of the page is table:
>>> NameNode Storage
>>> Storage Directory Type State
>>> /srv/dfs/name IMAGE_AND_EDITS Failed
>>>
>>> I have had this issue before, and I rebooted Hadoop, but lost many
>>> files.
>>> What can I do now that I won't lose files since last backup, and what
>>> this
>>> warning actually means?
>>> Any help would be greatly appreciated.
>>> --
>>> Wojciech Langiewicz
>>>
>>
>>
>>
>


Re: Problem with NameNode Storage: IMAGE_AND_EDITS Failed

Posted by Wojciech Langiewicz <wl...@gmail.com>.
Hello,
Yes, it's the only one, it has free space, and is not corrupted locally. 
I have found this exception in namenode logs:
2011-03-24 09:46:51,531 WARN org.mortbay.log: /getimage: 
java.io.IOException: GetImage failed. java.lang.NullPointerException
         at 
org.apache.hadoop.hdfs.server.namenode.FSImage.getImageFile(FSImage.java:219)
         at 
org.apache.hadoop.hdfs.server.namenode.FSImage.getFsImageName(FSImage.java:1584)
         at 
org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1.run(GetImageServlet.java:75)
         at 
org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1.run(GetImageServlet.java:70)
         at java.security.AccessController.doPrivileged(Native Method)
         at javax.security.auth.Subject.doAs(Subject.java:396)
         at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
         at 
org.apache.hadoop.hdfs.server.namenode.GetImageServlet.doGet(GetImageServlet.java:70)
         at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
         at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
         at 
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
         at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1124)
         at 
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:826)
         at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1115)
         at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:361)
         at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
         at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
         at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
         at 
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
         at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
         at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
         at org.mortbay.jetty.Server.handle(Server.java:324)
         at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
         at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
         at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
         at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
         at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
         at 
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
         at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)

It appears there every 5 minutes.

On 24.03.2011 09:51, Harsh J wrote:
> Hello,
>
> Can you verify and confirm if that location (is it the only one?) is a
> valid one (as in, has free space left, is not corrupt, etc.)?
>
> Take a backup of whatever exists at your dfs.name.dir before you
> proceed doing anything.
>
> On Thu, Mar 24, 2011 at 2:13 PM, Wojciech Langiewicz
> <wl...@gmail.com>  wrote:
>> Hello,
>> Right now I'm having this issue on my cluster:
>> On NameNode web interface at the bottom of the page is table:
>> NameNode Storage
>> Storage Directory       Type    State
>> /srv/dfs/name   IMAGE_AND_EDITS Failed
>>
>> I have had this issue before, and I rebooted Hadoop, but lost many files.
>> What can I do now that I won't lose files since last backup, and what this
>> warning actually means?
>> Any help would be greatly appreciated.
>> --
>> Wojciech Langiewicz
>>
>
>
>


Re: Problem with NameNode Storage: IMAGE_AND_EDITS Failed

Posted by Harsh J <qw...@gmail.com>.
Hello,

Can you verify and confirm if that location (is it the only one?) is a
valid one (as in, has free space left, is not corrupt, etc.)?

Take a backup of whatever exists at your dfs.name.dir before you
proceed doing anything.

On Thu, Mar 24, 2011 at 2:13 PM, Wojciech Langiewicz
<wl...@gmail.com> wrote:
> Hello,
> Right now I'm having this issue on my cluster:
> On NameNode web interface at the bottom of the page is table:
> NameNode Storage
> Storage Directory       Type    State
> /srv/dfs/name   IMAGE_AND_EDITS Failed
>
> I have had this issue before, and I rebooted Hadoop, but lost many files.
> What can I do now that I won't lose files since last backup, and what this
> warning actually means?
> Any help would be greatly appreciated.
> --
> Wojciech Langiewicz
>



-- 
Harsh J
http://harshj.com