You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Markus Jelsma <ma...@openindex.io> on 2011/12/20 08:16:44 UTC

Shuffle stuck at 0.22.0

Hi,

On 0.22.0 we sometimes see a shuffle phase being stuck to a point where the 
framework does not kill it because of lack of progress. The reducer's 
tasktracker log keeps filling up with two exceptions all night long:


2011-12-20 06:25:03,711 WARN org.mortbay.log: Committed before 410 
getMapOutputs(attempt_201112191334_0039_m_000270_0,attempt_201112191334_0039_m_000264_0,attempt_201112191334_0039_m_000233_0,attempt_201112191334_0039_m_000266_0,attempt_201112191334_0039_m_000231_0,attempt_201112191334_0039_m_000228_0,attempt_201112191334_0039_m_000234_0,attempt_201112191334_0039_m_000309_0,attempt_201112191334_0039_m_000265_0,attempt_201112191334_0039_m_000271_0,attempt_201112191334_0039_m_000268_0,6) 
failed
2011-12-20 06:25:03,711 ERROR org.mortbay.log: /mapOutput
java.lang.IllegalStateException: Committed
        at org.mortbay.jetty.Response.resetBuffer(Response.java:1023)
        at org.mortbay.jetty.Response.sendError(Response.java:240)
        at 
org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:3683)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at 
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at 
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:874)
        at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at 
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at 
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
2011-12-20 06:25:03,711 WARN org.apache.hadoop.mapred.TaskTracker: 
getMapOutputs(attempt_201112191334_0039_m_000264_0,attempt_201112191334_0039_m_000270_0,attempt_201112191334_0039_m_000233_0,attempt_201112191334_0039_m_000266_0,attempt_201112191334_0039_m_000234_0,attempt_201112191334_0039_m_000228_0,attempt_201112191334_0039_m_000231_0,attempt_201112191334_0039_m_000309_0,attempt_201112191334_0039_m_000271_0,attempt_201112191334_0039_m_000265_0,attempt_201112191334_0039_m_000268_0,6) 
failed
java.io.IOException: error on sending map attempt_201112191334_0039_m_000264_0 
to reduce 6
        at 
org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.sendMapFile(TaskTracker.java:3815)
        at 
org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:3675)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at 
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at 
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:874)
        at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at 
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at 
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: org.mortbay.jetty.EofException
        at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:791)
        at 
org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:569)
        at 
org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1012)
        at 
org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:651)
        at 
org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:580)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at 
org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.sendMapFile(TaskTracker.java:3785)
        ... 22 more
Caused by: java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcher.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:72)
        at sun.nio.ch.IOUtil.write(IOUtil.java:43)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
        at org.mortbay.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:170)
        at 
org.mortbay.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:221)
        at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:725)
        ... 28 more


Any thoughts? Each node is configured to 16k open files.