You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2012/06/08 03:43:23 UTC
[jira] [Commented] (MAPREDUCE-4298) NodeManager crashed after running out of file descriptors

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291492#comment-13291492 ] 

Jason Lowe commented on MAPREDUCE-4298:
---------------------------------------

This occurred again on one of our clusters.  Turns out I was mistaken earlier, the file descriptor ulimit for our nodemanager daemons is set to 32768, not 8192.  Fortunately this time we were able to examine some nodemanagers that had leaked numerous file descriptors but had not fallen over yet.

Almost all of the file descriptors were referencing map outputs for the shuffle, often hundreds of file descriptors open to the same file.  Interestingly almost all of the map files corresponded to just one job.  Examining the NM log around the time that job ran, I found numerous exceptions in it showing things had not gone smoothly during the shuffle for that job.  For example:

{noformat}
 [New I/O server worker #1-5]java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcher.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:100)
        at sun.nio.ch.IOUtil.write(IOUtil.java:56)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
        at org.jboss.netty.channel.socket.nio.SocketSendBufferPool$PooledSendBuffer.transferTo(SocketSendBufferPool.java:239)
        at org.jboss.netty.channel.socket.nio.NioWorker.write0(NioWorker.java:470)
        at org.jboss.netty.channel.socket.nio.NioWorker.writeFromUserCode(NioWorker.java:388)
        at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:137)
        at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:76)
        at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:68)
        at org.jboss.netty.handler.stream.ChunkedWriteHandler.flush(ChunkedWriteHandler.java:253)
        at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleDownstream(ChunkedWriteHandler.java:123)
        at org.jboss.netty.channel.Channels.write(Channels.java:611)
        at org.jboss.netty.channel.Channels.write(Channels.java:578)
        at org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:259)
        at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendMapOutput(ShuffleHandler.java:477)
        at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:397)
        at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:144)
        at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:116)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:302)
        at org.jboss.netty.handler.codec.replay.ReplayingDecoder.unfoldAndfireMessageReceived(ReplayingDecoder.java:523)
        at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:507)
        at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:444)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:274)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:261)
        at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:350)
        at org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:281)
        at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:201)
        at org.jboss.netty.util.internal.IoWorkerRunnable.run(IoWorkerRunnable.java:46)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
{noformat}

Looking closer at the job, I could see that it had run with 15000 maps and 2000 reduces.  Hundreds of the reducers had failed running out of heap space during the shuffle phase, which lead to broken pipe and connection reset errors on the nodemanagers trying to serve up shuffle data to those reducers when they died.

I was able to reproduce the broken pipe issue and step through the code with a debugger.  Normally the file descriptor is closed by adding a ChannelFuture after the map data is written, and that future's operationComplete() callback closes the file.  However when there is an I/O error sending the shuffle header, Netty closes down the channel automatically (plus we explicitly close it in a channel exception handler).  By the time we try to write the map file data to the channel, the channel is already closed.  And I was able to see that if we write to a closed channel, the ChannelFuture's operationComplete() callback is never invoked.  No operationComplete() callback means we leak the file descriptor for the map file.  If multiple map files are being sent to the reducer, we leak multiple file descriptors for the same error.

I searched around and discovered this is a known issue in Netty 3.2.3.Final, the version we're currently using.  See https://issues.jboss.org/browse/NETTY-374.  It's fixed in version 3.2.4.Final.
                
> NodeManager crashed after running out of file descriptors
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-4298
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4298
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.3
>            Reporter: Jason Lowe
>
> A node on one of our clusters fell over because it ran out of open file descriptors.  Log details with stack traceback to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira