You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Daryn Sharp (JIRA)" <ji...@apache.org> on 2014/06/18 20:02:24 UTC
[jira] [Commented] (HADOOP-10718) "IOException: An existing
connection was forcibly closed by the remote host" frequently happens on
Windows
[ https://issues.apache.org/jira/browse/HADOOP-10718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14036051#comment-14036051 ]
Daryn Sharp commented on HADOOP-10718:
--------------------------------------
I've seen a jira with similar odd windows tcp connection issues. The error about "forcibly closed" is the normal graceful tcp shutdown (FIN) did not occur but a hard connection abort (RESET). I tried to read up the windows tcp stack and found that close() immediately frees all resources. If there is data remaining to be sent then it's discarded and a RESET is sent. The shutdown() call is supposed to initiate the graceful FIN shutdown.
The ipc layer is doing shutdown + close but apparently windows isn't behaving correctly. I suspect that reason the errors are non-deterministic is the server thread is not being context switched between the write ... close. The client thread never got a chance to read the response.
It'd be curious to know if windows sent both the FIN and the RESET. Someone with windows should get a packet trace.
> "IOException: An existing connection was forcibly closed by the remote host" frequently happens on Windows
> ----------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-10718
> URL: https://issues.apache.org/jira/browse/HADOOP-10718
> Project: Hadoop Common
> Issue Type: Bug
> Components: ipc
> Reporter: Zhijie Shen
>
> After HADOOP-317, we still observed that on windows platform, there're a number of IOException: An existing connection was forcibly closed by the remote host when running a MR job. For example,
> {code}
> 2014-06-09 09:11:40,675 INFO [Socket Reader #3 for port 59622] org.apache.hadoop.ipc.Server: Socket Reader #3 for port 59622: readAndProcess from client 10.215.30.53 threw exception [java.io.IOException: An existing connection was forcibly closed by the remote host]
> java.io.IOException: An existing connection was forcibly closed by the remote host
> at sun.nio.ch.SocketDispatcher.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225)
> at sun.nio.ch.IOUtil.read(IOUtil.java:198)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:359)
> at org.apache.hadoop.ipc.Server.channelRead(Server.java:2558)
> at org.apache.hadoop.ipc.Server.access$2800(Server.java:130)
> at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1459)
> at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:750)
> at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:624)
> at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:595)
> {code}
> {code}
> 2014-06-09 09:15:38,539 WARN [main] org.apache.hadoop.mapred.Task: Failure sending commit pending: java.io.IOException: Failed on local exception: java.io.IOException: An existing connection was forcibly closed by the remote host; Host Details : local host is: "sdevin-clster53/10.215.16.72"; destination host is: "sdevin-clster54":63415;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:231)
> at com.sun.proxy.$Proxy9.commitPending(Unknown Source)
> at org.apache.hadoop.mapred.Task.done(Task.java:1006)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:397)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> Caused by: java.io.IOException: An existing connection was forcibly closed by the remote host
> at sun.nio.ch.SocketDispatcher.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225)
> at sun.nio.ch.IOUtil.read(IOUtil.java:198)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:359)
> at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57)
> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
> at java.io.FilterInputStream.read(FilterInputStream.java:133)
> at java.io.FilterInputStream.read(FilterInputStream.java:133)
> at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:510)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1054)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:949)
> {code}
> And the latter one results in the issue of MAPREDUCE-5924.
--
This message was sent by Atlassian JIRA
(v6.2#6252)