You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Piotr Kołaczkowski (JIRA)" <ji...@apache.org> on 2012/08/02 11:31:02 UTC
[jira] [Created] (MAPREDUCE-4506) EofException / 'connection reset
by peer' while copying map output
Piotr Kołaczkowski created MAPREDUCE-4506:
---------------------------------------------
Summary: EofException / 'connection reset by peer' while copying map output
Key: MAPREDUCE-4506
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4506
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 1.0.3
Environment: Ubuntu Linux 12.04 LTS, 64-bit, Java 6 update 33
Reporter: Piotr Kołaczkowski
Priority: Minor
When running complex mapreduce jobs with many mappers and reducers (e.g. 8 mappers, 8 reducers on a 8 core machine), sometimes the following exceptions pop up in the logs during the shuffle phase:
{noformat}
WARN [570516323@qtp-2060060479-164] 2012-07-19 02:50:21,229 TaskTracker.java (line 3894) getMapOutput(attempt_201207161621_0217_m_000071_0,0) failed :
org.mortbay.jetty.EofException
at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:787)
at org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:568)
at org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1005)
at org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:648)
at org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:579)
at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:3872)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:72)
at sun.nio.ch.IOUtil.write(IOUtil.java:43)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
at org.mortbay.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:169)
at org.mortbay.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:221)
at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:721)
{noformat}
The problem looks like some network problems at first, however it turns out that hadoop shuffleInMemory sometimes deliberately closes map-output-copy connections just to reopen them a few milliseconds later, because of temporary unavailability of free memory. Because the sending side does not expect this, an exception is thrown. Additionally this leads to wasting resources on the sender side, which does more work than required serving additional requests.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4506) EofException / 'connection reset
by peer' while copying map output
Posted by "Piotr Kołaczkowski (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Piotr Kołaczkowski updated MAPREDUCE-4506:
------------------------------------------
Status: Patch Available (was: Open)
I attach a patch disabling the 'break connection' feature.
> EofException / 'connection reset by peer' while copying map output
> -------------------------------------------------------------------
>
> Key: MAPREDUCE-4506
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4506
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 1.0.3
> Environment: Ubuntu Linux 12.04 LTS, 64-bit, Java 6 update 33
> Reporter: Piotr Kołaczkowski
> Priority: Minor
> Attachments: RamManager.patch, ReduceTask.patch
>
>
> When running complex mapreduce jobs with many mappers and reducers (e.g. 8 mappers, 8 reducers on a 8 core machine), sometimes the following exceptions pop up in the logs during the shuffle phase:
> {noformat}
> WARN [570516323@qtp-2060060479-164] 2012-07-19 02:50:21,229 TaskTracker.java (line 3894) getMapOutput(attempt_201207161621_0217_m_000071_0,0) failed :
> org.mortbay.jetty.EofException
> at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:787)
> at org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:568)
> at org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1005)
> at org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:648)
> at org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:579)
> at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:3872)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
> at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
> at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
> at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
> at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
> at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
> at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:326)
> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
> at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> Caused by: java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcher.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:72)
> at sun.nio.ch.IOUtil.write(IOUtil.java:43)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
> at org.mortbay.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:169)
> at org.mortbay.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:221)
> at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:721)
> {noformat}
> The problem looks like some network problems at first, however it turns out that hadoop shuffleInMemory sometimes deliberately closes map-output-copy connections just to reopen them a few milliseconds later, because of temporary unavailability of free memory. Because the sending side does not expect this, an exception is thrown. Additionally this leads to wasting resources on the sender side, which does more work than required serving additional requests.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4506) EofException / 'connection
reset by peer' while copying map output
Posted by "Robert Joseph Evans (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427310#comment-13427310 ]
Robert Joseph Evans commented on MAPREDUCE-4506:
------------------------------------------------
Piotr,
I agree that it is ugly to see the exception come out, but I am not really sure that your fix is the right one for 1.0. Making the memory manager blocking means that the jetty thread on the task tracker will not be able to serve up results to others. This could potentially slow down processing for very large jobs. I think it would be a lot better to do something like that on trunk, where the problem also exists, but the node manager uses netty instead of jetty so it is not a full thread that is blocked but just some memory and a file descriptor. Assuming that we have a timeout on how long the thread waits for memory to become available.
> EofException / 'connection reset by peer' while copying map output
> -------------------------------------------------------------------
>
> Key: MAPREDUCE-4506
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4506
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 1.0.3
> Environment: Ubuntu Linux 12.04 LTS, 64-bit, Java 6 update 33
> Reporter: Piotr Kołaczkowski
> Priority: Minor
> Attachments: RamManager.patch, ReduceTask.patch
>
>
> When running complex mapreduce jobs with many mappers and reducers (e.g. 8 mappers, 8 reducers on a 8 core machine), sometimes the following exceptions pop up in the logs during the shuffle phase:
> {noformat}
> WARN [570516323@qtp-2060060479-164] 2012-07-19 02:50:21,229 TaskTracker.java (line 3894) getMapOutput(attempt_201207161621_0217_m_000071_0,0) failed :
> org.mortbay.jetty.EofException
> at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:787)
> at org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:568)
> at org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1005)
> at org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:648)
> at org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:579)
> at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:3872)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
> at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
> at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
> at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
> at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
> at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
> at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:326)
> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
> at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> Caused by: java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcher.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:72)
> at sun.nio.ch.IOUtil.write(IOUtil.java:43)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
> at org.mortbay.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:169)
> at org.mortbay.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:221)
> at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:721)
> {noformat}
> The problem looks like some network problems at first, however it turns out that hadoop shuffleInMemory sometimes deliberately closes map-output-copy connections just to reopen them a few milliseconds later, because of temporary unavailability of free memory. Because the sending side does not expect this, an exception is thrown. Additionally this leads to wasting resources on the sender side, which does more work than required serving additional requests.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4506) EofException / 'connection reset
by peer' while copying map output
Posted by "Piotr Kołaczkowski (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Piotr Kołaczkowski updated MAPREDUCE-4506:
------------------------------------------
Attachment: ReduceTask.patch
RamManager.patch
Patch fixing the EofExceptions in shuffleInMemory. Additionally, map output copy connections are taken into try-finally block to assure they are closed properly.
> EofException / 'connection reset by peer' while copying map output
> -------------------------------------------------------------------
>
> Key: MAPREDUCE-4506
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4506
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 1.0.3
> Environment: Ubuntu Linux 12.04 LTS, 64-bit, Java 6 update 33
> Reporter: Piotr Kołaczkowski
> Priority: Minor
> Attachments: RamManager.patch, ReduceTask.patch
>
>
> When running complex mapreduce jobs with many mappers and reducers (e.g. 8 mappers, 8 reducers on a 8 core machine), sometimes the following exceptions pop up in the logs during the shuffle phase:
> {noformat}
> WARN [570516323@qtp-2060060479-164] 2012-07-19 02:50:21,229 TaskTracker.java (line 3894) getMapOutput(attempt_201207161621_0217_m_000071_0,0) failed :
> org.mortbay.jetty.EofException
> at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:787)
> at org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:568)
> at org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1005)
> at org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:648)
> at org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:579)
> at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:3872)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
> at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
> at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
> at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
> at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
> at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
> at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:326)
> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
> at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> Caused by: java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcher.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:72)
> at sun.nio.ch.IOUtil.write(IOUtil.java:43)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
> at org.mortbay.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:169)
> at org.mortbay.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:221)
> at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:721)
> {noformat}
> The problem looks like some network problems at first, however it turns out that hadoop shuffleInMemory sometimes deliberately closes map-output-copy connections just to reopen them a few milliseconds later, because of temporary unavailability of free memory. Because the sending side does not expect this, an exception is thrown. Additionally this leads to wasting resources on the sender side, which does more work than required serving additional requests.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4506) EofException / 'connection
reset by peer' while copying map output
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427221#comment-13427221 ]
Hadoop QA commented on MAPREDUCE-4506:
--------------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12538889/ReduceTask.patch
against trunk revision .
-1 patch. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2699//console
This message is automatically generated.
> EofException / 'connection reset by peer' while copying map output
> -------------------------------------------------------------------
>
> Key: MAPREDUCE-4506
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4506
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 1.0.3
> Environment: Ubuntu Linux 12.04 LTS, 64-bit, Java 6 update 33
> Reporter: Piotr Kołaczkowski
> Priority: Minor
> Attachments: RamManager.patch, ReduceTask.patch
>
>
> When running complex mapreduce jobs with many mappers and reducers (e.g. 8 mappers, 8 reducers on a 8 core machine), sometimes the following exceptions pop up in the logs during the shuffle phase:
> {noformat}
> WARN [570516323@qtp-2060060479-164] 2012-07-19 02:50:21,229 TaskTracker.java (line 3894) getMapOutput(attempt_201207161621_0217_m_000071_0,0) failed :
> org.mortbay.jetty.EofException
> at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:787)
> at org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:568)
> at org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1005)
> at org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:648)
> at org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:579)
> at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:3872)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
> at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
> at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
> at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
> at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
> at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
> at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:326)
> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
> at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> Caused by: java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcher.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:72)
> at sun.nio.ch.IOUtil.write(IOUtil.java:43)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
> at org.mortbay.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:169)
> at org.mortbay.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:221)
> at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:721)
> {noformat}
> The problem looks like some network problems at first, however it turns out that hadoop shuffleInMemory sometimes deliberately closes map-output-copy connections just to reopen them a few milliseconds later, because of temporary unavailability of free memory. Because the sending side does not expect this, an exception is thrown. Additionally this leads to wasting resources on the sender side, which does more work than required serving additional requests.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira