You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Ming Ma (JIRA)" <ji...@apache.org> on 2016/05/20 19:05:12 UTC

[jira] [Created] (TEZ-3263) Improved shuffle error handling across NM restarts

Ming Ma created TEZ-3263:
----------------------------

             Summary: Improved shuffle error handling across NM restarts
                 Key: TEZ-3263
                 URL: https://issues.apache.org/jira/browse/TEZ-3263
             Project: Apache Tez
          Issue Type: Improvement
            Reporter: Ming Ma


Maybe the fix could be something similar to MAPREDUCE-5891. Here is one exception found during NM rolling restart.

{noformat}
java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:579)
	at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
	at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
	at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
	at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:653)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1325)
	at org.apache.tez.http.HttpConnection.getInputStream(HttpConnection.java:247)
	at org.apache.tez.runtime.library.common.shuffle.Fetcher.setupConnection(Fetcher.java:464)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)