You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2014/08/03 02:45:12 UTC

[jira] [Comment Edited] (TEZ-1343) Bypass the Fetcher and read directly from the local filesystem if source vertex ran on the same host

    [ https://issues.apache.org/jira/browse/TEZ-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14083816#comment-14083816 ] 

Daniel Dai edited comment on TEZ-1343 at 8/3/14 12:44 AM:
----------------------------------------------------------

[~sseth], TEZ-1343.1.2.txt fail for me, but TEZ-1343.WIP.2.patch works. 

Error message:
{code}
java.net.ConnectException, Can't assign requested address]
2014-08-02 17:43:41,950 [fetcher [scope_25] #2] WARN  org.apache.tez.runtime.library.common.shuffle.impl.Fetcher - Failed to connect to daijymacpro-2:0 with 1 inputs
java.net.ConnectException: Can't assign requested address
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:382)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:241)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:228)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:431)
	at java.net.Socket.connect(Socket.java:527)
	at sun.net.NetworkClient.doConnect(NetworkClient.java:158)
	at sun.net.www.http.HttpClient.openServer(HttpClient.java:424)
	at sun.net.www.http.HttpClient.openServer(HttpClient.java:538)
	at sun.net.www.http.HttpClient.<init>(HttpClient.java:214)
	at sun.net.www.http.HttpClient.New(HttpClient.java:300)
	at sun.net.www.http.HttpClient.New(HttpClient.java:319)
	at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:987)
	at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:923)
	at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:841)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1195)
	at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379)
	at org.apache.tez.runtime.library.shuffle.common.HttpConnection.validate(HttpConnection.java:192)
	at org.apache.tez.runtime.library.common.shuffle.impl.Fetcher.copyFromHost(Fetcher.java:227)
	at org.apache.tez.runtime.library.common.shuffle.impl.Fetcher.run(Fetcher.java:137)

{code}


was (Author: daijy):
[~sseth], TEZ-1343.1.2.txt fail for me, but TEZ-1343.WIP.2.patch works. 

Error message:
{code}
2014-08-02 17:32:34,610 [AsyncDispatcher event handler] INFO  org.apache.tez.dag.history.HistoryEventHandler - [HISTORY][DAG:dag_1407025952092_0001_1][Event:TASK_FINISHED]: vertexName=scope-26, taskId=task_1407025952092_0001_1_01_000000, startTime=1407025954444, finishTime=1407025954610, timeTaken=166, status=KILLED, successfulAttemptID=null, diagnostics=TaskAttempt 0 failed, info=[Error: exceptionThrown=org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$ShuffleError: error in shuffle in fetcher [scope_25] #1
	at org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:329)
	at org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:311)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
	at java.lang.Thread.run(Thread.java:695)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
	at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:350)
	at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler.copyFailed(ShuffleScheduler.java:267)
	at org.apache.tez.runtime.library.common.shuffle.impl.Fetcher.copyFromHost(Fetcher.java:250)
	at org.apache.tez.runtime.library.common.shuffle.impl.Fetcher.run(Fetcher.java:137)
, errorMessage=Shuffle Runner Failed:org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$ShuffleError: error in shuffle in fetcher [scope_25] #1
	at org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:329)
	at org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:311)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
	at java.lang.Thread.run(Thread.java:695)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
	at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:350)
	at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler.copyFailed(ShuffleScheduler.java:267)
	at org.apache.tez.runtime.library.common.shuffle.impl.Fetcher.copyFromHost(Fetcher.java:250)
	at org.apache.tez.runtime.library.common.shuffle.impl.Fetcher.run(Fetcher.java:137)
]
{code}

> Bypass the Fetcher and read directly from the local filesystem if source vertex ran on the same host
> ----------------------------------------------------------------------------------------------------
>
>                 Key: TEZ-1343
>                 URL: https://issues.apache.org/jira/browse/TEZ-1343
>             Project: Apache Tez
>          Issue Type: Task
>    Affects Versions: 0.4.1
>            Reporter: Prakash Ramachandran
>            Assignee: Prakash Ramachandran
>             Fix For: 0.5.0
>
>         Attachments: TEZ-1343.1.1.txt, TEZ-1343.1.2.txt, TEZ-1343.1.patch, TEZ-1343.WIP.1.patch, TEZ-1343.WIP.2.patch
>
>
> In the case of the source and current vertex are on the same host bypass the Fetcher and read it directly from the local filesystem



--
This message was sent by Atlassian JIRA
(v6.2#6252)