You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "László Bodor (Jira)" <ji...@apache.org> on 2021/12/03 13:32:00 UTC

[jira] [Updated] (TEZ-4357) Report full url to logs in case of fetcher connection failure

     [ https://issues.apache.org/jira/browse/TEZ-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

László Bodor updated TEZ-4357:
------------------------------
    Description: 
Currently, when Fetcher and FetcherOrderedGrouped fail on getInputStream, like:
{code}
2021-12-03 08:32:04,634 [WARN] [Fetcher_O {expDataFile} #11] |orderedgrouped.FetcherOrderedGrouped|: Failed to verify reply after connecting from hwc7213-6.hwc7213.root.hwx.site to hwc7213-7.hwc7213.root.hwx.site:13562 with 1 inputs pending
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        at java.net.SocketInputStream.read(SocketInputStream.java:171)
        at java.net.SocketInputStream.read(SocketInputStream.java:141)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
        at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1593)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
        at org.apache.tez.http.HttpConnection.getInputStream(HttpConnection.java:260)
        at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:362)
        at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:265)
        at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:184)
        at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:196)
        at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:59)
{code}

they don't report the full url, which is important to understand...looking at an INFO level logs, I can see on the ShuffleHandler logs that the request is not a valid ssl request, which makes me think that fetcher simply works with invalid settings...if I saw the full url, I could make sure it was trying to connect in a secure way (--> protocol: https)

> Report full url to logs in case of fetcher connection failure
> -------------------------------------------------------------
>
>                 Key: TEZ-4357
>                 URL: https://issues.apache.org/jira/browse/TEZ-4357
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: László Bodor
>            Priority: Major
>
> Currently, when Fetcher and FetcherOrderedGrouped fail on getInputStream, like:
> {code}
> 2021-12-03 08:32:04,634 [WARN] [Fetcher_O {expDataFile} #11] |orderedgrouped.FetcherOrderedGrouped|: Failed to verify reply after connecting from hwc7213-6.hwc7213.root.hwx.site to hwc7213-7.hwc7213.root.hwx.site:13562 with 1 inputs pending
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>         at java.net.SocketInputStream.read(SocketInputStream.java:171)
>         at java.net.SocketInputStream.read(SocketInputStream.java:141)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>         at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
>         at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
>         at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
>         at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1593)
>         at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
>         at org.apache.tez.http.HttpConnection.getInputStream(HttpConnection.java:260)
>         at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:362)
>         at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:265)
>         at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:184)
>         at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:196)
>         at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:59)
> {code}
> they don't report the full url, which is important to understand...looking at an INFO level logs, I can see on the ShuffleHandler logs that the request is not a valid ssl request, which makes me think that fetcher simply works with invalid settings...if I saw the full url, I could make sure it was trying to connect in a secure way (--> protocol: https)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)