You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2017/09/22 10:17:05 UTC

[jira] [Commented] (HADOOP-14531) Improve S3A error handling & reporting

    [ https://issues.apache.org/jira/browse/HADOOP-14531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16176228#comment-16176228 ] 

Steve Loughran commented on HADOOP-14531:
-----------------------------------------

you can get a 443 sometimes; no response. Retryable on idempotent calls only
{code}
2017-02-07 10:00:44,200 INFO [main] org.apache.hadoop.mapred.Task: Task attempt_1486454881801_0009_m_000024_0 is allowed to commit now
2017-02-07 10:01:07,950 INFO [s3a-transfer-shared-pool1-t7] com.amazonaws.http.AmazonHttpClient: Unable to execute HTTP request: hwdev-rajesh-new2.s3.amazonaws.com:443 failed to respond
org.apache.http.NoHttpResponseException: hwdev-rajesh-new2.s3.amazonaws.com:443 failed to respond
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
        at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
        at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:259)
        at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:209)
        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
        at com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:66)
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
        at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:686)
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:488)
        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:884)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
        at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:728)
        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3785)
        at com.amazonaws.services.s3.AmazonS3Client.copyPart(AmazonS3Client.java:1731)
        at com.amazonaws.services.s3.transfer.internal.CopyPartCallable.call(CopyPartCallable.java:41)
        at com.amazonaws.services.s3.transfer.internal.CopyPartCallable.call(CopyPartCallable.java:28)
        at org.apache.hadoop.fs.s3a.SemaphoredDelegatingExecutor$CallableWithPermitRelease.call(SemaphoredDelegatingExecutor.java:222)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{code}

> Improve S3A error handling & reporting
> --------------------------------------
>
>                 Key: HADOOP-14531
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14531
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 2.8.1
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Blocker
>
> Improve S3a error handling and reporting
> this includes
> # looking at error codes and translating to more specific exceptions
> # better retry logic where present
> # adding retry logic where not present
> # more diagnostics in exceptions 
> # docs
> Overall goals
> * things that can be retried and will go away are retried for a bit
> * things that don't go away when retried failfast (302, no auth, unknown host, connection refused)
> * meaningful exceptions are built in translate exception
> * diagnostics are included, where possible
> * our troubleshooting docs are expanded with new failures we encounter
> AWS S3 error codes: http://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org