You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2021/07/23 13:32:46 UTC

[GitHub] [hadoop] steveloughran commented on pull request #3222: HADOOP-17812. NPE in S3AInputStream read() after failure to reconnect to store

steveloughran commented on pull request #3222:
URL: https://github.com/apache/hadoop/pull/3222#issuecomment-885641933


   > when wrappedStream is null, the IOException is thrown, then the catch block will call onReadFailure to retry.
   
   yes, but the exception raised is an IOE, *not the underlying cause*, so the retry logic won't examine failure, it will simply give up.
   
   If you look at the [S3ARetryPolicy](https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ARetryPolicy.java#L176) you can see how no attempt is made to retry a generic IOE. Therefore there will be precisely one retry attempt (the exception handler), and if that doesn't fix it (e.g. server not yet recovered): Failure.
   
   >  for the code suggestion you gave is the same as the PR
   
   I am proposing that on the entry to method, the full attempt to reconnect is made if to the stream is null
   
   In the test which this PR is *going to need*, the issue will become apparent if the simulated failure is a sequence of
   
   1. succeed, returning a stream
   2. throw SocketTimeoutException on the first read()
   3. throw ConnectTimeoutException three times
   3. then return a stream whose read() returns a character
   
   With ConnectTimeoutException being raised on the reconnect, the retry will try to connect with backoff, jitter, configurable limit. Throwing a simple IOE will fail on the first retry
   
   (test case should also setup a retry policy with a retry interval of 0ms so it doesn't trigger any delays)
   
   +@majdyz


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org