You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2019/05/08 15:39:30 UTC

[GitHub] [hadoop] ben-roling commented on issue #794: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

ben-roling commented on issue #794: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/794#issuecomment-490537876
 
 
   I've pushed a commit that adds retries as discussed in https://github.com/apache/hadoop/pull/675#issuecomment-488614814
   
   The retries happen in S3AInputStream if the version doesn't match on initial open.  There are no retries if the version doesn't match on re-open (during seek() backwards).
   
   Retries also happen for rename() and select().
   
   Testing was added in ITestS3ARemoteFileChanged.  I used Mockito.spy() on the s3 client to stub in inconsistent responses until a threshold of retries is met.
   
   I've run the full test suite (against a bucket with versioning enabled in us-west-2):
   
   ```
   mvn -T 1C verify -Dparallel-tests -DtestsThreadCount=8 -Ds3guard -Ddynamo
   ```
   
   ```
   [ERROR] Tests run: 896, Failures: 0, Errors: 2, Skipped: 145
   ```
   
   The two errors were in ITestDirectoryCommitMRJob and  ITestS3GuardConcurrentOps, which succeeded when run individually:
   
   ```
   mvn -T 1C verify -Dtest=skip -Dit.test=ITestDirectoryCommitMRJob -Ds3guard -Ddynamo
   mvn -T 1C verify -Dtest=skip -Dit.test=ITestS3GuardConcurrentOps -Ds3guard -Ddynamo
   ```
   
   https://github.com/apache/hadoop/pull/675#issuecomment-488614814 suggests possibly different retry settings for these scenarios.  I haven't done that, at least yet.  Perhaps that can be carved off as another issue.  Similarly, I haven't implemented the HADOOP-13293 proposal.  I'm open to those things but would like to get the rest of this settled (merged) first if possible.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org