You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Rajesh Balamohan (JIRA)" <ji...@apache.org> on 2016/02/25 07:05:18 UTC
[jira] [Updated] (HADOOP-12444) Consider implementing lazy seek in
S3AInputStream
[ https://issues.apache.org/jira/browse/HADOOP-12444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rajesh Balamohan updated HADOOP-12444:
--------------------------------------
Attachment: HADOOP-12444.3.patch
Thanks [~thodemoor]. Attaching the revised patch. I will upload the test report shortly.
There are the 2 tests which fail in both master and with-patch.
AWS tests without patch (“mvn clean package” from hadoop/hadoop-tools/hadoop-aws):
======================================================
Results :
========
Failed tests:
TestS3Credentials.noSecretShouldThrow Expected exception: java.lang.IllegalArgumentException
TestS3Credentials.noAccessIdShouldThrow Expected exception: java.lang.IllegalArgumentException
Tests in error:
TestS3AContractRootDir>AbstractContractRootDirectoryTest.testListEmptyRootDirectory:134 » FileNotFound
TestS3AConfiguration.TestAutomaticProxyPortSelection:138 » AmazonS3 Forbidden ...
Tests run: 220, Failures: 2, Errors: 2, Skipped: 6
AWS tests with patch
================
Results :
========
Failed tests:
TestS3Credentials.noSecretShouldThrow Expected exception: java.lang.IllegalArgumentException
TestS3Credentials.noAccessIdShouldThrow Expected exception: java.lang.IllegalArgumentException
Tests in error:
TestS3AContractRootDir>AbstractContractRootDirectoryTest.testListEmptyRootDirectory:134 » FileNotFound
TestS3AConfiguration.TestAutomaticProxyPortSelection:138 » AmazonS3 Forbidden ...
Tests run: 220, Failures: 2, Errors: 2, Skipped: 6
{noformat}
Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 8.75 sec <<< FAILURE! - in org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir
testListEmptyRootDirectory(org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir) Time elapsed: 1.633 sec <<< ERROR!
java.io.FileNotFoundException: No such file or directory: /
at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1000)
at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:738)
at org.apache.hadoop.fs.contract.AbstractContractRootDirectoryTest.testListEmptyRootDirectory(AbstractContractRootDirectoryTest.java:134)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
TestAutomaticProxyPortSelection(org.apache.hadoop.fs.s3a.TestS3AConfiguration) Time elapsed: 620.356 sec <<< ERROR!
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: null)
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1182)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:770)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3785)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3738)
at com.amazonaws.services.s3.AmazonS3Client.listMultipartUploads(AmazonS3Client.java:2796)
at com.amazonaws.services.s3.transfer.TransferManager.abortMultipartUploads(TransferManager.java:1217)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:313)
at org.apache.hadoop.fs.s3a.S3ATestUtils.createTestFileSystem(S3ATestUtils.java:51)
at org.apache.hadoop.fs.s3a.TestS3AConfiguration.TestAutomaticProxyPortSelection(TestS3AConfiguration.java:138)
{noformat}
> Consider implementing lazy seek in S3AInputStream
> -------------------------------------------------
>
> Key: HADOOP-12444
> URL: https://issues.apache.org/jira/browse/HADOOP-12444
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 2.7.1
> Reporter: Rajesh Balamohan
> Assignee: Rajesh Balamohan
> Attachments: HADOOP-12444.1.patch, HADOOP-12444.2.patch, HADOOP-12444.3.patch, HADOOP-12444.WIP.patch
>
>
> - Currently, "read(long position, byte[] buffer, int offset, int length)" is not implemented in S3AInputStream (unlike DFSInputStream). So, "readFully(long position, byte[] buffer, int offset, int length)" in S3AInputStream goes through the default implementation of seek(), read(), seek() in FSInputStream.
> - However, seek() in S3AInputStream involves re-opening of connection to S3 everytime (https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L115).
> - It would be good to consider having a lazy seek implementation to reduce connection overheads to S3. (e.g Presto implements lazy seek. https://github.com/facebook/presto/blob/master/presto-hive/src/main/java/com/facebook/presto/hive/PrestoS3FileSystem.java#L623)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)