You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Ajith S (JIRA)" <ji...@apache.org> on 2015/09/04 10:42:46 UTC
[jira] [Commented] (HADOOP-12376) S3NInputStream.close() downloads
the remaining bytes of the object from S3
[ https://issues.apache.org/jira/browse/HADOOP-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730525#comment-14730525 ]
Ajith S commented on HADOOP-12376:
----------------------------------
Hi
After initial analysis i found that in the jets3t jar, this particular change
https://bitbucket.org/jmurty/jets3t/diff/src/org/jets3t/service/impl/rest/httpclient/HttpMethodReleaseInputStream.java?diff2=3709f8458ba6&at=default
{code}
if (!underlyingStreamConsumed) {
// Underlying input stream has not been consumed, abort method
// to force connection to be closed and cleaned-up.
- httpMethod.abort();
+ httpResponse.getEntity().consumeContent(); //Current version consumes entity in a utility
}
- httpMethod.releaseConnection();
alreadyReleased = true;
}
{code}
is causing the issue as instead of aborting it chooses to consume the stream before closing
> S3NInputStream.close() downloads the remaining bytes of the object from S3
> --------------------------------------------------------------------------
>
> Key: HADOOP-12376
> URL: https://issues.apache.org/jira/browse/HADOOP-12376
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 2.6.0, 2.7.1
> Reporter: Steve Loughran
> Assignee: Ajith S
>
> This is the same as HADOOP-11570, possibly the swift code has the same problem.
> Apparently (as raised on ASF lists), when you close an s3n input stream, it
> reads through the remainder of the file. This kills performance on partial reads of large files.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)