You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2016/05/25 09:14:12 UTC

[jira] [Commented] (HADOOP-13203) S3a: Consider reducing the number of connection aborts by setting correct length in s3 request

    [ https://issues.apache.org/jira/browse/HADOOP-13203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15299721#comment-15299721 ] 

Steve Loughran commented on HADOOP-13203:
-----------------------------------------

So you are proposing some shorter block size for reads, on the basis that it allows for followon GETs to use the same SSL connection?

How do you know how much to ask for? Or: how do you handle the end of the connection and so start reading the next block? Presumably the cost of that will be lower (reused connection and all), but the stream reading will need to recognise premature EOFs and react

> S3a: Consider reducing the number of connection aborts by setting correct length in s3 request
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-13203
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13203
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Rajesh Balamohan
>            Priority: Minor
>
> Currently file's "contentLength" is set as the "requestedStreamLen", when invoking S3AInputStream::reopen().  As a part of lazySeek(), sometimes the stream had to be closed and reopened. But lots of times the stream was closed with abort() causing the internal http connection to be unusable. This incurs lots of connection establishment cost in some jobs.  It would be good to set the correct value for the stream length to avoid connection aborts. 
> I will post the patch once aws tests passes in my machine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org