You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Brahma Reddy Battula (Jira)" <ji...@apache.org> on 2020/04/09 18:51:04 UTC

[jira] [Updated] (HADOOP-16241) S3AInputStream PositionReadable should perform ranged read on dedicated stream

     [ https://issues.apache.org/jira/browse/HADOOP-16241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brahma Reddy Battula updated HADOOP-16241:
------------------------------------------
    Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a blocker.

> S3AInputStream PositionReadable should perform ranged read on dedicated stream 
> -------------------------------------------------------------------------------
>
>                 Key: HADOOP-16241
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16241
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>         Attachments: Impala-TPCDS-scans.zip, impala-web_returns-scan-flamegraph.svg
>
>
> The current implementation of {{PositionReadable}} in {{S3AInputStream}} is pretty close to the default implementation in {{FsInputStream}}.
> This JIRA proposes overriding the {{read(long position, byte[] buffer, int offset, int length)}} method and re-implementing the {{readFully(long position, byte[] buffer, int offset, int length)}} method in S3A.
> The new implementation would perform a "ranged read" on a dedicated object stream (rather than the shared one). Prototypes have shown this to bring a considerable performance improvement to readers who are only interested in reading a random chunk of the file at a time (e.g. Impala, although I would assume HBase would benefit from this as well).
> Setting {{fs.s3a.experimental.input.fadvise}} to {{RANDOM}} is helpful for clients that rely on pread, but has a few drawbacks:
>  * Unless the client explicitly sets fadvise to RANDOM, they will get at least one connection reset when the backwards seek is issued (after which fadvise automatically switches to RANDOM)
>  * Data is only read in 64 kb chunks, so for a large read, several GET requests must be issued to S3 to fetch the data; while the 64 kb chunk value is configurable, it is hard to set a reasonable value for variable length preads
>  * If the readahead value is too big, closing the input stream can take considerable time because the stream has to be drained of data before it can be closed
> The new implementation of {{PositionReadable}} would issue a {{GetObjectRequest}} with the range specified by {{position}} and the size of the given buffer. The data would be read from the {{S3ObjectInputStream}} and then closed at the end of the method. This stream would be independent of the {{wrappedStream}} currently maintained by S3A.
> This brings the following benefits:
>  * The {{PositionedReadable}} methods can be thread-safe without a {{synchronized}} block, which allows clients to concurrently call pread methods on the same {{S3AInputStream}} instance
>  * preads will request all the data at once rather than requesting it in chunks via the readahead logic
>  * Avoids performing potentially expensive seeks when performing preads



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org