You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Charles Givre <cg...@gmail.com> on 2017/11/05 01:29:34 UTC

S3 Performance

Hello everyone, 
I’m experimenting with Drill on S3 and I’ve been pretty disappointed with the performance.  I’m curious as to what kind of performance I can expect?  Also what can be done to improve performance on S3.  My current config is I am using Drill in embedded mode with a corporate S3 bucket. 
Thanks,
— C

Re: S3 Performance

Posted by Padma Penumarthy <pp...@mapr.com>.
Hi Uwe,

This is lot of good information. We should document it in a JIRA.

BTW, I just checked and apparently, Hadoop 2.8.2 is released recently, which they claim
is the first GA release. 
I think we can attempt to move to Hadoop 2.8.2 after Drill 1.12 is released.  
Yes, some unit tests were failing last time I tried 2.8.1.  But, I think we can fix them.

Thanks
Padma


> On Nov 5, 2017, at 8:27 AM, Uwe L. Korn <uw...@xhochy.com> wrote:
> 
> Hello Charles, 
> 
> I ran into the same performance issues some time ago and did make some
> discoveries:
> 
> * Drill is good at only pulling the byte ranges out of the file system
> it needs. Sadly, s3a in Hadoop 2.7 is translating a request to the byte
> range (x,y) into a HTTP request to S3 of the byte range
> (x,end-of-file). In the case of Parquet, this means that you will read
> for each column in each row group from the beginning of this column
> chunk to the end of the file. Overall this amounted for me for a
> traffic of 10-20x the size of the actual file in total.
> * Hadoop 2.8/3.0 actually introduces a new S3 experimental random
> access mode that really improves performance as this will only send
> requests of (x, y+readahead.range) to S3. You can activate it with
> fs.s3a.experimental.input.fadvise=random.
> * I played a bit with fs.s3a.readahead.range which is optimistic range
> that is included in the request but actually found that I could keep it
> at its default of 65536 bytes as Drill often requests all bytes it
> needs at once and thus reading ahead did not improve the situation.
> * This random access mode plays well with Parquet files but sadly
> slowed down the read of the metadata cache drastically as only requests
> of the size 65540 were done to S3. Therefore I had to add
> is.setReadahead(filesize); after
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/Metadata.java#L593
> to ensure that the metadata cache is read at once from S3.
> * Also
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/Metadata.java#L662
> seem to have been always true in my case, causing a refresh of the
> cache on every query. As I had quite a big dataset, this added a large
> constant to my query. This might be simply due to the fact that S3 does
> not have the concept of directories. I have not digged deeper into this
> but added as a dirty workaround that once the cache exists, it is never
> updated automatically.
> 
> Locally I have made my own Drill build based on the Hadoop 2.8 libraries
> but sadly some unit tests failed, at least for the S3 testing,
> everything seems to work. Work is still on the 1.11 release sources and
> some code has changed since then. I will have some time in the next
> days/weeks to look again at this and might open some PRs (don't expect
> me to be the one to open the Hadoop-Update PR, I'm a full-time Python
> dev, so this is a bit out of my comfort zone :D ).  At least for my
> basic tests, this resulted in a quite performant setup for me (embedded
> and in distributed mode).
> 
> Cheers
> Uwe
> 
> On Sun, Nov 5, 2017, at 02:29 AM, Charles Givre wrote:
>> Hello everyone, 
>> I’m experimenting with Drill on S3 and I’ve been pretty disappointed with
>> the performance.  I’m curious as to what kind of performance I can
>> expect?  Also what can be done to improve performance on S3.  My current
>> config is I am using Drill in embedded mode with a corporate S3 bucket. 
>> Thanks,
>> — C


Re: S3 Performance

Posted by "Uwe L. Korn" <uw...@xhochy.com>.
Hello Charles, 

I ran into the same performance issues some time ago and did make some
discoveries:

 * Drill is good at only pulling the byte ranges out of the file system
 it needs. Sadly, s3a in Hadoop 2.7 is translating a request to the byte
 range (x,y) into a HTTP request to S3 of the byte range
 (x,end-of-file). In the case of Parquet, this means that you will read
 for each column in each row group from the beginning of this column
 chunk to the end of the file. Overall this amounted for me for a
 traffic of 10-20x the size of the actual file in total.
 * Hadoop 2.8/3.0 actually introduces a new S3 experimental random
 access mode that really improves performance as this will only send
 requests of (x, y+readahead.range) to S3. You can activate it with
 fs.s3a.experimental.input.fadvise=random.
 * I played a bit with fs.s3a.readahead.range which is optimistic range
 that is included in the request but actually found that I could keep it
 at its default of 65536 bytes as Drill often requests all bytes it
 needs at once and thus reading ahead did not improve the situation.
 * This random access mode plays well with Parquet files but sadly
 slowed down the read of the metadata cache drastically as only requests
 of the size 65540 were done to S3. Therefore I had to add
 is.setReadahead(filesize); after
 https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/Metadata.java#L593
 to ensure that the metadata cache is read at once from S3.
 * Also
 https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/Metadata.java#L662
 seem to have been always true in my case, causing a refresh of the
 cache on every query. As I had quite a big dataset, this added a large
 constant to my query. This might be simply due to the fact that S3 does
 not have the concept of directories. I have not digged deeper into this
 but added as a dirty workaround that once the cache exists, it is never
 updated automatically.

Locally I have made my own Drill build based on the Hadoop 2.8 libraries
but sadly some unit tests failed, at least for the S3 testing,
everything seems to work. Work is still on the 1.11 release sources and
some code has changed since then. I will have some time in the next
days/weeks to look again at this and might open some PRs (don't expect
me to be the one to open the Hadoop-Update PR, I'm a full-time Python
dev, so this is a bit out of my comfort zone :D ).  At least for my
basic tests, this resulted in a quite performant setup for me (embedded
and in distributed mode).

Cheers
Uwe

On Sun, Nov 5, 2017, at 02:29 AM, Charles Givre wrote:
> Hello everyone, 
> I’m experimenting with Drill on S3 and I’ve been pretty disappointed with
> the performance.  I’m curious as to what kind of performance I can
> expect?  Also what can be done to improve performance on S3.  My current
> config is I am using Drill in embedded mode with a corporate S3 bucket. 
> Thanks,
> — C