You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Stephan Kölle <st...@gmail.com> on 2016/01/26 11:05:30 UTC

How to keep s3 data in memory with apache drill?

Querying JSON data stored on aws s3 with apache drill works awesome, but
drill fetches the data fresh from s3 for every query.

How to tell drill to keep the data in memory for the next query?

I got tachyon to work with drill (with the informations available on this
list) about 90%, but "SHOW FILES" on the s3 backed tachyon shows only the
tachyon "helper" files on s3 and not the real data files.

Has anyone got tachyon fully functional with drill? are the better ways to
prevent reloading from s3 (takes most of the time of the query)?

Re: How to keep s3 data in memory with apache drill?

Posted by Stefán Baxter <st...@activitystream.com>.
Hi,

I think the latest version of Tachyon uses a transparent storage structure.

Regards,
 -Stefán


On Tue, Jan 26, 2016 at 10:05 AM, Stephan Kölle <st...@gmail.com> wrote:

> Querying JSON data stored on aws s3 with apache drill works awesome, but
> drill fetches the data fresh from s3 for every query.
>
> How to tell drill to keep the data in memory for the next query?
>
> I got tachyon to work with drill (with the informations available on this
> list) about 90%, but "SHOW FILES" on the s3 backed tachyon shows only the
> tachyon "helper" files on s3 and not the real data files.
>
> Has anyone got tachyon fully functional with drill? are the better ways to
> prevent reloading from s3 (takes most of the time of the query)?
>