You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Joshua Schlesser <jo...@cursivelabs.com> on 2015/12/21 19:10:27 UTC

dirX variables don't always work on s3a datasource drill 1.3

I am querying json files on s3 using the s3a storage plugin on drill 1.3

The following query works fine
select  count(i0) from    fh.`2015/12/05` where dir0 = '23’;

This next query doesnt
select  count(i0) from    fh.`2015/12/` where dir0 = '05' and dir1 = '23’;

My understanding is that the two queries were synonymous.

Ultimately I am working towards the following scenario.
select  count(i0)
from    fh.`2015/12`
where   ((dir0 = '05' and dir1 = '23') or (dir0 = '06' and dir1 = '00’))

My files are organized in such a way that the time of the data in the files isnt perfectly aligned with the time / location of the files themselves.   For reference, Im using aws firehose in case anybody else is using that too.

Has anybody run into the first problem or second problem and come up with a good solution to querying subsets of files in adjacent directories?

Cheers,
Josh Schlesser