You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Prabhakar Bhosaale <bh...@gmail.com> on 2020/02/08 05:55:46 UTC

Re: Querying json files from multiple subdirectories

Hi Charles,
Another option which i found to query sub-directories is using directory
tree notations. like dir0 and dir1. So i will be using

Select * from transactions where dir0=2012  or
select * from transactions where dir0 in (2012, 2014, 2016) or
select * from transactions where dir0 between 2012 and 2016

I hope this helps others also.  Thanks for your replies.

Regards
Prabhakar

On Sun, Jan 19, 2020 at 12:21 PM Prabhakar Bhosaale <bh...@gmail.com>
wrote:

> Thanks charles. Will try few options and get back to you.
>
> Regards
> Prabhakar
>
> On Sun, Jan 19, 2020, 04:45 Charles Givre <cg...@gmail.com> wrote:
>
>> Hi Prabhakar,
>> You'll need to find some common identifier for the files you want to
>> query.
>> It could be something like:
>>
>> SELECT
>> FROM dfs.`<path>/Year*/`
>>
>> Alternatively, you could have multiple SELECT queries and join them
>> together via a UNION statement.  IE:
>>
>> SELECT * FROM dfs.`Year2013/trans.json`
>> UNION
>> SELECT * FROM dfs.`Year2014/trans.json`
>>
>>
>>
>> -- C
>>
>> > On Jan 17, 2020, at 11:07 PM, Prabhakar Bhosaale <bh...@gmail.com>
>> wrote:
>> >
>> > Hi Charls,
>> > Thanks for your suggestion. Actually the transactions folder will have
>> more
>> > yearwise folder. But i want to query only few folders at a time. The
>> >
>> > Regards
>> > Prabhakar
>> >
>> > On Fri, Jan 17, 2020, 20:01 Charles Givre <cg...@gmail.com> wrote:
>> >
>> >> Hi there,
>> >> If you have that directory structure, the following query should work:
>> >>
>> >> SELECT *
>> >> FROM dfs.<workspace>.`transactions/` as t1
>> >>
>> >> Obviously replacing <workspace> with your workspace.  You can then join
>> >> that with anything that Drill can query.
>> >> Best,
>> >> -- C
>> >>
>> >>
>> >>
>> >>> On Jan 17, 2020, at 1:27 AM, Prabhakar Bhosaale <
>> bhosale.p.v@gmail.com>
>> >> wrote:
>> >>>
>> >>> Hi All,
>> >>>
>> >>> I am new to apache drill and trying to retrieve data from json files
>> by
>> >>> querying the directories.
>> >>>
>> >>> The directory structure is
>> >>>
>> >>>                       |------>Year2012--->trans.json
>> >>>                       |
>> >>>                       |
>> >>> transactions-->|
>> >>>                       |
>> >>>                       |------>Year2013--->trans.json
>> >>>
>> >>> I would like to query trans.json from both the sub-directories as one
>> >> table
>> >>> and then join the resultant table with another table in a single
>> query.
>> >>> Please help with possible options. thx
>> >>>
>> >>> Regards
>> >>
>> >>
>>
>>