You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Prabhakar Bhosaale <bh...@gmail.com> on 2020/01/14 09:56:29 UTC

querying json from multiple subdirectories

Hi All,

I am new to apache drill and trying to retrieve data from json files by
querying the directories.

The directory structure is

                        |------>Year2012--->trans.json
                        |
                        |
transactions-->|
                        |
                        |------>Year2013--->trans.json

I would like to query trans.json from both the sub-directories as one table
and then join the resultant table with another table in a single query.
Please help with possible options. thx

Regards
Prabhakar

Re: querying json from multiple subdirectories

Posted by Charles Givre <cg...@gmail.com>.
Hi Prabhakar, 
I would think that the following query would work:

SELECT <fields>
FROM dfs.<workspace>.`transactions/`

That should merge everything into one table and you should get a dir0 column with the directory names.

--C

> On Jan 14, 2020, at 4:56 AM, Prabhakar Bhosaale <bh...@gmail.com> wrote:
> 
> Hi All,
> 
> I am new to apache drill and trying to retrieve data from json files by
> querying the directories.
> 
> The directory structure is
> 
>                        |------>Year2012--->trans.json
>                        |
>                        |
> transactions-->|
>                        |
>                        |------>Year2013--->trans.json
> 
> I would like to query trans.json from both the sub-directories as one table
> and then join the resultant table with another table in a single query.
> Please help with possible options. thx
> 
> Regards
> Prabhakar


Re: querying json from multiple subdirectories

Posted by Arina Yelchiyeva <ar...@gmail.com>.
Hi, 

Drill can easily query directories including subdirectories and then join data with other directories, tables etc.
Please refer to Drill documentation for more details.
For example, you can start from this article: 
https://drill.apache.org/docs/querying-directories/ <https://drill.apache.org/docs/querying-directories/> 

Kind regards,
Arina

> On Jan 14, 2020, at 11:56 AM, Prabhakar Bhosaale <bh...@gmail.com> wrote:
> 
> Hi All,
> 
> I am new to apache drill and trying to retrieve data from json files by
> querying the directories.
> 
> The directory structure is
> 
>                        |------>Year2012--->trans.json
>                        |
>                        |
> transactions-->|
>                        |
>                        |------>Year2013--->trans.json
> 
> I would like to query trans.json from both the sub-directories as one table
> and then join the resultant table with another table in a single query.
> Please help with possible options. thx
> 
> Regards
> Prabhakar