You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Nicolas Paris <ni...@gmail.com> on 2015/12/28 17:32:34 UTC

Drill & Remote database Question

Hello Drill users,

I have multiple kind of remote database (postgresql, mongodb..)
I have tested to query them within drill.
It appears drill first ingest data from sources, and then query them.

EG:

CREATE TABLE dfs.tmp.test AS
SELECT * FROM postgresql.schem.table1 t1
JOIN  postgresql.schem.table2 t2
ON t1.id = t2.id

Leads to :
1)copy table1 to drill
2)copy table2 to drill
3)Join on drill
4)result

It can be problematic if I tables are huge.

Is there a solution to make remote database do the join ?
I could create a view in postgresql/mongodb and query it, is it the
only solution ?

Thanks by advance,

Re: Drill & Remote database Question

Posted by Tomer Shiran <ts...@dremio.com>.
Drill should push the join down into the database. Is that the actual
query? Can you provide the full query and query profile.

Thanks

On Monday, December 28, 2015, Nicolas Paris <ni...@gmail.com> wrote:

> Hello Drill users,
>
> I have multiple kind of remote database (postgresql, mongodb..)
> I have tested to query them within drill.
> It appears drill first ingest data from sources, and then query them.
>
> EG:
>
> CREATE TABLE dfs.tmp.test AS
> SELECT * FROM postgresql.schem.table1 t1
> JOIN  postgresql.schem.table2 t2
> ON t1.id = t2.id
>
> Leads to :
> 1)copy table1 to drill
> 2)copy table2 to drill
> 3)Join on drill
> 4)result
>
> It can be problematic if I tables are huge.
>
> Is there a solution to make remote database do the join ?
> I could create a view in postgresql/mongodb and query it, is it the
> only solution ?
>
> Thanks by advance,
>


-- 
Tomer Shiran
CEO and Co-Founder, Dremio