You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by imbar marinescu <im...@gmail.com> on 2016/08/11 16:22:41 UTC
Performance question
Hi,
I'm looking into drill, to use it as an in memory db.
I wanted to handle data that I have in a Sql Server db.
I connected with an Sql Server jdbc plug in, and my test query ran for
about 2 sec.
When running directly from Sql Server it took 0.15 sec.
I ran a "create table" as a parquet file and then tried to query with dfs
plug in.
The query ran for 0.5 sec (after caching. first run is about 3 sec).
Also tried to do "REFRESH TABLE METADATA", but it didn't change anything.
My Test query is:
select sum(f.Sales), p.`Product Category`
from dfs.tmp.`/Demo/Facts/` f
join dfs.tmp.`/Demo/Product/` p on p.productKey = f.productKey
group by p.`Product Category`;
Facts table has 422,833 rows, product has 606.
The result set is 4 rows.
This was done running drill locally (embedded) on a windows machine.
I tried a linux machine, but the results where even slower.
I didn't configure anything, just used the install as-is.
Am I doing something wrong? Is a RDBMS going to be faster anyway?
I read about the performance and I feel I'm not getting there.
SqlServer: 0.15 sec.
SqlServer in drill: 2 sec.
Parquet in drill: 0.5 sec.
Thank you,
Imbar
Re: Performance question
Posted by Zelaine Fong <zf...@maprtech.com>.
What does the query plan look like when you're using SqlServer with Drill?
I'm guessing that the join isn't being pushed down to SqlServer. If so,
you've hit DRILL-4818. There are known limitations with the JDBC storage
plugin that prevent it from generating the optimal query plan in cases like
this.
-- Zelaine
On Thu, Aug 11, 2016 at 9:22 AM, imbar marinescu <im...@gmail.com> wrote:
> Hi,
>
> I'm looking into drill, to use it as an in memory db.
> I wanted to handle data that I have in a Sql Server db.
> I connected with an Sql Server jdbc plug in, and my test query ran for
> about 2 sec.
> When running directly from Sql Server it took 0.15 sec.
>
> I ran a "create table" as a parquet file and then tried to query with dfs
> plug in.
> The query ran for 0.5 sec (after caching. first run is about 3 sec).
> Also tried to do "REFRESH TABLE METADATA", but it didn't change anything.
>
> My Test query is:
> select sum(f.Sales), p.`Product Category`
> from dfs.tmp.`/Demo/Facts/` f
> join dfs.tmp.`/Demo/Product/` p on p.productKey = f.productKey
> group by p.`Product Category`;
>
> Facts table has 422,833 rows, product has 606.
> The result set is 4 rows.
>
> This was done running drill locally (embedded) on a windows machine.
> I tried a linux machine, but the results where even slower.
>
> I didn't configure anything, just used the install as-is.
>
> Am I doing something wrong? Is a RDBMS going to be faster anyway?
> I read about the performance and I feel I'm not getting there.
>
> SqlServer: 0.15 sec.
> SqlServer in drill: 2 sec.
> Parquet in drill: 0.5 sec.
>
> Thank you,
> Imbar
>
Re: Performance question
Posted by imbar marinescu <im...@gmail.com>.
I also checked on Microsoft Tabular, and the same query came back within
0.01 sec.
That is amazing!
2016-08-11 19:22 GMT+03:00 imbar marinescu <im...@gmail.com>:
> Hi,
>
> I'm looking into drill, to use it as an in memory db.
> I wanted to handle data that I have in a Sql Server db.
> I connected with an Sql Server jdbc plug in, and my test query ran for
> about 2 sec.
> When running directly from Sql Server it took 0.15 sec.
>
> I ran a "create table" as a parquet file and then tried to query with dfs
> plug in.
> The query ran for 0.5 sec (after caching. first run is about 3 sec).
> Also tried to do "REFRESH TABLE METADATA", but it didn't change anything.
>
> My Test query is:
> select sum(f.Sales), p.`Product Category`
> from dfs.tmp.`/Demo/Facts/` f
> join dfs.tmp.`/Demo/Product/` p on p.productKey = f.productKey
> group by p.`Product Category`;
>
> Facts table has 422,833 rows, product has 606.
> The result set is 4 rows.
>
> This was done running drill locally (embedded) on a windows machine.
> I tried a linux machine, but the results where even slower.
>
> I didn't configure anything, just used the install as-is.
>
> Am I doing something wrong? Is a RDBMS going to be faster anyway?
> I read about the performance and I feel I'm not getting there.
>
> SqlServer: 0.15 sec.
> SqlServer in drill: 2 sec.
> Parquet in drill: 0.5 sec.
>
> Thank you,
> Imbar
>