You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Marc Sole Fonte <ms...@iti.es> on 2020/06/02 05:26:02 UTC

Re: Excesive Execution Time querying RDBMS

Hello,

I am updating this as I have still not received any answer. I hope you can help me.

Thank you very much,
Marc

On Mon, 2020-05-25 at 14:39 +0200, Marc Solé wrote:
Hello,

So we finally updated to 1.17.0 and it improved a lot our times but a new problem appeared. It looks like it is related to limits.

If we execute a simple query it now takes a lot of less time. It is acceptable and it is not a problem anymore. However, if you use a limit in that query (+4) it starts to take a lot of execution time. It is just that weird. It looks like if we use big limits (like 10000) then times are acceptable again. Here is a table:

Time    User    Query   State   Duration        Foreman
05/25/2020 11:16:50     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 10000<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/213456dd-07b6-fa01-e2bd-386cf5a1bc4c>     Succeeded       1.990 sec       6669a55236df
05/25/2020 11:15:31     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 4<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/2134572c-12f7-3644-8391-cde6ebd438c6> Succeeded       29.999 sec      6669a55236df
05/25/2020 11:15:22     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 3<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/21345735-1122-e653-d232-6b1e3903eab5> Succeeded       1.332 sec       6669a55236df
05/25/2020 11:14:41     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 4<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/2134575d-887b-1202-1a13-1eb8cf3e6a8e> Succeeded       29.612 sec      6669a55236df
05/25/2020 11:13:50     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 5<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/21345790-f4bb-4a51-14b5-c650357b5a7f> Succeeded       30.446 sec      6669a55236df
05/25/2020 11:12:58     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 7<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/213457c5-7d3f-4807-f89a-16a81ed80e17> Succeeded       30.343 sec      6669a55236df
05/25/2020 11:12:43     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 3<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/213457d3-c44d-9c9f-df65-8457436b721a> Succeeded       0.519 sec       6669a55236df
05/25/2020 11:12:23     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 2<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/213457e7-d713-7e98-c828-6aa9d765e1ae> Succeeded       0.527 sec       6669a55236df
05/25/2020 11:11:55     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 1<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/21345803-c381-e188-d4a5-561263ee27c0> Succeeded       0.622 sec       6669a55236df
05/25/2020 11:11:10     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 10<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/21345830-807e-2fbf-2967-039bb73cabdd>        Succeeded       32.111 sec      6669a55236df

It just does not make sense for us. When duration is big we are talking about execution time. Database works correctly if accedesed from without making use of Drill.

Any idea about how to solve it?

Thank you again,
Marc

On Thu, 2020-05-14 at 14:39 +0000, Marc Sole Fonte wrote:
Hello,

Still unable to answer to your asnwers. I think I already fixed it for the following ones.

I am going to try to use 1.17.0, it is not exactly easy for us, as it is part of a bigger project, but I can try.

Anything else I can have in mind to check? I have seen there are a lot of tunning options.

Thank you,
Marc

On Thu, 2020-05-14 at 15:39 +0200, Marc Solé wrote:
Hello,

I have seen an answer but I am not able to answer it directly. The Drill version I am using is 1.16.0.

The transcript for the images is more or less the following one:

Planning Queued Execution Total
0.219 sec 0.070 sec 15.166 sec 15.455 sec
0.186 sec 0.036 sec 16.944 sec 17.166 sec

If you need more data, feel free to ask for it. Thank you for your help.

Marc


On Thu, 2020-05-14 at 12:50 +0200, Marc Solé wrote:
Hello,

First of all, I hope you and your families are all safe and healthy. Also, thank you in advance for your support.

I have a problem with Apache Drill conected to CockroachDB (PostgreSQL) using the RDBMS Storage Plugin. It works, but it is extremely slow. My experience with Drill has showed me a lot of scenarios where the first query is slow, due to planification, but the next ones are acceptable, assuming the time consumed by the man in the middle. However, this does not seem normal at all.

Running a small and simple query like this one: SELECT * FROM db_scott_centria.public.centria_raw LIMIT 1, execution times are completely over the limit. It takes about 15 seconds of Execution Time. Both Apache Drill and the database are in the same network, and a query from terminal from the same takes only some milliseconds. Where is the problem? What can I do to improve that time?

Here are some pictures with examples:

[cid:97eb2ea0977350cc3ca8583db031c2da9c332c32.camel@iti.es]

[cid:75d46626b4961aace77f136325391a8fb7002d04.camel@iti.es]

Thank you very much for your help, I hope you can help me.

Marc

Re: Excesive Execution Time querying RDBMS

Posted by Charles Givre <cg...@gmail.com>.
Hi Marc, 
I'm wondering if the limit is not being pushed down to the database or there is something wrong with the mechanism for that.  

A few questions:
1.  What database are you querying?
2.  Can you run the following queries and share the results:

EXPLAIN PLAN FOR SELECT * FROM db_scott_centria.public.centria_raw LIMIT 4 (or any small number)

EXPLAIN PLAN FOR SELECT * FROM db_scott_centria.public.centria_raw LIMIT 10000 (or any large number)

Also, if you have access to the query logs from the database, can share what query Drill actually is transmitting to the database?  

Thanks,
-- C


> On Jun 2, 2020, at 1:26 AM, Marc Sole Fonte <ms...@iti.es> wrote:
> 
> Hello,
> 
> I am updating this as I have still not received any answer. I hope you can help me.
> 
> Thank you very much,
> Marc
> 
> On Mon, 2020-05-25 at 14:39 +0200, Marc Solé wrote:
> Hello,
> 
> So we finally updated to 1.17.0 and it improved a lot our times but a new problem appeared. It looks like it is related to limits.
> 
> If we execute a simple query it now takes a lot of less time. It is acceptable and it is not a problem anymore. However, if you use a limit in that query (+4) it starts to take a lot of execution time. It is just that weird. It looks like if we use big limits (like 10000) then times are acceptable again. Here is a table:
> 
> Time    User    Query   State   Duration        Foreman
> 05/25/2020 11:16:50     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 10000<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/213456dd-07b6-fa01-e2bd-386cf5a1bc4c>     Succeeded       1.990 sec       6669a55236df
> 05/25/2020 11:15:31     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 4<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/2134572c-12f7-3644-8391-cde6ebd438c6> Succeeded       29.999 sec      6669a55236df
> 05/25/2020 11:15:22     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 3<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/21345735-1122-e653-d232-6b1e3903eab5> Succeeded       1.332 sec       6669a55236df
> 05/25/2020 11:14:41     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 4<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/2134575d-887b-1202-1a13-1eb8cf3e6a8e> Succeeded       29.612 sec      6669a55236df
> 05/25/2020 11:13:50     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 5<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/21345790-f4bb-4a51-14b5-c650357b5a7f> Succeeded       30.446 sec      6669a55236df
> 05/25/2020 11:12:58     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 7<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/213457c5-7d3f-4807-f89a-16a81ed80e17> Succeeded       30.343 sec      6669a55236df
> 05/25/2020 11:12:43     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 3<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/213457d3-c44d-9c9f-df65-8457436b721a> Succeeded       0.519 sec       6669a55236df
> 05/25/2020 11:12:23     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 2<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/213457e7-d713-7e98-c828-6aa9d765e1ae> Succeeded       0.527 sec       6669a55236df
> 05/25/2020 11:11:55     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 1<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/21345803-c381-e188-d4a5-561263ee27c0> Succeeded       0.622 sec       6669a55236df
> 05/25/2020 11:11:10     anonymous       SELECT * FROM db_scott_centria.public.centria_raw LIMIT 10<https://prod-2-4-drill-api-1589977009806.itiprod.slap53.iti.es/profiles/21345830-807e-2fbf-2967-039bb73cabdd>        Succeeded       32.111 sec      6669a55236df
> 
> It just does not make sense for us. When duration is big we are talking about execution time. Database works correctly if accedesed from without making use of Drill.
> 
> Any idea about how to solve it?
> 
> Thank you again,
> Marc
> 
> On Thu, 2020-05-14 at 14:39 +0000, Marc Sole Fonte wrote:
> Hello,
> 
> Still unable to answer to your asnwers. I think I already fixed it for the following ones.
> 
> I am going to try to use 1.17.0, it is not exactly easy for us, as it is part of a bigger project, but I can try.
> 
> Anything else I can have in mind to check? I have seen there are a lot of tunning options.
> 
> Thank you,
> Marc
> 
> On Thu, 2020-05-14 at 15:39 +0200, Marc Solé wrote:
> Hello,
> 
> I have seen an answer but I am not able to answer it directly. The Drill version I am using is 1.16.0.
> 
> The transcript for the images is more or less the following one:
> 
> Planning Queued Execution Total
> 0.219 sec 0.070 sec 15.166 sec 15.455 sec
> 0.186 sec 0.036 sec 16.944 sec 17.166 sec
> 
> If you need more data, feel free to ask for it. Thank you for your help.
> 
> Marc
> 
> 
> On Thu, 2020-05-14 at 12:50 +0200, Marc Solé wrote:
> Hello,
> 
> First of all, I hope you and your families are all safe and healthy. Also, thank you in advance for your support.
> 
> I have a problem with Apache Drill conected to CockroachDB (PostgreSQL) using the RDBMS Storage Plugin. It works, but it is extremely slow. My experience with Drill has showed me a lot of scenarios where the first query is slow, due to planification, but the next ones are acceptable, assuming the time consumed by the man in the middle. However, this does not seem normal at all.
> 
> Running a small and simple query like this one: SELECT * FROM db_scott_centria.public.centria_raw LIMIT 1, execution times are completely over the limit. It takes about 15 seconds of Execution Time. Both Apache Drill and the database are in the same network, and a query from terminal from the same takes only some milliseconds. Where is the problem? What can I do to improve that time?
> 
> Here are some pictures with examples:
> 
> [cid:97eb2ea0977350cc3ca8583db031c2da9c332c32.camel@iti.es]
> 
> [cid:75d46626b4961aace77f136325391a8fb7002d04.camel@iti.es]
> 
> Thank you very much for your help, I hope you can help me.
> 
> Marc