You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Aditya Allamraju <ad...@gmail.com> on 2020/02/05 20:41:37 UTC

Drill Fragment setup time

Team,

Is there a way to reduce the "setup time" for a minor fragment?
In my case, it's Drill on Mapr-db JSON table.

As per documentation, it is time consumed for "runtime code generation and
opening a file".
While going through a query profile i see below:

Minor Fragment Hostname Setup Time Process Time Wait Time Max Batches Max
Records Peak Memory
07-00-03 hostA.com *6.242s* 0.384s 0.000s 3 10,235 7MB

Thanks
Aditya

Re: Drill Fragment setup time

Posted by Aditya Allamraju <ad...@gmail.com>.
Hi Paul,

The total query execution time itself is not crossing 18 secs which
includes 1.9 secs of planning time. But going through the profile to see
where is the
time being spent, we saw 6 secs for "setup time" like above.

How different is "setup time" from planning time?

Duration
<https://10.10.72.204:8047/profiles/21cd5153-ab56-ed96-1b7e-4fa99cf81687#query-profile-duration>
PlanningQueuedExecutionTotal
1.903 sec 0.004 sec 16.858 sec 18.765 sec
To your question on how complex the query is, it's a 4-table join with few
predicates to make sure indexes are picked.
But they have nearly 70 columns selected from one of the table. Not sure if
that matters.

Thanks
Aditya

On Wed, Feb 5, 2020 at 1:55 PM Paul Rogers <pa...@yahoo.com.invalid>
wrote:

> Hi Aditya,
>
> While I cannot comment on MapR-DB in particular, I can say that, in
> general, Drill is designed for fairly large queries. There is a trade-off
> between the overhead of code gen and planning vs. the cost at runtime.
> Drill tends to invest more in up-front planning and code gen to minimize
> runtime costs.
>
> Of course, if your query scans just a few rows (MapR-DB has indexes), then
> Drill's trade-off might not work out as well as if Drill were scanning
> multiple GBs of data.
>
> That said, 6 seconds seems like a long time. In my experience, Drill can
> setup and execute queries in a few hundred ms. So, there are two possible
> sources of delay.
>
> First, how complex is the query? Simple queries should be very fast. If,
> however, you have a very large number of columns or GROUP BY keys, etc.
> then we have occasionally seen longer planning times.
>
> The other possible delay would relate to interaction with MapR-DB. For
> that, MapR folks would have better insight and might offer ways of
> identifying and resolving any issues.
>
> Thanks,
> - Paul
>
>
>
>     On Wednesday, February 5, 2020, 12:41:56 PM PST, Aditya Allamraju <
> aditya.allamraju@gmail.com> wrote:
>
>  Team,
>
> Is there a way to reduce the "setup time" for a minor fragment?
> In my case, it's Drill on Mapr-db JSON table.
>
> As per documentation, it is time consumed for "runtime code generation and
> opening a file".
> While going through a query profile i see below:
>
> Minor Fragment Hostname Setup Time Process Time Wait Time Max Batches Max
> Records Peak Memory
> 07-00-03 hostA.com *6.242s* 0.384s 0.000s 3 10,235 7MB
>
> Thanks
> Aditya
>

Re: Drill Fragment setup time

Posted by Paul Rogers <pa...@yahoo.com.INVALID>.
Hi Aditya,

While I cannot comment on MapR-DB in particular, I can say that, in general, Drill is designed for fairly large queries. There is a trade-off between the overhead of code gen and planning vs. the cost at runtime. Drill tends to invest more in up-front planning and code gen to minimize runtime costs.

Of course, if your query scans just a few rows (MapR-DB has indexes), then Drill's trade-off might not work out as well as if Drill were scanning multiple GBs of data.

That said, 6 seconds seems like a long time. In my experience, Drill can setup and execute queries in a few hundred ms. So, there are two possible sources of delay.

First, how complex is the query? Simple queries should be very fast. If, however, you have a very large number of columns or GROUP BY keys, etc. then we have occasionally seen longer planning times.

The other possible delay would relate to interaction with MapR-DB. For that, MapR folks would have better insight and might offer ways of identifying and resolving any issues.

Thanks,
- Paul

 

    On Wednesday, February 5, 2020, 12:41:56 PM PST, Aditya Allamraju <ad...@gmail.com> wrote:  
 
 Team,

Is there a way to reduce the "setup time" for a minor fragment?
In my case, it's Drill on Mapr-db JSON table.

As per documentation, it is time consumed for "runtime code generation and
opening a file".
While going through a query profile i see below:

Minor Fragment Hostname Setup Time Process Time Wait Time Max Batches Max
Records Peak Memory
07-00-03 hostA.com *6.242s* 0.384s 0.000s 3 10,235 7MB

Thanks
Aditya