You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2017/02/11 01:58:42 UTC

[jira] [Commented] (DRILL-5195) Publish Operator and MajorFragment Stats in Profile page

    [ https://issues.apache.org/jira/browse/DRILL-5195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15862152#comment-15862152 ] 

Paul Rogers commented on DRILL-5195:
------------------------------------

I like the idea. However, the idea of "busy" is probably not exactly right. Because of the Volcano-style structure of Drill, each fragment will be, in aggregate, up to 100% "busy." But, each operator will be "busy" some slice of that percentage. (Operators run sequentially, not in parallel.)

I've found it useful to display the percent of time an operator takes within its fragment. Maybe:

* Screen: 0%
* Selection vector remover: 5%
* Sort: 70%
* Scanner: 25%

That is, all operators together sum to 100. One cannot make the SVR, say, any more "busy" without making the others less "busy". So, perhaps a better name is "% CPU".

A question arises when the query has more than one fragment. In this case, the sum of times can be 100%, but each fragment might be, say, 40% and 60% of CPU time. Would it then make sense to display the % CPU relative to the fragment or the entire query? For debugging, % of fragment is most useful. That is, we want to reduce fragment run time and must do that per-fragment; the length of time spent in other fragments has no impact on the performance of our target fragment.

For the customer, % of total query run time might be useful. Customers just want to know where time goes, regardless of our parallization/serialization rules.

> Publish Operator and MajorFragment Stats in Profile page
> --------------------------------------------------------
>
>                 Key: DRILL-5195
>                 URL: https://issues.apache.org/jira/browse/DRILL-5195
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Web Server
>    Affects Versions: 1.9.0
>            Reporter: Kunal Khatua
>            Assignee: Kunal Khatua
>
> Currently, we show runtimes for major fragments, and min,max,avg times for setup, processing and waiting for various operators.
> It would be worthwhile to have additional stats for the following:
> MajorFragment
>   %Busy - % of the active time for all the minor fragments within each major fragment that they were busy. 
> Operator Profile
>   %Busy - % of the active time for all the fragments within each operator that they were busy. 
>   Records - Total number of records propagated out by that operator.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)