You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Venki Korukanti (JIRA)" <ji...@apache.org> on 2015/01/13 02:47:34 UTC

[jira] [Commented] (DRILL-1992) Add more stats for HashJoinBatch and HashAggBatch

    [ https://issues.apache.org/jira/browse/DRILL-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14274578#comment-14274578 ] 

Venki Korukanti commented on DRILL-1992:
----------------------------------------

Sample queries used to analyze the stats

{code:sql}
SELECT
  CASE 
    WHEN metric['metricId'] = 0 THEN 'NUM_BUCKETS'
    WHEN metric['metricId'] = 1 THEN 'NUM_ENTRIES'
    WHEN metric['metricId'] = 2 THEN 'NUM_RESIZES'
    WHEN metric['metricId'] = 4 THEN 'HT_MEMORY'
    WHEN metric['metricId'] = 5 THEN 'NUM_BHOLDERS'
    WHEN metric['metricId'] = 6 THEN 'HJH_MEMORY'
  END,
  sum(metric['longValue']) AggMetricValue FROM 
  (SELECT minorFragId, opProfile['operatorType'] opType, flatten(opProfile['metric']) as metric FROM 
     (SELECT  minorFragProfile['minorFragmentId'] as minorFragId,
         flatten(minorFragProfile['operatorProfile']) opProfile
      FROM 
         (SELECT flatten(majorFragment['minorFragmentProfile']) as minorFragProfile
           FROM
              (SELECT flatten(fragmentProfile) as majorFragment from dfs.`/tmp/a.json`)
           -- WHERE majorFragment['majorFragmentId'] = 1 -- if we are interested in op in a particular major fragment
         )
     )
  )
WHERE
      (metric['metricId'] IN  (0, 1, 2, 4, 5, 6)) AND opType = 4 -- Change to 3 for HashAgg stats
GROUP BY
     metric['metricId']
ORDER BY
      metric['metricId'];
{code}

{code}
+------------+----------------+
|   EXPR$0   | AggMetricValue |
+------------+----------------+
| NUM_BUCKETS | 6291456        |
| NUM_ENTRIES | 4000000        |
| NUM_RESIZES | 24             |
| HT_MEMORY  | 81395712       |
| NUM_BHOLDERS | 66             |
| HJH_MEMORY | 33301504       |
+------------+----------------+
{code}



> Add more stats for HashJoinBatch and HashAggBatch
> -------------------------------------------------
>
>                 Key: DRILL-1992
>                 URL: https://issues.apache.org/jira/browse/DRILL-1992
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Operators
>    Affects Versions: 0.8.0
>            Reporter: Venki Korukanti
>            Assignee: Venki Korukanti
>             Fix For: 0.8.0
>
>
> Adding more stats to analyze the memory usage of HashJoinBatch and HashAggBatch.
> HashJoinBatch
>   + HASHTABLE_MEMORY_ALLOCATION
>   + HASHTABLE_NUM_BATCHHOLDERS
>   + HASHJOINHELPER_MEMORY
> HashAgg
>   + HASHTABLE_MEMORY_ALLOCATION
>   + HASHTABLE_NUM_BATCHHOLDERS
>   + HASHAGG_MEMORY
>   + HASHAGG_NUM_BATCHHOLDERS
> Cleanup:
>   + Prefix "HASHTABLE_" to existing HashTable metrics such as NUM_BUCKETS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)