You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Arina Ielchiieva (Jira)" <ji...@apache.org> on 2019/11/04 12:54:00 UTC
[jira] [Updated] (DRILL-7136) Num_buckets for HashAgg in profile
may be inaccurate
[ https://issues.apache.org/jira/browse/DRILL-7136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arina Ielchiieva updated DRILL-7136:
------------------------------------
Fix Version/s: (was: 1.17.0)
> Num_buckets for HashAgg in profile may be inaccurate
> ----------------------------------------------------
>
> Key: DRILL-7136
> URL: https://issues.apache.org/jira/browse/DRILL-7136
> Project: Apache Drill
> Issue Type: Bug
> Components: Tools, Build & Test
> Affects Versions: 1.16.0
> Reporter: Robert Hou
> Assignee: Boaz Ben-Zvi
> Priority: Major
> Attachments: 23650ee5-6721-8a8f-7dd3-f5dd09a3a7b0.sys.drill
>
>
> I ran TPCH query 17 with sf 1000. Here is the query:
> {noformat}
> select
> sum(l.l_extendedprice) / 7.0 as avg_yearly
> from
> lineitem l,
> part p
> where
> p.p_partkey = l.l_partkey
> and p.p_brand = 'Brand#13'
> and p.p_container = 'JUMBO CAN'
> and l.l_quantity < (
> select
> 0.2 * avg(l2.l_quantity)
> from
> lineitem l2
> where
> l2.l_partkey = p.p_partkey
> );
> {noformat}
> One of the hash agg operators has resized 6 times. It should have 4M buckets. But the profile shows it has 64K buckets.
> I have attached a sample profile. In this profile, the hash agg operator is (04-02).
> {noformat}
> Operator Metrics
> Minor Fragment NUM_BUCKETS NUM_ENTRIES NUM_RESIZING RESIZING_TIME_MS NUM_PARTITIONS SPILLED_PARTITIONS SPILL_MB SPILL_CYCLE INPUT_BATCH_COUNT AVG_INPUT_BATCH_BYTES AVG_INPUT_ROW_BYTES INPUT_RECORD_COUNT OUTPUT_BATCH_COUNT AVG_OUTPUT_BATCH_BYTES AVG_OUTPUT_ROW_BYTES OUTPUT_RECORD_COUNT
> 04-00-02 65,536 748,746 6 364 1 582 0 813 582,653 18 26,316,456 401 1,631,943 25 26,176,350
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)