You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Hans Zeller (JIRA)" <ji...@apache.org> on 2018/02/21 18:15:00 UTC
[jira] [Created] (TRAFODION-2965) Hash partial groupby does not
report a row count in operator statistics
Hans Zeller created TRAFODION-2965:
--------------------------------------
Summary: Hash partial groupby does not report a row count in operator statistics
Key: TRAFODION-2965
URL: https://issues.apache.org/jira/browse/TRAFODION-2965
Project: Apache Trafodion
Issue Type: Bug
Components: sql-exe
Affects Versions: 2.0-incubating
Environment: any
Reporter: Hans Zeller
Assignee: Hans Zeller
Fix For: 2.4
Here is a test case that demonstrates this:
{noformat}
update statistics for table hive.hive.time_dim on every column;
prepare s from
select count(distinct t_time_id) from hive.hive.time_dim;
explain options 'f' s;
execute s;
get statistics for qid current default;
{noformat}
The actual statistics show a "0" as the row count for the lowest EX_HASH_GRBY with id 2:
{noformat}
LC RC Id PaId ExId Frag TDB Name DOP Dispatches OperCpuTime EstRowsUsed ActRowsUsed ActDataUsed Details
13 . 14 . 8 0 EX_ROOT 1 2 37 0 1 8 3658
12 . 13 14 7 0 EX_SORT_GRBY 1 5 61 1 1 8
11 . 12 13 6 0 EX_SPLIT_TOP 1 8 171 1 4 32
10 . 11 12 6 0 EX_SEND_TOP 4 16 3,389 1 4 64
9 . 10 11 6 2 EX_SEND_BOTTOM 4 12 683 1 4 64
8 . 9 10 6 2 EX_SPLIT_BOTTOM 4 19 597 1 4 32 833874
7 . 8 9 5 2 EX_SORT_GRBY 4 2,259 289,200 1 4 32
6 . 7 8 4 2 EX_HASH_GRBY 4 2,259 394,235 1451 86,400 2,765,059,200 0|0|0
5 . 6 7 3 2 EX_SPLIT_TOP 4 2,211 51,034 110007 86,400 2,765,059,200
4 . 5 6 3 2 EX_SEND_TOP 8 2,436 98,125 110007 86,400 2,766,528,000
3 . 4 5 3 3 EX_SEND_BOTTOM 8 10,658 196,714 110007 86,400 2,766,528,000
2 . 3 4 3 3 EX_SPLIT_BOTTOM 2 2,663 95,246 110007 86,400 2,765,059,200 1547521
1 . 2 3 2 3 EX_HASH_GRBY 2 4,521 303,319 55003.5 0 0
. . 1 2 1 3 EX_HDFS_SCAN 2 2,650 952,242 116085 86,400 2,765,059,200 HIVE.HIVE.TIME_DIM|86400|5288324
{noformat}
The reason is that the hash groupby reports its row count in the BMO stats. However, a partial hash groupby is not considered a BMO (Big Memory Operator), so no rowcount gets reported. The fix is to increment the rowcount in the generic stats entry that is present in both partial and full groupby operators.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)