You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2022/02/23 15:28:49 UTC
[GitHub] [hive] dengzhhu653 edited a comment on pull request #2585: HIVE-25448: Invalid partition columns when skew with distinct
dengzhhu653 edited a comment on pull request #2585:
URL: https://github.com/apache/hive/pull/2585#issuecomment-1048902780
I found something interesting, when I explain `select col1, count(distinct col2) from partition_distinct_skew group by col1;` on master branch, the output is following:
```
Vertices:
Map 1
Map Operator Tree:
TableScan
alias: partition_distinct_skew
Statistics: Num rows: 3 Data size: 510 Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
expressions: col1 (type: string), col2 (type: string)
outputColumnNames: col1, col2
Statistics: Num rows: 3 Data size: 510 Basic stats: COMPLETE Column stats: COMPLETE
Group By Operator
keys: col1 (type: string), col2 (type: string)
minReductionHashAggr: 0.4
mode: hash
outputColumnNames: _col0, _col1
Statistics: Num rows: 2 Data size: 340 Basic stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: _col0 (type: string), _col1 (type: string)
null sort order: zz
sort order: ++
Map-reduce partition columns: rand() (type: double)
Statistics: Num rows: 2 Data size: 340 Basic stats: COMPLETE Column stats: COMPLETE
```
The partition column is **rand()** for this case. It's seems we have done something to improve the skew case, though I'm not able to find where the cause locates.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org