You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Vineet Garg (JIRA)" <ji...@apache.org> on 2018/08/11 04:14:00 UTC
[jira] [Updated] (HIVE-20366) TPC-DS query78 stats estimates are
off for is null filter
[ https://issues.apache.org/jira/browse/HIVE-20366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vineet Garg updated HIVE-20366:
-------------------------------
Description:
In Query 78, there is Left outer join between fact table combos: stores_sales LOJ store_returns, catalog_sales LOJ catalog_returns and web_sales LOJ web_returns. Each of these joins estimates only a single row and the result is BROADCAST and causes hash table memory errors
{code}
Reducer 12 |
| Execution mode: vectorized, llap |
| Reduce Operator Tree: |
+----------------------------------------------------+
| Explain |
+----------------------------------------------------+
| Map Join Operator |
| condition map: |
| Left Outer Join 0 to 1 |
| keys: |
| 0 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 (type: bigint) |
| 1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 (type: bigint) |
| outputColumnNames: _col0, _col1, _col3, _col4, _col5, _col6, _col8 |
| input vertices: |
| 1 Map 14 |
| Statistics: Num rows: 10282477384 Data size: 534184867432 Basic stats: COMPLETE Column stats: COMPLETE |
| Filter Operator |
| predicate: _col8 is null (type: boolean) |
| * Statistics: Num rows: 1* Data size: 52 Basic stats: COMPLETE Column stats: COMPLETE |
{code}
> TPC-DS query78 stats estimates are off for is null filter
> ---------------------------------------------------------
>
> Key: HIVE-20366
> URL: https://issues.apache.org/jira/browse/HIVE-20366
> Project: Hive
> Issue Type: Bug
> Components: Query Planning
> Reporter: Vineet Garg
> Assignee: Vineet Garg
> Priority: Major
>
> In Query 78, there is Left outer join between fact table combos: stores_sales LOJ store_returns, catalog_sales LOJ catalog_returns and web_sales LOJ web_returns. Each of these joins estimates only a single row and the result is BROADCAST and causes hash table memory errors
> {code}
> Reducer 12 |
> | Execution mode: vectorized, llap |
> | Reduce Operator Tree: |
> +----------------------------------------------------+
> | Explain |
> +----------------------------------------------------+
> | Map Join Operator |
> | condition map: |
> | Left Outer Join 0 to 1 |
> | keys: |
> | 0 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 (type: bigint) |
> | 1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 (type: bigint) |
> | outputColumnNames: _col0, _col1, _col3, _col4, _col5, _col6, _col8 |
> | input vertices: |
> | 1 Map 14 |
> | Statistics: Num rows: 10282477384 Data size: 534184867432 Basic stats: COMPLETE Column stats: COMPLETE |
> | Filter Operator |
> | predicate: _col8 is null (type: boolean) |
> | * Statistics: Num rows: 1* Data size: 52 Basic stats: COMPLETE Column stats: COMPLETE |
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)