You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Ferdinand Xu (JIRA)" <ji...@apache.org> on 2017/10/13 02:30:00 UTC

[jira] [Comment Edited] (HIVE-17783) Hybrid Grace Hash Join has performance degradation for N-way join using Hive on Tez

    [ https://issues.apache.org/jira/browse/HIVE-17783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202985#comment-16202985 ] 

Ferdinand Xu edited comment on HIVE-17783 at 10/13/17 2:29 AM:
---------------------------------------------------------------

> It's possible it's slower even w/o accounting for sharing. 
Could you please expand a little bit and explain in more detail?

> The main motivation was actually avoiding OOMs as far as I understand.
Agree, this feature can make map join more general when hash table can not fit into the memory.

> I don't thin anyone is working on perf improvements right now.
Logically it should have some performance benefits over the non hybrid grace hash join since it isn't required to scan the big table again during the reprocessing phase when hash table can not fit into the memory.



was (Author: ferd):
Logically it should have some performance benefits over the non hybrid grace hash join since it didn't need to rescan the big table again during the reprocessing phase when hash table can not fit into the memory.

> Hybrid Grace Hash Join has performance degradation for N-way join using Hive on Tez
> -----------------------------------------------------------------------------------
>
>                 Key: HIVE-17783
>                 URL: https://issues.apache.org/jira/browse/HIVE-17783
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.2.0
>         Environment: 8*Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
> 1 master + 7 workers
> TPC-DS at 3TB data scales
> Hive version : 2.2.0
>            Reporter: Ferdinand Xu
>         Attachments: Hybrid_Grace_Hash_Join.xlsx, screenshot-1.png
>
>
> Most configurations are using default value. And the benchmark is to test enabling against disabling hybrid grace hash join using TPC-DS queries at 3TB data scales. Many queries related to N-way join has performance degradation over three times test. Detailed result  is attached.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)