You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2018/04/13 04:10:00 UTC

[jira] [Commented] (HIVE-19192) HiveServer2 query compilation : query compilation time increases if sql has multiple unions

    [ https://issues.apache.org/jira/browse/HIVE-19192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16436770#comment-16436770 ] 

Gopal V commented on HIVE-19192:
--------------------------------

UNION ALL doesn't seem to suffer from the same problem - in Hive 1.2+ UNION does something different and the optimizer path for UNION DISTINCT is called.

> HiveServer2 query compilation : query compilation time increases if sql has multiple unions 
> --------------------------------------------------------------------------------------------
>
>                 Key: HIVE-19192
>                 URL: https://issues.apache.org/jira/browse/HIVE-19192
>             Project: Hive
>          Issue Type: Improvement
>          Components: Hive, HiveServer2
>    Affects Versions: 1.2.1, 2.1.0
>         Environment: Hive-1.2.1
> Hive-2.1.0
>  
>            Reporter: Rajkumar Singh
>            Priority: Major
>         Attachments: query-with-100-union.q, query-with-200-union.q, query-with-50-union.q
>
>
> query compilation time suffer a lot if SQL has many unions, here is the simple reproduce of the problem. PFA attached query with 50,100 and 200 unions(forgive me for this bad SQL). when run explain against hiveserver2 I can see the compilation time increase many folds.
> {code}
> query-with-50-union.q
> 1,671 rows selected (10.662 seconds)
> query-with-100-union.q
> 3,321 rows selected (101.709 seconds)
> query-with-200-union.q
> 6,588 rows selected (1074.487 seconds)
> {code}
> Running such SQL against hiveserver2 can starve other SQL to run into single threaded compilation stage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)