You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Rajat Khandelwal (JIRA)" <ji...@apache.org> on 2016/05/10 14:26:13 UTC

[jira] [Commented] (HIVE-13727) Getting error Failed rule: 'orderByClause clusterByClause distributeByClause sortByClause limitClause can only be applied to the whole union.' in subquery

    [ https://issues.apache.org/jira/browse/HIVE-13727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278165#comment-15278165 ] 

Rajat Khandelwal commented on HIVE-13727:
-----------------------------------------

The rejection of the first query doesn't seem to be justified. It's a typical top-n query, but the data is available in two tables. So an optimized way of getting top-n is to get the top-n values from both tables and getting top-n from 2n elements, instead of from the entire union of two tables. 



> Getting error Failed rule: 'orderByClause clusterByClause distributeByClause sortByClause limitClause can only be applied to the whole union.' in subquery 
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-13727
>                 URL: https://issues.apache.org/jira/browse/HIVE-13727
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Rajat Khandelwal
>
> The error comes in the following query:
> {noformat}
> SELECT *
> FROM
>   (SELECT *
>    FROM srcpart a
>    WHERE a.ds = '2008-04-08'
>      AND a.hr = '11'
>    ORDER BY a.key LIMIT 5
>    UNION ALL
>    SELECT *
>    FROM srcpart b
>    WHERE b.ds = '2008-04-08'
>      AND b.hr = '14'
>    ORDER BY b.key LIMIT 5) subq
> ORDER BY KEY LIMIT 5
> {noformat}
> But the following query works:
> {noformat}
> SELECT *
> FROM
>   (SELECT *
>    FROM
>      (SELECT *
>       FROM srcpart a
>       WHERE a.ds = '2008-04-08'
>         AND a.hr = '11'
>       ORDER BY a.key LIMIT 5) pa
>    UNION ALL SELECT *
>    FROM
>      (SELECT *
>       FROM srcpart b
>       WHERE b.ds = '2008-04-08'
>         AND b.hr = '14'
>       ORDER BY b.key LIMIT 5) pb) subq
> ORDER BY KEY LIMIT 5
> {noformat}
> The queries are logically identical, the query that's rejected has dummy select * clauses around the sub-queries. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)