You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Krisztian Kasa (Jira)" <ji...@apache.org> on 2021/10/12 11:31:00 UTC

[jira] [Commented] (HIVE-25606) The global limit invalidates the subquery order by

    [ https://issues.apache.org/jira/browse/HIVE-25606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17427628#comment-17427628 ] 

Krisztian Kasa commented on HIVE-25606:
---------------------------------------

It seems that the order by clause is intentionally dropped during parsing:
{code:java}
explain ast
select * from (select * from test_1009 order by id desc) a limit 10;
{code}
{code:java}
ABSTRACT SYNTAX TREE:
  
TOK_QUERY
   TOK_FROM
      TOK_SUBQUERY
         TOK_QUERY
            TOK_FROM
               TOK_TABREF
                  TOK_TABNAME
                     test_1009
            TOK_INSERT
               TOK_DESTINATION
                  TOK_DIR
                     TOK_TMP_FILE
               TOK_SELECT
                  TOK_SELEXPR
                     TOK_ALLCOLREF
         a
   TOK_INSERT
      TOK_DESTINATION
         TOK_DIR
            TOK_TMP_FILE
      TOK_SELECT
         TOK_SELEXPR
            TOK_ALLCOLREF
      TOK_LIMIT
         10
{code}
When the Limit is moved after Order by to the subquery Hive parses the Order by too since it has an affect of the overall query result:
{code:java}
explain ast
select * from (select * from test_1009 order by id desc limit 10) a;
{code}
{code:java}
ABSTRACT SYNTAX TREE:
  
TOK_QUERY
   TOK_FROM
      TOK_SUBQUERY
         TOK_QUERY
            TOK_FROM
               TOK_TABREF
                  TOK_TABNAME
                     test_1009
            TOK_INSERT
               TOK_DESTINATION
                  TOK_DIR
                     TOK_TMP_FILE
               TOK_SELECT
                  TOK_SELEXPR
                     TOK_ALLCOLREF
               TOK_ORDERBY
                  TOK_TABSORTCOLNAMEDESC
                     TOK_NULLS_FIRST
                        TOK_TABLE_OR_COL
                           id
               TOK_LIMIT
                  10
         a
   TOK_INSERT
      TOK_DESTINATION
         TOK_DIR
            TOK_TMP_FILE
      TOK_SELECT
         TOK_SELEXPR
            TOK_ALLCOLREF
{code}
Some other database engines also drop the order by clause without limit from subquery or throw exception.
 As far as I know SQL standard allows Order by in the main query only but many database engines allows it in subquery however it is not guaranteed that it has an effect.

> The global limit invalidates the subquery order by
> --------------------------------------------------
>
>                 Key: HIVE-25606
>                 URL: https://issues.apache.org/jira/browse/HIVE-25606
>             Project: Hive
>          Issue Type: Bug
>          Components: CBO
>    Affects Versions: 3.1.1
>            Reporter: jinwensc
>            Priority: Major
>
> create table test_1009(id int);
> insert into table test_1009 values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13);
> select * from (select * from test_1009 order by id desc) a limit 10;
>  
> +-------+
> | a.id |
> +-------+
> | 1 |
> | 2 |
> | 3 |
> | 4 |
> | 5 |
> | 6 |
> | 7 |
> | 8 |
> | 9 |
> | 10 |
> +–+
>  
> -----------------------
> the right result should be 
> +-------+
> | a.id |
> +-------+
> | 13 |
> | 12 |
> | 11 |
> | 10 |
> | 9 |
> | 8 |
> | 7 |
> | 6 |
> | 5 |
> | 4 |
> +-------+
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)