You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Krisztian Kasa (Jira)" <ji...@apache.org> on 2021/10/12 11:31:00 UTC
[jira] [Commented] (HIVE-25606) The global limit invalidates the
subquery order by
[ https://issues.apache.org/jira/browse/HIVE-25606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17427628#comment-17427628 ]
Krisztian Kasa commented on HIVE-25606:
---------------------------------------
It seems that the order by clause is intentionally dropped during parsing:
{code:java}
explain ast
select * from (select * from test_1009 order by id desc) a limit 10;
{code}
{code:java}
ABSTRACT SYNTAX TREE:
TOK_QUERY
TOK_FROM
TOK_SUBQUERY
TOK_QUERY
TOK_FROM
TOK_TABREF
TOK_TABNAME
test_1009
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
TOK_ALLCOLREF
a
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
TOK_ALLCOLREF
TOK_LIMIT
10
{code}
When the Limit is moved after Order by to the subquery Hive parses the Order by too since it has an affect of the overall query result:
{code:java}
explain ast
select * from (select * from test_1009 order by id desc limit 10) a;
{code}
{code:java}
ABSTRACT SYNTAX TREE:
TOK_QUERY
TOK_FROM
TOK_SUBQUERY
TOK_QUERY
TOK_FROM
TOK_TABREF
TOK_TABNAME
test_1009
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
TOK_ALLCOLREF
TOK_ORDERBY
TOK_TABSORTCOLNAMEDESC
TOK_NULLS_FIRST
TOK_TABLE_OR_COL
id
TOK_LIMIT
10
a
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
TOK_ALLCOLREF
{code}
Some other database engines also drop the order by clause without limit from subquery or throw exception.
As far as I know SQL standard allows Order by in the main query only but many database engines allows it in subquery however it is not guaranteed that it has an effect.
> The global limit invalidates the subquery order by
> --------------------------------------------------
>
> Key: HIVE-25606
> URL: https://issues.apache.org/jira/browse/HIVE-25606
> Project: Hive
> Issue Type: Bug
> Components: CBO
> Affects Versions: 3.1.1
> Reporter: jinwensc
> Priority: Major
>
> create table test_1009(id int);
> insert into table test_1009 values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13);
> select * from (select * from test_1009 order by id desc) a limit 10;
>
> +-------+
> | a.id |
> +-------+
> | 1 |
> | 2 |
> | 3 |
> | 4 |
> | 5 |
> | 6 |
> | 7 |
> | 8 |
> | 9 |
> | 10 |
> +–+
>
> -----------------------
> the right result should be
> +-------+
> | a.id |
> +-------+
> | 13 |
> | 12 |
> | 11 |
> | 10 |
> | 9 |
> | 8 |
> | 7 |
> | 6 |
> | 5 |
> | 4 |
> +-------+
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)