You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Krisztian Kasa (Jira)" <ji...@apache.org> on 2022/10/30 06:20:00 UTC

[jira] [Resolved] (HIVE-26671) Incorrect results for group by/order by/limit query with 2 aggregates

     [ https://issues.apache.org/jira/browse/HIVE-26671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Krisztian Kasa resolved HIVE-26671.
-----------------------------------
    Resolution: Fixed

Merged to master. Thanks [~scarlin] for the patch.

> Incorrect results for group by/order by/limit query with 2 aggregates
> ---------------------------------------------------------------------
>
>                 Key: HIVE-26671
>                 URL: https://issues.apache.org/jira/browse/HIVE-26671
>             Project: Hive
>          Issue Type: Bug
>          Components: Operators
>            Reporter: Steve Carlin
>            Assignee: Steve Carlin
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> Grabbed this query from the Impala test suite.  It is a query run off of tpcds tables, but it's not really super special.  You will need a lot of data to reproduce this, though.
> select
> l_orderkey,
> min(l_shipdate) as flt,
> count(distinct l_partkey) as cnl 
> from lineitem
> group by l_orderkey order by l_orderkey limit 2;
> The issue is with the Top N Key operator optimizer. The Top N Key operator is the first operator after the Table Scan.  The sort key is on both the l_orderkey and l_partkey columns, but this means that the second sort key might not be forwarded.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)