You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hive.apache.org by "Krisztian Kasa (Jira)" <ji...@apache.org> on 2022/05/04 14:14:00 UTC

[jira] [Assigned] (HIVE-25589) SQL: Implement HAVING/QUALIFY predicates for ROW_NUMBER()=1

     [ https://issues.apache.org/jira/browse/HIVE-25589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Krisztian Kasa reassigned HIVE-25589:
-------------------------------------

    Assignee: Krisztian Kasa

> SQL: Implement HAVING/QUALIFY predicates for ROW_NUMBER()=1
> -----------------------------------------------------------
>
>                 Key: HIVE-25589
>                 URL: https://issues.apache.org/jira/browse/HIVE-25589
>             Project: Hive
>          Issue Type: Improvement
>          Components: CBO, SQL
>    Affects Versions: 4.0.0
>            Reporter: Gopal Vijayaraghavan
>            Assignee: Krisztian Kasa
>            Priority: Major
>
> The insert queries which use a row_num()=1 function are inconvenient to write or port from an existing workload, because there is no easy way to ignore a column in this pattern.
> {code}
> INSERT INTO main_table 
> SELECT * from duplicated_table
> QUALIFY ROW_NUMER() OVER (PARTITION BY event_id) = 1;
> {code}
> needs to be rewritten into
> {code}
> INSERT INTO main_table
> select event_id, event_ts, event_attribute, event_metric1, event_metric2, event_metric3, event_metric4, .., event_metric43 from 
> (select *, ROW_NUMBER() OVER (PARTITION BY event_id) as rnum from duplicated_table)
> where rnum=1;
> {code}
> This is a time-consuming and error-prone rewrite (dealing with a messed up order of columns between one source and dest table).
> An alternate rewrite would be to do the same or similar syntax using HAVING. 
> {code}
> INSERT INTO main_table 
> SELECT * from duplicated_table
> HAVING ROW_NUMER() OVER (PARTITION BY event_id) = 1;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)