You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Gopal Vijayaraghavan (Jira)" <ji...@apache.org> on 2021/10/01 17:49:00 UTC
[jira] [Created] (HIVE-25589) SQL: Implement HAVING/QUALIFY
predicates for ROW_NUMBER()=1
Gopal Vijayaraghavan created HIVE-25589:
-------------------------------------------
Summary: SQL: Implement HAVING/QUALIFY predicates for ROW_NUMBER()=1
Key: HIVE-25589
URL: https://issues.apache.org/jira/browse/HIVE-25589
Project: Hive
Issue Type: Improvement
Components: CBO, SQL
Affects Versions: 4.0.0
Reporter: Gopal Vijayaraghavan
The insert queries which use a row_num()=1 function are inconvenient to write or port from an existing workload, because there is no easy way to ignore a column in this pattern.
{code}
INSERT INTO main_table
SELECT * from duplicated_table
QUALIFY ROW_NUMER() OVER (PARTITION BY event_id) = 1;
{code}
needs to be rewritten into
{code}
INSERT INTO main_table
select event_id, event_ts, event_attribute, event_metric1, event_metric2, event_metric3, event_metric4, .., event_metric43 from
(select *, ROW_NUMBER() OVER (PARTITION BY event_id) as rnum from duplicated_table)
where rnum=1;
{code}
This is a time-consuming and error-prone rewrite (dealing with a messed up order of columns between one source and dest table).
An alternate rewrite would be to do the same or similar syntax using HAVING.
{code}
INSERT INTO main_table
SELECT * from duplicated_table
HAVING ROW_NUMER() OVER (PARTITION BY event_id) = 1;
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)