You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2022/01/27 10:40:00 UTC

[jira] [Commented] (HIVE-25905) ORDER BY colName DESC does not honour 'hive.default.nulls.last' property in absence of NULLS_LAST/FIRST

    [ https://issues.apache.org/jira/browse/HIVE-25905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483053#comment-17483053 ] 

Stamatis Zampetakis commented on HIVE-25905:
--------------------------------------------

Thanks for logging this [~asolimando]. In the absence of explicit null ordering the default behavior should be determined by the value of  {{hive.default.nulls.last}} property. Can you update the examples in the description to show what is the value of this property?

> ORDER BY colName DESC does not honour 'hive.default.nulls.last' property in absence of NULLS_LAST/FIRST
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-25905
>                 URL: https://issues.apache.org/jira/browse/HIVE-25905
>             Project: Hive
>          Issue Type: Bug
>          Components: Parser
>    Affects Versions: 4.0.0
>            Reporter: Alessandro Solimando
>            Assignee: Alessandro Solimando
>            Priority: Major
>
> Consider the following table and data:
> {noformat}
> create table test1
> (
>   a string,
>   b timestamp,
>   c timestamp
> );
> INSERT INTO TABLE test1 VALUES
> ('John Doe', '1990-05-10 00:00:00.0', '2022-01-10 00:00:00.0'),
> ('John Doe', '1990-05-10 00:00:00.0', '2021-12-10 00:00:00.0'),
> ('John Doe', '1990-05-10 00:00:00.0', '2021-11-10 00:00:00.0'),
> ('John Doe', '1990-05-10 00:00:00.0', '2021-10-10 00:00:00.0'),
> ('John Doe', '1990-05-10 00:00:00.0', '2021-09-10 00:00:00.0'),
> ('John Doe', '1987-05-10 00:00:00.0', '2022-01-10 00:00:00.0'),
> ('John Doe', '1987-05-10 00:00:00.0', '2021-12-10 00:00:00.0'),
> ('John Doe', '1987-05-10 00:00:00.0', '2021-11-10 00:00:00.0'),
> ('John Doe', '1987-05-10 00:00:00.0', '2021-10-10 00:00:00.0'),
> ('John Doe', '1987-05-10 00:00:00.0', null);{noformat}
> Consider also the following query:
> {noformat}
> SELECT a, b, c, row_number() OVER (PARTITION BY a, b ORDER BY b DESC, c DESC) FROM test1; 
> {noformat}
> The output is:
> {noformat}
> John Doe	10/05/1990 00:00	10/01/2022 00:00	1
> John Doe	10/05/1990 00:00	10/12/2021 00:00	2
> John Doe	10/05/1990 00:00	10/11/2021 00:00	3
> John Doe	10/05/1990 00:00	10/10/2021 00:00	4
> John Doe	10/05/1990 00:00	10/09/2021 00:00	5
> John Doe	10/05/1987 00:00	NULL 	                1
> John Doe	10/05/1987 00:00	10/01/2022 00:00	2
> John Doe	10/05/1987 00:00	10/12/2021 00:00	3
> John Doe	10/05/1987 00:00	10/11/2021 00:00	4
> John Doe	10/05/1987 00:00	10/10/2021 00:00	5{noformat}
> While the expected output should be:
> {noformat}
> John Doe	10/05/1990 00:00	10/01/2022 00:00	1
> John Doe	10/05/1990 00:00	10/12/2021 00:00	2
> John Doe	10/05/1990 00:00	10/11/2021 00:00	3
> John Doe	10/05/1990 00:00	10/10/2021 00:00	4
> John Doe	10/05/1990 00:00	10/09/2021 00:00	5
> John Doe	10/05/1987 00:00	10/01/2022 00:00	1
> John Doe	10/05/1987 00:00	10/12/2021 00:00	2
> John Doe	10/05/1987 00:00	10/11/2021 00:00	3
> John Doe	10/05/1987 00:00	10/10/2021 00:00	4
> John Doe	10/05/1987 00:00	NULL 	                5{noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)