You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Suprith Chandrashekharachar (Jira)" <ji...@apache.org> on 2021/03/31 00:12:00 UTC
[jira] [Commented] (HIVE-24915) Distribute by with sort by clause
when used with constant parameter for sort produces wrong result.
[ https://issues.apache.org/jira/browse/HIVE-24915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17311933#comment-17311933 ]
Suprith Chandrashekharachar commented on HIVE-24915:
----------------------------------------------------
[~kgyrtkirk] Could you please take a look at this one/assign it to someone who is familiar with the code base w.r.t the change being made?
> Distribute by with sort by clause when used with constant parameter for sort produces wrong result.
> ---------------------------------------------------------------------------------------------------
>
> Key: HIVE-24915
> URL: https://issues.apache.org/jira/browse/HIVE-24915
> Project: Hive
> Issue Type: Bug
> Affects Versions: 2.3.4
> Reporter: Suprith Chandrashekharachar
> Assignee: Suprith Chandrashekharachar
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Distribute by with sort by clause when used with constant parameter for sort produces wrong result.
> Example:
> {code:java}
> SELECT
> t.time,
> 'a' as const
> FROM
> (SELECT 1591819264 as time
> UNION ALL
> SELECT 1591819265 as time) t
> DISTRIBUTE by const
> sort by const, t.time
> {code}
> Produces
>
> |{color:#000000}*time*{color}|{color:#000000}*const*{color}|
> | NULL|{color:#000000}a{color}|
> | NULL|{color:#000000}a{color}|
> Instead it should produce(Hive 0.13 produces this):
> |{color:#000000}*time*{color}|{color:#000000}*const*{color}|
> |{color:#000000}*1591819264*{color}|{color:#000000}a{color}|
> |{color:#000000}*1591819265*{color}|{color:#000000}a{color}|
> Incorrect sort columns are used while creating ReduceSink here [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L9066]
> With constant propagation optimizer enabled, due to incorrect constant operator folding, incorrect results will be produced.
>
> More examples for incorrect behavior:
> {code:java}
> SELECT
> t.time,
> 'a' as const,
> t.id
> FROM
> (SELECT 1591819264 as time, 1 as id
> UNION ALL
> SELECT 1591819265 as time, 2 as id) t
> DISTRIBUTE by t.time
> sort by t.time, const, t.id
> {code}
> produces
> |{color:#000000}*time*{color}|{color:#000000}*const*{color}|{color:#000000}*id*{color}|
> |{color:#000000}*1591819264*{color}|{color:#000000}a{color}|NULL |
> |{color:#000000}*1591819265*{color}|{color:#000000}a{color}| NULL|
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)