You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/12/20 12:26:00 UTC
[jira] [Work logged] (HIVE-26054) Distinct + Groupby with column alias is failing

     [ https://issues.apache.org/jira/browse/HIVE-26054?focusedWorklogId=834767&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-834767 ]

ASF GitHub Bot logged work on HIVE-26054:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Dec/22 12:25
            Start Date: 20/Dec/22 12:25
    Worklog Time Spent: 10m 
      Work Description: kasakrisz opened a new pull request, #3155:
URL: https://github.com/apache/hive/pull/3155

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://cwiki.apache.org/confluence/display/Hive/HowToContribute
     2. Ensure that you have created an issue on the Hive project JIRA: https://issues.apache.org/jira/projects/HIVE/summary
     3. Ensure you have added or run the appropriate tests for your PR: 
     4. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP]HIVE-XXXXX:  Your PR title ...'.
     5. Be sure to keep the PR description updated to reflect all changes.
     6. Please write your PR title to summarize what this PR proposes.
     7. If possible, provide a concise example to reproduce the issue for a faster review.
   
   -->
   
   ### What changes were proposed in this pull request?
   Partially restore the way Project operator's output row resolver is created before [HIVE-16924](https://issues.apache.org/jira/browse/HIVE-16924) :
   When creating the output row resolver schema of the Project operator of a `select distinct` query append the column information from the input row resolver.
   
   ### Why are the changes needed?
   If some column references in the select distinct clause of the query has alias defined only the alias of the column can be used to reference it in the Project output RR. However common db engines enables both the column name and alias in the Order by clause.
   In case of a `select distinct` query the input RR of the Project is an Aggregate which output RR contains only the projected expressions. It is safe to use these in the order by clause too.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, Both column name and alias can be used in the order by clause of `select distinct` queries.
   
   ### How was this patch tested?
   ```
   mvn test -DskipSparkTests -Dtest=TestMiniLlapLocalCliDriver -Dqfile=distinct_col.q -pl itests/qtest -Dmaven.surefire.debug -Pitests
   ```




Issue Time Tracking
-------------------

    Worklog Id:     (was: 834767)
    Time Spent: 40m  (was: 0.5h)

> Distinct + Groupby with column alias is failing
> -----------------------------------------------
>
>                 Key: HIVE-26054
>                 URL: https://issues.apache.org/jira/browse/HIVE-26054
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Naresh P R
>            Assignee: Krisztian Kasa
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> After [HIVE-16924|https://issues.apache.org/jira/browse/HIVE-16924], below query is failing.
> {code:java}
> create table table1 (col1 bigint, col2 string);
> create table table2 (t2_col1 string);
> Select distinct col1 as alias_col1
> from table1
> where col2 = (SELECT max(t2_col1) as currentdate from table2 limit 1)
> order by col1;
> Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 Unsupported SubQuery Expression '1': Only SubQuery expressions that are top level conjuncts are allowed (state=42000,code=40000) {code}
> Workaround is either remove distinct column alias "alias_col1" or use alias in order by.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)