You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Shengsheng Huang (JIRA)" <ji...@apache.org> on 2012/06/26 06:15:42 UTC
[jira] [Created] (HIVE-3199) support select distinct *
Shengsheng Huang created HIVE-3199:
--------------------------------------
Summary: support select distinct *
Key: HIVE-3199
URL: https://issues.apache.org/jira/browse/HIVE-3199
Project: Hive
Issue Type: New Feature
Components: Query Processor
Affects Versions: 0.9.0
Reporter: Shengsheng Huang
Error is reported when running query "select distinct * from t".
It is a common feature that is better to be supported.
Did some investigation about this issue. In current implementation "select distinct a,b,c from t" is translated to "select a,b,c from t group by a,b,c". So select distinct * is translated literally to "select * from group by *". But * is not handled properly when processing groupby expressions.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3199) support select distinct *
Posted by "Shengsheng Huang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shengsheng Huang updated HIVE-3199:
-----------------------------------
Attachment: HIVE-3199.for0.9.0.patch
Uploaded a patch to enable basic "select distinct *" (patterns like "select distinct * from t", "select distinct t.* from t" or "select distinct * from a join b" are all supported) This patch does not support subqueries - That means "select distinct *" can not be contained in subqueries, or the "from" clause is a subquery.
> support select distinct *
> -------------------------
>
> Key: HIVE-3199
> URL: https://issues.apache.org/jira/browse/HIVE-3199
> Project: Hive
> Issue Type: New Feature
> Components: Query Processor
> Affects Versions: 0.9.0
> Reporter: Shengsheng Huang
> Attachments: HIVE-3199.for0.9.0.patch
>
>
> Error is reported when running query "select distinct * from t".
> It is a common feature that is better to be supported.
> Did some investigation about this issue. In current implementation "select distinct a,b,c from t" is translated to "select a,b,c from t group by a,b,c". So select distinct * is translated literally to "select * from group by *". But * is not handled properly when processing groupby expressions.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3199) support select distinct *
Posted by "Shengsheng Huang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403761#comment-13403761 ]
Shengsheng Huang commented on HIVE-3199:
----------------------------------------
This patch handles "select distinct *" at compile time. It added another stage of AST transformer to replace * with named columns, because GroupBy does not accept wildcard. Another option is to handle wildcard at runtime. Anyway I think adding an extra stage of AST transformation for potential optimization or feature enabling makes sense. We could support NATURAL JOIN in the similar way. @Namit @JQ What do you think?
> support select distinct *
> -------------------------
>
> Key: HIVE-3199
> URL: https://issues.apache.org/jira/browse/HIVE-3199
> Project: Hive
> Issue Type: New Feature
> Components: Query Processor
> Affects Versions: 0.9.0
> Reporter: Shengsheng Huang
> Attachments: HIVE-3199.for0.9.0.patch
>
>
> Error is reported when running query "select distinct * from t".
> It is a common feature that is better to be supported.
> Did some investigation about this issue. In current implementation "select distinct a,b,c from t" is translated to "select a,b,c from t group by a,b,c". So select distinct * is translated literally to "select * from group by *". But * is not handled properly when processing groupby expressions.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira