You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ted Xu (JIRA)" <ji...@apache.org> on 2011/08/29 07:36:37 UTC
[jira] [Created] (HIVE-2416) Multiple distinct function to support
hive.groupby.skewindata optimization
Multiple distinct function to support hive.groupby.skewindata optimization
--------------------------------------------------------------------------
Key: HIVE-2416
URL: https://issues.apache.org/jira/browse/HIVE-2416
Project: Hive
Issue Type: Improvement
Reporter: Ted Xu
Assignee: Ted Xu
Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
{code}
Error in semantic analysis: DISTINCT on different columns not supported with skew in data
{code}
Skew groupby should support multiple distinct function
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2416) Multiple distinct function to support
hive.groupby.skewindata optimization
Posted by "Ted Xu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Xu updated HIVE-2416:
-------------------------
Attachment: multi_distinct_skew.patch
> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
> Key: HIVE-2416
> URL: https://issues.apache.org/jira/browse/HIVE-2416
> Project: Hive
> Issue Type: Improvement
> Reporter: Ted Xu
> Assignee: Ted Xu
> Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2416) Multiple distinct function to support
hive.groupby.skewindata optimization
Posted by "Namit Jain (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-2416:
-----------------------------
Status: Open (was: Patch Available)
> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
> Key: HIVE-2416
> URL: https://issues.apache.org/jira/browse/HIVE-2416
> Project: Hive
> Issue Type: Improvement
> Reporter: Ted Xu
> Assignee: Ted Xu
> Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2416) Multiple distinct function to
support hive.groupby.skewindata optimization
Posted by "Namit Jain (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134447#comment-13134447 ]
Namit Jain commented on HIVE-2416:
----------------------------------
Can you update the patch ?
> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
> Key: HIVE-2416
> URL: https://issues.apache.org/jira/browse/HIVE-2416
> Project: Hive
> Issue Type: Improvement
> Reporter: Ted Xu
> Assignee: Ted Xu
> Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2416) Multiple distinct function to
support hive.groupby.skewindata optimization
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092703#comment-13092703 ]
Amareshwari Sriramadasu commented on HIVE-2416:
-----------------------------------------------
Will look into. Can you create a review board entry for the patch?
> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
> Key: HIVE-2416
> URL: https://issues.apache.org/jira/browse/HIVE-2416
> Project: Hive
> Issue Type: Improvement
> Reporter: Ted Xu
> Assignee: Ted Xu
> Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2416) Multiple distinct function to
support hive.groupby.skewindata optimization
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099875#comment-13099875 ]
Amareshwari Sriramadasu commented on HIVE-2416:
-----------------------------------------------
Can you update bugs field in the review board entry so that the updates will be seen here? Thanks.
> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
> Key: HIVE-2416
> URL: https://issues.apache.org/jira/browse/HIVE-2416
> Project: Hive
> Issue Type: Improvement
> Reporter: Ted Xu
> Assignee: Ted Xu
> Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2416) Multiple distinct function to
support hive.groupby.skewindata optimization
Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099914#comment-13099914 ]
jiraposter@reviews.apache.org commented on HIVE-2416:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1741/
-----------------------------------------------------------
(Updated 2011-09-08 04:52:29.083703)
Review request for hive and Amareshwari Sriramadasu.
Changes
-------
Update BUGs field to link JIRA
Summary
-------
Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
Error in semantic analysis: DISTINCT on different columns not supported with skew in data
Skew groupby should support multiple distinct function
This addresses bug HIVE-2416.
https://issues.apache.org/jira/browse/HIVE-2416
Diffs
-----
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 1162620
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ErrorMsg.java 1162620
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1162620
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java 1162620
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 1162620
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/groupby2_map_skew_multi_distinct.q 1162620
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/groupby3_multi_distinct.q 1162620
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/groupby2_map_skew_multi_distinct.q PRE-CREATION
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/groupby2_map_skew_multi_distinct.q.out 1162620
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/groupby3_map_skew_multi_distinct.q.out 1162620
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/groupby3_multi_distinct.q.out 1162620
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/groupby2.q.out 1162620
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/groupby2_map_skew.q.out 1162620
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/groupby2_map_skew_multi_distinct.q.out PRE-CREATION
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/groupby3.q.out 1162620
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/groupby3_map_skew.q.out 1162620
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/nullgroup4.q.out 1162620
Diff: https://reviews.apache.org/r/1741/diff
Testing
-------
All UT passed
Thanks,
Ted
> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
> Key: HIVE-2416
> URL: https://issues.apache.org/jira/browse/HIVE-2416
> Project: Hive
> Issue Type: Improvement
> Reporter: Ted Xu
> Assignee: Ted Xu
> Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2416) Multiple distinct function to support
hive.groupby.skewindata optimization
Posted by "Ted Xu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Xu updated HIVE-2416:
-------------------------
Status: Patch Available (was: Open)
> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
> Key: HIVE-2416
> URL: https://issues.apache.org/jira/browse/HIVE-2416
> Project: Hive
> Issue Type: Improvement
> Reporter: Ted Xu
> Assignee: Ted Xu
> Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2416) Multiple distinct function to
support hive.groupby.skewindata optimization
Posted by "Ted Xu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099725#comment-13099725 ]
Ted Xu commented on HIVE-2416:
------------------------------
Thank you Amareshwari, sorry for reply late.
The RB entry is created at https://reviews.apache.org/r/1741/
> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
> Key: HIVE-2416
> URL: https://issues.apache.org/jira/browse/HIVE-2416
> Project: Hive
> Issue Type: Improvement
> Reporter: Ted Xu
> Assignee: Ted Xu
> Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira