You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ted Xu (JIRA)" <ji...@apache.org> on 2011/08/29 07:36:37 UTC

[jira] [Created] (HIVE-2416) Multiple distinct function to support hive.groupby.skewindata optimization

Multiple distinct function to support hive.groupby.skewindata optimization
--------------------------------------------------------------------------

                 Key: HIVE-2416
                 URL: https://issues.apache.org/jira/browse/HIVE-2416
             Project: Hive
          Issue Type: Improvement
            Reporter: Ted Xu
            Assignee: Ted Xu


Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
{code}
Error in semantic analysis: DISTINCT on different columns not supported with skew in data
{code}
Skew groupby should support multiple distinct function

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2416) Multiple distinct function to support hive.groupby.skewindata optimization

Posted by "Ted Xu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Xu updated HIVE-2416:
-------------------------

    Attachment: multi_distinct_skew.patch

> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
>                 Key: HIVE-2416
>                 URL: https://issues.apache.org/jira/browse/HIVE-2416
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ted Xu
>            Assignee: Ted Xu
>         Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2416) Multiple distinct function to support hive.groupby.skewindata optimization

Posted by "Namit Jain (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-2416:
-----------------------------

    Status: Open  (was: Patch Available)
    
> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
>                 Key: HIVE-2416
>                 URL: https://issues.apache.org/jira/browse/HIVE-2416
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ted Xu
>            Assignee: Ted Xu
>         Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2416) Multiple distinct function to support hive.groupby.skewindata optimization

Posted by "Namit Jain (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134447#comment-13134447 ] 

Namit Jain commented on HIVE-2416:
----------------------------------

Can you update the patch ?
                
> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
>                 Key: HIVE-2416
>                 URL: https://issues.apache.org/jira/browse/HIVE-2416
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ted Xu
>            Assignee: Ted Xu
>         Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2416) Multiple distinct function to support hive.groupby.skewindata optimization

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092703#comment-13092703 ] 

Amareshwari Sriramadasu commented on HIVE-2416:
-----------------------------------------------

Will look into. Can you create a review board entry for the patch?

> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
>                 Key: HIVE-2416
>                 URL: https://issues.apache.org/jira/browse/HIVE-2416
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ted Xu
>            Assignee: Ted Xu
>         Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2416) Multiple distinct function to support hive.groupby.skewindata optimization

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099875#comment-13099875 ] 

Amareshwari Sriramadasu commented on HIVE-2416:
-----------------------------------------------

Can you update bugs field in the review board entry so that the updates will be seen here? Thanks. 

> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
>                 Key: HIVE-2416
>                 URL: https://issues.apache.org/jira/browse/HIVE-2416
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ted Xu
>            Assignee: Ted Xu
>         Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2416) Multiple distinct function to support hive.groupby.skewindata optimization

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099914#comment-13099914 ] 

jiraposter@reviews.apache.org commented on HIVE-2416:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1741/
-----------------------------------------------------------

(Updated 2011-09-08 04:52:29.083703)


Review request for hive and Amareshwari Sriramadasu.


Changes
-------

Update BUGs field to link JIRA


Summary
-------

Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:

Error in semantic analysis: DISTINCT on different columns not supported with skew in data

Skew groupby should support multiple distinct function


This addresses bug HIVE-2416.
    https://issues.apache.org/jira/browse/HIVE-2416


Diffs
-----

  http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 1162620 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ErrorMsg.java 1162620 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1162620 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java 1162620 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 1162620 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/groupby2_map_skew_multi_distinct.q 1162620 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/groupby3_multi_distinct.q 1162620 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/groupby2_map_skew_multi_distinct.q PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/groupby2_map_skew_multi_distinct.q.out 1162620 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/groupby3_map_skew_multi_distinct.q.out 1162620 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/groupby3_multi_distinct.q.out 1162620 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/groupby2.q.out 1162620 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/groupby2_map_skew.q.out 1162620 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/groupby2_map_skew_multi_distinct.q.out PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/groupby3.q.out 1162620 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/groupby3_map_skew.q.out 1162620 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/nullgroup4.q.out 1162620 

Diff: https://reviews.apache.org/r/1741/diff


Testing
-------

All UT passed


Thanks,

Ted



> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
>                 Key: HIVE-2416
>                 URL: https://issues.apache.org/jira/browse/HIVE-2416
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ted Xu
>            Assignee: Ted Xu
>         Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2416) Multiple distinct function to support hive.groupby.skewindata optimization

Posted by "Ted Xu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Xu updated HIVE-2416:
-------------------------

    Status: Patch Available  (was: Open)

> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
>                 Key: HIVE-2416
>                 URL: https://issues.apache.org/jira/browse/HIVE-2416
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ted Xu
>            Assignee: Ted Xu
>         Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2416) Multiple distinct function to support hive.groupby.skewindata optimization

Posted by "Ted Xu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099725#comment-13099725 ] 

Ted Xu commented on HIVE-2416:
------------------------------

Thank you Amareshwari, sorry for reply late.
The RB entry is created at https://reviews.apache.org/r/1741/

> Multiple distinct function to support hive.groupby.skewindata optimization
> --------------------------------------------------------------------------
>
>                 Key: HIVE-2416
>                 URL: https://issues.apache.org/jira/browse/HIVE-2416
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ted Xu
>            Assignee: Ted Xu
>         Attachments: multi_distinct_skew.patch
>
>
> Currently when multiple distinct function is used, hive.groupby.skewindata optimization parameter shall be set false, or else an exception is raised:
> {code}
> Error in semantic analysis: DISTINCT on different columns not supported with skew in data
> {code}
> Skew groupby should support multiple distinct function

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira