You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2010/02/03 20:57:29 UTC

[jira] Created: (HIVE-1128) Let max/min handle complex types like struct

Let max/min handle complex types like struct
--------------------------------------------

                 Key: HIVE-1128
                 URL: https://issues.apache.org/jira/browse/HIVE-1128
             Project: Hadoop Hive
          Issue Type: Improvement
    Affects Versions: 0.6.0
            Reporter: Zheng Shao


A lot of users are interested in doing "arg_min" and "arg_max". Basically, return the value of some other columns when one column's value is the max value.

The following is an example usage when this is done:

{code}
SELECT department, max(struct(salary, employee_name))
FROM compensations;
{code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1128) Let max/min handle complex types like struct

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-1128:
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.6.0
           Status: Resolved  (was: Patch Available)

Committed. Thanks Zheng!

> Let max/min handle complex types like struct
> --------------------------------------------
>
>                 Key: HIVE-1128
>                 URL: https://issues.apache.org/jira/browse/HIVE-1128
>             Project: Hadoop Hive
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1128.1.sh, HIVE-1128.2.patch
>
>
> A lot of users are interested in doing "arg_min" and "arg_max". Basically, return the value of some other columns when one column's value is the max value.
> The following is an example usage when this is done:
> {code}
> SELECT department, max(struct(salary, employee_name))
> FROM compensations;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1128) Let max/min handle complex types like struct

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-1128:
-----------------------------

    Attachment:     (was: HIVE-1128.1.patch)

> Let max/min handle complex types like struct
> --------------------------------------------
>
>                 Key: HIVE-1128
>                 URL: https://issues.apache.org/jira/browse/HIVE-1128
>             Project: Hadoop Hive
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-1128.1.sh
>
>
> A lot of users are interested in doing "arg_min" and "arg_max". Basically, return the value of some other columns when one column's value is the max value.
> The following is an example usage when this is done:
> {code}
> SELECT department, max(struct(salary, employee_name))
> FROM compensations;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1128) Let max/min handle complex types like struct

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829338#action_12829338 ] 

Todd Lipcon commented on HIVE-1128:
-----------------------------------

yep, my suggestion is definitely a separate feature that is orthogonal to this JIRA.

> Let max/min handle complex types like struct
> --------------------------------------------
>
>                 Key: HIVE-1128
>                 URL: https://issues.apache.org/jira/browse/HIVE-1128
>             Project: Hadoop Hive
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-1128.1.sh, HIVE-1128.2.patch
>
>
> A lot of users are interested in doing "arg_min" and "arg_max". Basically, return the value of some other columns when one column's value is the max value.
> The following is an example usage when this is done:
> {code}
> SELECT department, max(struct(salary, employee_name))
> FROM compensations;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1128) Let max/min handle complex types like struct

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-1128:
-----------------------------

    Attachment: HIVE-1128.1.sh
                HIVE-1128.1.patch

This patch adds new max/min functions which are capable of dealing with complex types. It also moves the old UDAFMax/UDAFMin into contrib so that people can take these as examples.

Please run "HIVE-1128.1.sh" before applying the patch HIVE-1128.1.patch to make sure we can keep the svn log information for the moved files.


> Let max/min handle complex types like struct
> --------------------------------------------
>
>                 Key: HIVE-1128
>                 URL: https://issues.apache.org/jira/browse/HIVE-1128
>             Project: Hadoop Hive
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-1128.1.patch, HIVE-1128.1.sh
>
>
> A lot of users are interested in doing "arg_min" and "arg_max". Basically, return the value of some other columns when one column's value is the max value.
> The following is an example usage when this is done:
> {code}
> SELECT department, max(struct(salary, employee_name))
> FROM compensations;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1128) Let max/min handle complex types like struct

Posted by "Paul Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829421#action_12829421 ] 

Paul Yang commented on HIVE-1128:
---------------------------------

Looks good. +1

> Let max/min handle complex types like struct
> --------------------------------------------
>
>                 Key: HIVE-1128
>                 URL: https://issues.apache.org/jira/browse/HIVE-1128
>             Project: Hadoop Hive
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-1128.1.sh, HIVE-1128.2.patch
>
>
> A lot of users are interested in doing "arg_min" and "arg_max". Basically, return the value of some other columns when one column's value is the max value.
> The following is an example usage when this is done:
> {code}
> SELECT department, max(struct(salary, employee_name))
> FROM compensations;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1128) Let max/min handle complex types like struct

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829329#action_12829329 ] 

Zheng Shao commented on HIVE-1128:
----------------------------------

Todd, that's a good suggestion. I will open a follow-up for argmin and argmax.


> Let max/min handle complex types like struct
> --------------------------------------------
>
>                 Key: HIVE-1128
>                 URL: https://issues.apache.org/jira/browse/HIVE-1128
>             Project: Hadoop Hive
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-1128.1.patch, HIVE-1128.1.sh
>
>
> A lot of users are interested in doing "arg_min" and "arg_max". Basically, return the value of some other columns when one column's value is the max value.
> The following is an example usage when this is done:
> {code}
> SELECT department, max(struct(salary, employee_name))
> FROM compensations;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1128) Let max/min handle complex types like struct

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829311#action_12829311 ] 

Todd Lipcon commented on HIVE-1128:
-----------------------------------

This is clever, but I'd be surprised if a lot of non-programmer users would come up with this on their own. Would it be helpful to also provide argmin and argmax functions? The statistics community would probably appreciate the syntactic sugar.

> Let max/min handle complex types like struct
> --------------------------------------------
>
>                 Key: HIVE-1128
>                 URL: https://issues.apache.org/jira/browse/HIVE-1128
>             Project: Hadoop Hive
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Zheng Shao
>
> A lot of users are interested in doing "arg_min" and "arg_max". Basically, return the value of some other columns when one column's value is the max value.
> The following is an example usage when this is done:
> {code}
> SELECT department, max(struct(salary, employee_name))
> FROM compensations;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1128) Let max/min handle complex types like struct

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-1128:
-----------------------------

    Status: Patch Available  (was: Open)

> Let max/min handle complex types like struct
> --------------------------------------------
>
>                 Key: HIVE-1128
>                 URL: https://issues.apache.org/jira/browse/HIVE-1128
>             Project: Hadoop Hive
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-1128.1.sh, HIVE-1128.2.patch
>
>
> A lot of users are interested in doing "arg_min" and "arg_max". Basically, return the value of some other columns when one column's value is the max value.
> The following is an example usage when this is done:
> {code}
> SELECT department, max(struct(salary, employee_name))
> FROM compensations;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HIVE-1128) Let max/min handle complex types like struct

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao reassigned HIVE-1128:
--------------------------------

    Assignee: Zheng Shao

> Let max/min handle complex types like struct
> --------------------------------------------
>
>                 Key: HIVE-1128
>                 URL: https://issues.apache.org/jira/browse/HIVE-1128
>             Project: Hadoop Hive
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>
> A lot of users are interested in doing "arg_min" and "arg_max". Basically, return the value of some other columns when one column's value is the max value.
> The following is an example usage when this is done:
> {code}
> SELECT department, max(struct(salary, employee_name))
> FROM compensations;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1128) Let max/min handle complex types like struct

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829448#action_12829448 ] 

Ning Zhang commented on HIVE-1128:
----------------------------------

Will commit after test.

> Let max/min handle complex types like struct
> --------------------------------------------
>
>                 Key: HIVE-1128
>                 URL: https://issues.apache.org/jira/browse/HIVE-1128
>             Project: Hadoop Hive
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-1128.1.sh, HIVE-1128.2.patch
>
>
> A lot of users are interested in doing "arg_min" and "arg_max". Basically, return the value of some other columns when one column's value is the max value.
> The following is an example usage when this is done:
> {code}
> SELECT department, max(struct(salary, employee_name))
> FROM compensations;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1128) Let max/min handle complex types like struct

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-1128:
-----------------------------

    Attachment: HIVE-1128.2.patch

Fixed a typo in test case.

> Let max/min handle complex types like struct
> --------------------------------------------
>
>                 Key: HIVE-1128
>                 URL: https://issues.apache.org/jira/browse/HIVE-1128
>             Project: Hadoop Hive
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-1128.1.sh, HIVE-1128.2.patch
>
>
> A lot of users are interested in doing "arg_min" and "arg_max". Basically, return the value of some other columns when one column's value is the max value.
> The following is an example usage when this is done:
> {code}
> SELECT department, max(struct(salary, employee_name))
> FROM compensations;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.