You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Travis Powell (JIRA)" <ji...@apache.org> on 2011/06/24 18:55:47 UTC

[jira] [Created] (HIVE-2238) Support for Median and Mode UDAFs

Support for Median and Mode UDAFs
---------------------------------

                 Key: HIVE-2238
                 URL: https://issues.apache.org/jira/browse/HIVE-2238
             Project: Hive
          Issue Type: New Feature
          Components: UDF
            Reporter: Travis Powell


Median and Mode are essential functions for reducing/refining the data set, and would allow for greater control over the selection of data. More involved analytics are probably best handled by relational databases or OLAP cubes, but Median and Mode are very practical for Hive solely in terms of delivering a smaller data set, where items selected only have a certain mode. (Rows that describe an object to which the table is joined where that object has a column value frequency threshold.)

Comments are more than welcome. Would be happy to support. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2238) Support for Median and Mode UDAFs

Posted by "Ashutosh Chauhan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134718#comment-13134718 ] 

Ashutosh Chauhan commented on HIVE-2238:
----------------------------------------

Travis,
Median and mode will be pretty useful. Are you planning to contribute them?
                
> Support for Median and Mode UDAFs
> ---------------------------------
>
>                 Key: HIVE-2238
>                 URL: https://issues.apache.org/jira/browse/HIVE-2238
>             Project: Hive
>          Issue Type: New Feature
>          Components: UDF
>            Reporter: Travis Powell
>
> Median and Mode are essential functions for reducing/refining the data set, and would allow for greater control over the selection of data. More involved analytics are probably best handled by relational databases or OLAP cubes, but Median and Mode are very practical for Hive solely in terms of delivering a smaller data set, where items selected only have a certain mode. (Rows that describe an object to which the table is joined where that object has a column value frequency threshold.)
> Comments are more than welcome. Would be happy to support. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira