You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Pierre Huyn (JIRA)" <ji...@apache.org> on 2010/08/11 21:35:23 UTC

[jira] Created: (HIVE-1529) Add covariance aggregate function covar_pop and covar_samp

Add covariance aggregate function covar_pop and covar_samp
----------------------------------------------------------

                 Key: HIVE-1529
                 URL: https://issues.apache.org/jira/browse/HIVE-1529
             Project: Hadoop Hive
          Issue Type: New Feature
          Components: Query Processor
    Affects Versions: 0.6.0
            Reporter: Pierre Huyn
             Fix For: 0.6.0


Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "Pierre Huyn (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898022#action_12898022 ] 

Pierre Huyn commented on HIVE-1529:
-----------------------------------

Hi Mayank,

Thanks for reviewing. Please bear with me, as this is my first time. I am looking at the checkstyle-errors.html file but I cannot find the problems you reported. The only thing I found is "File contains tab characters (this is the first instance)." on line 177.

Are there other log files I need to look at to find style errors? Are tab characters now allowed?

Regards
--- Pierre



> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "Pierre Huyn (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pierre Huyn updated HIVE-1529:
------------------------------

    Attachment: HIVE-1529.2.patch

Implemented all feedback from reviewer.

> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch, HIVE-1529.2.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1529) Add covariance aggregate function covar_pop and covar_samp

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Sichi updated HIVE-1529:
-----------------------------

    Fix Version/s: 0.7.0
                       (was: 0.6.0)

> Add covariance aggregate function covar_pop and covar_samp
> ----------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.6.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "Mayank Lahiri (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898364#action_12898364 ] 

Mayank Lahiri commented on HIVE-1529:
-------------------------------------

Happy to help! It gets a lot easier after the first couple of UD(A)Fs...

For the code conventions, Hive uses the Sun Java code conventions: http://www.oracle.com/technetwork/java/codeconvtoc-136057.html (the example usage section is probably the most helpful, and I believe not all of them are checked by checkstyle.)

> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1529) Add covariance aggregate function covar_pop and covar_samp

Posted by "Pierre Huyn (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pierre Huyn updated HIVE-1529:
------------------------------


Work is currently under way. Actually the code is done and tested. I am going doing the checklist, moving toward patch submit. Since this is my first assignment, I am not familiar with the process and it may take a little longer to get to the submission point.

> Add covariance aggregate function covar_pop and covar_samp
> ----------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.6.0
>            Reporter: Pierre Huyn
>             Fix For: 0.6.0
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "Pierre Huyn (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pierre Huyn updated HIVE-1529:
------------------------------

    Status: In Progress  (was: Patch Available)

Updated patch ready for review

> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch, HIVE-1529.2.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1529) Add covariance aggregate function covar_pop and covar_samp

Posted by "Pierre Huyn (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pierre Huyn updated HIVE-1529:
------------------------------

    Attachment: HIVE-1529.1.patch

This is the first release of 2 covariance generic UDAF: population covariance covar_pop(x,y) and sample covariance covar_samp(x,y).

I am requesting a code review.

> Add covariance aggregate function covar_pop and covar_samp
> ----------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "Pierre Huyn (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899546#action_12899546 ] 

Pierre Huyn commented on HIVE-1529:
-----------------------------------

Hi John,

That's cool. When and where will the next one be? I am in Mountain View/Palo Alto.
--- Pierre



> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch, HIVE-1529.2.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1529) Add covariance aggregate function covar_pop and covar_samp

Posted by "Pierre Huyn (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pierre Huyn updated HIVE-1529:
------------------------------

    Affects Version/s: 0.7.0
                           (was: 0.6.0)

> Add covariance aggregate function covar_pop and covar_samp
> ----------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "Pierre Huyn (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pierre Huyn updated HIVE-1529:
------------------------------

          Status: Patch Available  (was: In Progress)
    Release Note: New patch available for review.

> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch, HIVE-1529.2.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1529) Add covariance aggregate function covar_pop and covar_samp

Posted by "Pierre Huyn (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12897502#action_12897502 ] 

Pierre Huyn commented on HIVE-1529:
-----------------------------------

Hi John,

Now that I have created the .q test file, how do I generate the corresponding .q.out file? I assume the latter is needed by "ant test". Thanks.

Regards
--- Pierre
 



> Add covariance aggregate function covar_pop and covar_samp
> ----------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1529) Add covariance aggregate function covar_pop and covar_samp

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12897510#action_12897510 ] 

John Sichi commented on HIVE-1529:
----------------------------------

Run ant test with -Doverwrite=true


> Add covariance aggregate function covar_pop and covar_samp
> ----------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "Mayank Lahiri (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899107#action_12899107 ] 

Mayank Lahiri commented on HIVE-1529:
-------------------------------------

+1 Looks good, passes tests.

Note: there is a new data file to be added in data/files/covar_tab.txt 

> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch, HIVE-1529.2.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1529) Add covariance aggregate function covar_pop and covar_samp

Posted by "Pierre Huyn (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12897557#action_12897557 ] 

Pierre Huyn commented on HIVE-1529:
-----------------------------------

Hi John,

Thanks for your help. Now that I am getting further, is there a recommended way to create and populate the table used in my .q test file? I look at the other .q files and many of them use "src" without defining it. Also, I don't see where the "src" is populated. Help!
Regards
--- Pierre



> Add covariance aggregate function covar_pop and covar_samp
> ----------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899629#action_12899629 ] 

John Sichi edited comment on HIVE-1529 at 8/17/10 6:37 PM:
-----------------------------------------------------------

Next one will be at Cloudera, probably some time in September.  Join this meetup group to get notifications and RSVP:

http://www.meetup.com/Hive-Contributors-Group/


      was (Author: jvs):
    Next one will be at CloudEra, probably some time in September.  Join this meetup group to get notifications and RSVP:

http://www.meetup.com/Hive-Contributors-Group/

  
> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch, HIVE-1529.2.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HIVE-1529) Add covariance aggregate function covar_pop and covar_samp

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Sichi reassigned HIVE-1529:
--------------------------------

    Assignee: Pierre Huyn

> Add covariance aggregate function covar_pop and covar_samp
> ----------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.6.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906081#action_12906081 ] 

John Sichi commented on HIVE-1529:
----------------------------------

BTW, Pierre, I found this wiki page:

http://wiki.apache.org/hadoop/Hive/PoweredBy

Feel free to add Intuit there and note that the company is contributing resources to improve Hive.


> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch, HIVE-1529.2.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899629#action_12899629 ] 

John Sichi commented on HIVE-1529:
----------------------------------

Next one will be at CloudEra, probably some time in September.  Join this meetup group to get notifications and RSVP:

http://www.meetup.com/Hive-Contributors-Group/


> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch, HIVE-1529.2.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "Mayank Lahiri (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898003#action_12898003 ] 

Mayank Lahiri commented on HIVE-1529:
-------------------------------------

Hi Pierre,

The numerical results appear to be accurate. A couple of comments about the code:

(1) Run "ant checkstyle" and looks at the formatting errors for your file in the build/checkstyle/checkstyle-errors.html file. In particular, remove commented lines like #160 of GenericUDAFCovariance.java, and newline-elses like line #214, unnecessary wraps #210-211

(2) Is there any reason for accepting string arguments in the Resolver class? If the user has a numeric value as a string, they can simply (CAST val AS double) in the query. As it stands right now, passing junk strings as one of the input expressions causes a return value of NULL and a silent exception that is only visible in the log file. It might be better to simply not accept STRING types in the resolver, as in GenericUDAFHistogramNumeric.java. This would also mean that you don't have to test for a NumberFormatException in the iterate() method -- line #263 of GenericUDAFCovariance.java.

(3) Please add at least a little extended function info, line #59, see GenericUDAFHistogramNumeric.java or GenericUDAFnGrams.java for an example.

> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "Pierre Huyn (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pierre Huyn updated HIVE-1529:
------------------------------

    Summary: Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.  (was: Add covariance aggregate function covar_pop and covar_samp)

> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899113#action_12899113 ] 

John Sichi commented on HIVE-1529:
----------------------------------

Will commit when tests pass.


> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch, HIVE-1529.2.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1529) Add covariance aggregate function covar_pop and covar_samp

Posted by "Pierre Huyn (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pierre Huyn updated HIVE-1529:
------------------------------

    Status: Patch Available  (was: Open)

This is the initial release of the covariance generic UDAFs, covar_pop and covar_samp.

> Add covariance aggregate function covar_pop and covar_samp
> ----------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "Pierre Huyn (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899460#action_12899460 ] 

Pierre Huyn commented on HIVE-1529:
-----------------------------------

Hi John,

Cool! Thanks for committing. I have a question for you about open source contributions. I would like to contribute more going forward and would like my company (Intuit) to allocate more resource (essentially my time) on open source contributions. What would people do commonly? How would my company be recognized for giving back?



> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch, HIVE-1529.2.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1529) Add covariance aggregate function covar_pop and covar_samp

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12897921#action_12897921 ] 

John Sichi commented on HIVE-1529:
----------------------------------

"src" is a test fixture (automatically created for use by all tests).

For an example of how to add a test-specific dataset, see

ql/src/test/queries/clientpositive/nullscript.q

svn add your new file under hive-trunk/data/files.


> Add covariance aggregate function covar_pop and covar_samp
> ----------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899542#action_12899542 ] 

John Sichi commented on HIVE-1529:
----------------------------------

Hey Pierre,

I don't know if we have one already, but we can certainly start a wiki page acknowledging the contributions from all of the companies and individuals who have helped with the development of Hive.

Also, if you're in the Bay Area, come on by one of our contributor meetups so we can connect in person.


> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch, HIVE-1529.2.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Sichi updated HIVE-1529:
-----------------------------

          Status: Resolved  (was: Patch Available)
    Hadoop Flags: [Reviewed]
      Resolution: Fixed

Committed.  Thanks Pierre!


> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch, HIVE-1529.2.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

Posted by "Pierre Huyn (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pierre Huyn updated HIVE-1529:
------------------------------

    Tags: ANSI SQL covariance aggregation function  (was: covariance aggregation function)

> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.