You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Mayank Lahiri (JIRA)" <ji...@apache.org> on 2010/08/13 01:40:17 UTC

[jira] Commented: (HIVE-1529) Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.

    [ https://issues.apache.org/jira/browse/HIVE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898003#action_12898003 ] 

Mayank Lahiri commented on HIVE-1529:
-------------------------------------

Hi Pierre,

The numerical results appear to be accurate. A couple of comments about the code:

(1) Run "ant checkstyle" and looks at the formatting errors for your file in the build/checkstyle/checkstyle-errors.html file. In particular, remove commented lines like #160 of GenericUDAFCovariance.java, and newline-elses like line #214, unnecessary wraps #210-211

(2) Is there any reason for accepting string arguments in the Resolver class? If the user has a numeric value as a string, they can simply (CAST val AS double) in the query. As it stands right now, passing junk strings as one of the input expressions causes a return value of NULL and a silent exception that is only visible in the log file. It might be better to simply not accept STRING types in the resolver, as in GenericUDAFHistogramNumeric.java. This would also mean that you don't have to test for a NumberFormatException in the iterate() method -- line #263 of GenericUDAFCovariance.java.

(3) Please add at least a little extended function info, line #59, see GenericUDAFHistogramNumeric.java or GenericUDAFnGrams.java for an example.

> Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
> ----------------------------------------------------------------------
>
>                 Key: HIVE-1529
>                 URL: https://issues.apache.org/jira/browse/HIVE-1529
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1529.1.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Create new built-in aggregate functions covar_pop and covar_samp, functions commonly used in statistical data analyses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.