You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Ted Dunning (JIRA)" <ji...@apache.org> on 2010/01/25 20:32:34 UTC

[jira] Commented: (MAHOUT-209) Add aggregate() methods for Vector

    [ https://issues.apache.org/jira/browse/MAHOUT-209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804663#action_12804663 ] 

Ted Dunning commented on MAHOUT-209:
------------------------------------


These look nearly good enough to commit as they stand.  I am sure that there will be 1-2 more patterns that we would like to add in this style, but I think that these are good enough to start with.  If we need specialized versions for speed, we can file additional JIRA's (after 0.3).

 

> Add aggregate() methods for Vector
> ----------------------------------
>
>                 Key: MAHOUT-209
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-209
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Math
>         Environment: all
>            Reporter: Jake Mannix
>            Assignee: Jake Mannix
>            Priority: Minor
>             Fix For: 0.3
>
>
> As discussed in MAHOUT-165 at some point, Vector (and Matrix, but let's put that on a separate ticket) could do with a nice exposure of methods like the following:
> {code}
> // this can get optimized, of course
>   public double aggregate(Vector other, BinaryFunction aggregator, BinaryFunction combiner) {
>     double result = 0;
>     for(int i=0; i<size(); i++) {
>       result = aggregator.apply(result, combiner.apply(getQuick(i), other.getQuick(i)));
>     }
>     return result;
>   }
> {code}
> this is good for generalized inner products and distances.  Also nice:
> {code}
>   public double aggregate(BinaryFunction aggregator, UnaryFunction map) {
>     double result = 0;
>     for(int i=0; i<size(); i++) {
>       result = aggregator.apply(result, map.apply(getQuick(i)) );
>     }
>     return result;
>   }
> {code}
> Which generalizes norms and statistics (mean, median, stdDev) and things like that (number of positive values, or negative values, etc...).
> These kind of thing exists in Colt, and we could just surface it up to the top.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.