You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2014/06/13 16:24:01 UTC

[jira] [Commented] (MAHOUT-1580) Optimize getNumNonZeroElements

    [ https://issues.apache.org/jira/browse/MAHOUT-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030667#comment-14030667 ] 

ASF GitHub Bot commented on MAHOUT-1580:
----------------------------------------

GitHub user sscdotopen opened a pull request:

    https://github.com/apache/mahout/pull/17

    MAHOUT-1580 Optimize getNumNonZeroElements()

    Can someone have a look at the changes? The basic idea is to directly work on the internal datastructures instead of going through the non-zeroes iterator.
    
    A quick benchmark should 7x performance on dense and 3x performance on sequential access sparse vectors.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sscdotopen/mahout MAHOUT-1580

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/mahout/pull/17.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17
    
----
commit 25b99847f43e4cb29b71064921d52c373c221442
Author: ssc <ss...@apache.org>
Date:   2014-06-13T12:32:24Z

    custom implementations for main vectors

commit 50fc20826893d2af3a7fafe94c4ac6ae6198e622
Author: ssc <ss...@apache.org>
Date:   2014-06-13T12:44:49Z

    added optimization for permuted view

----


> Optimize getNumNonZeroElements
> ------------------------------
>
>                 Key: MAHOUT-1580
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1580
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Math
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>             Fix For: 1.0
>
>
> getNumNonZeroElements in AbstractVector uses the nonZeroes -iterator internally which adds a lot of overhead for certain types of vectors, e.g. the dense ones. We should add custom implementations here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)