You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Sean Owen <sr...@gmail.com> on 2009/06/17 16:00:21 UTC

Re: [jira] Updated: (MAHOUT-121) Speed up distance calculations for sparse vectors

Oops somehow lost my last sentence:

Meant to say that the unit test pass. OK to submit?

On Jun 17, 2009 2:58 PM, "Sean Owen (JIRA)" <ji...@apache.org> wrote:

[
https://issues.apache.org/jira/browse/MAHOUT-121?page=com.atlassian.jira.plugin.system.issue.
..
Sean Owen updated MAHOUT-121:
-----------------------------

   Attachment: MAHOUT-121.patch

Here's my latest patch, which incorporates a unit test, further refinements,
as well as related changes we discussed on a separate thread. In fact, most
of the diff is due to those other changes. The core of the change involves
SparseVector and OrderedIntDoubleMapping.

Is this enough of a good start to commit, and move forward with? seems like
we have evidence it gives a significant performance boost. There was a
question of correctness

> Speed up distance calculations for sparse vectors >
---------------------------------------------...
>         Attachments: MAHOUT-121.patch, MAHOUT-121.patch, mahout-121.patch,
Mahout1211.patch

> > > From my mail to the Mahout mailing list. > I am working on clustering
a dataset which has thou...