You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Jake Mannix (JIRA)" <ji...@apache.org> on 2010/01/27 17:26:34 UTC
[jira] Commented: (MAHOUT-268) Vector.getDistanceSquared() is
incorrect for both SparseVector varieties
[ https://issues.apache.org/jira/browse/MAHOUT-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805562#action_12805562 ]
Jake Mannix commented on MAHOUT-268:
------------------------------------
Oy, this is wrong in all three places it is implemented (in different ways, :\ ) - even in the "non-optimized" impl in AbstractVector:
{code}
@Override
public double getDistanceSquared(Vector v) {
double d = 0;
Iterator<Element> it = iterateNonZero();
Element e;
while(it.hasNext() && (e = it.next()) != null) {
double diff = e.get() - v.getQuick(e.index());
d += (diff * diff);
}
return d;
}
{code}
Iterating over the nonzero entries of this vector doesn't make sure to iterate over the nonzero entries of the other one as well!
> Vector.getDistanceSquared() is incorrect for both SparseVector varieties
> ------------------------------------------------------------------------
>
> Key: MAHOUT-268
> URL: https://issues.apache.org/jira/browse/MAHOUT-268
> Project: Mahout
> Issue Type: Bug
> Affects Versions: 0.2
> Environment: all
> Reporter: NOT_A_USER
> Assignee: Jake Mannix
> Fix For: 0.3
>
>
> I'm pretty sure that getDistanceSquared() should just return as if an optimized implementation of:
> {code}
> public double getDistanceSquared(Vector v) { return this.minus(v).getLengthSquared(); }
> {code}
> In which case if some vector elements are negative, both SequentialAccessSparseVector (my fault!) and RandomAccessSparseVector return the wrong thing. Very easy to write a failing unit test for this one.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.