You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Jake Mannix (JIRA)" <ji...@apache.org> on 2010/01/27 08:49:34 UTC
[jira] Created: (MAHOUT-268) Vector.getDistanceSquared() is
incorrect for both SparseVector varieties
Vector.getDistanceSquared() is incorrect for both SparseVector varieties
------------------------------------------------------------------------
Key: MAHOUT-268
URL: https://issues.apache.org/jira/browse/MAHOUT-268
Project: Mahout
Issue Type: Bug
Affects Versions: 0.2
Environment: all
Reporter: Jake Mannix
Fix For: 0.3
I'm pretty sure that getDistanceSquared() should just return as if an optimized implementation of:
{code}
public double getDistanceSquared(Vector v) { return this.minus(v).getLengthSquared(); }
{code}
In which case if some vector elements are negative, both SequentialAccessSparseVector (my fault!) and RandomAccessSparseVector return the wrong thing. Very easy to write a failing unit test for this one.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAHOUT-268) Vector.getDistanceSquared() is
incorrect for both SparseVector varieties
Posted by "Jake Mannix (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805562#action_12805562 ]
Jake Mannix commented on MAHOUT-268:
------------------------------------
Oy, this is wrong in all three places it is implemented (in different ways, :\ ) - even in the "non-optimized" impl in AbstractVector:
{code}
@Override
public double getDistanceSquared(Vector v) {
double d = 0;
Iterator<Element> it = iterateNonZero();
Element e;
while(it.hasNext() && (e = it.next()) != null) {
double diff = e.get() - v.getQuick(e.index());
d += (diff * diff);
}
return d;
}
{code}
Iterating over the nonzero entries of this vector doesn't make sure to iterate over the nonzero entries of the other one as well!
> Vector.getDistanceSquared() is incorrect for both SparseVector varieties
> ------------------------------------------------------------------------
>
> Key: MAHOUT-268
> URL: https://issues.apache.org/jira/browse/MAHOUT-268
> Project: Mahout
> Issue Type: Bug
> Affects Versions: 0.2
> Environment: all
> Reporter: NOT_A_USER
> Assignee: Jake Mannix
> Fix For: 0.3
>
>
> I'm pretty sure that getDistanceSquared() should just return as if an optimized implementation of:
> {code}
> public double getDistanceSquared(Vector v) { return this.minus(v).getLengthSquared(); }
> {code}
> In which case if some vector elements are negative, both SequentialAccessSparseVector (my fault!) and RandomAccessSparseVector return the wrong thing. Very easy to write a failing unit test for this one.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAHOUT-268) Vector.getDistanceSquared() is
incorrect for both SparseVector varieties
Posted by "Jake Mannix (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jake Mannix resolved MAHOUT-268.
--------------------------------
Resolution: Fixed
fixed in r903965
> Vector.getDistanceSquared() is incorrect for both SparseVector varieties
> ------------------------------------------------------------------------
>
> Key: MAHOUT-268
> URL: https://issues.apache.org/jira/browse/MAHOUT-268
> Project: Mahout
> Issue Type: Bug
> Affects Versions: 0.2
> Environment: all
> Reporter: NOT_A_USER
> Assignee: Jake Mannix
> Fix For: 0.3
>
>
> I'm pretty sure that getDistanceSquared() should just return as if an optimized implementation of:
> {code}
> public double getDistanceSquared(Vector v) { return this.minus(v).getLengthSquared(); }
> {code}
> In which case if some vector elements are negative, both SequentialAccessSparseVector (my fault!) and RandomAccessSparseVector return the wrong thing. Very easy to write a failing unit test for this one.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAHOUT-268) Vector.getDistanceSquared() is
incorrect for both SparseVector varieties
Posted by "Robin Anil (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robin Anil reassigned MAHOUT-268:
---------------------------------
Assignee: (was: Robin Anil)
Revert
> Vector.getDistanceSquared() is incorrect for both SparseVector varieties
> ------------------------------------------------------------------------
>
> Key: MAHOUT-268
> URL: https://issues.apache.org/jira/browse/MAHOUT-268
> Project: Mahout
> Issue Type: Bug
> Affects Versions: 0.2
> Environment: all
> Reporter: Jake Mannix
> Fix For: 0.3
>
>
> I'm pretty sure that getDistanceSquared() should just return as if an optimized implementation of:
> {code}
> public double getDistanceSquared(Vector v) { return this.minus(v).getLengthSquared(); }
> {code}
> In which case if some vector elements are negative, both SequentialAccessSparseVector (my fault!) and RandomAccessSparseVector return the wrong thing. Very easy to write a failing unit test for this one.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAHOUT-268) Vector.getDistanceSquared() is
incorrect for both SparseVector varieties
Posted by "Robin Anil (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robin Anil reassigned MAHOUT-268:
---------------------------------
Assignee: Robin Anil
Test
> Vector.getDistanceSquared() is incorrect for both SparseVector varieties
> ------------------------------------------------------------------------
>
> Key: MAHOUT-268
> URL: https://issues.apache.org/jira/browse/MAHOUT-268
> Project: Mahout
> Issue Type: Bug
> Affects Versions: 0.2
> Environment: all
> Reporter: Jake Mannix
> Assignee: Robin Anil
> Fix For: 0.3
>
>
> I'm pretty sure that getDistanceSquared() should just return as if an optimized implementation of:
> {code}
> public double getDistanceSquared(Vector v) { return this.minus(v).getLengthSquared(); }
> {code}
> In which case if some vector elements are negative, both SequentialAccessSparseVector (my fault!) and RandomAccessSparseVector return the wrong thing. Very easy to write a failing unit test for this one.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAHOUT-268) Vector.getDistanceSquared() is
incorrect for both SparseVector varieties
Posted by "Jake Mannix (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jake Mannix reassigned MAHOUT-268:
----------------------------------
Assignee: Jake Mannix
> Vector.getDistanceSquared() is incorrect for both SparseVector varieties
> ------------------------------------------------------------------------
>
> Key: MAHOUT-268
> URL: https://issues.apache.org/jira/browse/MAHOUT-268
> Project: Mahout
> Issue Type: Bug
> Affects Versions: 0.2
> Environment: all
> Reporter: Jake Mannix
> Assignee: Jake Mannix
> Fix For: 0.3
>
>
> I'm pretty sure that getDistanceSquared() should just return as if an optimized implementation of:
> {code}
> public double getDistanceSquared(Vector v) { return this.minus(v).getLengthSquared(); }
> {code}
> In which case if some vector elements are negative, both SequentialAccessSparseVector (my fault!) and RandomAccessSparseVector return the wrong thing. Very easy to write a failing unit test for this one.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.