You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Mark Desnoyer (JIRA)" <ji...@apache.org> on 2009/08/04 19:03:14 UTC
[jira] Created: (MAHOUT-159) SparseVector and DenseVector hashCode
does not conform to the Java standard
SparseVector and DenseVector hashCode does not conform to the Java standard
---------------------------------------------------------------------------
Key: MAHOUT-159
URL: https://issues.apache.org/jira/browse/MAHOUT-159
Project: Mahout
Issue Type: Bug
Components: Matrix
Affects Versions: 0.2
Reporter: Mark Desnoyer
Priority: Critical
The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAHOUT-159) SparseVector and DenseVector
hashCode does not conform to the Java standard
Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741065#action_12741065 ]
Grant Ingersoll commented on MAHOUT-159:
----------------------------------------
My only suggestion is that the hashCode method should use the iterateNonZeros() method instead of iterating over size().
> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
> Key: MAHOUT-159
> URL: https://issues.apache.org/jira/browse/MAHOUT-159
> Project: Mahout
> Issue Type: Bug
> Components: Matrix
> Affects Versions: 0.2
> Reporter: Mark Desnoyer
> Assignee: Grant Ingersoll
> Priority: Critical
> Attachments: MAHOUT-159.patch, MAHOUT-159.patch
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAHOUT-159) SparseVector and DenseVector hashCode
does not conform to the Java standard
Posted by "Mark Desnoyer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mark Desnoyer updated MAHOUT-159:
---------------------------------
Attachment: MAHOUT-159.patch
I missed a broken test on the first patch
> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
> Key: MAHOUT-159
> URL: https://issues.apache.org/jira/browse/MAHOUT-159
> Project: Mahout
> Issue Type: Bug
> Components: Matrix
> Affects Versions: 0.2
> Reporter: Mark Desnoyer
> Priority: Critical
> Attachments: MAHOUT-159.patch, MAHOUT-159.patch
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAHOUT-159) SparseVector and DenseVector hashCode
does not conform to the Java standard
Posted by "Mark Desnoyer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mark Desnoyer updated MAHOUT-159:
---------------------------------
Attachment: MAHOUT-159.patch
Bah. Sorry about that. I had a fix for that test but it didn't get into the diff for some reason and I didn't check it.
> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
> Key: MAHOUT-159
> URL: https://issues.apache.org/jira/browse/MAHOUT-159
> Project: Mahout
> Issue Type: Bug
> Components: Matrix
> Affects Versions: 0.2
> Reporter: Mark Desnoyer
> Assignee: Grant Ingersoll
> Priority: Critical
> Attachments: MAHOUT-159.patch, MAHOUT-159.patch, MAHOUT-159.patch, MAHOUT-159.patch
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAHOUT-159) SparseVector and DenseVector
hashCode does not conform to the Java standard
Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744496#action_12744496 ]
Grant Ingersoll commented on MAHOUT-159:
----------------------------------------
Hmm, seems some tests fail with this patch. TestKMeans, etc. I haven't investigated yet and it is likely something is wrong in the tests, but they are still there.
> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
> Key: MAHOUT-159
> URL: https://issues.apache.org/jira/browse/MAHOUT-159
> Project: Mahout
> Issue Type: Bug
> Components: Matrix
> Affects Versions: 0.2
> Reporter: Mark Desnoyer
> Assignee: Grant Ingersoll
> Priority: Critical
> Attachments: MAHOUT-159.patch, MAHOUT-159.patch, MAHOUT-159.patch
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAHOUT-159) SparseVector and DenseVector hashCode
does not conform to the Java standard
Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Ingersoll reassigned MAHOUT-159:
--------------------------------------
Assignee: Grant Ingersoll
> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
> Key: MAHOUT-159
> URL: https://issues.apache.org/jira/browse/MAHOUT-159
> Project: Mahout
> Issue Type: Bug
> Components: Matrix
> Affects Versions: 0.2
> Reporter: Mark Desnoyer
> Assignee: Grant Ingersoll
> Priority: Critical
> Attachments: MAHOUT-159.patch, MAHOUT-159.patch
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAHOUT-159) SparseVector and DenseVector
hashCode does not conform to the Java standard
Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741124#action_12741124 ]
Ted Dunning commented on MAHOUT-159:
------------------------------------
If you iterate over non zeros, then you should also include the dimensions in the hash code so that different sized vectors will have different hashes. In addition, the indexes of the non-zero elements should be included so that vectors with the same values in different positions will have different hashes. I think that we want to be fairly sure that
\\
\\
{noformat}hash([1,0,2,0,0,0]) != hash([1,0,2]){noformat}
and
{noformat}hash([1,0,2]) != hash([1,2,0]){noformat}
> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
> Key: MAHOUT-159
> URL: https://issues.apache.org/jira/browse/MAHOUT-159
> Project: Mahout
> Issue Type: Bug
> Components: Matrix
> Affects Versions: 0.2
> Reporter: Mark Desnoyer
> Assignee: Grant Ingersoll
> Priority: Critical
> Attachments: MAHOUT-159.patch, MAHOUT-159.patch
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAHOUT-159) SparseVector and DenseVector hashCode
does not conform to the Java standard
Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Ingersoll resolved MAHOUT-159.
------------------------------------
Resolution: Fixed
Fix Version/s: 0.2
Committed revision 810184.
> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
> Key: MAHOUT-159
> URL: https://issues.apache.org/jira/browse/MAHOUT-159
> Project: Mahout
> Issue Type: Bug
> Components: Matrix
> Affects Versions: 0.2
> Reporter: Mark Desnoyer
> Assignee: Grant Ingersoll
> Priority: Critical
> Fix For: 0.2
>
> Attachments: MAHOUT-159.patch, MAHOUT-159.patch, MAHOUT-159.patch, MAHOUT-159.patch
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAHOUT-159) SparseVector and DenseVector hashCode
does not conform to the Java standard
Posted by "Mark Desnoyer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mark Desnoyer updated MAHOUT-159:
---------------------------------
Attachment: MAHOUT-159.patch
Added patch to fix this inconsistency. Includes tests to make sure it won't happen again.
> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
> Key: MAHOUT-159
> URL: https://issues.apache.org/jira/browse/MAHOUT-159
> Project: Mahout
> Issue Type: Bug
> Components: Matrix
> Affects Versions: 0.2
> Reporter: Mark Desnoyer
> Priority: Critical
> Attachments: MAHOUT-159.patch
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAHOUT-159) SparseVector and DenseVector hashCode
does not conform to the Java standard
Posted by "Mark Desnoyer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mark Desnoyer updated MAHOUT-159:
---------------------------------
Attachment: MAHOUT-159.patch
As per Ted's suggestion, added vector size and element index to the hash
> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
> Key: MAHOUT-159
> URL: https://issues.apache.org/jira/browse/MAHOUT-159
> Project: Mahout
> Issue Type: Bug
> Components: Matrix
> Affects Versions: 0.2
> Reporter: Mark Desnoyer
> Assignee: Grant Ingersoll
> Priority: Critical
> Attachments: MAHOUT-159.patch, MAHOUT-159.patch, MAHOUT-159.patch
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.