You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Mark Desnoyer (JIRA)" <ji...@apache.org> on 2009/08/04 19:03:14 UTC

[jira] Created: (MAHOUT-159) SparseVector and DenseVector hashCode does not conform to the Java standard

SparseVector and DenseVector hashCode does not conform to the Java standard
---------------------------------------------------------------------------

                 Key: MAHOUT-159
                 URL: https://issues.apache.org/jira/browse/MAHOUT-159
             Project: Mahout
          Issue Type: Bug
          Components: Matrix
    Affects Versions: 0.2
            Reporter: Mark Desnoyer
            Priority: Critical


The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-159) SparseVector and DenseVector hashCode does not conform to the Java standard

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741065#action_12741065 ] 

Grant Ingersoll commented on MAHOUT-159:
----------------------------------------

My only suggestion is that the hashCode method should use the iterateNonZeros() method instead of iterating over size().

> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
>                 Key: MAHOUT-159
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-159
>             Project: Mahout
>          Issue Type: Bug
>          Components: Matrix
>    Affects Versions: 0.2
>            Reporter: Mark Desnoyer
>            Assignee: Grant Ingersoll
>            Priority: Critical
>         Attachments: MAHOUT-159.patch, MAHOUT-159.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAHOUT-159) SparseVector and DenseVector hashCode does not conform to the Java standard

Posted by "Mark Desnoyer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Desnoyer updated MAHOUT-159:
---------------------------------

    Attachment: MAHOUT-159.patch

I missed a broken test on the first patch

> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
>                 Key: MAHOUT-159
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-159
>             Project: Mahout
>          Issue Type: Bug
>          Components: Matrix
>    Affects Versions: 0.2
>            Reporter: Mark Desnoyer
>            Priority: Critical
>         Attachments: MAHOUT-159.patch, MAHOUT-159.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAHOUT-159) SparseVector and DenseVector hashCode does not conform to the Java standard

Posted by "Mark Desnoyer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Desnoyer updated MAHOUT-159:
---------------------------------

    Attachment: MAHOUT-159.patch

Bah. Sorry about that. I had a fix for that test but it didn't get into the diff for some reason and I didn't check it.

> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
>                 Key: MAHOUT-159
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-159
>             Project: Mahout
>          Issue Type: Bug
>          Components: Matrix
>    Affects Versions: 0.2
>            Reporter: Mark Desnoyer
>            Assignee: Grant Ingersoll
>            Priority: Critical
>         Attachments: MAHOUT-159.patch, MAHOUT-159.patch, MAHOUT-159.patch, MAHOUT-159.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-159) SparseVector and DenseVector hashCode does not conform to the Java standard

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744496#action_12744496 ] 

Grant Ingersoll commented on MAHOUT-159:
----------------------------------------

Hmm, seems some tests fail with this patch.  TestKMeans, etc.  I haven't investigated yet and it is likely something is wrong in the tests, but they are still there.

> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
>                 Key: MAHOUT-159
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-159
>             Project: Mahout
>          Issue Type: Bug
>          Components: Matrix
>    Affects Versions: 0.2
>            Reporter: Mark Desnoyer
>            Assignee: Grant Ingersoll
>            Priority: Critical
>         Attachments: MAHOUT-159.patch, MAHOUT-159.patch, MAHOUT-159.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (MAHOUT-159) SparseVector and DenseVector hashCode does not conform to the Java standard

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll reassigned MAHOUT-159:
--------------------------------------

    Assignee: Grant Ingersoll

> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
>                 Key: MAHOUT-159
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-159
>             Project: Mahout
>          Issue Type: Bug
>          Components: Matrix
>    Affects Versions: 0.2
>            Reporter: Mark Desnoyer
>            Assignee: Grant Ingersoll
>            Priority: Critical
>         Attachments: MAHOUT-159.patch, MAHOUT-159.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-159) SparseVector and DenseVector hashCode does not conform to the Java standard

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741124#action_12741124 ] 

Ted Dunning commented on MAHOUT-159:
------------------------------------

If you iterate over non zeros, then you should also include the dimensions in the hash code so that different sized vectors will have different hashes.  In addition, the indexes of the non-zero elements should be included so that vectors with the same values in different positions will have different hashes.  I think that we want to be fairly sure that
\\
\\
{noformat}hash([1,0,2,0,0,0]) != hash([1,0,2]){noformat}

and

{noformat}hash([1,0,2]) != hash([1,2,0]){noformat}


> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
>                 Key: MAHOUT-159
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-159
>             Project: Mahout
>          Issue Type: Bug
>          Components: Matrix
>    Affects Versions: 0.2
>            Reporter: Mark Desnoyer
>            Assignee: Grant Ingersoll
>            Priority: Critical
>         Attachments: MAHOUT-159.patch, MAHOUT-159.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (MAHOUT-159) SparseVector and DenseVector hashCode does not conform to the Java standard

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll resolved MAHOUT-159.
------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.2

Committed revision 810184.

> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
>                 Key: MAHOUT-159
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-159
>             Project: Mahout
>          Issue Type: Bug
>          Components: Matrix
>    Affects Versions: 0.2
>            Reporter: Mark Desnoyer
>            Assignee: Grant Ingersoll
>            Priority: Critical
>             Fix For: 0.2
>
>         Attachments: MAHOUT-159.patch, MAHOUT-159.patch, MAHOUT-159.patch, MAHOUT-159.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAHOUT-159) SparseVector and DenseVector hashCode does not conform to the Java standard

Posted by "Mark Desnoyer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Desnoyer updated MAHOUT-159:
---------------------------------

    Attachment: MAHOUT-159.patch

Added patch to fix this inconsistency. Includes tests to make sure it won't happen again.

> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
>                 Key: MAHOUT-159
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-159
>             Project: Mahout
>          Issue Type: Bug
>          Components: Matrix
>    Affects Versions: 0.2
>            Reporter: Mark Desnoyer
>            Priority: Critical
>         Attachments: MAHOUT-159.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAHOUT-159) SparseVector and DenseVector hashCode does not conform to the Java standard

Posted by "Mark Desnoyer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Desnoyer updated MAHOUT-159:
---------------------------------

    Attachment: MAHOUT-159.patch

As per Ted's suggestion, added vector size and element index to the hash

> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
>                 Key: MAHOUT-159
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-159
>             Project: Mahout
>          Issue Type: Bug
>          Components: Matrix
>    Affects Versions: 0.2
>            Reporter: Mark Desnoyer
>            Assignee: Grant Ingersoll
>            Priority: Critical
>         Attachments: MAHOUT-159.patch, MAHOUT-159.patch, MAHOUT-159.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though equals() may return true. Also, the equals logic is inconsistent because DenseVector takes into account the name parameter but SparseVector does not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.