You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2010/06/30 16:29:49 UTC

[jira] Commented: (MAHOUT-379) SequentialAccessSparseVector.equals does not agree with AbstractVector.equivalent

    [ https://issues.apache.org/jira/browse/MAHOUT-379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883937#action_12883937 ] 

Sean Owen commented on MAHOUT-379:
----------------------------------

Yes that sounds about right, the intent is to use NamedVector instead.

> SequentialAccessSparseVector.equals does not agree with AbstractVector.equivalent
> ---------------------------------------------------------------------------------
>
>                 Key: MAHOUT-379
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-379
>             Project: Mahout
>          Issue Type: Bug
>          Components: Math
>    Affects Versions: 0.4
>            Reporter: Danny Leshem
>            Assignee: Sean Owen
>            Priority: Minor
>             Fix For: 0.4
>
>         Attachments: MAHOUT-379-lucene.patch, MAHOUT-379.patch, MAHOUT-379.patch, MAHOUT-379.patch
>
>
> When a SequentialAccessSparseVector is serialized and deserialized using VectorWritable, the result vector and the original vector are equivalent, yet equals returns false.
> The following unit-test reproduces the problem:
> {code}
> @Test
> public void testSequentialAccessSparseVectorEquals() throws Exception {
>     final Vector v = new SequentialAccessSparseVector(1);
>     final VectorWritable vectorWritable = new VectorWritable(v);
>     final VectorWritable vectorWritable2 = new VectorWritable();
>     writeAndRead(vectorWritable, vectorWritable2);
>     final Vector v2 = vectorWritable2.get();
>     assertTrue(AbstractVector.equivalent(v, v2));
>     assertEquals(v, v2); // This line fails!
> }
> private void writeAndRead(Writable toWrite, Writable toRead) throws IOException {
>     final ByteArrayOutputStream baos = new ByteArrayOutputStream();
>     final DataOutputStream dos = new DataOutputStream(baos);
>     toWrite.write(dos);
>     final ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray());
>     final DataInputStream dis = new DataInputStream(bais);
>     toRead.readFields(dis);
> }
> {code}
> The problem seems to be that the original vector name is null, while the new vector's name is an empty string. The same issue probably also happens with RandomAccessSparseVector.
> SequentialAccessSparseVectorWritable (line 40):
> {code}
> dataOutput.writeUTF(getName() == null ? "" : getName());
> {code}
> RandomAccessSparseVectorWritable (line 42):
> {code}
> dataOutput.writeUTF(this.getName() == null ? "" : this.getName());
> {code}
> The simplest fix is probably to change the default Vector's name from null to the empty string.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.