You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mahout.apache.org by sr...@apache.org on 2010/05/11 16:35:04 UTC

svn commit: r943126 - /lucene/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/IndexIndexWritable.java

Author: srowen
Date: Tue May 11 14:35:04 2010
New Revision: 943126

URL: http://svn.apache.org/viewvc?rev=943126&view=rev
Log:
Remove use of Varint, inadvertently introduced

Modified:
    lucene/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/IndexIndexWritable.java

Modified: lucene/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/IndexIndexWritable.java
URL: http://svn.apache.org/viewvc/lucene/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/IndexIndexWritable.java?rev=943126&r1=943125&r2=943126&view=diff
==============================================================================
--- lucene/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/IndexIndexWritable.java (original)
+++ lucene/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/IndexIndexWritable.java Tue May 11 14:35:04 2010
@@ -22,7 +22,6 @@ import java.io.DataOutput;
 import java.io.IOException;
 
 import org.apache.hadoop.io.WritableComparable;
-import org.apache.mahout.math.Varint;
 
 /** A {@link WritableComparable} encapsulating two item indices. */
 public final class IndexIndexWritable
@@ -55,14 +54,14 @@ public final class IndexIndexWritable
 
   @Override
   public void write(DataOutput out) throws IOException {
-    Varint.writeUnsignedVarInt(aID, out);
-    Varint.writeUnsignedVarInt(bID, out);
+    out.writeInt(aID);
+    out.writeInt(bID);
   }
 
   @Override
   public void readFields(DataInput in) throws IOException {
-    aID = Varint.readUnsignedVarInt(in);
-    bID = Varint.readUnsignedVarInt(in);
+    aID = in.readInt();
+    bID = in.readInt();
   }
 
   @Override



Re: svn commit: r943126 - /lucene/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/IndexIndexWritable.java

Posted by Robin Anil <ro...@gmail.com>.
put the patch up. I will verify it on reuters once.


On Tue, May 11, 2010 at 11:10 PM, Sean Owen <sr...@gmail.com> wrote:

> Well I did a purer, local test and results are more reasonable.
> Writing 10000 random-access sparse vectors, 1000 entries, each a
> random number to 100000, takes 5.4s before versus 4.7s with changes.
> That must be I/O savings since it takes a little more CPU -- and
> that's savings writing to an SSD. Imagine the savings over a network.
>
> Size goes down from 120MB to 108MB, which is in line with expectations.
>
> I saw and fixed one bone-headed error in my patch which didn't use
> variable-length coding for random-access sparse vectors. I think that
> explains the puzzle.
>
Yeah it definitely does.

>
> So... I guess I'd like to commit. Anyone want to check my work?
>
> On Tue, May 11, 2010 at 4:21 PM, Sean Owen <sr...@gmail.com> wrote:
> > I added tests to check it outputs the expected number of bytes. I checked
> > that performance is fine. That checks out.
> >
> > So maybe it was a bad or misleading test. I haven't constructed a new one
> > yet, should be easy though.
> >
> > On May 11, 2010 4:17 PM, "Robin Anil" <ro...@gmail.com> wrote:
> >
> > Sean, Did you get to explore the issue you found with Varint,
> theoretically
> > it should bring better savings thatn VInt and VLong right?
> >
> > Robin
> >
>

Re: svn commit: r943126 - /lucene/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/IndexIndexWritable.java

Posted by Sean Owen <sr...@gmail.com>.
Well I did a purer, local test and results are more reasonable.
Writing 10000 random-access sparse vectors, 1000 entries, each a
random number to 100000, takes 5.4s before versus 4.7s with changes.
That must be I/O savings since it takes a little more CPU -- and
that's savings writing to an SSD. Imagine the savings over a network.

Size goes down from 120MB to 108MB, which is in line with expectations.

I saw and fixed one bone-headed error in my patch which didn't use
variable-length coding for random-access sparse vectors. I think that
explains the puzzle.

So... I guess I'd like to commit. Anyone want to check my work?

On Tue, May 11, 2010 at 4:21 PM, Sean Owen <sr...@gmail.com> wrote:
> I added tests to check it outputs the expected number of bytes. I checked
> that performance is fine. That checks out.
>
> So maybe it was a bad or misleading test. I haven't constructed a new one
> yet, should be easy though.
>
> On May 11, 2010 4:17 PM, "Robin Anil" <ro...@gmail.com> wrote:
>
> Sean, Did you get to explore the issue you found with Varint, theoretically
> it should bring better savings thatn VInt and VLong right?
>
> Robin
>

Re: svn commit: r943126 - /lucene/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/IndexIndexWritable.java

Posted by Sean Owen <sr...@gmail.com>.
I added tests to check it outputs the expected number of bytes. I checked
that performance is fine. That checks out.

So maybe it was a bad or misleading test. I haven't constructed a new one
yet, should be easy though.

On May 11, 2010 4:17 PM, "Robin Anil" <ro...@gmail.com> wrote:

Sean, Did you get to explore the issue you found with Varint, theoretically
it should bring better savings thatn VInt and VLong right?

Robin

Re: svn commit: r943126 - /lucene/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/IndexIndexWritable.java

Posted by Robin Anil <ro...@gmail.com>.
Sean, Did you get to explore the issue you found with Varint, theoretically
it should bring better savings thatn VInt and VLong right?

Robin