You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Ivan Brusic <iv...@brusic.com> on 2014/07/25 23:04:58 UTC

Was the Vector hierarchy ever Serializable?

I am in the midst of upgrading our Mahout library in order to take
advantage of all the excellent recent additions.

As far as I can tell, the library was based off a snapshot of 0.5. The code
does not use any of the Mahout algorithms, just a few of the data
structures such as DenseVector. The existing code builds a Java object
which is then serialized and distributed. After upgrading to 0.9, I noticed
I was no longer able to deserialize objects since DenseVector is
not Serializable. After inspect the old jar, it seems like AbstractVector
was declared Serializable.

So either someone at my company added serialization to the Mahout classes
or they were Serializable at some point. I am assuming the former. Is this
the case? I looked at the commits and at no point was anything Serializable.

Since the classes are not Serializable and no longer inherit from Writable,
is there an existing strategy to output Mahout structures? Would hate to
write wrapper classes or once again modify the source.

Cheers,

Ivan

Re: Was the Vector hierarchy ever Serializable?

Posted by Ted Dunning <te...@gmail.com>.
There is a problem with using
The java serialization framework in that the serialization that we do thinks about the properties of the vectors rather than the class. That means that the serialization isn't round trip safe. 

Sent from my iPhone

> On Jul 25, 2014, at 14:33, Anand Avati <av...@gluster.org> wrote:
> 
> Ivan,
> 
> Yes you will need some extra (trivial) plumbing, but the meat of efficient
> serialize/deserialize are in those helper classes.
> 
> Thanks
> 
> 
>> On Fri, Jul 25, 2014 at 2:26 PM, Ivan Brusic <iv...@brusic.com> wrote:
>> 
>> Thanks for the quick response. VectorWritable looks like exactly what I
>> need, but it doesn't extend Vector, so there needs to be work done on my
>> part for deeper serialization.
>> 
>> Cheers,
>> 
>> Ivan
>> 
>> 
>>> On Fri, Jul 25, 2014 at 2:13 PM, Anand Avati <av...@gluster.org> wrote:
>>> 
>>> I don't think Vector and Matrix were ever declares Serializable. Please
>>> look at VectorWritable and MatrixWritable classes in mrlegacy module.
>> Both
>>> the Spark bindings and H2O bindings use these *Writable classes for
>>> shipping matrix and vector over the wire. You can even look at
>> https://github.com/avati/mahout/blob/MAHOUT-1500/h2o/src/main/java/org/apache/mahout/h2obindings/drm/H2OBCast.java
>>> as
>>> a reference for how to do it.
>>> 
>>> Thanks
>>> 
>>> 
>>>> On Fri, Jul 25, 2014 at 2:04 PM, Ivan Brusic <iv...@brusic.com> wrote:
>>>> 
>>>> I am in the midst of upgrading our Mahout library in order to take
>>>> advantage of all the excellent recent additions.
>>>> 
>>>> As far as I can tell, the library was based off a snapshot of 0.5. The
>>> code
>>>> does not use any of the Mahout algorithms, just a few of the data
>>>> structures such as DenseVector. The existing code builds a Java object
>>>> which is then serialized and distributed. After upgrading to 0.9, I
>>> noticed
>>>> I was no longer able to deserialize objects since DenseVector is
>>>> not Serializable. After inspect the old jar, it seems like
>> AbstractVector
>>>> was declared Serializable.
>>>> 
>>>> So either someone at my company added serialization to the Mahout
>> classes
>>>> or they were Serializable at some point. I am assuming the former. Is
>>> this
>>>> the case? I looked at the commits and at no point was anything
>>>> Serializable.
>>>> 
>>>> Since the classes are not Serializable and no longer inherit from
>>> Writable,
>>>> is there an existing strategy to output Mahout structures? Would hate
>> to
>>>> write wrapper classes or once again modify the source.
>>>> 
>>>> Cheers,
>>>> 
>>>> Ivan
>> 

Re: Was the Vector hierarchy ever Serializable?

Posted by Anand Avati <av...@gluster.org>.
Ivan,

Yes you will need some extra (trivial) plumbing, but the meat of efficient
serialize/deserialize are in those helper classes.

Thanks


On Fri, Jul 25, 2014 at 2:26 PM, Ivan Brusic <iv...@brusic.com> wrote:

> Thanks for the quick response. VectorWritable looks like exactly what I
> need, but it doesn't extend Vector, so there needs to be work done on my
> part for deeper serialization.
>
> Cheers,
>
> Ivan
>
>
> On Fri, Jul 25, 2014 at 2:13 PM, Anand Avati <av...@gluster.org> wrote:
>
> > I don't think Vector and Matrix were ever declares Serializable. Please
> > look at VectorWritable and MatrixWritable classes in mrlegacy module.
> Both
> > the Spark bindings and H2O bindings use these *Writable classes for
> > shipping matrix and vector over the wire. You can even look at
> >
> >
> https://github.com/avati/mahout/blob/MAHOUT-1500/h2o/src/main/java/org/apache/mahout/h2obindings/drm/H2OBCast.java
> > as
> > a reference for how to do it.
> >
> > Thanks
> >
> >
> > On Fri, Jul 25, 2014 at 2:04 PM, Ivan Brusic <iv...@brusic.com> wrote:
> >
> > > I am in the midst of upgrading our Mahout library in order to take
> > > advantage of all the excellent recent additions.
> > >
> > > As far as I can tell, the library was based off a snapshot of 0.5. The
> > code
> > > does not use any of the Mahout algorithms, just a few of the data
> > > structures such as DenseVector. The existing code builds a Java object
> > > which is then serialized and distributed. After upgrading to 0.9, I
> > noticed
> > > I was no longer able to deserialize objects since DenseVector is
> > > not Serializable. After inspect the old jar, it seems like
> AbstractVector
> > > was declared Serializable.
> > >
> > > So either someone at my company added serialization to the Mahout
> classes
> > > or they were Serializable at some point. I am assuming the former. Is
> > this
> > > the case? I looked at the commits and at no point was anything
> > > Serializable.
> > >
> > > Since the classes are not Serializable and no longer inherit from
> > Writable,
> > > is there an existing strategy to output Mahout structures? Would hate
> to
> > > write wrapper classes or once again modify the source.
> > >
> > > Cheers,
> > >
> > > Ivan
> > >
> >
>

Re: Was the Vector hierarchy ever Serializable?

Posted by Ivan Brusic <iv...@brusic.com>.
Thanks for the quick response. VectorWritable looks like exactly what I
need, but it doesn't extend Vector, so there needs to be work done on my
part for deeper serialization.

Cheers,

Ivan


On Fri, Jul 25, 2014 at 2:13 PM, Anand Avati <av...@gluster.org> wrote:

> I don't think Vector and Matrix were ever declares Serializable. Please
> look at VectorWritable and MatrixWritable classes in mrlegacy module. Both
> the Spark bindings and H2O bindings use these *Writable classes for
> shipping matrix and vector over the wire. You can even look at
>
> https://github.com/avati/mahout/blob/MAHOUT-1500/h2o/src/main/java/org/apache/mahout/h2obindings/drm/H2OBCast.java
> as
> a reference for how to do it.
>
> Thanks
>
>
> On Fri, Jul 25, 2014 at 2:04 PM, Ivan Brusic <iv...@brusic.com> wrote:
>
> > I am in the midst of upgrading our Mahout library in order to take
> > advantage of all the excellent recent additions.
> >
> > As far as I can tell, the library was based off a snapshot of 0.5. The
> code
> > does not use any of the Mahout algorithms, just a few of the data
> > structures such as DenseVector. The existing code builds a Java object
> > which is then serialized and distributed. After upgrading to 0.9, I
> noticed
> > I was no longer able to deserialize objects since DenseVector is
> > not Serializable. After inspect the old jar, it seems like AbstractVector
> > was declared Serializable.
> >
> > So either someone at my company added serialization to the Mahout classes
> > or they were Serializable at some point. I am assuming the former. Is
> this
> > the case? I looked at the commits and at no point was anything
> > Serializable.
> >
> > Since the classes are not Serializable and no longer inherit from
> Writable,
> > is there an existing strategy to output Mahout structures? Would hate to
> > write wrapper classes or once again modify the source.
> >
> > Cheers,
> >
> > Ivan
> >
>

Re: Was the Vector hierarchy ever Serializable?

Posted by Anand Avati <av...@gluster.org>.
I don't think Vector and Matrix were ever declares Serializable. Please
look at VectorWritable and MatrixWritable classes in mrlegacy module. Both
the Spark bindings and H2O bindings use these *Writable classes for
shipping matrix and vector over the wire. You can even look at
https://github.com/avati/mahout/blob/MAHOUT-1500/h2o/src/main/java/org/apache/mahout/h2obindings/drm/H2OBCast.java
as
a reference for how to do it.

Thanks


On Fri, Jul 25, 2014 at 2:04 PM, Ivan Brusic <iv...@brusic.com> wrote:

> I am in the midst of upgrading our Mahout library in order to take
> advantage of all the excellent recent additions.
>
> As far as I can tell, the library was based off a snapshot of 0.5. The code
> does not use any of the Mahout algorithms, just a few of the data
> structures such as DenseVector. The existing code builds a Java object
> which is then serialized and distributed. After upgrading to 0.9, I noticed
> I was no longer able to deserialize objects since DenseVector is
> not Serializable. After inspect the old jar, it seems like AbstractVector
> was declared Serializable.
>
> So either someone at my company added serialization to the Mahout classes
> or they were Serializable at some point. I am assuming the former. Is this
> the case? I looked at the commits and at no point was anything
> Serializable.
>
> Since the classes are not Serializable and no longer inherit from Writable,
> is there an existing strategy to output Mahout structures? Would hate to
> write wrapper classes or once again modify the source.
>
> Cheers,
>
> Ivan
>