You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Arun Avanathan <ar...@gmail.com> on 2013/04/01 02:04:14 UTC

ModelSerializer.writeToJson()

All,

I'm new to Mahout. So please bear with my dumb question.

In 0.4 we had writeToJson() method, but I don't see it in 0.7 version. Is
there a way to serialize Models in json format? Idea is to store them in
Database & rebuild the models when the application server comes up.

Thanks
Arun

Re: ModelSerializer.writeToJson()

Posted by Ted Dunning <te...@gmail.com>.
The write to json code had serious problems with heap size and thus was
deleted rather than fixed.  There is a binary format instead which is much
more efficient.

Some models are quite large, however, and you may not want to store them in
a database.


On Mon, Apr 1, 2013 at 2:04 AM, Arun Avanathan <ar...@gmail.com>wrote:

> All,
>
> I'm new to Mahout. So please bear with my dumb question.
>
> In 0.4 we had writeToJson() method, but I don't see it in 0.7 version. Is
> there a way to serialize Models in json format? Idea is to store them in
> Database & rebuild the models when the application server comes up.
>
> Thanks
> Arun
>

Re: ModelSerializer.writeToJson()

Posted by Arun Avanathan <ar...@gmail.com>.
Suneel & Ted, Thanks for responding so quickly to my questions. I'll stick
to Base64 solution for now.

Thanks again


On Mon, Apr 1, 2013 at 3:59 AM, Ted Dunning <te...@gmail.com> wrote:

> The real problem is that the model consists mostly of a matrix full of
> double precision numbers.  Encoding those as text is always problematic.
>  Your solution of using base-64 is a fine one if you really need text.
>
> On Mon, Apr 1, 2013 at 4:31 AM, Arun Avanathan <arun.avanathan@gmail.com
> >wrote:
>
> > Imagine if we had a String based serializing feature, we could have
> > simplified the solution. I guess serializing is not that straight forward
> > with complex math models, so it is not implemented.
> >
>

Re: ModelSerializer.writeToJson()

Posted by Ted Dunning <te...@gmail.com>.
The real problem is that the model consists mostly of a matrix full of
double precision numbers.  Encoding those as text is always problematic.
 Your solution of using base-64 is a fine one if you really need text.

On Mon, Apr 1, 2013 at 4:31 AM, Arun Avanathan <ar...@gmail.com>wrote:

> Imagine if we had a String based serializing feature, we could have
> simplified the solution. I guess serializing is not that straight forward
> with complex math models, so it is not implemented.
>

Re: ModelSerializer.writeToJson()

Posted by Arun Avanathan <ar...@gmail.com>.
:) Actually we are thinking in this line. We have few other columns to
store some metadata around this BLOB/CLOB. Like id, foreign key etc.,

Now if we are doing ETL, we need to have a csv or some file, containing all
these columns. We cannot have blob / binary object embedded in this text
file. So I was looking for a String based serialize solution.

So until we find a cleaner solution, we are going with encoding the java
serialized binary object with Base64 and uploading it via ETL with other
text fields.

Imagine if we had a String based serializing feature, we could have
simplified the solution. I guess serializing is not that straight forward
with complex math models, so it is not implemented.

I am sure there is better way. Any thoughts / ideas to simplify our
approach is greatly appreciated.

Thanks


On Sun, Mar 31, 2013 at 7:18 PM, Suneel Marthi <su...@yahoo.com>wrote:

> ....  how about persisting the model itself into an Oracle column as a
> BLOB and have the AppServer read
> the model from the database.  This way if the model gets updated, then all
> the applications would be reading off of the same model.
>
>
>
>
>
> ________________________________
>  From: Arun Avanathan <ar...@gmail.com>
> To: dev@mahout.apache.org; Suneel Marthi <su...@yahoo.com>
> Sent: Sunday, March 31, 2013 9:47 PM
> Subject: Re: ModelSerializer.writeToJson()
>
> Idea is to ETL the serialized models on a schedule basis from one
> application to another. War/Jar approach is another way of doing, but it
> needs shipping jars + the metadata around them by ETL.
>
>
> We use Oracle DB for this & it is getting a bit messy uploading these via
> SQLLoader to DB. Am looking for simpler options :)
>
> Thanks
>
>
> On Sun, Mar 31, 2013 at 6:41 PM, Suneel Marthi <suneel_marthi@yahoo.com
> >wrote:
>
> > Why would you want to do that?  Isn't it easier to just package the model
> > into your application war/jar and read the same a s ResourceStream as
> > opposed to what you are proposing?
> >
> >
> >
> > ________________________________
> >  From: Arun Avanathan <ar...@gmail.com>
> > To: dev@mahout.apache.org
> > Sent: Sunday, March 31, 2013 8:04 PM
> > Subject: ModelSerializer.writeToJson()
> >
> > All,
> >
> > I'm new to Mahout. So please bear with my dumb question.
> >
> > In 0.4 we had writeToJson() method, but I don't see it in 0.7 version. Is
> > there a way to serialize Models in json format? Idea is to store them in
> > Database & rebuild the models when the application server comes up.
> >
> > Thanks
> > Arun
> >
>

Re: ModelSerializer.writeToJson()

Posted by Suneel Marthi <su...@yahoo.com>.
....  how about persisting the model itself into an Oracle column as a BLOB and have the AppServer read
the model from the database.  This way if the model gets updated, then all the applications would be reading off of the same model.





________________________________
 From: Arun Avanathan <ar...@gmail.com>
To: dev@mahout.apache.org; Suneel Marthi <su...@yahoo.com> 
Sent: Sunday, March 31, 2013 9:47 PM
Subject: Re: ModelSerializer.writeToJson()
 
Idea is to ETL the serialized models on a schedule basis from one
application to another. War/Jar approach is another way of doing, but it
needs shipping jars + the metadata around them by ETL.


We use Oracle DB for this & it is getting a bit messy uploading these via
SQLLoader to DB. Am looking for simpler options :)

Thanks


On Sun, Mar 31, 2013 at 6:41 PM, Suneel Marthi <su...@yahoo.com>wrote:

> Why would you want to do that?  Isn't it easier to just package the model
> into your application war/jar and read the same a s ResourceStream as
> opposed to what you are proposing?
>
>
>
> ________________________________
>  From: Arun Avanathan <ar...@gmail.com>
> To: dev@mahout.apache.org
> Sent: Sunday, March 31, 2013 8:04 PM
> Subject: ModelSerializer.writeToJson()
>
> All,
>
> I'm new to Mahout. So please bear with my dumb question.
>
> In 0.4 we had writeToJson() method, but I don't see it in 0.7 version. Is
> there a way to serialize Models in json format? Idea is to store them in
> Database & rebuild the models when the application server comes up.
>
> Thanks
> Arun
>

Re: ModelSerializer.writeToJson()

Posted by Arun Avanathan <ar...@gmail.com>.
Idea is to ETL the serialized models on a schedule basis from one
application to another. War/Jar approach is another way of doing, but it
needs shipping jars + the metadata around them by ETL.


We use Oracle DB for this & it is getting a bit messy uploading these via
SQLLoader to DB. Am looking for simpler options :)

Thanks


On Sun, Mar 31, 2013 at 6:41 PM, Suneel Marthi <su...@yahoo.com>wrote:

> Why would you want to do that?  Isn't it easier to just package the model
> into your application war/jar and read the same a s ResourceStream as
> opposed to what you are proposing?
>
>
>
> ________________________________
>  From: Arun Avanathan <ar...@gmail.com>
> To: dev@mahout.apache.org
> Sent: Sunday, March 31, 2013 8:04 PM
> Subject: ModelSerializer.writeToJson()
>
> All,
>
> I'm new to Mahout. So please bear with my dumb question.
>
> In 0.4 we had writeToJson() method, but I don't see it in 0.7 version. Is
> there a way to serialize Models in json format? Idea is to store them in
> Database & rebuild the models when the application server comes up.
>
> Thanks
> Arun
>

Re: ModelSerializer.writeToJson()

Posted by Suneel Marthi <su...@yahoo.com>.
Why would you want to do that?  Isn't it easier to just package the model into your application war/jar and read the same a s ResourceStream as opposed to what you are proposing?



________________________________
 From: Arun Avanathan <ar...@gmail.com>
To: dev@mahout.apache.org 
Sent: Sunday, March 31, 2013 8:04 PM
Subject: ModelSerializer.writeToJson()
 
All,

I'm new to Mahout. So please bear with my dumb question.

In 0.4 we had writeToJson() method, but I don't see it in 0.7 version. Is
there a way to serialize Models in json format? Idea is to store them in
Database & rebuild the models when the application server comes up.

Thanks
Arun