You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Arun Avanathan <ar...@gmail.com> on 2013/04/01 02:04:14 UTC
ModelSerializer.writeToJson()
All,
I'm new to Mahout. So please bear with my dumb question.
In 0.4 we had writeToJson() method, but I don't see it in 0.7 version. Is
there a way to serialize Models in json format? Idea is to store them in
Database & rebuild the models when the application server comes up.
Thanks
Arun
Re: ModelSerializer.writeToJson()
Posted by Ted Dunning <te...@gmail.com>.
The write to json code had serious problems with heap size and thus was
deleted rather than fixed. There is a binary format instead which is much
more efficient.
Some models are quite large, however, and you may not want to store them in
a database.
On Mon, Apr 1, 2013 at 2:04 AM, Arun Avanathan <ar...@gmail.com>wrote:
> All,
>
> I'm new to Mahout. So please bear with my dumb question.
>
> In 0.4 we had writeToJson() method, but I don't see it in 0.7 version. Is
> there a way to serialize Models in json format? Idea is to store them in
> Database & rebuild the models when the application server comes up.
>
> Thanks
> Arun
>
Re: ModelSerializer.writeToJson()
Posted by Arun Avanathan <ar...@gmail.com>.
Suneel & Ted, Thanks for responding so quickly to my questions. I'll stick
to Base64 solution for now.
Thanks again
On Mon, Apr 1, 2013 at 3:59 AM, Ted Dunning <te...@gmail.com> wrote:
> The real problem is that the model consists mostly of a matrix full of
> double precision numbers. Encoding those as text is always problematic.
> Your solution of using base-64 is a fine one if you really need text.
>
> On Mon, Apr 1, 2013 at 4:31 AM, Arun Avanathan <arun.avanathan@gmail.com
> >wrote:
>
> > Imagine if we had a String based serializing feature, we could have
> > simplified the solution. I guess serializing is not that straight forward
> > with complex math models, so it is not implemented.
> >
>
Re: ModelSerializer.writeToJson()
Posted by Ted Dunning <te...@gmail.com>.
The real problem is that the model consists mostly of a matrix full of
double precision numbers. Encoding those as text is always problematic.
Your solution of using base-64 is a fine one if you really need text.
On Mon, Apr 1, 2013 at 4:31 AM, Arun Avanathan <ar...@gmail.com>wrote:
> Imagine if we had a String based serializing feature, we could have
> simplified the solution. I guess serializing is not that straight forward
> with complex math models, so it is not implemented.
>
Re: ModelSerializer.writeToJson()
Posted by Arun Avanathan <ar...@gmail.com>.
:) Actually we are thinking in this line. We have few other columns to
store some metadata around this BLOB/CLOB. Like id, foreign key etc.,
Now if we are doing ETL, we need to have a csv or some file, containing all
these columns. We cannot have blob / binary object embedded in this text
file. So I was looking for a String based serialize solution.
So until we find a cleaner solution, we are going with encoding the java
serialized binary object with Base64 and uploading it via ETL with other
text fields.
Imagine if we had a String based serializing feature, we could have
simplified the solution. I guess serializing is not that straight forward
with complex math models, so it is not implemented.
I am sure there is better way. Any thoughts / ideas to simplify our
approach is greatly appreciated.
Thanks
On Sun, Mar 31, 2013 at 7:18 PM, Suneel Marthi <su...@yahoo.com>wrote:
> .... how about persisting the model itself into an Oracle column as a
> BLOB and have the AppServer read
> the model from the database. This way if the model gets updated, then all
> the applications would be reading off of the same model.
>
>
>
>
>
> ________________________________
> From: Arun Avanathan <ar...@gmail.com>
> To: dev@mahout.apache.org; Suneel Marthi <su...@yahoo.com>
> Sent: Sunday, March 31, 2013 9:47 PM
> Subject: Re: ModelSerializer.writeToJson()
>
> Idea is to ETL the serialized models on a schedule basis from one
> application to another. War/Jar approach is another way of doing, but it
> needs shipping jars + the metadata around them by ETL.
>
>
> We use Oracle DB for this & it is getting a bit messy uploading these via
> SQLLoader to DB. Am looking for simpler options :)
>
> Thanks
>
>
> On Sun, Mar 31, 2013 at 6:41 PM, Suneel Marthi <suneel_marthi@yahoo.com
> >wrote:
>
> > Why would you want to do that? Isn't it easier to just package the model
> > into your application war/jar and read the same a s ResourceStream as
> > opposed to what you are proposing?
> >
> >
> >
> > ________________________________
> > From: Arun Avanathan <ar...@gmail.com>
> > To: dev@mahout.apache.org
> > Sent: Sunday, March 31, 2013 8:04 PM
> > Subject: ModelSerializer.writeToJson()
> >
> > All,
> >
> > I'm new to Mahout. So please bear with my dumb question.
> >
> > In 0.4 we had writeToJson() method, but I don't see it in 0.7 version. Is
> > there a way to serialize Models in json format? Idea is to store them in
> > Database & rebuild the models when the application server comes up.
> >
> > Thanks
> > Arun
> >
>
Re: ModelSerializer.writeToJson()
Posted by Suneel Marthi <su...@yahoo.com>.
.... how about persisting the model itself into an Oracle column as a BLOB and have the AppServer read
the model from the database. This way if the model gets updated, then all the applications would be reading off of the same model.
________________________________
From: Arun Avanathan <ar...@gmail.com>
To: dev@mahout.apache.org; Suneel Marthi <su...@yahoo.com>
Sent: Sunday, March 31, 2013 9:47 PM
Subject: Re: ModelSerializer.writeToJson()
Idea is to ETL the serialized models on a schedule basis from one
application to another. War/Jar approach is another way of doing, but it
needs shipping jars + the metadata around them by ETL.
We use Oracle DB for this & it is getting a bit messy uploading these via
SQLLoader to DB. Am looking for simpler options :)
Thanks
On Sun, Mar 31, 2013 at 6:41 PM, Suneel Marthi <su...@yahoo.com>wrote:
> Why would you want to do that? Isn't it easier to just package the model
> into your application war/jar and read the same a s ResourceStream as
> opposed to what you are proposing?
>
>
>
> ________________________________
> From: Arun Avanathan <ar...@gmail.com>
> To: dev@mahout.apache.org
> Sent: Sunday, March 31, 2013 8:04 PM
> Subject: ModelSerializer.writeToJson()
>
> All,
>
> I'm new to Mahout. So please bear with my dumb question.
>
> In 0.4 we had writeToJson() method, but I don't see it in 0.7 version. Is
> there a way to serialize Models in json format? Idea is to store them in
> Database & rebuild the models when the application server comes up.
>
> Thanks
> Arun
>
Re: ModelSerializer.writeToJson()
Posted by Arun Avanathan <ar...@gmail.com>.
Idea is to ETL the serialized models on a schedule basis from one
application to another. War/Jar approach is another way of doing, but it
needs shipping jars + the metadata around them by ETL.
We use Oracle DB for this & it is getting a bit messy uploading these via
SQLLoader to DB. Am looking for simpler options :)
Thanks
On Sun, Mar 31, 2013 at 6:41 PM, Suneel Marthi <su...@yahoo.com>wrote:
> Why would you want to do that? Isn't it easier to just package the model
> into your application war/jar and read the same a s ResourceStream as
> opposed to what you are proposing?
>
>
>
> ________________________________
> From: Arun Avanathan <ar...@gmail.com>
> To: dev@mahout.apache.org
> Sent: Sunday, March 31, 2013 8:04 PM
> Subject: ModelSerializer.writeToJson()
>
> All,
>
> I'm new to Mahout. So please bear with my dumb question.
>
> In 0.4 we had writeToJson() method, but I don't see it in 0.7 version. Is
> there a way to serialize Models in json format? Idea is to store them in
> Database & rebuild the models when the application server comes up.
>
> Thanks
> Arun
>
Re: ModelSerializer.writeToJson()
Posted by Suneel Marthi <su...@yahoo.com>.
Why would you want to do that? Isn't it easier to just package the model into your application war/jar and read the same a s ResourceStream as opposed to what you are proposing?
________________________________
From: Arun Avanathan <ar...@gmail.com>
To: dev@mahout.apache.org
Sent: Sunday, March 31, 2013 8:04 PM
Subject: ModelSerializer.writeToJson()
All,
I'm new to Mahout. So please bear with my dumb question.
In 0.4 we had writeToJson() method, but I don't see it in 0.7 version. Is
there a way to serialize Models in json format? Idea is to store them in
Database & rebuild the models when the application server comes up.
Thanks
Arun