You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Chris Schilling <ch...@cellixis.com> on 2011/03/02 01:16:29 UTC

SVD Factorization persistence

Hello,

I am trying to use the SVDRecommender tool.  I was hoping I could persist (to a file on disk) the matrix Factorization.  However, the arrays in the Factorization encapsulation are all private.  Is there already a tool that exists to persist this info?  I want to run various tests, but it will become painful to recalculate the SVD each time.

Thanks!
Chris


Re: SVD Factorization persistence

Posted by Dan Brickley <da...@danbri.org>.
On 2 March 2011 08:47, Sean Owen <sr...@gmail.com> wrote:
> You could suggest a patch that would refactor this all such that you
> could read the intermediate result, and/or restore the intermediate
> result later. I'm sure there are easy hacky ways of doing it but I bet
> there is an elegant design that would get this done too.

To what extent would it make sense for any new API here to also be
shared with the Hadoop-backed Lanczos solver?

cheers,

Dan

Re: SVD Factorization persistence

Posted by Chris Schilling <ch...@cellixis.com>.
On Mar 2, 2011, at 7:22 AM, Sebastian Schelter wrote:

> On 02.03.2011 16:13, Chris Schilling wrote:
>> Sure Sean,
>> 
>> Let me work on a prototype in my own code over the next few days.  Ill have a better suggestion soon.
>> 
>> @Dan, I haven't really looked at the distributed implementation of the SVD, but it should be possible to integrate the reading of the distributed calculation into the online SVD recommender.
>> 
>> Some other functionality I thought would be nice for the SVD:  item-item/user-user similarities based on the latent feature vectors.
> 
> That would a nice thing to have but AFAIK the resulting vectors are dense, so that would be very computation intensive.

Hey Sebastian,

I was thinking more like finding top k similar items given an item id, so I think it would be as intensive as calculating the recommendations and serves a different purpose.  

WDYT?
Chris



>> Anyway, Ill get back to you once I have a better idea of what I want to accomplish.
>> 
>> Thanks
>> Chris
>> 
>> On Mar 2, 2011, at 12:47 AM, Sean Owen wrote:
>> 
>>> You could suggest a patch that would refactor this all such that you
>>> could read the intermediate result, and/or restore the intermediate
>>> result later. I'm sure there are easy hacky ways of doing it but I bet
>>> there is an elegant design that would get this done too.
>>> 
>>> On Wed, Mar 2, 2011 at 12:16 AM, Chris Schilling<ch...@cellixis.com>  wrote:
>>>> Hello,
>>>> 
>>>> I am trying to use the SVDRecommender tool.  I was hoping I could persist (to a file on disk) the matrix Factorization.  However, the arrays in the Factorization encapsulation are all private.  Is there already a tool that exists to persist this info?  I want to run various tests, but it will become painful to recalculate the SVD each time.
>>>> 
>>>> Thanks!
>>>> Chris
>>>> 
>>>> 
> 


Re: SVD Factorization persistence

Posted by Sebastian Schelter <ss...@apache.org>.
On 02.03.2011 16:13, Chris Schilling wrote:
> Sure Sean,
>
> Let me work on a prototype in my own code over the next few days.  Ill have a better suggestion soon.
>
> @Dan, I haven't really looked at the distributed implementation of the SVD, but it should be possible to integrate the reading of the distributed calculation into the online SVD recommender.
>
> Some other functionality I thought would be nice for the SVD:  item-item/user-user similarities based on the latent feature vectors.

That would a nice thing to have but AFAIK the resulting vectors are 
dense, so that would be very computation intensive.

> Anyway, Ill get back to you once I have a better idea of what I want to accomplish.
>
> Thanks
> Chris
>
> On Mar 2, 2011, at 12:47 AM, Sean Owen wrote:
>
>> You could suggest a patch that would refactor this all such that you
>> could read the intermediate result, and/or restore the intermediate
>> result later. I'm sure there are easy hacky ways of doing it but I bet
>> there is an elegant design that would get this done too.
>>
>> On Wed, Mar 2, 2011 at 12:16 AM, Chris Schilling<ch...@cellixis.com>  wrote:
>>> Hello,
>>>
>>> I am trying to use the SVDRecommender tool.  I was hoping I could persist (to a file on disk) the matrix Factorization.  However, the arrays in the Factorization encapsulation are all private.  Is there already a tool that exists to persist this info?  I want to run various tests, but it will become painful to recalculate the SVD each time.
>>>
>>> Thanks!
>>> Chris
>>>
>>>


Re: SVD Factorization persistence

Posted by Chris Schilling <ch...@cellixis.com>.
Sure Sean,

Let me work on a prototype in my own code over the next few days.  Ill have a better suggestion soon.

@Dan, I haven't really looked at the distributed implementation of the SVD, but it should be possible to integrate the reading of the distributed calculation into the online SVD recommender.

Some other functionality I thought would be nice for the SVD:  item-item/user-user similarities based on the latent feature vectors.

Anyway, Ill get back to you once I have a better idea of what I want to accomplish.

Thanks
Chris

On Mar 2, 2011, at 12:47 AM, Sean Owen wrote:

> You could suggest a patch that would refactor this all such that you
> could read the intermediate result, and/or restore the intermediate
> result later. I'm sure there are easy hacky ways of doing it but I bet
> there is an elegant design that would get this done too.
> 
> On Wed, Mar 2, 2011 at 12:16 AM, Chris Schilling <ch...@cellixis.com> wrote:
>> Hello,
>> 
>> I am trying to use the SVDRecommender tool.  I was hoping I could persist (to a file on disk) the matrix Factorization.  However, the arrays in the Factorization encapsulation are all private.  Is there already a tool that exists to persist this info?  I want to run various tests, but it will become painful to recalculate the SVD each time.
>> 
>> Thanks!
>> Chris
>> 
>> 


Re: SVD Factorization persistence

Posted by Sean Owen <sr...@gmail.com>.
You could suggest a patch that would refactor this all such that you
could read the intermediate result, and/or restore the intermediate
result later. I'm sure there are easy hacky ways of doing it but I bet
there is an elegant design that would get this done too.

On Wed, Mar 2, 2011 at 12:16 AM, Chris Schilling <ch...@cellixis.com> wrote:
> Hello,
>
> I am trying to use the SVDRecommender tool.  I was hoping I could persist (to a file on disk) the matrix Factorization.  However, the arrays in the Factorization encapsulation are all private.  Is there already a tool that exists to persist this info?  I want to run various tests, but it will become painful to recalculate the SVD each time.
>
> Thanks!
> Chris
>
>