You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Gokhan Capan <gk...@gmail.com> on 2012/10/12 09:04:17 UTC

Re: SGD Based Recommender Contribution Proposal

On Sun, Sep 9, 2012 at 10:27 PM, Ted Dunning <te...@gmail.com> wrote:

> Great.
>
> If the update has a huge impact on existing code, can you break it into
> manageable pieces?
>
> If it is just an addition, having a big blob of stuff is probably fine.


Now it is integrable to Mahout, and can work with Mahout's existing
Recommender interface. It does not modify any existing code, except a
couple of additional lines in driver.class.props, which define a few
commandline utilities I find useful while experimenting a recommender.

By the way I found a few minor bugs, updated the patch.

Did you have any chance to look at this?


Secondly, I would like to up the thread to trigger a discussion on this.
Sean raised some concerns on the patch. (available in the JIRA page as a
comment)

Quoting Sean's comment:
"I imagine this is all great work. As I commented off-list, it is a big
enough and even different enough beast that it feels like it should be a
separate project. The Mahout code base is already uneven and sprawling and
I think this would exacerbate that – and not generate much "synergy" worth
the effort of integration."

I understand all of these, and want to provide a general response to
possibly clarify some of points Sean made.

Basically it adds an online version of existing Mahout recommendation
capabilities. Learning MF based recommender with Alternating Least Squares
already exists in Mahout, and this is the SGD based version. The different
targets approach is just a set of wrappers on those linear models. (Same as
Generalized Linear Models approach) Adding side info is optional, which may
be beneficial when there is a cold-start issue.

Additionally, the OnlineFactorizationRecommender extends the
AbstractRecommender, and the FactorizationAwareDataModel is a Mahout
DataModel composed with a base DataModel that is capable of adding new
ratings.

Besides all these, I remember the initiative Ted started following Menon
and Elkan's 'Dyadic Prediction Using a Latent Feature Log-Linear Model'
paper. First I intended to improve Ted's initial implementation, then I
started a separate implementation to keep the code integrable to Taste  in
the very beginning. What I mean is, those approaches are really similar.

The code is already integrated, and may be one of the options of many
recommenders to a user. Finally, I am volunteer to keep the code integrated
and working, improve it upon suggestions, and provide a documentation on
usage and details.

Why I don't consider to start a separate project rather than offer to
contribute to Mahout is; I am familiar with Mahout library, the code
already depends on Mahout, and the goal for the project is to be used by
people. Mahout already attracts a plenty of users and developers, which
means the code is used by more people, and with reviews it may be fixed and
improved faster.

Regards

>
> On Sun, Sep 9, 2012 at 7:01 AM, Gokhan Capan <gk...@gmail.com> wrote:
>
> > On Fri, Sep 7, 2012 at 12:48 AM, Ted Dunning <te...@gmail.com>
> > wrote:
> >
> > > This sounds pretty exciting.  Beyond that, it is hard to say much.
> > >
> > > Can you say a bit more about how you would see introducing the code
> into
> > > Mahout?
> > >
> >
> > Ted, I've forked apache/mahout at github, and I will merge the library
> into
> > mahout. I believe in a week I will be able to add documentation and
> mahout
> > jobs for experiments and start submitting patches to JIRA.
> >
>
-- 
Gokhan

Re: SGD Based Recommender Contribution Proposal

Posted by Ted Dunning <te...@gmail.com>.
This actually sounds pretty good, at least in advance of actually looking
at the code.

On Fri, Oct 12, 2012 at 2:15 AM, Sean Owen <sr...@gmail.com> wrote:

> (Bug fixes are good, if you can separate those out, those should not
> be controversial.)
>
> I am still skeptical of new blobs of code, as these have been a big
> problem for the project. If it's clean, follows existing code and
> APIs, and you are willing to actively support it, it may be
> un-harmful.
>
> I am not going to be able to work on Mahout in any significant way any
> more, so would not feel right committing this myself, especially if
> the expectation was that I would be able to help modify and support
> these changes. But, leave it to other committers to take that up. It
> sounds lik eyou have made a lot of effort to weave it in nicely.
>
>
> > Now it is integrable to Mahout, and can work with Mahout's existing
> > Recommender interface. It does not modify any existing code, except a
> couple
> > of additional lines in driver.class.props, which define a few commandline
> > utilities I find useful while experimenting a recommender.
>

Re: SGD Based Recommender Contribution Proposal

Posted by Sean Owen <sr...@gmail.com>.
(Bug fixes are good, if you can separate those out, those should not
be controversial.)

I am still skeptical of new blobs of code, as these have been a big
problem for the project. If it's clean, follows existing code and
APIs, and you are willing to actively support it, it may be
un-harmful.

I am not going to be able to work on Mahout in any significant way any
more, so would not feel right committing this myself, especially if
the expectation was that I would be able to help modify and support
these changes. But, leave it to other committers to take that up. It
sounds lik eyou have made a lot of effort to weave it in nicely.


> Now it is integrable to Mahout, and can work with Mahout's existing
> Recommender interface. It does not modify any existing code, except a couple
> of additional lines in driver.class.props, which define a few commandline
> utilities I find useful while experimenting a recommender.