You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by ARROYO MANCEBO David <da...@altran.com> on 2015/01/15 14:11:53 UTC

Own recommender

Hi folks,
How I can start to build my own recommender system in apache mahout with my personal algorithm? I need a custom UserSimilarity. Maybe a subclass from UserSimilarity like PearsonCorrelationSimilarity?

Thanks
Regards :)

Re: Own recommender

Posted by Manuel Blechschmidt <Ma...@gmx.de>.
Hi Juanjo,

> On 21.01.2015, at 11:20, Juanjo Ramos <jj...@gmail.com> wrote:
> 
> Hi Manuel,
> Thanks for the update.
> 
> I'm using Mahout in a simple Java application myself. Following Ted's
> comment a few posts back, I was just concerned about the performance.

So if you have more than around 300.000 preferences for around 10.000 users it will take some seconds to generate recommendations for users with a lot of preferences.

Normally what you do is just down sample the preferences that a certain user has to make it faster.

When I understood LLR correctly than it is a smart way of doing this downsampling.

Another approach would be using a model based approach:

https://mahout.apache.org/users/recommender/matrix-factorization.html <https://mahout.apache.org/users/recommender/matrix-factorization.html>

The following class contains an example:
https://github.com/ManuelB/facebook-recommender-demo/blob/master/src/main/java/de/apaxo/bedcon/AnimalFoodRecommender.java <https://github.com/ManuelB/facebook-recommender-demo/blob/master/src/main/java/de/apaxo/bedcon/AnimalFoodRecommender.java>
> 
> Is performance the only concern when using Taste or the algorithm's
> implementation has also been improved in the current implementations
> accessible via CLI.
> 
> Thanks.

/Manuel

-- 
Manuel Blechschmidt
Twitter: http://twitter.com/Manuel_B


Re: Own recommender

Posted by Pat Ferrel <pa...@occamsmachete.com>.
spark-itemsimilarity uses SimialrityAnalysis,cooccurrence, which does the majority of the work. Examples of how to use it and handle I/O are in the CLI driver mahout/spark/src/…/drivers/ItemSimilarityDriver.scala. All of this is available as a library. The nature of Spark makes creating map-reduce and other parallel ops almost transparent to the programmer and therefore the code is easier to use as a lib. Calling Scala from Java is a bit tricky since Scala renames some things for access by Java—things that don’t exist in Java. Calling Java from Scala is very easy. They are mixed in the Mahout codebase.

On Jan 21, 2015, at 7:49 AM, Ted Dunning <te...@gmail.com> wrote:

Juanjo,

Using the Taste components, it will be almost impossible to get really high
performance.  For that, using the itemsimilarity program to feed a search
index is the best alternative.

The scala version of the itemsimilarity program is available in Scala and
could be called fairly easily as a library.  The older map-reduce version
is not easily used as a library.


On Wed, Jan 21, 2015 at 2:20 AM, Juanjo Ramos <jj...@gmail.com> wrote:

> Hi Manuel,
> Thanks for the update.
> 
> I'm using Mahout in a simple Java application myself. Following Ted's
> comment a few posts back, I was just concerned about the performance.
> 
> Is performance the only concern when using Taste or the algorithm's
> implementation has also been improved in the current implementations
> accessible via CLI.
> 
> Thanks.
> 
> On Wed, Jan 21, 2015 at 10:14 AM, Manuel Blechschmidt <
> Manuel.Blechschmidt@gmx.de> wrote:
> 
>> Hi Juan,
>> 
>>> On 21.01.2015, at 11:05, Juanjo Ramos <jj...@gmail.com> wrote:
>>> 
>>> Thanks Pat for the resources.
>>> 
>>> Please correct me if I'm wrong but all Mahout's latest tools are
> command
>>> line tools only, is that correct?
>> 
>> Yes, this is kind of correct. All tools are command line based. There was
>> some development for an interactive console similar to R
>> 
>> https://issues.apache.org/jira/browse/MAHOUT-1489 <
>> https://issues.apache.org/jira/browse/MAHOUT-1489>
>>> I was wondering if there is a library
>>> with the latest implementation that can be used in a Java or Scala
>> project?
>> 
>> The following project uses Mahout in a full blown simple Java EE
>> application:
>> 
>> https://github.com/ManuelB/facebook-recommender-demo <
>> https://github.com/ManuelB/facebook-recommender-demo>
>>> 
>>> Best.
>> 
>> /Manuel
>> 
>> --
>> Manuel Blechschmidt
>> Twitter: http://twitter.com/Manuel_B
>> 
>> 
> 


Re: Own recommender

Posted by Ted Dunning <te...@gmail.com>.
Juanjo,

Using the Taste components, it will be almost impossible to get really high
performance.  For that, using the itemsimilarity program to feed a search
index is the best alternative.

The scala version of the itemsimilarity program is available in Scala and
could be called fairly easily as a library.  The older map-reduce version
is not easily used as a library.


On Wed, Jan 21, 2015 at 2:20 AM, Juanjo Ramos <jj...@gmail.com> wrote:

> Hi Manuel,
> Thanks for the update.
>
> I'm using Mahout in a simple Java application myself. Following Ted's
> comment a few posts back, I was just concerned about the performance.
>
> Is performance the only concern when using Taste or the algorithm's
> implementation has also been improved in the current implementations
> accessible via CLI.
>
> Thanks.
>
> On Wed, Jan 21, 2015 at 10:14 AM, Manuel Blechschmidt <
> Manuel.Blechschmidt@gmx.de> wrote:
>
> > Hi Juan,
> >
> > > On 21.01.2015, at 11:05, Juanjo Ramos <jj...@gmail.com> wrote:
> > >
> > > Thanks Pat for the resources.
> > >
> > > Please correct me if I'm wrong but all Mahout's latest tools are
> command
> > > line tools only, is that correct?
> >
> > Yes, this is kind of correct. All tools are command line based. There was
> > some development for an interactive console similar to R
> >
> > https://issues.apache.org/jira/browse/MAHOUT-1489 <
> > https://issues.apache.org/jira/browse/MAHOUT-1489>
> > > I was wondering if there is a library
> > > with the latest implementation that can be used in a Java or Scala
> > project?
> >
> > The following project uses Mahout in a full blown simple Java EE
> > application:
> >
> > https://github.com/ManuelB/facebook-recommender-demo <
> > https://github.com/ManuelB/facebook-recommender-demo>
> > >
> > > Best.
> >
> > /Manuel
> >
> > --
> > Manuel Blechschmidt
> > Twitter: http://twitter.com/Manuel_B
> >
> >
>

Re: Own recommender

Posted by Juanjo Ramos <jj...@gmail.com>.
Hi Manuel,
Thanks for the update.

I'm using Mahout in a simple Java application myself. Following Ted's
comment a few posts back, I was just concerned about the performance.

Is performance the only concern when using Taste or the algorithm's
implementation has also been improved in the current implementations
accessible via CLI.

Thanks.

On Wed, Jan 21, 2015 at 10:14 AM, Manuel Blechschmidt <
Manuel.Blechschmidt@gmx.de> wrote:

> Hi Juan,
>
> > On 21.01.2015, at 11:05, Juanjo Ramos <jj...@gmail.com> wrote:
> >
> > Thanks Pat for the resources.
> >
> > Please correct me if I'm wrong but all Mahout's latest tools are command
> > line tools only, is that correct?
>
> Yes, this is kind of correct. All tools are command line based. There was
> some development for an interactive console similar to R
>
> https://issues.apache.org/jira/browse/MAHOUT-1489 <
> https://issues.apache.org/jira/browse/MAHOUT-1489>
> > I was wondering if there is a library
> > with the latest implementation that can be used in a Java or Scala
> project?
>
> The following project uses Mahout in a full blown simple Java EE
> application:
>
> https://github.com/ManuelB/facebook-recommender-demo <
> https://github.com/ManuelB/facebook-recommender-demo>
> >
> > Best.
>
> /Manuel
>
> --
> Manuel Blechschmidt
> Twitter: http://twitter.com/Manuel_B
>
>

Re: Own recommender

Posted by Manuel Blechschmidt <Ma...@gmx.de>.
Hi Juan,

> On 21.01.2015, at 11:05, Juanjo Ramos <jj...@gmail.com> wrote:
> 
> Thanks Pat for the resources.
> 
> Please correct me if I'm wrong but all Mahout's latest tools are command
> line tools only, is that correct?

Yes, this is kind of correct. All tools are command line based. There was some development for an interactive console similar to R

https://issues.apache.org/jira/browse/MAHOUT-1489 <https://issues.apache.org/jira/browse/MAHOUT-1489>
> I was wondering if there is a library
> with the latest implementation that can be used in a Java or Scala project?

The following project uses Mahout in a full blown simple Java EE application:

https://github.com/ManuelB/facebook-recommender-demo <https://github.com/ManuelB/facebook-recommender-demo>
> 
> Best.

/Manuel

-- 
Manuel Blechschmidt
Twitter: http://twitter.com/Manuel_B


Re: Own recommender

Posted by Juanjo Ramos <jj...@gmail.com>.
Thanks Pat for the resources.

Please correct me if I'm wrong but all Mahout's latest tools are command
line tools only, is that correct? I was wondering if there is a library
with the latest implementation that can be used in a Java or Scala project?

Best.

On Mon, Jan 19, 2015 at 9:51 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> I guess I’ll put a page on the mahout site. For now some references:
>
> small free book here, which talks about the general idea:
> https://www.mapr.com/practical-machine-learning
> preso, which talks about mixing actions or other indicators:
> http://occamsmachete.com/ml/2014/10/07/creating-a-unified-recommender-with-mahout-and-a-search-engine/
> two blog posts:
> http://occamsmachete.com/ml/2014/08/11/mahout-on-spark-whats-new-in-recommenders/http://occamsmachete.com/ml/2014/09/09/mahout-on-spark-whats-new-in-recommenders-part-2/
> mahout docs:
> http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html
>
>
> On Jan 19, 2015, at 3:02 AM, Juanjo Ramos <jj...@gmail.com> wrote:
>
> Hi Pat,
> Do you know if there is any tutorial for the Scala recommender code?
> Mahout's site keeps pointing here:
> http://mahout.apache.org/users/recommender/userbased-5-minutes.html
>
> Thanks.
>
> On Sat, Jan 17, 2015 at 4:24 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:
>
> > The newest recommender code runs on the new Scala R-like DSL. It is
> > cooccurrence based and supports only LLR. LLR is used to downsample
> > cooccurrences comparing all pairs of items. I’ve done fairly careful
> > offline testing of all the similarity methods of Mahout’s hadoop and
> > in-memory recommenders and LLR was a clear winner.
> >
> > However if you have something new you want to try, look at the Scala
> > SimilarityAnalysis class. For runtime efficiency it first calculates
> > cooccurrences by performing [AA’] then calculating LLR on elements by row
> > and downsampling in one step. You could look at some other similarity
> > method for downsampling there.
> >
> > On Jan 16, 2015, at 12:44 AM, ARROYO MANCEBO David <
> > david.arroyo@altran.com> wrote:
> >
> > Any idea, Ted? :)
> >
> > -----Mensaje original-----
> > De: Ted Dunning [mailto:ted.dunning@gmail.com]
> > Enviado el: jueves, 15 de enero de 2015 20:05
> > Para: user@mahout.apache.org
> > Asunto: Re: Own recommender
> >
> > The old Taste code is not the state of the art.  User-based recommenders
> > built on that will be slow.
> >
> >
> >
> > On Thu, Jan 15, 2015 at 7:10 AM, Juanjo Ramos <jj...@gmail.com> wrote:
> >
> >> Hi David,
> >> You implement your custom algorithm and create your own class that
> >> implements the UserSimilarity interface.
> >>
> >> When you then instantiate your User-Based recommender, just pass your
> >> custom class for the UserSimilarity parameter.
> >>
> >> Best.
> >>
> >> On Thu, Jan 15, 2015 at 1:11 PM, ARROYO MANCEBO David <
> >> david.arroyo@altran.com> wrote:
> >>
> >>> Hi folks,
> >>> How I can start to build my own recommender system in apache mahout
> >>> with my personal algorithm? I need a custom UserSimilarity. Maybe a
> >>> subclass from UserSimilarity like PearsonCorrelationSimilarity?
> >>>
> >>> Thanks
> >>> Regards :)
> >>>
> >>
> >
> >
>
>

Re: Own recommender

Posted by Pat Ferrel <pa...@occamsmachete.com>.
I guess I’ll put a page on the mahout site. For now some references:

small free book here, which talks about the general idea: https://www.mapr.com/practical-machine-learning
preso, which talks about mixing actions or other indicators: http://occamsmachete.com/ml/2014/10/07/creating-a-unified-recommender-with-mahout-and-a-search-engine/ 
two blog posts: http://occamsmachete.com/ml/2014/08/11/mahout-on-spark-whats-new-in-recommenders/http://occamsmachete.com/ml/2014/09/09/mahout-on-spark-whats-new-in-recommenders-part-2/
mahout docs: http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html


On Jan 19, 2015, at 3:02 AM, Juanjo Ramos <jj...@gmail.com> wrote:

Hi Pat,
Do you know if there is any tutorial for the Scala recommender code?
Mahout's site keeps pointing here:
http://mahout.apache.org/users/recommender/userbased-5-minutes.html

Thanks.

On Sat, Jan 17, 2015 at 4:24 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> The newest recommender code runs on the new Scala R-like DSL. It is
> cooccurrence based and supports only LLR. LLR is used to downsample
> cooccurrences comparing all pairs of items. I’ve done fairly careful
> offline testing of all the similarity methods of Mahout’s hadoop and
> in-memory recommenders and LLR was a clear winner.
> 
> However if you have something new you want to try, look at the Scala
> SimilarityAnalysis class. For runtime efficiency it first calculates
> cooccurrences by performing [AA’] then calculating LLR on elements by row
> and downsampling in one step. You could look at some other similarity
> method for downsampling there.
> 
> On Jan 16, 2015, at 12:44 AM, ARROYO MANCEBO David <
> david.arroyo@altran.com> wrote:
> 
> Any idea, Ted? :)
> 
> -----Mensaje original-----
> De: Ted Dunning [mailto:ted.dunning@gmail.com]
> Enviado el: jueves, 15 de enero de 2015 20:05
> Para: user@mahout.apache.org
> Asunto: Re: Own recommender
> 
> The old Taste code is not the state of the art.  User-based recommenders
> built on that will be slow.
> 
> 
> 
> On Thu, Jan 15, 2015 at 7:10 AM, Juanjo Ramos <jj...@gmail.com> wrote:
> 
>> Hi David,
>> You implement your custom algorithm and create your own class that
>> implements the UserSimilarity interface.
>> 
>> When you then instantiate your User-Based recommender, just pass your
>> custom class for the UserSimilarity parameter.
>> 
>> Best.
>> 
>> On Thu, Jan 15, 2015 at 1:11 PM, ARROYO MANCEBO David <
>> david.arroyo@altran.com> wrote:
>> 
>>> Hi folks,
>>> How I can start to build my own recommender system in apache mahout
>>> with my personal algorithm? I need a custom UserSimilarity. Maybe a
>>> subclass from UserSimilarity like PearsonCorrelationSimilarity?
>>> 
>>> Thanks
>>> Regards :)
>>> 
>> 
> 
> 


Re: Own recommender

Posted by Juanjo Ramos <jj...@gmail.com>.
Hi Pat,
Do you know if there is any tutorial for the Scala recommender code?
Mahout's site keeps pointing here:
http://mahout.apache.org/users/recommender/userbased-5-minutes.html

Thanks.

On Sat, Jan 17, 2015 at 4:24 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> The newest recommender code runs on the new Scala R-like DSL. It is
> cooccurrence based and supports only LLR. LLR is used to downsample
> cooccurrences comparing all pairs of items. I’ve done fairly careful
> offline testing of all the similarity methods of Mahout’s hadoop and
> in-memory recommenders and LLR was a clear winner.
>
> However if you have something new you want to try, look at the Scala
> SimilarityAnalysis class. For runtime efficiency it first calculates
> cooccurrences by performing [AA’] then calculating LLR on elements by row
> and downsampling in one step. You could look at some other similarity
> method for downsampling there.
>
> On Jan 16, 2015, at 12:44 AM, ARROYO MANCEBO David <
> david.arroyo@altran.com> wrote:
>
> Any idea, Ted? :)
>
> -----Mensaje original-----
> De: Ted Dunning [mailto:ted.dunning@gmail.com]
> Enviado el: jueves, 15 de enero de 2015 20:05
> Para: user@mahout.apache.org
> Asunto: Re: Own recommender
>
> The old Taste code is not the state of the art.  User-based recommenders
> built on that will be slow.
>
>
>
> On Thu, Jan 15, 2015 at 7:10 AM, Juanjo Ramos <jj...@gmail.com> wrote:
>
> > Hi David,
> > You implement your custom algorithm and create your own class that
> > implements the UserSimilarity interface.
> >
> > When you then instantiate your User-Based recommender, just pass your
> > custom class for the UserSimilarity parameter.
> >
> > Best.
> >
> > On Thu, Jan 15, 2015 at 1:11 PM, ARROYO MANCEBO David <
> > david.arroyo@altran.com> wrote:
> >
> >> Hi folks,
> >> How I can start to build my own recommender system in apache mahout
> >> with my personal algorithm? I need a custom UserSimilarity. Maybe a
> >> subclass from UserSimilarity like PearsonCorrelationSimilarity?
> >>
> >> Thanks
> >> Regards :)
> >>
> >
>
>

Re: Own recommender

Posted by Pat Ferrel <pa...@occamsmachete.com>.
The newest recommender code runs on the new Scala R-like DSL. It is cooccurrence based and supports only LLR. LLR is used to downsample cooccurrences comparing all pairs of items. I’ve done fairly careful offline testing of all the similarity methods of Mahout’s hadoop and in-memory recommenders and LLR was a clear winner.

However if you have something new you want to try, look at the Scala SimilarityAnalysis class. For runtime efficiency it first calculates cooccurrences by performing [AA’] then calculating LLR on elements by row and downsampling in one step. You could look at some other similarity method for downsampling there. 

On Jan 16, 2015, at 12:44 AM, ARROYO MANCEBO David <da...@altran.com> wrote:

Any idea, Ted? :)

-----Mensaje original-----
De: Ted Dunning [mailto:ted.dunning@gmail.com] 
Enviado el: jueves, 15 de enero de 2015 20:05
Para: user@mahout.apache.org
Asunto: Re: Own recommender

The old Taste code is not the state of the art.  User-based recommenders built on that will be slow.



On Thu, Jan 15, 2015 at 7:10 AM, Juanjo Ramos <jj...@gmail.com> wrote:

> Hi David,
> You implement your custom algorithm and create your own class that 
> implements the UserSimilarity interface.
> 
> When you then instantiate your User-Based recommender, just pass your 
> custom class for the UserSimilarity parameter.
> 
> Best.
> 
> On Thu, Jan 15, 2015 at 1:11 PM, ARROYO MANCEBO David < 
> david.arroyo@altran.com> wrote:
> 
>> Hi folks,
>> How I can start to build my own recommender system in apache mahout 
>> with my personal algorithm? I need a custom UserSimilarity. Maybe a 
>> subclass from UserSimilarity like PearsonCorrelationSimilarity?
>> 
>> Thanks
>> Regards :)
>> 
> 


RE: Own recommender

Posted by ARROYO MANCEBO David <da...@altran.com>.
Any idea, Ted? :)

-----Mensaje original-----
De: Ted Dunning [mailto:ted.dunning@gmail.com] 
Enviado el: jueves, 15 de enero de 2015 20:05
Para: user@mahout.apache.org
Asunto: Re: Own recommender

The old Taste code is not the state of the art.  User-based recommenders built on that will be slow.



On Thu, Jan 15, 2015 at 7:10 AM, Juanjo Ramos <jj...@gmail.com> wrote:

> Hi David,
> You implement your custom algorithm and create your own class that 
> implements the UserSimilarity interface.
>
> When you then instantiate your User-Based recommender, just pass your 
> custom class for the UserSimilarity parameter.
>
> Best.
>
> On Thu, Jan 15, 2015 at 1:11 PM, ARROYO MANCEBO David < 
> david.arroyo@altran.com> wrote:
>
> > Hi folks,
> > How I can start to build my own recommender system in apache mahout 
> > with my personal algorithm? I need a custom UserSimilarity. Maybe a 
> > subclass from UserSimilarity like PearsonCorrelationSimilarity?
> >
> > Thanks
> > Regards :)
> >
>

Re: Own recommender

Posted by Ted Dunning <te...@gmail.com>.
The old Taste code is not the state of the art.  User-based recommenders
built on that will be slow.



On Thu, Jan 15, 2015 at 7:10 AM, Juanjo Ramos <jj...@gmail.com> wrote:

> Hi David,
> You implement your custom algorithm and create your own class that
> implements the UserSimilarity interface.
>
> When you then instantiate your User-Based recommender, just pass your
> custom class for the UserSimilarity parameter.
>
> Best.
>
> On Thu, Jan 15, 2015 at 1:11 PM, ARROYO MANCEBO David <
> david.arroyo@altran.com> wrote:
>
> > Hi folks,
> > How I can start to build my own recommender system in apache mahout with
> > my personal algorithm? I need a custom UserSimilarity. Maybe a subclass
> > from UserSimilarity like PearsonCorrelationSimilarity?
> >
> > Thanks
> > Regards :)
> >
>

Re: Own recommender

Posted by Juanjo Ramos <jj...@gmail.com>.
Hi David,
You implement your custom algorithm and create your own class that
implements the UserSimilarity interface.

When you then instantiate your User-Based recommender, just pass your
custom class for the UserSimilarity parameter.

Best.

On Thu, Jan 15, 2015 at 1:11 PM, ARROYO MANCEBO David <
david.arroyo@altran.com> wrote:

> Hi folks,
> How I can start to build my own recommender system in apache mahout with
> my personal algorithm? I need a custom UserSimilarity. Maybe a subclass
> from UserSimilarity like PearsonCorrelationSimilarity?
>
> Thanks
> Regards :)
>