You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mahout.apache.org by Grant Ingersoll <gs...@apache.org> on 2008/11/10 14:21:12 UTC

Taste ?'s

Hi,

I'm integrating the build and demo into the main workflow, and have a  
couple of questions about Taste.  See https://issues.apache.org/jira/browse/MAHOUT-94

1.  What should we do about the Axis jars required for the AxisServlet  
stuff in the example WAR?  For now, I've commented out the dependency  
and the rest seems to work fine.  Is there a pointer to where these  
live?

2. Is there a way to get the top rated items in the demo without a  
user?  I think that would be useful.  For pretty much every user I've  
entered in the demo, it recommends "Song of Freedom", and I'm curious  
as to why.  One thing that would be cool is an "explain" method like  
Lucene has for explaining search results.

-Grant

Re: Taste ?'s

Posted by Grant Ingersoll <gs...@apache.org>.

On Nov 10, 2008, at 12:05 PM, Sean Owen wrote:

>
>
> Fine by me, I had always distributed it myself. I pulled it out here  
> by
> request I think in order to reduce the number of dependencies.
>
> I would rather not remove working, plausibly useful functionality  
> just to
> avoid a dependency - web services are definitely in that category.  
> At worst
> just make it a dependency you have to add, which is what I did here  
> (same
> for EJB though not sure anyone actually uses that). Obviously I have  
> no
> issue going the other way and including Axis.

OK, I agree on keeping it.  I can see Mahout growing to have a webapp  
layer that isn't just for taste, so maybe, eventually, we will want to  
separate out the webapp part from the "core"

>
>
> On explanation - I like the idea. I am struggling to think of any
> commonality between explanations across all algorithms. That may argue
> against a generic Explanation object but not against specific
> implementations tailored for particular algorithms.
>
> I am then trying to imagine what the explanation is like. For user- 
> based
> recommenders it is really a neighborhood of users that explains it.  
> Do you
> return that? In an object that tells you stuff like how much a given  
> item is
> liked by the group?
>
> I like the idea... when I dig in I am having trouble thinking of  
> just what
> an explanation looks like. I can imagine reporting interesting  
> figures (e.g.
> most popular items in neighborhood) that don't really explain things.
> Perhaps it is good to think of real-world examples like amazon's
> 'recommended because' function. That I already expose - in the  
> algorithm for
> which it makes any sense that is. For instance there would be no  
> such notion
> of this in slope-one.
>
> An interesting item to mull over indeed. It is a way a recommender  
> system
> could add real insight and value.


I think you might be going a little to generic in your thinking.  The  
only real difference between the actual recommendation and the explain  
of that recommendation is that the explain provides the details of why  
the recommendation was given.  For instance, and I'm just making this  
up, it might say something like:
Item A had a score of 2.5, which is the product of
	Recommender Weight Factor: 0.5 times
	# of Users recommending item: 10 times
	Avg. Recommendation Score: 0.5

In Lucene, an explain looks like:
0.34713835 = (MATCH) sum of:
   0.29951444 = weight(text:"i pod" in 0), product of:
     0.54212046 = queryWeight(text:"i pod"), product of:
       5.8931932 = idf(text: i=27 pod=25)
       0.091990955 = queryNorm
     0.5524869 = fieldWeight(text:"i pod" in 0), product of:
       1.0 = tf(phraseFreq=1.0)
       5.8931932 = idf(text: i=27 pod=25)
       0.09375 = fieldNorm(field=text, doc=0)
   0.04762391 = (MATCH) sum of:
     0.04762391 = (MATCH) weight(text:gb in 0), product of:
       0.21617201 = queryWeight(text:gb), product of:
         2.3499267 = idf(docFreq=48, numDocs=33)
         0.091990955 = queryNorm
       0.22030562 = (MATCH) fieldWeight(text:gb in 0), product of:
         1.0 = tf(termFreq(text:gb)=1)
         2.3499267 = idf(docFreq=48, numDocs=33)
         0.09375 = fieldNorm(field=text, doc=0)


(FYI: I got this by running the Solr example and using the query: http://localhost:8983/solr/select?indent=on&version=2.2&q=iPod+AND+GB&start=0&rows=10&fl=*%2Cscore&qt=standard&wt=standard&debugQuery=on&explainOther=&hl.fl=)

Due note, I don't even know if this makes sense in CF, but I suspect  
it does, otherwise we wouldn't be having this conversation. :-)


>
>
> On 10 Nov 2008, 1:54 PM, "Grant Ingersoll" <gs...@apache.org>  
> wrote:
>
> On Nov 10, 2008, at 8:32 AM, Sean Owen wrote: > The web service bit  
> won't
> work without Axis but ye...
> Yep, it does.  Axis is ASF, so we can just put them in where they  
> need to
> be.  Either that, or let's not support Web Services.
>
>>>> You hit on an interesting feature of the slope-one recommender  
>>>> used in
>> the demo. It does h...
> Right, the same is true for Lucene.  We have an abstract "explain"  
> method
> that gets implemented by the various scoring pieces.   So, for  
> Taste, we
> could probably add it to the Recommender class, something like:
>
> abstract Explanation explain(Item, Item, ???);  //Given a source item,
> explain why the recommended Item was chosen.
> and
> abstract Explanation explain(User, User, Item, ???);  //Given a  
> source user,
> explain why the item was recommended for the target user.
>
>> On Mon, Nov 10, 2008 at 1:21 PM, Grant Ingersoll  
>> <gs...@apache.org>
> wrote: >> >> Hi, >> >> ...
> --------------------------
> Grant Ingersoll
> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
> http://www.lucenebootcamp.com
>
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
>
>

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ

Re: Taste ?'s

Posted by Sean Owen <sr...@gmail.com>.

Fine by me, I had always distributed it myself. I pulled it out here by 
request I think in order to reduce the number of dependencies. 

I would rather not remove working, plausibly useful functionality just to 
avoid a dependency - web services are definitely in that category. At worst 
just make it a dependency you have to add, which is what I did here (same 
for EJB though not sure anyone actually uses that). Obviously I have no 
issue going the other way and including Axis. 

On explanation - I like the idea. I am struggling to think of any 
commonality between explanations across all algorithms. That may argue 
against a generic Explanation object but not against specific 
implementations tailored for particular algorithms.

I am then trying to imagine what the explanation is like. For user-based 
recommenders it is really a neighborhood of users that explains it. Do you 
return that? In an object that tells you stuff like how much a given item is 
liked by the group?

I like the idea... when I dig in I am having trouble thinking of just what 
an explanation looks like. I can imagine reporting interesting figures (e.g. 
most popular items in neighborhood) that don't really explain things. 
Perhaps it is good to think of real-world examples like amazon's 
'recommended because' function. That I already expose - in the algorithm for 
which it makes any sense that is. For instance there would be no such notion 
of this in slope-one. 

An interesting item to mull over indeed. It is a way a recommender system 
could add real insight and value. 

On 10 Nov 2008, 1:54 PM, "Grant Ingersoll" <gs...@apache.org> wrote:

On Nov 10, 2008, at 8:32 AM, Sean Owen wrote: > The web service bit won't 
work without Axis but ye...
Yep, it does.  Axis is ASF, so we can just put them in where they need to 
be.  Either that, or let's not support Web Services.

> > > You hit on an interesting feature of the slope-one recommender used in 
> the demo. It does h...
Right, the same is true for Lucene.  We have an abstract "explain" method 
that gets implemented by the various scoring pieces.   So, for Taste, we 
could probably add it to the Recommender class, something like:

abstract Explanation explain(Item, Item, ???);  //Given a source item, 
explain why the recommended Item was chosen.
and
abstract Explanation explain(User, User, Item, ???);  //Given a source user, 
explain why the item was recommended for the target user.

> On Mon, Nov 10, 2008 at 1:21 PM, Grant Ingersoll <gs...@apache.org> 
wrote: >> >> Hi, >> >> ...
--------------------------
Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
http://www.lucenebootcamp.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ

Re: Taste ?'s

Posted by Grant Ingersoll <gs...@apache.org>.

On Nov 10, 2008, at 8:32 AM, Sean Owen wrote:

> The web service bit won't work without Axis but yes I think the rest
> does. The Axis .jars should be added into lib/axis as I recall. I
> believe the build file prompts you to do this? at least my build file
> does. I don't have it in front of me.

Yep, it does.  Axis is ASF, so we can just put them in where they need  
to be.  Either that, or let's not support Web Services.

>
>
> You hit on an interesting feature of the slope-one recommender used in
> the demo. It does have this odd tendency to produce similar
> recommendations for relatively dense data sets. I haven't really
> thought it through or discussed with the creator of this algorithm but
> I should.
>
> Top-rated items independent of user -- yes this could be added fairly
> easily. It is not really part of a recommender API but something
> computable from the data. It... may or may not really explain the
> recommendations in general. It *shouldn't*. In the case of slope-one
> it sort of does but that is not so desirable.
>
> The explanation is kind of algorithm-specific. For example for
> user-based recommenders the explanation relies on the user
> neighborhoods that were computed. For slope-one it kind of depends on
> item-item rating differences over the whole data set. I am not so sure
> yet how to expose that cleanly and consistently. The API methods
> needed are all there, but nothing more is done to someone craft an
> explanation of some kind.
>

Right, the same is true for Lucene.  We have an abstract "explain"  
method that gets implemented by the various scoring pieces.   So, for  
Taste, we could probably add it to the Recommender class, something  
like:

abstract Explanation explain(Item, Item, ???);  //Given a source item,  
explain why the recommended Item was chosen.
and
abstract Explanation explain(User, User, Item, ???);  //Given a source  
user, explain why the item was recommended for the target user.



> On Mon, Nov 10, 2008 at 1:21 PM, Grant Ingersoll  
> <gs...@apache.org> wrote:
>> Hi,
>>
>> I'm integrating the build and demo into the main workflow, and have  
>> a couple
>> of questions about Taste.  See
>> https://issues.apache.org/jira/browse/MAHOUT-94
>>
>> 1.  What should we do about the Axis jars required for the  
>> AxisServlet stuff
>> in the example WAR?  For now, I've commented out the dependency and  
>> the rest
>> seems to work fine.  Is there a pointer to where these live?
>>
>> 2. Is there a way to get the top rated items in the demo without a  
>> user?  I
>> think that would be useful.  For pretty much every user I've  
>> entered in the
>> demo, it recommends "Song of Freedom", and I'm curious as to why.   
>> One thing
>> that would be cool is an "explain" method like Lucene has for  
>> explaining
>> search results.
>>
>> -Grant
>>

--------------------------
Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
http://www.lucenebootcamp.com


Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ

Re: Taste ?'s

Posted by Sean Owen <sr...@gmail.com>.

The web service bit won't work without Axis but yes I think the rest
does. The Axis .jars should be added into lib/axis as I recall. I
believe the build file prompts you to do this? at least my build file
does. I don't have it in front of me.

You hit on an interesting feature of the slope-one recommender used in
the demo. It does have this odd tendency to produce similar
recommendations for relatively dense data sets. I haven't really
thought it through or discussed with the creator of this algorithm but
I should.

Top-rated items independent of user -- yes this could be added fairly
easily. It is not really part of a recommender API but something
computable from the data. It... may or may not really explain the
recommendations in general. It *shouldn't*. In the case of slope-one
it sort of does but that is not so desirable.

The explanation is kind of algorithm-specific. For example for
user-based recommenders the explanation relies on the user
neighborhoods that were computed. For slope-one it kind of depends on
item-item rating differences over the whole data set. I am not so sure
yet how to expose that cleanly and consistently. The API methods
needed are all there, but nothing more is done to someone craft an
explanation of some kind.

On Mon, Nov 10, 2008 at 1:21 PM, Grant Ingersoll <gs...@apache.org> wrote:
> Hi,
>
> I'm integrating the build and demo into the main workflow, and have a couple
> of questions about Taste.  See
> https://issues.apache.org/jira/browse/MAHOUT-94
>
> 1.  What should we do about the Axis jars required for the AxisServlet stuff
> in the example WAR?  For now, I've commented out the dependency and the rest
> seems to work fine.  Is there a pointer to where these live?
>
> 2. Is there a way to get the top rated items in the demo without a user?  I
> think that would be useful.  For pretty much every user I've entered in the
> demo, it recommends "Song of Freedom", and I'm curious as to why.  One thing
> that would be cool is an "explain" method like Lucene has for explaining
> search results.
>
> -Grant
>