You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Grant Ingersoll <gs...@apache.org> on 2008/11/10 14:21:12 UTC
Taste ?'s
Hi,
I'm integrating the build and demo into the main workflow, and have a
couple of questions about Taste. See https://issues.apache.org/jira/browse/MAHOUT-94
1. What should we do about the Axis jars required for the AxisServlet
stuff in the example WAR? For now, I've commented out the dependency
and the rest seems to work fine. Is there a pointer to where these
live?
2. Is there a way to get the top rated items in the demo without a
user? I think that would be useful. For pretty much every user I've
entered in the demo, it recommends "Song of Freedom", and I'm curious
as to why. One thing that would be cool is an "explain" method like
Lucene has for explaining search results.
-Grant
Re: Taste ?'s
Posted by Grant Ingersoll <gs...@apache.org>.
On Nov 10, 2008, at 12:05 PM, Sean Owen wrote:
>
>
> Fine by me, I had always distributed it myself. I pulled it out here
> by
> request I think in order to reduce the number of dependencies.
>
> I would rather not remove working, plausibly useful functionality
> just to
> avoid a dependency - web services are definitely in that category.
> At worst
> just make it a dependency you have to add, which is what I did here
> (same
> for EJB though not sure anyone actually uses that). Obviously I have
> no
> issue going the other way and including Axis.
OK, I agree on keeping it. I can see Mahout growing to have a webapp
layer that isn't just for taste, so maybe, eventually, we will want to
separate out the webapp part from the "core"
>
>
> On explanation - I like the idea. I am struggling to think of any
> commonality between explanations across all algorithms. That may argue
> against a generic Explanation object but not against specific
> implementations tailored for particular algorithms.
>
> I am then trying to imagine what the explanation is like. For user-
> based
> recommenders it is really a neighborhood of users that explains it.
> Do you
> return that? In an object that tells you stuff like how much a given
> item is
> liked by the group?
>
> I like the idea... when I dig in I am having trouble thinking of
> just what
> an explanation looks like. I can imagine reporting interesting
> figures (e.g.
> most popular items in neighborhood) that don't really explain things.
> Perhaps it is good to think of real-world examples like amazon's
> 'recommended because' function. That I already expose - in the
> algorithm for
> which it makes any sense that is. For instance there would be no
> such notion
> of this in slope-one.
>
> An interesting item to mull over indeed. It is a way a recommender
> system
> could add real insight and value.
I think you might be going a little to generic in your thinking. The
only real difference between the actual recommendation and the explain
of that recommendation is that the explain provides the details of why
the recommendation was given. For instance, and I'm just making this
up, it might say something like:
Item A had a score of 2.5, which is the product of
Recommender Weight Factor: 0.5 times
# of Users recommending item: 10 times
Avg. Recommendation Score: 0.5
In Lucene, an explain looks like:
0.34713835 = (MATCH) sum of:
0.29951444 = weight(text:"i pod" in 0), product of:
0.54212046 = queryWeight(text:"i pod"), product of:
5.8931932 = idf(text: i=27 pod=25)
0.091990955 = queryNorm
0.5524869 = fieldWeight(text:"i pod" in 0), product of:
1.0 = tf(phraseFreq=1.0)
5.8931932 = idf(text: i=27 pod=25)
0.09375 = fieldNorm(field=text, doc=0)
0.04762391 = (MATCH) sum of:
0.04762391 = (MATCH) weight(text:gb in 0), product of:
0.21617201 = queryWeight(text:gb), product of:
2.3499267 = idf(docFreq=48, numDocs=33)
0.091990955 = queryNorm
0.22030562 = (MATCH) fieldWeight(text:gb in 0), product of:
1.0 = tf(termFreq(text:gb)=1)
2.3499267 = idf(docFreq=48, numDocs=33)
0.09375 = fieldNorm(field=text, doc=0)
(FYI: I got this by running the Solr example and using the query: http://localhost:8983/solr/select?indent=on&version=2.2&q=iPod+AND+GB&start=0&rows=10&fl=*%2Cscore&qt=standard&wt=standard&debugQuery=on&explainOther=&hl.fl=)
Due note, I don't even know if this makes sense in CF, but I suspect
it does, otherwise we wouldn't be having this conversation. :-)
>
>
> On 10 Nov 2008, 1:54 PM, "Grant Ingersoll" <gs...@apache.org>
> wrote:
>
> On Nov 10, 2008, at 8:32 AM, Sean Owen wrote: > The web service bit
> won't
> work without Axis but ye...
> Yep, it does. Axis is ASF, so we can just put them in where they
> need to
> be. Either that, or let's not support Web Services.
>
>>>> You hit on an interesting feature of the slope-one recommender
>>>> used in
>> the demo. It does h...
> Right, the same is true for Lucene. We have an abstract "explain"
> method
> that gets implemented by the various scoring pieces. So, for
> Taste, we
> could probably add it to the Recommender class, something like:
>
> abstract Explanation explain(Item, Item, ???); //Given a source item,
> explain why the recommended Item was chosen.
> and
> abstract Explanation explain(User, User, Item, ???); //Given a
> source user,
> explain why the item was recommended for the target user.
>
>> On Mon, Nov 10, 2008 at 1:21 PM, Grant Ingersoll
>> <gs...@apache.org>
> wrote: >> >> Hi, >> >> ...
> --------------------------
> Grant Ingersoll
> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
> http://www.lucenebootcamp.com
>
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
>
>
--------------------------
Grant Ingersoll
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
Re: Taste ?'s
Posted by Sean Owen <sr...@gmail.com>.
Fine by me, I had always distributed it myself. I pulled it out here by
request I think in order to reduce the number of dependencies.
I would rather not remove working, plausibly useful functionality just to
avoid a dependency - web services are definitely in that category. At worst
just make it a dependency you have to add, which is what I did here (same
for EJB though not sure anyone actually uses that). Obviously I have no
issue going the other way and including Axis.
On explanation - I like the idea. I am struggling to think of any
commonality between explanations across all algorithms. That may argue
against a generic Explanation object but not against specific
implementations tailored for particular algorithms.
I am then trying to imagine what the explanation is like. For user-based
recommenders it is really a neighborhood of users that explains it. Do you
return that? In an object that tells you stuff like how much a given item is
liked by the group?
I like the idea... when I dig in I am having trouble thinking of just what
an explanation looks like. I can imagine reporting interesting figures (e.g.
most popular items in neighborhood) that don't really explain things.
Perhaps it is good to think of real-world examples like amazon's
'recommended because' function. That I already expose - in the algorithm for
which it makes any sense that is. For instance there would be no such notion
of this in slope-one.
An interesting item to mull over indeed. It is a way a recommender system
could add real insight and value.
On 10 Nov 2008, 1:54 PM, "Grant Ingersoll" <gs...@apache.org> wrote:
On Nov 10, 2008, at 8:32 AM, Sean Owen wrote: > The web service bit won't
work without Axis but ye...
Yep, it does. Axis is ASF, so we can just put them in where they need to
be. Either that, or let's not support Web Services.
> > > You hit on an interesting feature of the slope-one recommender used in
> the demo. It does h...
Right, the same is true for Lucene. We have an abstract "explain" method
that gets implemented by the various scoring pieces. So, for Taste, we
could probably add it to the Recommender class, something like:
abstract Explanation explain(Item, Item, ???); //Given a source item,
explain why the recommended Item was chosen.
and
abstract Explanation explain(User, User, Item, ???); //Given a source user,
explain why the item was recommended for the target user.
> On Mon, Nov 10, 2008 at 1:21 PM, Grant Ingersoll <gs...@apache.org>
wrote: >> >> Hi, >> >> ...
--------------------------
Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
http://www.lucenebootcamp.com
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
Re: Taste ?'s
Posted by Grant Ingersoll <gs...@apache.org>.
On Nov 10, 2008, at 8:32 AM, Sean Owen wrote:
> The web service bit won't work without Axis but yes I think the rest
> does. The Axis .jars should be added into lib/axis as I recall. I
> believe the build file prompts you to do this? at least my build file
> does. I don't have it in front of me.
Yep, it does. Axis is ASF, so we can just put them in where they need
to be. Either that, or let's not support Web Services.
>
>
> You hit on an interesting feature of the slope-one recommender used in
> the demo. It does have this odd tendency to produce similar
> recommendations for relatively dense data sets. I haven't really
> thought it through or discussed with the creator of this algorithm but
> I should.
>
> Top-rated items independent of user -- yes this could be added fairly
> easily. It is not really part of a recommender API but something
> computable from the data. It... may or may not really explain the
> recommendations in general. It *shouldn't*. In the case of slope-one
> it sort of does but that is not so desirable.
>
> The explanation is kind of algorithm-specific. For example for
> user-based recommenders the explanation relies on the user
> neighborhoods that were computed. For slope-one it kind of depends on
> item-item rating differences over the whole data set. I am not so sure
> yet how to expose that cleanly and consistently. The API methods
> needed are all there, but nothing more is done to someone craft an
> explanation of some kind.
>
Right, the same is true for Lucene. We have an abstract "explain"
method that gets implemented by the various scoring pieces. So, for
Taste, we could probably add it to the Recommender class, something
like:
abstract Explanation explain(Item, Item, ???); //Given a source item,
explain why the recommended Item was chosen.
and
abstract Explanation explain(User, User, Item, ???); //Given a source
user, explain why the item was recommended for the target user.
> On Mon, Nov 10, 2008 at 1:21 PM, Grant Ingersoll
> <gs...@apache.org> wrote:
>> Hi,
>>
>> I'm integrating the build and demo into the main workflow, and have
>> a couple
>> of questions about Taste. See
>> https://issues.apache.org/jira/browse/MAHOUT-94
>>
>> 1. What should we do about the Axis jars required for the
>> AxisServlet stuff
>> in the example WAR? For now, I've commented out the dependency and
>> the rest
>> seems to work fine. Is there a pointer to where these live?
>>
>> 2. Is there a way to get the top rated items in the demo without a
>> user? I
>> think that would be useful. For pretty much every user I've
>> entered in the
>> demo, it recommends "Song of Freedom", and I'm curious as to why.
>> One thing
>> that would be cool is an "explain" method like Lucene has for
>> explaining
>> search results.
>>
>> -Grant
>>
--------------------------
Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
http://www.lucenebootcamp.com
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
Re: Taste ?'s
Posted by Sean Owen <sr...@gmail.com>.
The web service bit won't work without Axis but yes I think the rest
does. The Axis .jars should be added into lib/axis as I recall. I
believe the build file prompts you to do this? at least my build file
does. I don't have it in front of me.
You hit on an interesting feature of the slope-one recommender used in
the demo. It does have this odd tendency to produce similar
recommendations for relatively dense data sets. I haven't really
thought it through or discussed with the creator of this algorithm but
I should.
Top-rated items independent of user -- yes this could be added fairly
easily. It is not really part of a recommender API but something
computable from the data. It... may or may not really explain the
recommendations in general. It *shouldn't*. In the case of slope-one
it sort of does but that is not so desirable.
The explanation is kind of algorithm-specific. For example for
user-based recommenders the explanation relies on the user
neighborhoods that were computed. For slope-one it kind of depends on
item-item rating differences over the whole data set. I am not so sure
yet how to expose that cleanly and consistently. The API methods
needed are all there, but nothing more is done to someone craft an
explanation of some kind.
On Mon, Nov 10, 2008 at 1:21 PM, Grant Ingersoll <gs...@apache.org> wrote:
> Hi,
>
> I'm integrating the build and demo into the main workflow, and have a couple
> of questions about Taste. See
> https://issues.apache.org/jira/browse/MAHOUT-94
>
> 1. What should we do about the Axis jars required for the AxisServlet stuff
> in the example WAR? For now, I've commented out the dependency and the rest
> seems to work fine. Is there a pointer to where these live?
>
> 2. Is there a way to get the top rated items in the demo without a user? I
> think that would be useful. For pretty much every user I've entered in the
> demo, it recommends "Song of Freedom", and I'm curious as to why. One thing
> that would be cool is an "explain" method like Lucene has for explaining
> search results.
>
> -Grant
>