You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Claudia Grieco <gr...@crmpa.unisa.it> on 2009/07/23 19:41:18 UTC

Using Item Based recommenders as content based recommenders

Hi guys,

I know that Mahout supports only collaborative filtering, but let's see if
this approach makes sense:

Item-based recommenders can be initialized with pre-computed item-item
similarities, right? And what if those item similarities are computed using
a Content Based technique (example off the top of my head, Cosine distance
between the text of two documents computed using Lucene)? Am I missing
something?

Thanks for your help

Claudia


Re: R: Using Item Based recommenders as content based recommenders

Posted by Sean Owen <sr...@gmail.com>.
Yep I fixed that one a little while ago. :)

On Fri, Jul 24, 2009 at 8:22 AM, Claudia Grieco<gr...@crmpa.unisa.it> wrote:
> Thanks.
> Do you know of any fast way to obtain document-document similarity in Lucene and write it to a text Mahout can read?
>
> Ah, while playing a bit with item based recommenders I found a bug in MySQLJDBCDataModel
> // getNumPreferenceForItemsSQL
> "SELECT COUNT(1) FROM " + preferenceTable + " tp1 INNER JOIN " + preferenceColumn + " tp2 " +
>          "ON (tp1." + userIDColumn + "=tp2." + userIDColumn + ") " +
>          "WHERE tp1." + itemIDColumn + "=? and tp2." + itemIDColumn + "=?");
>
>
> It should be
> "SELECT COUNT(1) FROM " + preferenceTable + " tp1 INNER JOIN " + preferenceTable....

R: Using Item Based recommenders as content based recommenders

Posted by Claudia Grieco <gr...@crmpa.unisa.it>.
Thanks.
Do you know of any fast way to obtain document-document similarity in Lucene and write it to a text Mahout can read?

Ah, while playing a bit with item based recommenders I found a bug in MySQLJDBCDataModel
// getNumPreferenceForItemsSQL
"SELECT COUNT(1) FROM " + preferenceTable + " tp1 INNER JOIN " + preferenceColumn + " tp2 " +
          "ON (tp1." + userIDColumn + "=tp2." + userIDColumn + ") " +
          "WHERE tp1." + itemIDColumn + "=? and tp2." + itemIDColumn + "=?");
  

It should be
"SELECT COUNT(1) FROM " + preferenceTable + " tp1 INNER JOIN " + preferenceTable....


-----Messaggio originale-----
Da: Sean Owen [mailto:srowen@gmail.com] 
Inviato: giovedì 23 luglio 2009 20.51
A: mahout-user@lucene.apache.org
Oggetto: Re: Using Item Based recommenders as content based recommenders

Yes, this is entirely reasonable.

You could also proceed by pre-computing user-user similarities and
using a user-based recommender. While it seems very much the same, I
wouldn't do that actually. Pre-computing kind of relies on the
assumption that similarities won't change much as you learn more, and
that is far more true of docs than users.




Re: Using Item Based recommenders as content based recommenders

Posted by Sean Owen <sr...@gmail.com>.
Yes, this is entirely reasonable.

You could also proceed by pre-computing user-user similarities and
using a user-based recommender. While it seems very much the same, I
wouldn't do that actually. Pre-computing kind of relies on the
assumption that similarities won't change much as you learn more, and
that is far more true of docs than users.

On Thu, Jul 23, 2009 at 6:41 PM, Claudia Grieco<gr...@crmpa.unisa.it> wrote:
> Hi guys,
>
> I know that Mahout supports only collaborative filtering, but let's see if
> this approach makes sense:
>
> Item-based recommenders can be initialized with pre-computed item-item
> similarities, right? And what if those item similarities are computed using
> a Content Based technique (example off the top of my head, Cosine distance
> between the text of two documents computed using Lucene)? Am I missing
> something?
>
> Thanks for your help
>
> Claudia
>
>

Re: Using Item Based recommenders as content based recommenders

Posted by Ted Dunning <te...@gmail.com>.
This can work just fine.

Basically, you are precomputing the results of a similar document search.
Naive implementation of similar document search can be relatively slow, but
it can be sped up dramatically by simply using a filter to pull the most
interesting terms out of a document before doing the search.

Once you do that, then it doesn't make a lot of sense to do the searches
ahead of time, especially if you cache the results of the real-time search.

On Thu, Jul 23, 2009 at 10:41 AM, Claudia Grieco <gr...@crmpa.unisa.it>wrote:

> Item-based recommenders can be initialized with pre-computed item-item
> similarities, right? And what if those item similarities are computed using
> a Content Based technique (example off the top of my head, Cosine distance
> between the text of two documents computed using Lucene)? Am I missing
> something?
>



-- 
Ted Dunning, CTO
DeepDyve