You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Phoenix Bai <ba...@gmail.com> on 2013/04/01 13:18:41 UTC

Re: Regarding ItemBased Recommendation Results

Raju,

like Sebastian said, it probably due to the default sampling restriction of
hadoop-based implementation.
<blockquote>
maxPrefsPerUserInItemSimilarity", "max number of preferences to consider
per user in the "
            + "item similarity computation phase, users with more
preferences will be sampled down (default: 1000)
<blockquote>
You could check your data to see if there are many users whose preferences
are over 1000?


On Fri, Mar 29, 2013 at 12:53 AM, ch raju <ch...@gmail.com> wrote:

>   yeah, recommendations are completely different, out of 10 only one
> suggestion got matched..
> which neighborhoods are you asking about ? I am new to this, didn't
> understand..
>
> Thanks & regards,
> Raju
>
> On Thu, Mar 28, 2013 at 8:25 PM, Koobas <ko...@gmail.com> wrote:
>
> > Are the suggestions completely different, or somewhat different?
> > What about the neighborhoods?
> >
> >
> > On Thu, Mar 28, 2013 at 10:09 AM, ch raju <ch...@gmail.com> wrote:
> >
> > > Hi all,
> > >   I am working on mahout-0.7 recommendations, ran following command
> from
> > > the command line
> > > ./bin/mahout recommenditembased --input UserData.csv --output output/
> > > --similarityClassname SIMILARITY_PEARSON_CORRELATION
> --numRecommendations
> > > 10
> > > got the recommendations for every user.
> > > I deployed the Mahout integration war in the localhost and executed the
> > > url(
> > >
> > >
> >
> http://localhost:8080/mahout-integration-0.7/RecommenderServlet?userID=47639&howMany=10
> > > ),
> > > got the results, but when i compare the recommendations of above and
> this
> > > for the 47639 user then results are completely different.
> > >
> > >  I am using GenericItemBasedRecommender with Pearson Correlation
> > similarity
> > > and the same input file. I would like to know why i am getting
> different
> > > results? Both are item based only right? Which recommender is
> > > recommenditembased using?
> > >
> > > --
> > > Thanks & Regards,
> > > Raju Chinthala
> > >
> >
>
>
>
> --
> Thanks & Regards,
> Raju Chinthala
>

Re: Regarding ItemBased Recommendation Results

Posted by Sebastian Schelter <ss...@googlemail.com>.
It could also be due to the way in which the Pearson correlation is
calculated in both implementations.

The distributed implementation centers all item vectors, scales them to
unit length and computes dot products afterwards.

The single machine implementation centers only based on the common
interactions between two item vectors AFAIK, which means it discards
users that have not interacted with both items.

I think both are valid approaches, unfortunately the second one is not
possible in the current distributed implementation.

On 01.04.2013 13:18, Phoenix Bai wrote:
> Raju,
> 
> like Sebastian said, it probably due to the default sampling restriction of
> hadoop-based implementation.
> <blockquote>
> maxPrefsPerUserInItemSimilarity", "max number of preferences to consider
> per user in the "
>             + "item similarity computation phase, users with more
> preferences will be sampled down (default: 1000)
> <blockquote>
> You could check your data to see if there are many users whose preferences
> are over 1000?
> 
> 
> On Fri, Mar 29, 2013 at 12:53 AM, ch raju <ch...@gmail.com> wrote:
> 
>>   yeah, recommendations are completely different, out of 10 only one
>> suggestion got matched..
>> which neighborhoods are you asking about ? I am new to this, didn't
>> understand..
>>
>> Thanks & regards,
>> Raju
>>
>> On Thu, Mar 28, 2013 at 8:25 PM, Koobas <ko...@gmail.com> wrote:
>>
>>> Are the suggestions completely different, or somewhat different?
>>> What about the neighborhoods?
>>>
>>>
>>> On Thu, Mar 28, 2013 at 10:09 AM, ch raju <ch...@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>   I am working on mahout-0.7 recommendations, ran following command
>> from
>>>> the command line
>>>> ./bin/mahout recommenditembased --input UserData.csv --output output/
>>>> --similarityClassname SIMILARITY_PEARSON_CORRELATION
>> --numRecommendations
>>>> 10
>>>> got the recommendations for every user.
>>>> I deployed the Mahout integration war in the localhost and executed the
>>>> url(
>>>>
>>>>
>>>
>> http://localhost:8080/mahout-integration-0.7/RecommenderServlet?userID=47639&howMany=10
>>>> ),
>>>> got the results, but when i compare the recommendations of above and
>> this
>>>> for the 47639 user then results are completely different.
>>>>
>>>>  I am using GenericItemBasedRecommender with Pearson Correlation
>>> similarity
>>>> and the same input file. I would like to know why i am getting
>> different
>>>> results? Both are item based only right? Which recommender is
>>>> recommenditembased using?
>>>>
>>>> --
>>>> Thanks & Regards,
>>>> Raju Chinthala
>>>>
>>>
>>
>>
>>
>> --
>> Thanks & Regards,
>> Raju Chinthala
>>
>