You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by han henry <hu...@gmail.com> on 2011/01/14 07:13:06 UTC

Mahout 0.4 seems recommend user's existed items to user.

Hi,All

Now I have a issue ,Mahout 0.4 seems recommend user's existed items to user.

I remembered that Mahout has skips those user's existed items when recommend
items to user.

But I have not found the logic for skipping the  existed items in Mahout
0.4.

Can anyone confirm that or let me know where can find the logic for skipping
existed items ?

Best Regards,

Re: Mahout 0.4 seems recommend user's existed items to user.

Posted by han henry <hu...@gmail.com>.
Got your meaning. It's a easy and efficient way.

Thanks,

2011/1/14 Sean Owen <sr...@gmail.com>

> ItemFilterAsVectorAndPrefsReducer does #3.
>
> You can always post-process the recommendations however you like and
> ignore whatever items you want.
>
> On Fri, Jan 14, 2011 at 10:19 AM, han henry <hu...@gmail.com> wrote:
> > Hi,Sean and sebastian
> >
> > We have two type preference .
> >
> > 1)  ,Preferences that user does not want to see them ,we store those
> > preference in filterFile.
> > 2)  ,All preferences (include those in the #1) ,also those data can use
> to
> > calculate similarity.
> >
> > We can not recommend those items to user
> >
> > #1, Invalid items or expired items .we store those items in itemSFile.
> > #2, User Non-interested items ,we store those user ,item pairs in
> filterFile
> > .
> > #3, User existed items (user already has those item in preferences ).
> >
> >  ItemFilterAsVectorAndPrefsReducer seems can make  those items been
> skiped
> > in last step.
> >
> > so we do #1 and #2 in the last step
> > (AggregateAndRecommendReducer.java<
> http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java
> >),
> > but I have not found logic to skip #3.
> >
> > Am I right ?
> >
> > Best Regards,
> >
> > 2011/1/14 han henry <hu...@gmail.com>
> >
> >> Thank you Sean and sebastian :)
> >>
> >> 2011/1/14 Sean Owen <sr...@gmail.com>
> >>
> >> Look at ItemFilterAsVectorAndPrefsReducer. This does what you are
> looking
> >>> for.
> >>>
> >>> On Fri, Jan 14, 2011 at 9:17 AM, han henry <hu...@gmail.com>
> wrote:
> >>> > Hi,Sebastian
> >>> >
> >>> > Because my data is on the production ,it 's very large .so sorry that
> I
> >>> can
> >>> > not give you input data.
> >>> >
> >>> > But we can try to review the code .
> >>> >
> >>> > The initial version cooccurence arithmetic has logic to skip user's
> >>> existed
> >>> > items.
> >>> >
> >>> > Best Regards,
> >>>
> >>
> >>
> >
>

Re: Mahout 0.4 seems recommend user's existed items to user.

Posted by Sean Owen <sr...@gmail.com>.
ItemFilterAsVectorAndPrefsReducer does #3.

You can always post-process the recommendations however you like and
ignore whatever items you want.

On Fri, Jan 14, 2011 at 10:19 AM, han henry <hu...@gmail.com> wrote:
> Hi,Sean and sebastian
>
> We have two type preference .
>
> 1)  ,Preferences that user does not want to see them ,we store those
> preference in filterFile.
> 2)  ,All preferences (include those in the #1) ,also those data can use to
> calculate similarity.
>
> We can not recommend those items to user
>
> #1, Invalid items or expired items .we store those items in itemSFile.
> #2, User Non-interested items ,we store those user ,item pairs in filterFile
> .
> #3, User existed items (user already has those item in preferences ).
>
>  ItemFilterAsVectorAndPrefsReducer seems can make  those items been skiped
> in last step.
>
> so we do #1 and #2 in the last step
> (AggregateAndRecommendReducer.java<http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java>),
> but I have not found logic to skip #3.
>
> Am I right ?
>
> Best Regards,
>
> 2011/1/14 han henry <hu...@gmail.com>
>
>> Thank you Sean and sebastian :)
>>
>> 2011/1/14 Sean Owen <sr...@gmail.com>
>>
>> Look at ItemFilterAsVectorAndPrefsReducer. This does what you are looking
>>> for.
>>>
>>> On Fri, Jan 14, 2011 at 9:17 AM, han henry <hu...@gmail.com> wrote:
>>> > Hi,Sebastian
>>> >
>>> > Because my data is on the production ,it 's very large .so sorry that I
>>> can
>>> > not give you input data.
>>> >
>>> > But we can try to review the code .
>>> >
>>> > The initial version cooccurence arithmetic has logic to skip user's
>>> existed
>>> > items.
>>> >
>>> > Best Regards,
>>>
>>
>>
>

Re: Mahout 0.4 seems recommend user's existed items to user.

Posted by Sebastian Schelter <ss...@apache.org>.
It's true that already preferred items might be looked at in 
AggregateAndRecommendReducer but the prediction for them will always be 
NaN so they will be filtered out.

--sebastian

On 17.01.2011 16:57, han henry wrote:
> Hi,Sebastian,
>
> I have viewed the code today.
>
> Assume that the output of job partialMultiply as following:
>
> context.write(key, vectorAndPrefs);
>
> ItemA -->(([itemB,0.9],[itemC,0.1]),({user1,user2)),({10,1}))
> ItemB--> (([itemA,0.9]),{user1,user2),(5,1)).
>
> It meas that user1 has existed item itemA and ItemB,it also may
> recommend user1 with itemA or ItemB.
>
> Am I right ?
>
> Best Regards,
>
> --Henry Han
>
>
> 2011/1/14 Sebastian Schelter <ssc@apache.org <ma...@apache.org>>
>
>     Hi Han,
>
>     It's hard to see from the sources how the users' already preferred
>     items (#3) are excluded from the final results but it's definitely done.
>
>     I'll walk you through the code:
>
>     In SimilarityMatrixRowWrapperMapper.map() we map all similar items
>     for each item as a vector, notice that the similarity value of each
>     item to itself is set to NaN here.
>
>     When AggregateAndRecommender computes the final recommendations, it
>     receives a PrefAndSimilarityColumnWritable for each item preferred
>     by the user. Those similarity vectors and preference values are used
>     to compute the weighted sum that gives the prediction value for each
>     item to recommend.
>
>     For each item that has already been preferred by the user we can be
>     sure that there is the NaN value from above added to its sum which
>     makes it NaN too. Finally all NaN predictions are explicitly
>     filtered in AggregateAndRecommendReducer.writeRecommendedItems().
>
>
>     --sebastian
>
>
>
>
>
>
>     On 14.01.2011 11:19, han henry wrote:
>
>         Hi,Sean and sebastian
>
>         We have two type preference .
>
>         1)  ,Preferences that user does not want to see them ,we store those
>         preference in filterFile.
>         2)  ,All preferences (include those in the #1) ,also those data
>         can use to
>         calculate similarity.
>
>         We can not recommend those items to user
>
>         #1, Invalid items or expired items .we store those items in
>         itemSFile.
>         #2, User Non-interested items ,we store those user ,item pairs
>         in filterFile
>         .
>         #3, User existed items (user already has those item in
>         preferences ).
>
>           ItemFilterAsVectorAndPrefsReducer seems can make  those items
>         been skiped
>         in last step.
>
>         so we do #1 and #2 in the last step
>         (AggregateAndRecommendReducer.java<http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java>),
>
>         but I have not found logic to skip #3.
>
>         Am I right ?
>
>         Best Regards,
>
>         2011/1/14 han henry<huiwen.han@gmail.com
>         <ma...@gmail.com>>
>
>             Thank you Sean and sebastian :)
>
>             2011/1/14 Sean Owen<srowen@gmail.com <ma...@gmail.com>>
>
>             Look at ItemFilterAsVectorAndPrefsReducer. This does what
>             you are looking
>
>                 for.
>
>                 On Fri, Jan 14, 2011 at 9:17 AM, han
>                 henry<huiwen.han@gmail.com
>                 <ma...@gmail.com>>  wrote:
>
>                     Hi,Sebastian
>
>                     Because my data is on the production ,it 's very
>                     large .so sorry that I
>
>                 can
>
>                     not give you input data.
>
>                     But we can try to review the code .
>
>                     The initial version cooccurence arithmetic has logic
>                     to skip user's
>
>                 existed
>
>                     items.
>
>                     Best Regards,
>
>
>
>


Re: Mahout 0.4 seems recommend user's existed items to user.

Posted by han henry <hu...@gmail.com>.
Hi,Sebastian,

I have viewed the code today.

Assume that the output of job partialMultiply as following:

context.write(key, vectorAndPrefs);

ItemA -->(([itemB,0.9],[itemC,0.1]),({user1,user2)),({10,1}))
ItemB--> (([itemA,0.9]),{user1,user2),(5,1)).

It meas that user1 has existed item itemA and ItemB,it also may recommend
user1 with itemA or ItemB.

Am I right ?

Best Regards,

--Henry Han


2011/1/14 Sebastian Schelter <ss...@apache.org>

> Hi Han,
>
> It's hard to see from the sources how the users' already preferred items
> (#3) are excluded from the final results but it's definitely done.
>
> I'll walk you through the code:
>
> In SimilarityMatrixRowWrapperMapper.map() we map all similar items for each
> item as a vector, notice that the similarity value of each item to itself is
> set to NaN here.
>
> When AggregateAndRecommender computes the final recommendations, it
> receives a PrefAndSimilarityColumnWritable for each item preferred by the
> user. Those similarity vectors and preference values are used to compute the
> weighted sum that gives the prediction value for each item to recommend.
>
> For each item that has already been preferred by the user we can be sure
> that there is the NaN value from above added to its sum which makes it NaN
> too. Finally all NaN predictions are explicitly filtered in
> AggregateAndRecommendReducer.writeRecommendedItems().
>
>
> --sebastian
>
>
>
>
>
>
> On 14.01.2011 11:19, han henry wrote:
>
>> Hi,Sean and sebastian
>>
>> We have two type preference .
>>
>> 1)  ,Preferences that user does not want to see them ,we store those
>> preference in filterFile.
>> 2)  ,All preferences (include those in the #1) ,also those data can use to
>> calculate similarity.
>>
>> We can not recommend those items to user
>>
>> #1, Invalid items or expired items .we store those items in itemSFile.
>> #2, User Non-interested items ,we store those user ,item pairs in
>> filterFile
>> .
>> #3, User existed items (user already has those item in preferences ).
>>
>>  ItemFilterAsVectorAndPrefsReducer seems can make  those items been skiped
>> in last step.
>>
>> so we do #1 and #2 in the last step
>> (AggregateAndRecommendReducer.java<
>> http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java
>> >),
>>
>> but I have not found logic to skip #3.
>>
>> Am I right ?
>>
>> Best Regards,
>>
>> 2011/1/14 han henry<hu...@gmail.com>
>>
>>  Thank you Sean and sebastian :)
>>>
>>> 2011/1/14 Sean Owen<sr...@gmail.com>
>>>
>>> Look at ItemFilterAsVectorAndPrefsReducer. This does what you are looking
>>>
>>>> for.
>>>>
>>>> On Fri, Jan 14, 2011 at 9:17 AM, han henry<hu...@gmail.com>
>>>>  wrote:
>>>>
>>>>> Hi,Sebastian
>>>>>
>>>>> Because my data is on the production ,it 's very large .so sorry that I
>>>>>
>>>> can
>>>>
>>>>> not give you input data.
>>>>>
>>>>> But we can try to review the code .
>>>>>
>>>>> The initial version cooccurence arithmetic has logic to skip user's
>>>>>
>>>> existed
>>>>
>>>>> items.
>>>>>
>>>>> Best Regards,
>>>>>
>>>>
>>>
>

Re: Mahout 0.4 seems recommend user's existed items to user.

Posted by Sebastian Schelter <ss...@apache.org>.
Hi Han,

It's hard to see from the sources how the users' already preferred items 
(#3) are excluded from the final results but it's definitely done.

I'll walk you through the code:

In SimilarityMatrixRowWrapperMapper.map() we map all similar items for 
each item as a vector, notice that the similarity value of each item to 
itself is set to NaN here.

When AggregateAndRecommender computes the final recommendations, it 
receives a PrefAndSimilarityColumnWritable for each item preferred by 
the user. Those similarity vectors and preference values are used to 
compute the weighted sum that gives the prediction value for each item 
to recommend.

For each item that has already been preferred by the user we can be sure 
that there is the NaN value from above added to its sum which makes it 
NaN too. Finally all NaN predictions are explicitly filtered in 
AggregateAndRecommendReducer.writeRecommendedItems().


--sebastian





On 14.01.2011 11:19, han henry wrote:
> Hi,Sean and sebastian
>
> We have two type preference .
>
> 1)  ,Preferences that user does not want to see them ,we store those
> preference in filterFile.
> 2)  ,All preferences (include those in the #1) ,also those data can use to
> calculate similarity.
>
> We can not recommend those items to user
>
> #1, Invalid items or expired items .we store those items in itemSFile.
> #2, User Non-interested items ,we store those user ,item pairs in filterFile
> .
> #3, User existed items (user already has those item in preferences ).
>
>   ItemFilterAsVectorAndPrefsReducer seems can make  those items been skiped
> in last step.
>
> so we do #1 and #2 in the last step
> (AggregateAndRecommendReducer.java<http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java>),
> but I have not found logic to skip #3.
>
> Am I right ?
>
> Best Regards,
>
> 2011/1/14 han henry<hu...@gmail.com>
>
>> Thank you Sean and sebastian :)
>>
>> 2011/1/14 Sean Owen<sr...@gmail.com>
>>
>> Look at ItemFilterAsVectorAndPrefsReducer. This does what you are looking
>>> for.
>>>
>>> On Fri, Jan 14, 2011 at 9:17 AM, han henry<hu...@gmail.com>  wrote:
>>>> Hi,Sebastian
>>>>
>>>> Because my data is on the production ,it 's very large .so sorry that I
>>> can
>>>> not give you input data.
>>>>
>>>> But we can try to review the code .
>>>>
>>>> The initial version cooccurence arithmetic has logic to skip user's
>>> existed
>>>> items.
>>>>
>>>> Best Regards,
>>


Re: Mahout 0.4 seems recommend user's existed items to user.

Posted by han henry <hu...@gmail.com>.
Hi,Sean and sebastian

We have two type preference .

1)  ,Preferences that user does not want to see them ,we store those
preference in filterFile.
2)  ,All preferences (include those in the #1) ,also those data can use to
calculate similarity.

We can not recommend those items to user

#1, Invalid items or expired items .we store those items in itemSFile.
#2, User Non-interested items ,we store those user ,item pairs in filterFile
.
#3, User existed items (user already has those item in preferences ).

 ItemFilterAsVectorAndPrefsReducer seems can make  those items been skiped
in last step.

so we do #1 and #2 in the last step
(AggregateAndRecommendReducer.java<http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java>),
but I have not found logic to skip #3.

Am I right ?

Best Regards,

2011/1/14 han henry <hu...@gmail.com>

> Thank you Sean and sebastian :)
>
> 2011/1/14 Sean Owen <sr...@gmail.com>
>
> Look at ItemFilterAsVectorAndPrefsReducer. This does what you are looking
>> for.
>>
>> On Fri, Jan 14, 2011 at 9:17 AM, han henry <hu...@gmail.com> wrote:
>> > Hi,Sebastian
>> >
>> > Because my data is on the production ,it 's very large .so sorry that I
>> can
>> > not give you input data.
>> >
>> > But we can try to review the code .
>> >
>> > The initial version cooccurence arithmetic has logic to skip user's
>> existed
>> > items.
>> >
>> > Best Regards,
>>
>
>

Re: Mahout 0.4 seems recommend user's existed items to user.

Posted by han henry <hu...@gmail.com>.
Thank you Sean and sebastian :)

2011/1/14 Sean Owen <sr...@gmail.com>

> Look at ItemFilterAsVectorAndPrefsReducer. This does what you are looking
> for.
>
> On Fri, Jan 14, 2011 at 9:17 AM, han henry <hu...@gmail.com> wrote:
> > Hi,Sebastian
> >
> > Because my data is on the production ,it 's very large .so sorry that I
> can
> > not give you input data.
> >
> > But we can try to review the code .
> >
> > The initial version cooccurence arithmetic has logic to skip user's
> existed
> > items.
> >
> > Best Regards,
>

Re: Mahout 0.4 seems recommend user's existed items to user.

Posted by Sean Owen <sr...@gmail.com>.
Look at ItemFilterAsVectorAndPrefsReducer. This does what you are looking for.

On Fri, Jan 14, 2011 at 9:17 AM, han henry <hu...@gmail.com> wrote:
> Hi,Sebastian
>
> Because my data is on the production ,it 's very large .so sorry that I can
> not give you input data.
>
> But we can try to review the code .
>
> The initial version cooccurence arithmetic has logic to skip user's existed
> items.
>
> Best Regards,

Re: Mahout 0.4 seems recommend user's existed items to user.

Posted by han henry <hu...@gmail.com>.
Hi,Sebastian

Because my data is on the production ,it 's very large .so sorry that I can
not give you input data.

But we can try to review the code .

The initial version cooccurence arithmetic has logic to skip user's existed
items.

Best Regards,

2011/1/14 Sebastian Schelter <ss...@apache.org>

> Hi Han,
>
> I extended the unit test in
> org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.testCompleteJob()
> to explicitly check for that behavior and everything seems fine.
>
> Can you provide some input data where you see this happening?
>
>
> --sebastian
>
>
>
> On 14.01.2011 09:43, han henry wrote:
>
>> Hi, Sebastian,
>>
>> I mean this one:
>>
>>
>> http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/RecommenderJob.java
>>
>>
>> 2011/1/14 Sebastian Schelter <ssc@apache.org <ma...@apache.org>>
>>
>>
>>    Hi,
>>
>>    which recommender are you talking about? The distributed recommender
>>    does not do this, I checked it and will include a test for that into
>>    our unit tests.
>>
>>    --sebastian
>>
>>
>>
>>    On 14.01.2011 07:13, han henry wrote:
>>
>>        Hi,All
>>
>>        Now I have a issue ,Mahout 0.4 seems recommend user's existed
>>        items to user.
>>
>>        I remembered that Mahout has skips those user's existed items
>>        when recommend
>>        items to user.
>>
>>        But I have not found the logic for skipping the  existed items
>>        in Mahout
>>        0.4.
>>
>>        Can anyone confirm that or let me know where can find the logic
>>        for skipping
>>        existed items ?
>>
>>        Best Regards,
>>
>>
>>
>>
>

Re: Mahout 0.4 seems recommend user's existed items to user.

Posted by Sebastian Schelter <ss...@apache.org>.
Hi Han,

I extended the unit test in 
org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.testCompleteJob() 
to explicitly check for that behavior and everything seems fine.

Can you provide some input data where you see this happening?


--sebastian


On 14.01.2011 09:43, han henry wrote:
> Hi, Sebastian,
>
> I mean this one:
>
> http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/RecommenderJob.java
>
>
> 2011/1/14 Sebastian Schelter <ssc@apache.org <ma...@apache.org>>
>
>     Hi,
>
>     which recommender are you talking about? The distributed recommender
>     does not do this, I checked it and will include a test for that into
>     our unit tests.
>
>     --sebastian
>
>
>
>     On 14.01.2011 07:13, han henry wrote:
>
>         Hi,All
>
>         Now I have a issue ,Mahout 0.4 seems recommend user's existed
>         items to user.
>
>         I remembered that Mahout has skips those user's existed items
>         when recommend
>         items to user.
>
>         But I have not found the logic for skipping the  existed items
>         in Mahout
>         0.4.
>
>         Can anyone confirm that or let me know where can find the logic
>         for skipping
>         existed items ?
>
>         Best Regards,
>
>
>


Re: Mahout 0.4 seems recommend user's existed items to user.

Posted by han henry <hu...@gmail.com>.
Hi, Sebastian,

I mean this one:

http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/RecommenderJob.java


2011/1/14 Sebastian Schelter <ss...@apache.org>

> Hi,
>
> which recommender are you talking about? The distributed recommender does
> not do this, I checked it and will include a test for that into our unit
> tests.
>
> --sebastian
>
>
>
> On 14.01.2011 07:13, han henry wrote:
>
>> Hi,All
>>
>> Now I have a issue ,Mahout 0.4 seems recommend user's existed items to
>> user.
>>
>> I remembered that Mahout has skips those user's existed items when
>> recommend
>> items to user.
>>
>> But I have not found the logic for skipping the  existed items in Mahout
>> 0.4.
>>
>> Can anyone confirm that or let me know where can find the logic for
>> skipping
>> existed items ?
>>
>> Best Regards,
>>
>>
>

Re: Mahout 0.4 seems recommend user's existed items to user.

Posted by Sebastian Schelter <ss...@apache.org>.
Hi,

which recommender are you talking about? The distributed recommender 
does not do this, I checked it and will include a test for that into our 
unit tests.

--sebastian


On 14.01.2011 07:13, han henry wrote:
> Hi,All
>
> Now I have a issue ,Mahout 0.4 seems recommend user's existed items to user.
>
> I remembered that Mahout has skips those user's existed items when recommend
> items to user.
>
> But I have not found the logic for skipping the  existed items in Mahout
> 0.4.
>
> Can anyone confirm that or let me know where can find the logic for skipping
> existed items ?
>
> Best Regards,
>