You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "lariven (JIRA)" <ji...@apache.org> on 2015/06/14 10:05:01 UTC

[jira] [Issue Comment Deleted] (MAHOUT-1739) maxSimilarItemsPerItem param of ItemSimilarityJob doesn't behave correct

     [ https://issues.apache.org/jira/browse/MAHOUT-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

lariven updated MAHOUT-1739:
----------------------------
    Comment: was deleted

(was: 1, From the view point of usage, it make sense that an item inputted then 10 most similar items output. But the triangular matrix can't satisfy the "Who Buy X also Buy Y" recommandation because it is not contain all items as Keys. So what is the usage of this Job?

2, This job take it's input from RowSimilarityJob, whose maxSimilaritiesPerRow param make sense and really do what we want. It's strange this two param take the same value but behave differently.
)

> maxSimilarItemsPerItem param of ItemSimilarityJob doesn't behave correct
> ------------------------------------------------------------------------
>
>                 Key: MAHOUT-1739
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1739
>             Project: Mahout
>          Issue Type: Bug
>          Components: Collaborative Filtering
>    Affects Versions: 0.10.0
>            Reporter: lariven
>              Labels: easyfix, patch
>         Attachments: fix_maxSimilarItemsPerItem_incorrect_behave.patch
>
>
> the output similar items of ItemSimilarityJob for each target item may exceed the number of similar items we set to maxSimilarItemsPerItem  parameter. the following code of ItemSimilarityJob.java about line NO. 200 may affect:
>         if (itemID < otherItemID) {
>           ctx.write(new EntityEntityWritable(itemID, otherItemID), new DoubleWritable(similarItem.getSimilarity()));
>         } else {
>           ctx.write(new EntityEntityWritable(otherItemID, itemID), new DoubleWritable(similarItem.getSimilarity()));
>         }
> Don't know why need to switch itemID with otherItemID, but I think a single line is enough:
>           ctx.write(new EntityEntityWritable(itemID, otherItemID), new DoubleWritable(similarItem.getSimilarity()));



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)