You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/04/05 08:07:33 UTC

[jira] [Commented] (MAHOUT-1622) MultithreadedBatchItemSimilarities outputs incorrect number of similarities.

    [ https://issues.apache.org/jira/browse/MAHOUT-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14396096#comment-14396096 ] 

ASF GitHub Bot commented on MAHOUT-1622:
----------------------------------------

GitHub user avati opened a pull request:

    https://github.com/apache/mahout/pull/106

    MAHOUT-1622: MultithreadedBatchItemSimilarities output fix

    Rebased batchSimilarities.patch attached in MAHOUT-1622 and resolved conflicts. Tests pass on laptop and good for merge.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/avati/mahout MAHOUT-1622

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/mahout/pull/106.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #106
    
----
commit 5e15c62df3406cbe69465eee42fbddb0c439bb8a
Author: Anand Avati <av...@redhat.com>
Date:   2015-04-05T05:59:09Z

    MAHOUT-1622: MultithreadedBatchItemSimilarities output fix
    
    In some cases the Output class in MultithreadedBatchItemSimilarities does
    not output all of the similarity pairs that it should. It is very possible
    for the number of active workers to go to zero while in the while loop,
    in which case the remaining similarities for the finished workers will not
    be flushed to the output. This is because the while loop is only
    conditioned on whether there are active workers or not. An easy fix is to
    also check to make sure the results structure is not empty. This way both
    the number of active workers must be 0 and the result set must be empty to
    exit the while loop.
    
    On-behalf-of: Jesse Daniels <je...@gmail.com>
    Signed-off-by: Anand Avati <av...@redhat.com>

----


> MultithreadedBatchItemSimilarities outputs incorrect number of similarities.
> ----------------------------------------------------------------------------
>
>                 Key: MAHOUT-1622
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1622
>             Project: Mahout
>          Issue Type: Bug
>          Components: Collaborative Filtering
>    Affects Versions: 0.9
>            Reporter: Jesse Daniels
>            Assignee: Anand Avati
>            Priority: Minor
>              Labels: legacy
>             Fix For: 0.10.0
>
>         Attachments: batchSimilarities.patch
>
>
> In some cases the Output class in MultithreadedBatchItemSimilarities does not output all of the similarity pairs that it should. It is very possible for the number of active workers to go to zero while in the while loop, in which case the remaining similarities for the finished workers will not be flushed to the output. This is because the while loop is only conditioned on whether there are active workers or not. An easy fix is to also check to make sure the results structure is not empty. This way both the number of active workers must be 0 and the result set must be empty to exit the while loop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)