You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Sebastian Schelter (JIRA)" <ji...@apache.org> on 2011/03/30 16:15:05 UTC

[jira] [Created] (MAHOUT-642) Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity

Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity
---------------------------------------------------------------------------------------------

                 Key: MAHOUT-642
                 URL: https://issues.apache.org/jira/browse/MAHOUT-642
             Project: Mahout
          Issue Type: Bug
            Reporter: Sebastian Schelter
            Assignee: Sebastian Schelter


org.apache.mahout.math.stats.LogLikelihood.logLikelihoodRatio expects the following counts:

* A and B together (k_11)
* B without A (k_12)
* A without B (k_21)
* Neither A nor B (k_22)

It seems to me that in org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarity and org.apache.mahout.math.hadoop.similarity.vector.DistributedLoglikelihoodVectorSimilarity the counts of k_12 and k_21 are given in the wrong order (B without A should come before A without B)
 
Can someone confirm that?


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-642) Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13013270#comment-13013270 ] 

Sean Owen commented on MAHOUT-642:
----------------------------------

Yes it's symmetric in that sense -- just depends on what you're calling event A and event B. But yes feel free to change things to at least be conceptually consistent.

> Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity
> ---------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-642
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-642
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>            Priority: Minor
>
> org.apache.mahout.math.stats.LogLikelihood.logLikelihoodRatio expects the following counts:
> * A and B together (k_11)
> * B without A (k_12)
> * A without B (k_21)
> * Neither A nor B (k_22)
> It seems to me that in org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarity and org.apache.mahout.math.hadoop.similarity.vector.DistributedLoglikelihoodVectorSimilarity the counts of k_12 and k_21 are given in the wrong order (B without A should come before A without B)
>  
> Can someone confirm that?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAHOUT-642) Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sebastian Schelter updated MAHOUT-642:
--------------------------------------

    Priority: Minor  (was: Major)

> Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity
> ---------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-642
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-642
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>            Priority: Minor
>
> org.apache.mahout.math.stats.LogLikelihood.logLikelihoodRatio expects the following counts:
> * A and B together (k_11)
> * B without A (k_12)
> * A without B (k_21)
> * Neither A nor B (k_22)
> It seems to me that in org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarity and org.apache.mahout.math.hadoop.similarity.vector.DistributedLoglikelihoodVectorSimilarity the counts of k_12 and k_21 are given in the wrong order (B without A should come before A without B)
>  
> Can someone confirm that?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAHOUT-642) Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sebastian Schelter updated MAHOUT-642:
--------------------------------------

    Priority: Trivial  (was: Minor)

> Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity
> ---------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-642
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-642
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>            Priority: Trivial
>             Fix For: 0.5
>
>
> org.apache.mahout.math.stats.LogLikelihood.logLikelihoodRatio expects the following counts:
> * A and B together (k_11)
> * B without A (k_12)
> * A without B (k_21)
> * Neither A nor B (k_22)
> It seems to me that in org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarity and org.apache.mahout.math.hadoop.similarity.vector.DistributedLoglikelihoodVectorSimilarity the counts of k_12 and k_21 are given in the wrong order (B without A should come before A without B)
>  
> Can someone confirm that?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAHOUT-642) Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sebastian Schelter updated MAHOUT-642:
--------------------------------------

    Affects Version/s: 0.5

> Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity
> ---------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-642
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-642
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>
> org.apache.mahout.math.stats.LogLikelihood.logLikelihoodRatio expects the following counts:
> * A and B together (k_11)
> * B without A (k_12)
> * A without B (k_21)
> * Neither A nor B (k_22)
> It seems to me that in org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarity and org.apache.mahout.math.hadoop.similarity.vector.DistributedLoglikelihoodVectorSimilarity the counts of k_12 and k_21 are given in the wrong order (B without A should come before A without B)
>  
> Can someone confirm that?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Work started] (MAHOUT-642) Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on MAHOUT-642 started by Sebastian Schelter.

> Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity
> ---------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-642
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-642
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>            Priority: Trivial
>             Fix For: 0.5
>
>
> org.apache.mahout.math.stats.LogLikelihood.logLikelihoodRatio expects the following counts:
> * A and B together (k_11)
> * B without A (k_12)
> * A without B (k_21)
> * Neither A nor B (k_22)
> It seems to me that in org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarity and org.apache.mahout.math.hadoop.similarity.vector.DistributedLoglikelihoodVectorSimilarity the counts of k_12 and k_21 are given in the wrong order (B without A should come before A without B)
>  
> Can someone confirm that?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-642) Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012978#comment-13012978 ] 

Sebastian Schelter commented on MAHOUT-642:
-------------------------------------------

Looked at the formula again, it shouldn't matter for the final score, but it is inconsistent with the documentation and Ted's blog entry, so we should change it.

> Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity
> ---------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-642
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-642
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>
> org.apache.mahout.math.stats.LogLikelihood.logLikelihoodRatio expects the following counts:
> * A and B together (k_11)
> * B without A (k_12)
> * A without B (k_21)
> * Neither A nor B (k_22)
> It seems to me that in org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarity and org.apache.mahout.math.hadoop.similarity.vector.DistributedLoglikelihoodVectorSimilarity the counts of k_12 and k_21 are given in the wrong order (B without A should come before A without B)
>  
> Can someone confirm that?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAHOUT-642) Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sebastian Schelter updated MAHOUT-642:
--------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.5
           Status: Resolved  (was: Patch Available)

> Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity
> ---------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-642
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-642
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>            Priority: Trivial
>             Fix For: 0.5
>
>
> org.apache.mahout.math.stats.LogLikelihood.logLikelihoodRatio expects the following counts:
> * A and B together (k_11)
> * B without A (k_12)
> * A without B (k_21)
> * Neither A nor B (k_22)
> It seems to me that in org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarity and org.apache.mahout.math.hadoop.similarity.vector.DistributedLoglikelihoodVectorSimilarity the counts of k_12 and k_21 are given in the wrong order (B without A should come before A without B)
>  
> Can someone confirm that?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAHOUT-642) Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sebastian Schelter updated MAHOUT-642:
--------------------------------------

    Status: Patch Available  (was: In Progress)

> Wrong parameter order in LoglikelihoodSimilarity and DistributedLoglikelihoodVectorSimilarity
> ---------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-642
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-642
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>            Priority: Trivial
>
> org.apache.mahout.math.stats.LogLikelihood.logLikelihoodRatio expects the following counts:
> * A and B together (k_11)
> * B without A (k_12)
> * A without B (k_21)
> * Neither A nor B (k_22)
> It seems to me that in org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarity and org.apache.mahout.math.hadoop.similarity.vector.DistributedLoglikelihoodVectorSimilarity the counts of k_12 and k_21 are given in the wrong order (B without A should come before A without B)
>  
> Can someone confirm that?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira