You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hivemall.apache.org by myui <gi...@git.apache.org> on 2017/09/15 01:32:54 UTC

[GitHub] incubator-hivemall pull request #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs...

GitHub user myui opened a pull request:

    https://github.com/apache/incubator-hivemall/pull/115

    [WIP][HIVEMALL-124][BUGFIX] Fixed bugs in BinaryResponseMeasure (nDCG, MRR, AP)

    ## What changes were proposed in this pull request?
    
    Fixed bug in the BinaryResponse measure such as nDCG, MRR, (M)AP.
    
    ## What type of PR is it?
    
    Bug Fix
    
    ## What is the Jira issue?
    
    https://issues.apache.org/jira/browse/HIVEMALL-124
    
    ## How was this patch tested?
    
    unit tests, manual tests
    
    ## Checklist
    
    (Please remove this section if not needed; check `x` for YES, blank for NO)
    
    - [x] Did you apply source code formatter, i.e., `mvn formatter:format`, for your commit?
    - [ ] Did you run manual test on Hive/Spark CLI on the modified functions?

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/myui/incubator-hivemall HIVEMALL-124

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-hivemall/pull/115.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #115
    
----
commit bef36ed9dd4bcd11bd46c156d85e8b2006d3811d
Author: Makoto Yui <my...@apache.org>
Date:   2017-09-15T01:28:23Z

    Fixed nDCG, MRR, AP bugs and applied refactoring

----


---

[GitHub] incubator-hivemall pull request #115: [HIVEMALL-124][BUGFIX] Fixed bugs in B...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-hivemall/pull/115


---

[GitHub] incubator-hivemall issue #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs in Bin...

Posted by myui <gi...@git.apache.org>.

Github user myui commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/115
  
    @takuti could you review this PR?


---

[GitHub] incubator-hivemall issue #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs in Bin...

Posted by coveralls <gi...@git.apache.org>.

Github user coveralls commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/115
  
    
    [![Coverage Status](https://coveralls.io/builds/13284227/badge)](https://coveralls.io/builds/13284227)
    
    Coverage decreased (-0.05%) to 40.926% when pulling **cd5cb4ad8ec7954e3e60cf04aa05a209f1d34557 on myui:HIVEMALL-124** into **06f2f822008de1990948ffdd81022910e987d548 on apache:master**.



---

[GitHub] incubator-hivemall issue #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs in Bin...

Posted by myui <gi...@git.apache.org>.

Github user myui commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/115
  
    Ideally, it should be `ndcg(array(1,3), array(1)) < ndcg(array(1), array(1))` but correct code return 1.0 for both cases.


---

[GitHub] incubator-hivemall issue #115: [HIVEMALL-124][BUGFIX] Fixed bugs in BinaryRe...

Posted by coveralls <gi...@git.apache.org>.

Github user coveralls commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/115
  
    
    [![Coverage Status](https://coveralls.io/builds/13285636/badge)](https://coveralls.io/builds/13285636)
    
    Coverage decreased (-0.04%) to 40.929% when pulling **d70c072f97e882ce36f950798182aefa086efb53 on myui:HIVEMALL-124** into **06f2f822008de1990948ffdd81022910e987d548 on apache:master**.



---

[GitHub] incubator-hivemall pull request #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs...

Posted by takuti <gi...@git.apache.org>.

Github user takuti commented on a diff in the pull request:

    https://github.com/apache/incubator-hivemall/pull/115#discussion_r139082280
  
    --- Diff: core/src/main/java/hivemall/evaluation/BinaryResponsesMeasures.java ---
    @@ -120,48 +148,65 @@ public static int countTruePositive(final List<?> rankedList, final List<?> grou
         }
     
         /**
    -     * Computes Mean Reciprocal Rank (MRR)
    +     * Computes Reciprocal Rank
          *
          * @param rankedList a list of ranked item IDs (first item is highest-ranked)
          * @param groundTruth a collection of positive/correct item IDs
          * @param recommendSize top-`recommendSize` items in `rankedList` are recommended
    -     * @return MRR
    +     * @return Reciprocal Rank
    +     * @link https://en.wikipedia.org/wiki/Mean_reciprocal_rank
          */
    -    public static double MRR(@Nonnull final List<?> rankedList, @Nonnull final List<?> groundTruth,
    -            @Nonnull final int recommendSize) {
    -        for (int i = 0, n = recommendSize; i < n; i++) {
    +    public static double ReciprocalRank(@Nonnull final List<?> rankedList,
    +            @Nonnull final List<?> groundTruth, @Nonnegative final int recommendSize) {
    +        Preconditions.checkArgument(recommendSize > 0);
    +
    +        final int k = Math.min(rankedList.size(), recommendSize);
    +        for (int i = 0; i < k; i++) {
                 Object item_id = rankedList.get(i);
                 if (groundTruth.contains(item_id)) {
    -                return 1.0 / (i + 1.0);
    +                return 1.d / (i + 1);
                 }
             }
     
    -        return 0.0;
    +        return 0.d;
         }
     
         /**
    -     * Computes Mean Average Precision (MAP)
    +     * Computes Average Precision (AP)
          *
          * @param rankedList a list of ranked item IDs (first item is highest-ranked)
          * @param groundTruth a collection of positive/correct item IDs
          * @param recommendSize top-`recommendSize` items in `rankedList` are recommended
    -     * @return MAP
    +     * @return AveragePrecision
          */
    -    public static double MAP(@Nonnull final List<?> rankedList, @Nonnull final List<?> groundTruth,
    -            @Nonnull final int recommendSize) {
    +    public static double AveragePrecision(@Nonnull final List<?> rankedList,
    +            @Nonnull final List<?> groundTruth, @Nonnegative final int recommendSize) {
    +        Preconditions.checkArgument(recommendSize > 0);
    +
    +        if (groundTruth.isEmpty()) {
    +            if (rankedList.isEmpty()) {
    +                return 1.d;
    +            }
    +            return 0.d;
    +        }
    +
             int nTruePositive = 0;
    -        double sumPrecision = 0.0;
    +        double sumPrecision = 0.d;
     
             // accumulate precision@1 to @recommendSize
    -        for (int i = 0, n = recommendSize; i < n; i++) {
    +        final int k = Math.min(rankedList.size(), recommendSize);
    +        for (int i = 0; i < k; i++) {
                 Object item_id = rankedList.get(i);
                 if (groundTruth.contains(item_id)) {
                     nTruePositive++;
    -                sumPrecision += nTruePositive / (i + 1.0);
    +                sumPrecision += nTruePositive / (i + 1.d);
                 }
             }
     
    -        return sumPrecision / groundTruth.size();
    +        if (nTruePositive == 0) {
    +            return 0.d;
    +        }
    +        return sumPrecision / nTruePositive;
    --- End diff --
    
    👍 


---

[GitHub] incubator-hivemall pull request #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs...

Posted by takuti <gi...@git.apache.org>.

Github user takuti commented on a diff in the pull request:

    https://github.com/apache/incubator-hivemall/pull/115#discussion_r139072140
  
    --- Diff: core/src/main/java/hivemall/evaluation/BinaryResponsesMeasures.java ---
    @@ -79,8 +91,15 @@ public static double IDCG(final int n) {
          * @return Precision
          */
         public static double Precision(@Nonnull final List<?> rankedList,
    -            @Nonnull final List<?> groundTruth, @Nonnull final int recommendSize) {
    -        return (double) countTruePositive(rankedList, groundTruth, recommendSize) / recommendSize;
    +            @Nonnull final List<?> groundTruth, @Nonnegative final int recommendSize) {
    +        if (rankedList.isEmpty()) {
    +            if (groundTruth.isEmpty()) {
    +                return 1.d;
    +            }
    +            return 0.d;
    +        }
    +        final int k = Math.min(rankedList.size(), recommendSize);
    --- End diff --
    
    Computing `Math.min(rankedList.size(), recommendSize)` twice at here and inside of `TruePositives` looks redundant and a little bit confusing. For the sake of simplicity, how about directly writing what `TruePositives` does both in `Precision` and `Recall`, and remove `TruePositives`? Even though the same for-loop appears twice, I personally feel it's much more easier to understand.


---

[GitHub] incubator-hivemall pull request #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs...

Posted by myui <gi...@git.apache.org>.

Github user myui commented on a diff in the pull request:

    https://github.com/apache/incubator-hivemall/pull/115#discussion_r139088453
  
    --- Diff: core/src/main/java/hivemall/evaluation/BinaryResponsesMeasures.java ---
    @@ -79,8 +91,15 @@ public static double IDCG(final int n) {
          * @return Precision
          */
         public static double Precision(@Nonnull final List<?> rankedList,
    -            @Nonnull final List<?> groundTruth, @Nonnull final int recommendSize) {
    -        return (double) countTruePositive(rankedList, groundTruth, recommendSize) / recommendSize;
    +            @Nonnull final List<?> groundTruth, @Nonnegative final int recommendSize) {
    +        if (rankedList.isEmpty()) {
    +            if (groundTruth.isEmpty()) {
    +                return 1.d;
    +            }
    +            return 0.d;
    +        }
    +        final int k = Math.min(rankedList.size(), recommendSize);
    --- End diff --
    
    Fixed in https://github.com/apache/incubator-hivemall/pull/115/commits/759b088370fb9e6b8d17c46dd9d71f7c1980a78a
    
    Recall is as it is.


---

[GitHub] incubator-hivemall issue #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs in Bin...

Posted by coveralls <gi...@git.apache.org>.

Github user coveralls commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/115
  
    
    [![Coverage Status](https://coveralls.io/builds/13281389/badge)](https://coveralls.io/builds/13281389)
    
    Coverage decreased (-0.05%) to 40.928% when pulling **bef36ed9dd4bcd11bd46c156d85e8b2006d3811d on myui:HIVEMALL-124** into **06f2f822008de1990948ffdd81022910e987d548 on apache:master**.



---

[GitHub] incubator-hivemall issue #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs in Bin...

Posted by coveralls <gi...@git.apache.org>.

Github user coveralls commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/115
  
    
    [![Coverage Status](https://coveralls.io/builds/13281688/badge)](https://coveralls.io/builds/13281688)
    
    Coverage decreased (-0.05%) to 40.926% when pulling **eb0cb36ba3cef9a850eea0ec25265924df26a659 on myui:HIVEMALL-124** into **06f2f822008de1990948ffdd81022910e987d548 on apache:master**.



---