You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hivemall.apache.org by myui <gi...@git.apache.org> on 2017/09/15 01:32:54 UTC
[GitHub] incubator-hivemall pull request #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs...
GitHub user myui opened a pull request:
https://github.com/apache/incubator-hivemall/pull/115
[WIP][HIVEMALL-124][BUGFIX] Fixed bugs in BinaryResponseMeasure (nDCG, MRR, AP)
## What changes were proposed in this pull request?
Fixed bug in the BinaryResponse measure such as nDCG, MRR, (M)AP.
## What type of PR is it?
Bug Fix
## What is the Jira issue?
https://issues.apache.org/jira/browse/HIVEMALL-124
## How was this patch tested?
unit tests, manual tests
## Checklist
(Please remove this section if not needed; check `x` for YES, blank for NO)
- [x] Did you apply source code formatter, i.e., `mvn formatter:format`, for your commit?
- [ ] Did you run manual test on Hive/Spark CLI on the modified functions?
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/myui/incubator-hivemall HIVEMALL-124
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-hivemall/pull/115.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #115
----
commit bef36ed9dd4bcd11bd46c156d85e8b2006d3811d
Author: Makoto Yui <my...@apache.org>
Date: 2017-09-15T01:28:23Z
Fixed nDCG, MRR, AP bugs and applied refactoring
----
---
[GitHub] incubator-hivemall pull request #115: [HIVEMALL-124][BUGFIX] Fixed bugs in B...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/incubator-hivemall/pull/115
---
[GitHub] incubator-hivemall issue #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs in Bin...
Posted by myui <gi...@git.apache.org>.
Github user myui commented on the issue:
https://github.com/apache/incubator-hivemall/pull/115
@takuti could you review this PR?
---
[GitHub] incubator-hivemall issue #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs in Bin...
Posted by coveralls <gi...@git.apache.org>.
Github user coveralls commented on the issue:
https://github.com/apache/incubator-hivemall/pull/115
[![Coverage Status](https://coveralls.io/builds/13284227/badge)](https://coveralls.io/builds/13284227)
Coverage decreased (-0.05%) to 40.926% when pulling **cd5cb4ad8ec7954e3e60cf04aa05a209f1d34557 on myui:HIVEMALL-124** into **06f2f822008de1990948ffdd81022910e987d548 on apache:master**.
---
[GitHub] incubator-hivemall issue #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs in Bin...
Posted by myui <gi...@git.apache.org>.
Github user myui commented on the issue:
https://github.com/apache/incubator-hivemall/pull/115
Ideally, it should be `ndcg(array(1,3), array(1)) < ndcg(array(1), array(1))` but correct code return 1.0 for both cases.
---
[GitHub] incubator-hivemall issue #115: [HIVEMALL-124][BUGFIX] Fixed bugs in BinaryRe...
Posted by coveralls <gi...@git.apache.org>.
Github user coveralls commented on the issue:
https://github.com/apache/incubator-hivemall/pull/115
[![Coverage Status](https://coveralls.io/builds/13285636/badge)](https://coveralls.io/builds/13285636)
Coverage decreased (-0.04%) to 40.929% when pulling **d70c072f97e882ce36f950798182aefa086efb53 on myui:HIVEMALL-124** into **06f2f822008de1990948ffdd81022910e987d548 on apache:master**.
---
[GitHub] incubator-hivemall pull request #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs...
Posted by takuti <gi...@git.apache.org>.
Github user takuti commented on a diff in the pull request:
https://github.com/apache/incubator-hivemall/pull/115#discussion_r139082280
--- Diff: core/src/main/java/hivemall/evaluation/BinaryResponsesMeasures.java ---
@@ -120,48 +148,65 @@ public static int countTruePositive(final List<?> rankedList, final List<?> grou
}
/**
- * Computes Mean Reciprocal Rank (MRR)
+ * Computes Reciprocal Rank
*
* @param rankedList a list of ranked item IDs (first item is highest-ranked)
* @param groundTruth a collection of positive/correct item IDs
* @param recommendSize top-`recommendSize` items in `rankedList` are recommended
- * @return MRR
+ * @return Reciprocal Rank
+ * @link https://en.wikipedia.org/wiki/Mean_reciprocal_rank
*/
- public static double MRR(@Nonnull final List<?> rankedList, @Nonnull final List<?> groundTruth,
- @Nonnull final int recommendSize) {
- for (int i = 0, n = recommendSize; i < n; i++) {
+ public static double ReciprocalRank(@Nonnull final List<?> rankedList,
+ @Nonnull final List<?> groundTruth, @Nonnegative final int recommendSize) {
+ Preconditions.checkArgument(recommendSize > 0);
+
+ final int k = Math.min(rankedList.size(), recommendSize);
+ for (int i = 0; i < k; i++) {
Object item_id = rankedList.get(i);
if (groundTruth.contains(item_id)) {
- return 1.0 / (i + 1.0);
+ return 1.d / (i + 1);
}
}
- return 0.0;
+ return 0.d;
}
/**
- * Computes Mean Average Precision (MAP)
+ * Computes Average Precision (AP)
*
* @param rankedList a list of ranked item IDs (first item is highest-ranked)
* @param groundTruth a collection of positive/correct item IDs
* @param recommendSize top-`recommendSize` items in `rankedList` are recommended
- * @return MAP
+ * @return AveragePrecision
*/
- public static double MAP(@Nonnull final List<?> rankedList, @Nonnull final List<?> groundTruth,
- @Nonnull final int recommendSize) {
+ public static double AveragePrecision(@Nonnull final List<?> rankedList,
+ @Nonnull final List<?> groundTruth, @Nonnegative final int recommendSize) {
+ Preconditions.checkArgument(recommendSize > 0);
+
+ if (groundTruth.isEmpty()) {
+ if (rankedList.isEmpty()) {
+ return 1.d;
+ }
+ return 0.d;
+ }
+
int nTruePositive = 0;
- double sumPrecision = 0.0;
+ double sumPrecision = 0.d;
// accumulate precision@1 to @recommendSize
- for (int i = 0, n = recommendSize; i < n; i++) {
+ final int k = Math.min(rankedList.size(), recommendSize);
+ for (int i = 0; i < k; i++) {
Object item_id = rankedList.get(i);
if (groundTruth.contains(item_id)) {
nTruePositive++;
- sumPrecision += nTruePositive / (i + 1.0);
+ sumPrecision += nTruePositive / (i + 1.d);
}
}
- return sumPrecision / groundTruth.size();
+ if (nTruePositive == 0) {
+ return 0.d;
+ }
+ return sumPrecision / nTruePositive;
--- End diff --
👍
---
[GitHub] incubator-hivemall pull request #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs...
Posted by takuti <gi...@git.apache.org>.
Github user takuti commented on a diff in the pull request:
https://github.com/apache/incubator-hivemall/pull/115#discussion_r139072140
--- Diff: core/src/main/java/hivemall/evaluation/BinaryResponsesMeasures.java ---
@@ -79,8 +91,15 @@ public static double IDCG(final int n) {
* @return Precision
*/
public static double Precision(@Nonnull final List<?> rankedList,
- @Nonnull final List<?> groundTruth, @Nonnull final int recommendSize) {
- return (double) countTruePositive(rankedList, groundTruth, recommendSize) / recommendSize;
+ @Nonnull final List<?> groundTruth, @Nonnegative final int recommendSize) {
+ if (rankedList.isEmpty()) {
+ if (groundTruth.isEmpty()) {
+ return 1.d;
+ }
+ return 0.d;
+ }
+ final int k = Math.min(rankedList.size(), recommendSize);
--- End diff --
Computing `Math.min(rankedList.size(), recommendSize)` twice at here and inside of `TruePositives` looks redundant and a little bit confusing. For the sake of simplicity, how about directly writing what `TruePositives` does both in `Precision` and `Recall`, and remove `TruePositives`? Even though the same for-loop appears twice, I personally feel it's much more easier to understand.
---
[GitHub] incubator-hivemall pull request #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs...
Posted by myui <gi...@git.apache.org>.
Github user myui commented on a diff in the pull request:
https://github.com/apache/incubator-hivemall/pull/115#discussion_r139088453
--- Diff: core/src/main/java/hivemall/evaluation/BinaryResponsesMeasures.java ---
@@ -79,8 +91,15 @@ public static double IDCG(final int n) {
* @return Precision
*/
public static double Precision(@Nonnull final List<?> rankedList,
- @Nonnull final List<?> groundTruth, @Nonnull final int recommendSize) {
- return (double) countTruePositive(rankedList, groundTruth, recommendSize) / recommendSize;
+ @Nonnull final List<?> groundTruth, @Nonnegative final int recommendSize) {
+ if (rankedList.isEmpty()) {
+ if (groundTruth.isEmpty()) {
+ return 1.d;
+ }
+ return 0.d;
+ }
+ final int k = Math.min(rankedList.size(), recommendSize);
--- End diff --
Fixed in https://github.com/apache/incubator-hivemall/pull/115/commits/759b088370fb9e6b8d17c46dd9d71f7c1980a78a
Recall is as it is.
---
[GitHub] incubator-hivemall issue #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs in Bin...
Posted by coveralls <gi...@git.apache.org>.
Github user coveralls commented on the issue:
https://github.com/apache/incubator-hivemall/pull/115
[![Coverage Status](https://coveralls.io/builds/13281389/badge)](https://coveralls.io/builds/13281389)
Coverage decreased (-0.05%) to 40.928% when pulling **bef36ed9dd4bcd11bd46c156d85e8b2006d3811d on myui:HIVEMALL-124** into **06f2f822008de1990948ffdd81022910e987d548 on apache:master**.
---
[GitHub] incubator-hivemall issue #115: [WIP][HIVEMALL-124][BUGFIX] Fixed bugs in Bin...
Posted by coveralls <gi...@git.apache.org>.
Github user coveralls commented on the issue:
https://github.com/apache/incubator-hivemall/pull/115
[![Coverage Status](https://coveralls.io/builds/13281688/badge)](https://coveralls.io/builds/13281688)
Coverage decreased (-0.05%) to 40.926% when pulling **eb0cb36ba3cef9a850eea0ec25265924df26a659 on myui:HIVEMALL-124** into **06f2f822008de1990948ffdd81022910e987d548 on apache:master**.
---