You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Pablo J. Villacorta (JIRA)" <ji...@apache.org> on 2018/12/12 23:49:00 UTC
[jira] [Created] (SPARK-26351) Documented formula of precision at k
does not match the actual code
Pablo J. Villacorta created SPARK-26351:
-------------------------------------------
Summary: Documented formula of precision at k does not match the actual code
Key: SPARK-26351
URL: https://issues.apache.org/jira/browse/SPARK-26351
Project: Spark
Issue Type: Bug
Components: Documentation
Affects Versions: 2.4.0
Reporter: Pablo J. Villacorta
The formula of the *precision @ k* for measuring the quality of the recommendations:
[https://spark.apache.org/docs/latest/mllib-evaluation-metrics.html#ranking-systems]
says that j goes from 0 to *min(|D|, k)* , but according to the code,
[https://github.com/apache/spark/blob/a63e7b2a212bab94d080b00cf1c5f397800a276a/mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala#L65]
{code:java}
val n = math.min(pred.length, k){code}
The notation of Spark documentation defines
D~i~ as the set of ground truth relevant documents for user i
R~i~ as the set of recommended documents (i.e. predictions) given for user i .
According to the code, the documentation should say j goes from 0 to *min( |R~i~|, k )*
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org