You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Pablo J. Villacorta (JIRA)" <ji...@apache.org> on 2018/12/12 23:49:00 UTC

[jira] [Created] (SPARK-26351) Documented formula of precision at k does not match the actual code

Pablo J. Villacorta created SPARK-26351:
-------------------------------------------

             Summary: Documented formula of precision at k does not match the actual code
                 Key: SPARK-26351
                 URL: https://issues.apache.org/jira/browse/SPARK-26351
             Project: Spark
          Issue Type: Bug
          Components: Documentation
    Affects Versions: 2.4.0
            Reporter: Pablo J. Villacorta


The formula of the *precision @ k* for measuring the quality of the recommendations:

[https://spark.apache.org/docs/latest/mllib-evaluation-metrics.html#ranking-systems]

says that j goes from 0 to *min(|D|, k)* , but according to the code, 

[https://github.com/apache/spark/blob/a63e7b2a212bab94d080b00cf1c5f397800a276a/mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala#L65]

 
{code:java}
val n = math.min(pred.length, k){code}
 

The notation of Spark documentation defines

D~i~ as the set of ground truth relevant documents for user i

R~i~ as the set of recommended documents (i.e. predictions) given for user i .

According to the code, the documentation should say j goes from 0 to *min( |R~i~|, k )*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org