You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jerry Lam (JIRA)" <ji...@apache.org> on 2018/06/26 00:03:00 UTC

[jira] [Resolved] (SPARK-24652) Strange ALS Implementation for Implicit Feedback

     [ https://issues.apache.org/jira/browse/SPARK-24652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jerry Lam resolved SPARK-24652.
-------------------------------
    Resolution: Not A Problem

> Strange ALS Implementation for Implicit Feedback
> ------------------------------------------------
>
>                 Key: SPARK-24652
>                 URL: https://issues.apache.org/jira/browse/SPARK-24652
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>    Affects Versions: 2.3.1
>            Reporter: Jerry Lam
>            Priority: Major
>
> Hi there,
> I'm evaluating the ALS implementation from Spark ML. Does Spark implement the algorithm described in "Collaborative Filtering for Implicit Feedback Datasets"? because if it is, I think the implementation returns result that is incorrect.
> Here is the example:
> {code:java}
> from pyspark.ml.recommendation import ALS
> als = ALS(
>     maxIter=100,
>     regParam=0.0,
>     alpha=1.0,
>     nonnegative=False,
>     implicitPrefs=True,
>     rank=1)
> ratings = spark.createDataFrame([(0, 0, 1), (1,1, 1)]).toDF('user', 'item', 'rating')
> als_model = als.fit(ratings)
> reco = als_model.recommendForAllUsers(10)
> reco.show(truncate=False)
> {code}
>  The result is:
> {code:java}
> +----+---------------------------------+ 
> |user|recommendations |
> +----+---------------------------------+ 
> |0 |[[0, 0.6666667], [1, -0.6666667]]|
> |1 |[[1, 0.6666667], [0, -0.6666667]]| 
> +----+---------------------------------+
> {code}
>  I expect the results for the above to be :
> {code:java}
> +----+---------------------------------+ 
> |user|recommendations | 
> +----+---------------------------------+ 
> |0 |[[0, 1.0], [1, -1.0]]|
> |1 |[[1, 1.0], [0, -1.0]]| 
> +----+---------------------------------+
> {code}
> The reason I believe that it should be equal to 1.0 for (user=1, item=1) and 1.0 for (user=0, item=0) is because from the paper, the above should return 1.0 this two cases given that lambda is 0.0 (no regularization). 
>  
> Can someone describe what implementation of implicit feedback is spark using? If it implemented the same paper, why the result is so different? Thank you.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org