You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jerry Lam (JIRA)" <ji...@apache.org> on 2018/06/26 00:03:00 UTC
[jira] [Resolved] (SPARK-24652) Strange ALS Implementation for
Implicit Feedback
[ https://issues.apache.org/jira/browse/SPARK-24652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jerry Lam resolved SPARK-24652.
-------------------------------
Resolution: Not A Problem
> Strange ALS Implementation for Implicit Feedback
> ------------------------------------------------
>
> Key: SPARK-24652
> URL: https://issues.apache.org/jira/browse/SPARK-24652
> Project: Spark
> Issue Type: Bug
> Components: ML
> Affects Versions: 2.3.1
> Reporter: Jerry Lam
> Priority: Major
>
> Hi there,
> I'm evaluating the ALS implementation from Spark ML. Does Spark implement the algorithm described in "Collaborative Filtering for Implicit Feedback Datasets"? because if it is, I think the implementation returns result that is incorrect.
> Here is the example:
> {code:java}
> from pyspark.ml.recommendation import ALS
> als = ALS(
> maxIter=100,
> regParam=0.0,
> alpha=1.0,
> nonnegative=False,
> implicitPrefs=True,
> rank=1)
> ratings = spark.createDataFrame([(0, 0, 1), (1,1, 1)]).toDF('user', 'item', 'rating')
> als_model = als.fit(ratings)
> reco = als_model.recommendForAllUsers(10)
> reco.show(truncate=False)
> {code}
> The result is:
> {code:java}
> +----+---------------------------------+
> |user|recommendations |
> +----+---------------------------------+
> |0 |[[0, 0.6666667], [1, -0.6666667]]|
> |1 |[[1, 0.6666667], [0, -0.6666667]]|
> +----+---------------------------------+
> {code}
> I expect the results for the above to be :
> {code:java}
> +----+---------------------------------+
> |user|recommendations |
> +----+---------------------------------+
> |0 |[[0, 1.0], [1, -1.0]]|
> |1 |[[1, 1.0], [0, -1.0]]|
> +----+---------------------------------+
> {code}
> The reason I believe that it should be equal to 1.0 for (user=1, item=1) and 1.0 for (user=0, item=0) is because from the paper, the above should return 1.0 this two cases given that lambda is 0.0 (no regularization).
>
> Can someone describe what implementation of implicit feedback is spark using? If it implemented the same paper, why the result is so different? Thank you.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org