You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Shuo Xiang (JIRA)" <ji...@apache.org> on 2014/06/10 00:23:02 UTC

[jira] [Created] (SPARK-2085) Apply user-specific regularization instead of uniform regularization in Alternating Least Squares (ALS)

Shuo Xiang created SPARK-2085:
---------------------------------

             Summary: Apply user-specific regularization instead of uniform regularization in Alternating Least Squares (ALS)
                 Key: SPARK-2085
                 URL: https://issues.apache.org/jira/browse/SPARK-2085
             Project: Spark
          Issue Type: Improvement
          Components: MLlib
    Affects Versions: 1.0.0
            Reporter: Shuo Xiang
            Priority: Minor


The current implementation of ALS takes a single regularization parameter and apply it on both of the user factors and the product factors. This kind of regularization can be less effective while users number is significantly larger than the number of products (and vice versa). For example, if we have 10M users and 1K product, regularization on user factors will dominate. Following the discussion in [this thread](http://apache-spark-user-list.1001560.n3.nabble.com/possible-bug-in-Spark-s-ALS-implementation-tt2567.html#a2704), the implementation in this PR will regularize each factor vector by #ratings * lambda.




--
This message was sent by Atlassian JIRA
(v6.2#6252)