You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by techaddict <gi...@git.apache.org> on 2014/04/30 10:07:25 UTC

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

GitHub user techaddict opened a pull request:

    https://github.com/apache/spark/pull/597

    SPARK-1668: Add implicit preference as an option to examples/MovieLensALS

    Add --implicitPrefs as an command-line option to the example app MovieLensALS under examples/

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/techaddict/spark SPARK-1668

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/597.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #597
    
----
commit e3082faddbf2e1449310e1774f43dd44f0b32683
Author: Sandeep <sa...@techaddict.me>
Date:   2014-04-30T07:23:46Z

    SPARK-1668: Add implicit preference as an option to examples/MovieLensALS
    Add --implicitPrefs as an command-line option to the example app MovieLensALS under examples/

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42456355
  
    @techaddict Thanks for running the experiments! The results definitely look better than `3.xxxxx` if we compute RMSE directly for implicit ALS. In the future, we may switch to metrics based on ranking.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by techaddict <gi...@git.apache.org>.

Github user techaddict commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12362841
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -99,7 +107,12 @@ object MovieLensALS {
     
         val splits = ratings.randomSplit(Array(0.8, 0.2))
         val training = splits(0).cache()
    -    val test = splits(1).cache()
    +    val test = if (params.implicitPrefs) {
    +      splits(1)
    +      .map(x => Rating(x.user, x.product, if(x.rating >= 0) 1.0 else 0.0))
    --- End diff --
    
    `if (x.rating > 2.5)` will always be false, better ` > 0` since we have already done `x.ratings - 2.5`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12362954
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -99,7 +107,12 @@ object MovieLensALS {
     
         val splits = ratings.randomSplit(Array(0.8, 0.2))
         val training = splits(0).cache()
    -    val test = splits(1).cache()
    +    val test = if (params.implicitPrefs) {
    +      splits(1)
    +      .map(x => Rating(x.user, x.product, if(x.rating >= 0) 1.0 else 0.0))
    --- End diff --
    
    Sorry, I thought we only applied `- 2.5` to ratings in `training`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42275971
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42284620
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42502537
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14793/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by techaddict <gi...@git.apache.org>.

Github user techaddict commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12363364
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -99,7 +107,12 @@ object MovieLensALS {
     
         val splits = ratings.randomSplit(Array(0.8, 0.2))
         val training = splits(0).cache()
    -    val test = splits(1).cache()
    +    val test = if (params.implicitPrefs) {
    +      splits(1)
    --- End diff --
    
    @mengxr ok ?
    ```
    /**
           * 0 means "don't know" and positive values mean "confident that the prediction should be 1".
           * Negative values means "confident that the prediction should be 0".
           * We have in this case used some kind of weighted RMSE. The weight is the absolute value of the
           * confidence. The error is the difference between prediction and either 1 or 0, depending on
           * whether r is positive or negative.
           */
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12388719
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -99,7 +123,18 @@ object MovieLensALS {
     
         val splits = ratings.randomSplit(Array(0.8, 0.2))
         val training = splits(0).cache()
    -    val test = splits(1).cache()
    +    val test = if (params.implicitPrefs) {
    +      /**
    --- End diff --
    
    Ditto. Remove the last `*`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12388577
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -62,6 +63,9 @@ object MovieLensALS {
           opt[Unit]("kryo")
             .text(s"use Kryo serialization")
             .action((_, c) => c.copy(kryo = true))
    +      opt[Unit]("implicitPrefs")
    +        .text(s"use implicit preference")
    --- End diff --
    
    remove `s`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-45339833
  
    Simple RMSE is not a great metric for this model, because it treats all errors equally when the model itself does not at all. 1s are much more important than 0s. The predictions are not rating-like. See my comment above.
    
    I usually try to look at metrics that measure how good the top of the ranking is, since this is far more like what the user experiences. MAP or something like area under the curve are about as good as you can hope for, but still somewhat flawed. It's hard to eval recommenders since you have such incomplete information on what the "right" or "relevant" items are.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42405181
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-45647190
  
    The results depend a whole lot on the choice of parameters. Did you try some degree of search for the best lambda / # features? it's quite possible to make a model that can't predict anything. I have generally found ALS works fine on the Movielens data set.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42273881
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42256964
  
    MovieLens ratings are on a scale of 1-5:
    
    ~~~
    5: Must see
    4: Will enjoy
    3: It's okay
    2: Fairly bad
    1: Awful
    ~~~
    
    So we should not recommend a movie if the predicted rating is less than `3`. To map ratings to confidence scores, I would use `5 -> 2, 4 -> 1, 3 -> 0, 2 -> -1, 1 -> -2` or `5 -> 2.5, 4 -> 1.5, 3 -> 0.5, 2 -> -0.5, 1 -> -1.5`. The latter mappings means unobserved entries are generally between `It's okay` and `Fairly bad`.
    
    For evaluation, the mapping should be `if (r >= 3) 1.0 else 0.0` for MovieLens ratings, and I agree with @srowen on weighted RMSE.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42275972
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14708/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by techaddict <gi...@git.apache.org>.

Github user techaddict commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12363376
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -99,7 +107,12 @@ object MovieLensALS {
     
         val splits = ratings.randomSplit(Array(0.8, 0.2))
         val training = splits(0).cache()
    -    val test = splits(1).cache()
    +    val test = if (params.implicitPrefs) {
    +      splits(1)
    --- End diff --
    
    Will put it on top of this line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by techaddict <gi...@git.apache.org>.

Github user techaddict commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42273835
  
    @mengxr  now good ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42409178
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14770/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42404869
  
    Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42264121
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42463271
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14777/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42502536
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42404871
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14769/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by techaddict <gi...@git.apache.org>.

Github user techaddict commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12362778
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -88,7 +92,11 @@ object MovieLensALS {
     
         val ratings = sc.textFile(params.input).map { line =>
           val fields = line.split("::")
    -      Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble)
    +      if (params.implicitPrefs) {
    +        Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble - 2.5)
    --- End diff --
    
    ```
            /**
             * MovieLens ratings are on a scale of 1-5:
             * 5: Must see
             * 4: Will enjoy
             * 3: It's okay
             * 2: Fairly bad
             * 1: Awful
             * So we should not recommend a movie if the predicted rating is less than 3.
             * To map ratings to confidence scores, we use
             * 5 -> 2.5, 4 -> 1.5, 3 -> 0.5, 2 -> -0.5, 1 -> -1.5. This mappings means unobserved
             * entries are generally between It's okay and Fairly bad.
             * The semantics of 0 in this expanded world of non-positive weights
             * are "the same as never having interacted at all" -- which doesn't quite fit.
             * It's possible that 0 values are ignored when constructing the sparse representation,
             * because the 0s are implicit. This would be a problem, at least, a theoretical one.
             */
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42264123
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14695/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12388902
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -121,11 +157,14 @@ object MovieLensALS {
       }
     
       /** Compute RMSE (Root Mean Squared Error). */
    -  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], n: Long) = {
    +  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], implicitPrefs: Boolean) = {
    +
    +    def evalRating(r: Double) =
    +      if (!implicitPrefs) r else if (r > 1.0) 1.0 else if (r < 0.0) 0.0 else r
    --- End diff --
    
    There are two `if` blocks in this line. Better use multiple lines for readability.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42280860
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by MLnick <gi...@git.apache.org>.

Github user MLnick commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42125746
  
    It is true that implicit prefs predict 0/1 (ie a "preference" matrix rather than a "rating" matrix), but the ratings are taken as confidence levels indicating preference (or in the case of negative ratings, lack of preference). So already there is an implicit mapping of 1 if r > 0, 0 if r == 0, with the actual rating being a confidence value in the case of r > 0.
    
    So keeping ratings input as is, is a reasonable approach. Even better would be to map low ratings to zero or perhaps even negative scores, as a low rating would indicate a lack of preference certainly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-45792263
  
    You mentioned trying lots of values but what did you try? What about other test metrics -- to rule out some problem in the evaluation? Maybe you can share some of how you ran the test in a gist.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by coderh <gi...@git.apache.org>.

Github user coderh commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-45731942
  
    I have tried different lamdba and # features. But nothing has changed. To be clear, initially, the Movielens dataset it is divided into training set(80%) and test set(20%). The ratings are re-interpreted as `rating -2.5`, we take only the positives in both training and test set, as we want to simulate a implicit feedback case where no negative feedback exists. All the negative ratings are considered as non-observed. Finally, we evaluated EPR both in training set and test set. It's about 49%~50% in both cases. Am I doing the right thing ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12334664
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -99,7 +107,12 @@ object MovieLensALS {
     
         val splits = ratings.randomSplit(Array(0.8, 0.2))
         val training = splits(0).cache()
    -    val test = splits(1).cache()
    +    val test = if (params.implicitPrefs) {
    +      splits(1)
    +      .map(x => Rating(x.user, x.product, if(x.rating >= 0) 1.0 else 0.0))
    --- End diff --
    
    `if(x.rating >= 0)` => `if (x.rating > 2.5)`. So ratings 3/4/5 are positive and 1/2 are negative.
    
    Does this line fit the one above?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42458899
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42280865
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-41770652
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by coderh <gi...@git.apache.org>.

Github user coderh commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-45640620
  
    I have recently tested expected percentile rank evaluation method proposed in the paper on the Movielens data set and a real world data set. However, I got a expected rank about 50% in both set, according to the paper, that means implicit ALS actually does not predict anything.
    
    I am not sure if any evaluation has been done like this. 
    
    How can we make sure that implicit ALS is implemented correctly in MLlib without checking code?
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42404743
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by coderh <gi...@git.apache.org>.

Github user coderh commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-45847264
  
    Here is the values I have tried: seed is set to 42
    
    in & out means in sample (training set) out-of-sample (test set)
    
    # #factor = 12, lamda = 1, alpha = 1 
         iter 20  => 
                     MAP_in = 0.035399855240788425
                     MAP_out = 0.007907455900941737
                     EPR_in  = 0.4902389595686534
                     EPR_out = 0.4931204751436468
    
          iter 40  => 
                      MAP_in = 0.033210624652830374
                      MAP_out = 0.007158070987320343
                      EPR_in  = 0.4907502816419743
                      EPR_out = 0.49214166351173705
    
    # #factor = 50, alpha = 1, iter = 30
          lambda = 1, => 
                      MAP_in = 0.029096938174350682
                      MAP_out = 0.006634856811818636
                      EPR_in  = 0.4928298931862564
                      EPR_out = 0.49328834081999423
    
         lambda = 0.001 => 
                      MAP_in = 0.02903970778838223
                      MAP_out = 0.006569378517284138
                      EPR_in  = 0.4929466287464198
                      EPR_out = 0.49337539845412665
    
    I have not tried other metrics, as said before, RMSE is not that good. I will give AUC and ROC a try.
    
    I listed some code snippets here. There are 2 evaluation methods and the main
    https://gist.github.com/coderh/05a83be081c1f713e15b


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42498727
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42498714
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12334516
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -88,7 +92,11 @@ object MovieLensALS {
     
         val ratings = sc.textFile(params.input).map { line =>
           val fields = line.split("::")
    -      Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble)
    +      if (params.implicitPrefs) {
    +        Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble - 2.5)
    --- End diff --
    
    Could you summarize our discussion and put a comment here explaining why we use `- 2.5`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/597


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12334467
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -62,6 +63,9 @@ object MovieLensALS {
           opt[Unit]("kryo")
             .text(s"use Kryo serialization")
             .action((_, c) => c.copy(kryo = true))
    +      opt[Unit]("implicitPrefs")
    +        .text(s"use Implicit Preference")
    --- End diff --
    
    `use implicit preference`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-41827788
  
    @techaddict Thanks for working on this JIRA. You also need to change the evaluation code. Implicit ALS predicts 0/1 instead of the original rating. So you need some mapping before computing RMSE.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12407659
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -99,7 +123,18 @@ object MovieLensALS {
     
         val splits = ratings.randomSplit(Array(0.8, 0.2))
         val training = splits(0).cache()
    -    val test = splits(1).cache()
    +    val test = if (params.implicitPrefs) {
    +      /*
    +       * 0 means "don't know" and positive values mean "confident that the prediction should be 1".
    +       * Negative values means "confident that the prediction should be 0".
    +       * We have in this case used some kind of weighted RMSE. The weight is the absolute value of
    +       * the confidence. The error is the difference between prediction and either 1 or 0,
    +       * depending on whether r is positive or negative.
    +       */
    +      splits(1).map(x => Rating(x.user, x.product, if(x.rating > 0) 1.0 else 0.0))
    --- End diff --
    
    Add a space after `if`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42511606
  
    LGTM. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42463269
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12407647
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -88,7 +92,27 @@ object MovieLensALS {
     
         val ratings = sc.textFile(params.input).map { line =>
           val fields = line.split("::")
    -      Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble)
    +      if (params.implicitPrefs) {
    +        /*
    +         * MovieLens ratings are on a scale of 1-5:
    +         * 5: Must see
    +         * 4: Will enjoy
    +         * 3: It's okay
    +         * 2: Fairly bad
    +         * 1: Awful
    +         * So we should not recommend a movie if the predicted rating is less than 3.
    +         * To map ratings to confidence scores, we use
    +         * 5 -> 2.5, 4 -> 1.5, 3 -> 0.5, 2 -> -0.5, 1 -> -1.5. This mappings means unobserved
    +         * entries are generally between It's okay and Fairly bad.
    +         * The semantics of 0 in this expanded world of non-positive weights
    +         * are "the same as never having interacted at all".
    +         * It's possible that 0 values are ignored when constructing the sparse representation,
    +         * because the 0s are implicit. This would be a problem, at least, a theoretical one.
    --- End diff --
    
    Shall we remove lines 109 and 110? MovieLens data does not have `0` ratings.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12334771
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -121,11 +135,17 @@ object MovieLensALS {
       }
     
       /** Compute RMSE (Root Mean Squared Error). */
    -  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], n: Long) = {
    +  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], params: Params) = {
         val predictions: RDD[Rating] = model.predict(data.map(x => (x.user, x.product)))
    -    val predictionsAndRatings = predictions.map(x => ((x.user, x.product), x.rating))
    -      .join(data.map(x => ((x.user, x.product), x.rating)))
    -      .values
    -    math.sqrt(predictionsAndRatings.map(x => (x._1 - x._2) * (x._1 - x._2)).reduce(_ + _) / n)
    +    val predictionsAndRatings = if (params.implicitPrefs) {
    +      predictions.map(x => (
    +        (x.user, x.product),
    +        if (x.rating > 1.0) 1.0 else if (x.rating < 0.0) 0.0 else x.rating
    --- End diff --
    
    This block is more readable if 
    
    ~~~
    val r = if (x.rating > 1.0) 1.0 else if (x.rating < 0.0) 0.0 else x.rating
    ((x.user, x.product), r)
    ~~~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by techaddict <gi...@git.apache.org>.

Github user techaddict commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42277501
  
    @mengxr i'm bit confused.
    ```
    val ratings = sc.textFile(input).map { line =>
    val ratings = sc.textFile(params.input).map { line =>
          val fields = line.split("::")
          if (params.implicitPrefs) {
            Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble - 2.5)
          } else {
            Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble)
          }
        }.cache()
    ```
    ```
    val test = splits(1)
          .map(x => Rating(x.user, x.product, if(x.rating>=0)1.0 else 0.0))
          .cache()
    ```
    ```scala
      def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], n: Long) = {
        val predictions: RDD[Rating] = model.predict(data.map(x => (x.user, x.product)))
        val predictionsAndRatings = predictions.map(x => ((x.user, x.product), (x.rating + 2.5) / 5.0))
          .join(data.map(x => ((x.user, x.product), x.rating)))
          .values
        math.sqrt(predictionsAndRatings.map(x => (x._1 - x._2) * (x._1 - x._2)).mean())
      }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42274037
  
    @techaddict For training, we should keep the `r - 2.5`s, which indicate confidence. For evaluation, we could either use `if (r > 2.5) 1.0 else 0.0` or weighted RMSE suggested by @srowen . Also, we need to map the predictions to interval `[0.0, 1.0]`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12363021
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -88,7 +92,11 @@ object MovieLensALS {
     
         val ratings = sc.textFile(params.input).map { line =>
           val fields = line.split("::")
    -      Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble)
    +      if (params.implicitPrefs) {
    +        Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble - 2.5)
    --- End diff --
    
    The part starting from `-- which doesn't quite fit` is a little confusing. I think it is okay to end the comment at `at all".` Also we need some comments for the evaluation part.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12334839
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -121,11 +135,17 @@ object MovieLensALS {
       }
     
       /** Compute RMSE (Root Mean Squared Error). */
    -  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], n: Long) = {
    +  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], params: Params) = {
         val predictions: RDD[Rating] = model.predict(data.map(x => (x.user, x.product)))
    -    val predictionsAndRatings = predictions.map(x => ((x.user, x.product), x.rating))
    -      .join(data.map(x => ((x.user, x.product), x.rating)))
    -      .values
    -    math.sqrt(predictionsAndRatings.map(x => (x._1 - x._2) * (x._1 - x._2)).reduce(_ + _) / n)
    +    val predictionsAndRatings = if (params.implicitPrefs) {
    +      predictions.map(x => (
    +        (x.user, x.product),
    +        if (x.rating > 1.0) 1.0 else if (x.rating < 0.0) 0.0 else x.rating
    +      )).join(data.map(x => ((x.user, x.product), x.rating)))
    --- End diff --
    
    Put `join` after `if ... else ...` because both branches use the same join.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42409173
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42458911
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42125991
  
    On this note, recall there was a change a while back to handle the case of negative confidence levels. 0 still means "don't know" and positive values mean "confident that the prediction should be 1". Negative values means "confident that the prediction should be 0".
    
    I have in this case used some kind of weighted RMSE. The weight is the absolute value of the confidence. The error is the difference between prediction and either 1 or 0, depending on whether r is positive or negative.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42278530
  
    `ratings` is correct. `test` also need to check `implicitPrefs`. If `implicitPrefs`, predictions should map to `if (pred > 1.0) 1.0 else if (pred < 0.0) 0.0 else pred`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42273875
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by techaddict <gi...@git.apache.org>.

Github user techaddict commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42459977
  
    @mengxr changes done :smile:  anything else or this good to merge ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by techaddict <gi...@git.apache.org>.

Github user techaddict commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12388904
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -88,7 +92,27 @@ object MovieLensALS {
     
         val ratings = sc.textFile(params.input).map { line =>
           val fields = line.split("::")
    -      Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble)
    +      if (params.implicitPrefs) {
    +        /**
    +         * MovieLens ratings are on a scale of 1-5:
    +         * 5: Must see
    +         * 4: Will enjoy
    +         * 3: It's okay
    +         * 2: Fairly bad
    +         * 1: Awful
    +         * So we should not recommend a movie if the predicted rating is less than 3.
    +         * To map ratings to confidence scores, we use
    +         * 5 -> 2.5, 4 -> 1.5, 3 -> 0.5, 2 -> -0.5, 1 -> -1.5. This mappings means unobserved
    +         * entries are generally between It's okay and Fairly bad.
    +         * The semantics of 0 in this expanded world of non-positive weights
    +         * are "the same as never having interacted at all"
    +         * It's possible that 0 values are ignored when constructing the sparse representation,
    --- End diff --
    
    I think it's just ok won't cause a problem. should i leave it as it is ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by techaddict <gi...@git.apache.org>.

Github user techaddict commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12389144
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -121,11 +157,14 @@ object MovieLensALS {
       }
     
       /** Compute RMSE (Root Mean Squared Error). */
    -  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], n: Long) = {
    +  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], implicitPrefs: Boolean) = {
    +
    +    def evalRating(r: Double) =
    --- End diff --
    
    ya better.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-41773593
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-41770645
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by techaddict <gi...@git.apache.org>.

Github user techaddict commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42003703
  
    Mapping rating in case of ImplicitPref to `{r=0 --> 0`, `r>0 --> 1}`,
    `Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble)` to 
    `Rating(fields(0).toInt, fields(1).toInt, if (fields(2).toDouble == 0) 0.0 else 1.0)` when ImplicitPref is `true`.
    This will work right ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12388603
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -88,7 +92,27 @@ object MovieLensALS {
     
         val ratings = sc.textFile(params.input).map { line =>
           val fields = line.split("::")
    -      Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble)
    +      if (params.implicitPrefs) {
    +        /**
    --- End diff --
    
    This is not JavaDoc, so please remove the last `*`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42261805
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12407485
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -121,11 +157,23 @@ object MovieLensALS {
       }
     
       /** Compute RMSE (Root Mean Squared Error). */
    -  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], n: Long) = {
    +  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], implicitPrefs: Boolean) = {
    +
    +    def mapPredictedRating(r: Double) =
    +      if (!implicitPrefs) {
    --- End diff --
    
    Can we change it to the following:
    
    ~~~
    if (implicitPrefs) math.max(math.min(r, 1.0), 0.0) else r
    ~~~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42261808
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12334735
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -121,11 +135,17 @@ object MovieLensALS {
       }
     
       /** Compute RMSE (Root Mean Squared Error). */
    -  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], n: Long) = {
    +  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], params: Params) = {
         val predictions: RDD[Rating] = model.predict(data.map(x => (x.user, x.product)))
    -    val predictionsAndRatings = predictions.map(x => ((x.user, x.product), x.rating))
    -      .join(data.map(x => ((x.user, x.product), x.rating)))
    -      .values
    -    math.sqrt(predictionsAndRatings.map(x => (x._1 - x._2) * (x._1 - x._2)).reduce(_ + _) / n)
    +    val predictionsAndRatings = if (params.implicitPrefs) {
    +      predictions.map(x => (
    --- End diff --
    
    `(x =>` -> `{ x =>`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by coderh <gi...@git.apache.org>.

Github user coderh commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-45898782
  
    Ok, I have found the error in my metric.
    ```
    val itemFactors = model.productFeatures.collect()
    ```
    This line is for creating a item-factor matrix, the problem is that item factors are not ordered by item id when collecting them, which leads to a wrong matrix, that's y the result is non sense.
    
    Adding a sortBy(_._1), like 
    ```
    val itemFactors = model.productFeatures.collect().sortBy(_._1)
    ```
    give a EPR like 9%(in sample), 10%(out of sample)
    
    Implicit ALS works. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12388654
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -88,7 +92,27 @@ object MovieLensALS {
     
         val ratings = sc.textFile(params.input).map { line =>
           val fields = line.split("::")
    -      Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble)
    +      if (params.implicitPrefs) {
    +        /**
    +         * MovieLens ratings are on a scale of 1-5:
    +         * 5: Must see
    +         * 4: Will enjoy
    +         * 3: It's okay
    +         * 2: Fairly bad
    +         * 1: Awful
    +         * So we should not recommend a movie if the predicted rating is less than 3.
    +         * To map ratings to confidence scores, we use
    +         * 5 -> 2.5, 4 -> 1.5, 3 -> 0.5, 2 -> -0.5, 1 -> -1.5. This mappings means unobserved
    +         * entries are generally between It's okay and Fairly bad.
    +         * The semantics of 0 in this expanded world of non-positive weights
    +         * are "the same as never having interacted at all"
    --- End diff --
    
    Missing a period  at the end.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42269236
  
    Can I make a tiny suggestion to map from ratings to weights with something like "rating - 2.5" instead of "rating - 3"? So that 3 becomes a small positive value like 0.5?
    
    There is an argument that even neutral ratings are weak positive interactions; to have even consumed the item to be able to rate it means you had an interest. 
    
    But more than that, the semantics of 0 in this expanded world of non-positive weights are "the same as never having interacted at all" -- which doesn't quite fit. I don't know if the intermediate sparse representations do this internally, at the moment, but it's possible that 0 values are ignored when constructing the sparse representation, because the 0s are implicit. This would be a problem, at least, a theoretical one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by techaddict <gi...@git.apache.org>.

Github user techaddict commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12362789
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -88,7 +92,11 @@ object MovieLensALS {
     
         val ratings = sc.textFile(params.input).map { line =>
           val fields = line.split("::")
    -      Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble)
    +      if (params.implicitPrefs) {
    +        Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble - 2.5)
    --- End diff --
    
    Any modifs in the comment ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42328612
  
    @techaddict  Could you try running this example on ml-1m: http://grouplens.org/datasets/movielens/ with `--implicitPrefs`? Try different combinations of rank and number of iterations, and see how our evaluation metric works? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12388848
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -121,11 +157,14 @@ object MovieLensALS {
       }
     
       /** Compute RMSE (Root Mean Squared Error). */
    -  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], n: Long) = {
    +  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], implicitPrefs: Boolean) = {
    +
    +    def evalRating(r: Double) =
    --- End diff --
    
    `eval` might mean something different. Shall we use 'mapPredictedRating`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by techaddict <gi...@git.apache.org>.

Github user techaddict commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42404618
  
    @mengxr Here are few results 
    ```
    implicitPref rank numInterations lambda -> rmse
    true          10   20             1.0   -> 0.5985187619423589
    true          20   20             1.0   -> 0.5822212152847526
    true          30   20             1.0   -> 0.5780589497218527
    true          30   40             1.0   -> 0.5776665087027969
    true          30   40             0.1   -> 0.5768531690541231
    true          30   40             0.001 -> 0.5756156814748565
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by techaddict <gi...@git.apache.org>.

Github user techaddict commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42508562
  
    @mengxr done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42405189
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12388703
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -88,7 +92,27 @@ object MovieLensALS {
     
         val ratings = sc.textFile(params.input).map { line =>
           val fields = line.split("::")
    -      Rating(fields(0).toInt, fields(1).toInt, fields(2).toDouble)
    +      if (params.implicitPrefs) {
    +        /**
    +         * MovieLens ratings are on a scale of 1-5:
    +         * 5: Must see
    +         * 4: Will enjoy
    +         * 3: It's okay
    +         * 2: Fairly bad
    +         * 1: Awful
    +         * So we should not recommend a movie if the predicted rating is less than 3.
    +         * To map ratings to confidence scores, we use
    +         * 5 -> 2.5, 4 -> 1.5, 3 -> 0.5, 2 -> -0.5, 1 -> -1.5. This mappings means unobserved
    +         * entries are generally between It's okay and Fairly bad.
    +         * The semantics of 0 in this expanded world of non-positive weights
    +         * are "the same as never having interacted at all"
    +         * It's possible that 0 values are ignored when constructing the sparse representation,
    --- End diff --
    
    This sentence may be confusing to users. Shall we hide theory from users?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42273467
  
    +1 on @srowen 's suggestion.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42404750
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/597#discussion_r12388952
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
    @@ -121,11 +157,14 @@ object MovieLensALS {
       }
     
       /** Compute RMSE (Root Mean Squared Error). */
    -  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], n: Long) = {
    +  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], implicitPrefs: Boolean) = {
    +
    +    def evalRating(r: Double) =
    +      if (!implicitPrefs) r else if (r > 1.0) 1.0 else if (r < 0.0) 0.0 else r
    +
         val predictions: RDD[Rating] = model.predict(data.map(x => (x.user, x.product)))
    -    val predictionsAndRatings = predictions.map(x => ((x.user, x.product), x.rating))
    -      .join(data.map(x => ((x.user, x.product), x.rating)))
    -      .values
    -    math.sqrt(predictionsAndRatings.map(x => (x._1 - x._2) * (x._1 - x._2)).reduce(_ + _) / n)
    +    val predictionsAndRatings = predictions.map(x => ((x.user, x.product), evalRating(x.rating)))
    +        .join(data.map(x => ((x.user, x.product), x.rating))).values
    --- End diff --
    
    2-space indentation?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by coderh <gi...@git.apache.org>.

Github user coderh commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-45338790
  
    Just a question on the result.
    ```
    implicitPref rank numInterations lambda -> rmse
    true          30   40             1.0   -> 0.5776665087027969
    ```
    Here, 0.57 is the error we will make when we predict 0/1, but is that too much ? 
    That means the preference we predicted is +/- 0.57 far from 0/1. It doesn't look good enough for me.
    Tell me if I am missing something. And how can I know that what rmse indicates a good prediction ?
    
    In the paper on which the implicit ALS is based on, we see that it used expected percentile rank.
    Maybe, mean averaged precision at k (MAP@K) will also be useful for evaluation. Do you have some kind of these results ?
    
    Thank you. =)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42511880
  
    Merged. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-42284625
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14713/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1668: Add implicit preference as an opti...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-41773594
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14584/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---