You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by MechCoder <gi...@git.apache.org> on 2015/04/30 13:33:28 UTC

[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

GitHub user MechCoder opened a pull request:

    https://github.com/apache/spark/pull/5807

    [SPARK-6257] [PySpark] [MLlib] MLlib API missing items in Recommendation

    Adds
    
    rank, recommendUsers and RecommendProducts to MatrixFactorizationModel in PySpark.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/MechCoder/spark spark-6257

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5807.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5807
    
----
commit 5d75c31125530530401fd6bc7cfc4f55ad2162dd
Author: MechCoder <ma...@gmail.com>
Date:   2015-04-30T11:17:20Z

    [SPARK-6257] MLlib API missing items in Recommendation

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5807#issuecomment-97763896
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5807#issuecomment-97827180
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31418/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5807#issuecomment-97763900
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31415/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5807#issuecomment-97746090
  
      [Test build #31415 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31415/consoleFull) for   PR 5807 at commit [`5d75c31`](https://github.com/apache/spark/commit/5d75c31125530530401fd6bc7cfc4f55ad2162dd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5807#discussion_r29471580
  
    --- Diff: python/pyspark/mllib/recommendation.py ---
    @@ -105,9 +112,15 @@ class MatrixFactorizationModel(JavaModelWrapper, JavaSaveable, JavaLoader):
         ...     pass
         """
         def predict(self, user, product):
    +        """
    +        Predicts rating for a given user and product.
    --- End diff --
    
    `a given user and product pair` or `the given user and product`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5807#issuecomment-97772696
  
      [Test build #31418 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31418/consoleFull) for   PR 5807 at commit [`2b1dd89`](https://github.com/apache/spark/commit/2b1dd8982c536e1b6253c272c21edb73ee7cba13).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5807#issuecomment-97771293
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by MechCoder <gi...@git.apache.org>.
Github user MechCoder commented on the pull request:

    https://github.com/apache/spark/pull/5807#issuecomment-97745811
  
    @jkbradkey
    
    Does it make sense to have things like "setXXX" in Python especially since the attributes are always accessible and especially in this case since the recommendation is trained using methods of the static object ALS?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5807#discussion_r29471589
  
    --- Diff: python/pyspark/mllib/recommendation.py ---
    @@ -115,11 +128,39 @@ def predictAll(self, user_product):
             return self.call("predict", user_product)
     
         def userFeatures(self):
    +        """
    +        Returns a coupled RDD, where the first element is the user and the
    +        second is an array of features corresponding to that user.
    +        """
             return self.call("getUserFeatures").mapValues(lambda v: array.array('d', v))
     
         def productFeatures(self):
    +        """
    +        Returns a coupled RDD, where the first element is the product and the
    --- End diff --
    
    `coupled` -> `pair`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5807#issuecomment-97771445
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5807#issuecomment-97827162
  
      [Test build #31418 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31418/consoleFull) for   PR 5807 at commit [`2b1dd89`](https://github.com/apache/spark/commit/2b1dd8982c536e1b6253c272c21edb73ee7cba13).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5807#discussion_r29471587
  
    --- Diff: python/pyspark/mllib/recommendation.py ---
    @@ -115,11 +128,39 @@ def predictAll(self, user_product):
             return self.call("predict", user_product)
     
         def userFeatures(self):
    +        """
    +        Returns a coupled RDD, where the first element is the user and the
    --- End diff --
    
    `coupled` -> `pair`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5807#issuecomment-97745891
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5807#issuecomment-97827179
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by MechCoder <gi...@git.apache.org>.
Github user MechCoder commented on the pull request:

    https://github.com/apache/spark/pull/5807#issuecomment-97746104
  
    Also the release deadline means that PR's would still be reviewed right? It is just that they would enter the 1.5 branch instead of the 1.4?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5807#issuecomment-97763871
  
      [Test build #31415 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31415/consoleFull) for   PR 5807 at commit [`5d75c31`](https://github.com/apache/spark/commit/5d75c31125530530401fd6bc7cfc4f55ad2162dd).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5807#discussion_r29471593
  
    --- Diff: python/pyspark/mllib/recommendation.py ---
    @@ -115,11 +128,39 @@ def predictAll(self, user_product):
             return self.call("predict", user_product)
     
         def userFeatures(self):
    +        """
    +        Returns a coupled RDD, where the first element is the user and the
    +        second is an array of features corresponding to that user.
    +        """
             return self.call("getUserFeatures").mapValues(lambda v: array.array('d', v))
     
         def productFeatures(self):
    +        """
    +        Returns a coupled RDD, where the first element is the product and the
    +        second is an array of features corresponding to that product.
    +        """
             return self.call("getProductFeatures").mapValues(lambda v: array.array('d', v))
     
    +    def recommendUsers(self, product, num):
    +        """
    +        Recommends the top "num" number of products for a given user.
    +        This is done by returning a tuple of Rating objects, the second id (product)
    +        being constant and sorted according to the rating.
    --- End diff --
    
    `Recommends the top "num" number of products for a given user and returns a list or Rating objects sorted by the predicted rating in descending order.`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5807#issuecomment-97745911
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6257] [PySpark] [MLlib] MLlib API missi...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5807#discussion_r29471585
  
    --- Diff: python/pyspark/mllib/recommendation.py ---
    @@ -105,9 +112,15 @@ class MatrixFactorizationModel(JavaModelWrapper, JavaSaveable, JavaLoader):
         ...     pass
         """
         def predict(self, user, product):
    +        """
    +        Predicts rating for a given user and product.
    +        """
             return self._java_model.predict(int(user), int(product))
     
         def predictAll(self, user_product):
    +        """
    +        Returns a list of predicted Ratings for a user and many products.
    --- End diff --
    
    `predicted ratings for input user and product pairs`. This works for more than one user.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org