You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by davies <gi...@git.apache.org> on 2014/08/22 06:58:26 UTC

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

GitHub user davies opened a pull request:

    https://github.com/apache/spark/pull/2094

    [SPARK-2871] [PySpark] add `comp` argument for RDD.max() and RDD.min()

    RDD.max(comp=None)
    
            Find the maximum item in this RDD.
    
            @param comp: A function used to compare two elements, the builtin `cmp`
                         will be used by default.
    
            >>> rdd = sc.parallelize([1.0, 5.0, 43.0, 10.0])
            >>> rdd.max()
            43.0
            >>> rdd.max(lambda a, b: cmp(str(a), str(b)))
            5.0
    
    RDD.min(comp=None)
    
            Find the minimum item in this RDD.
    
            @param comp: A function used to compare two elements, the builtin `cmp`
                         will be used by default.
    
            >>> rdd = sc.parallelize([2.0, 5.0, 43.0, 10.0])
            >>> rdd.min()
            2.0
            >>> rdd.min(lambda a, b: cmp(str(a), str(b)))
            10.0

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/davies/spark cmp

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2094.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2094
    
----
commit dd91e08a92ebace863506cdfe52114ffeec894c9
Author: Davies Liu <da...@gmail.com>
Date:   2014-08-22T04:56:27Z

    add `comp` argument for RDD.max() and RDD.min()

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2094#issuecomment-53161691
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19110/consoleFull) for   PR 2094 at commit [`2f63512`](https://github.com/apache/spark/commit/2f63512e10a608722c1e8cd9ab5d22124d389a5d).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2094#discussion_r16631962
  
    --- Diff: python/pyspark/rdd.py ---
    @@ -810,23 +810,45 @@ def func(iterator):
     
             return self.mapPartitions(func).fold(zeroValue, combOp)
     
    -    def max(self):
    +    def max(self, comp=None):
             """
             Find the maximum item in this RDD.
     
    -        >>> sc.parallelize([1.0, 5.0, 43.0, 10.0]).max()
    +        @param comp: A function used to compare two elements, the builtin `cmp`
    +                     will be used by default.
    +
    +        >>> rdd = sc.parallelize([1.0, 5.0, 43.0, 10.0])
    +        >>> rdd.max()
             43.0
    +        >>> rdd.max(lambda a, b: cmp(str(a), str(b)))
    +        5.0
             """
    -        return self.reduce(max)
    +        if comp is not None:
    +            func = lambda a, b: a if comp(a, b) >= 0 else b
    +        else:
    +            func = max
     
    -    def min(self):
    +        return self.reduce(func)
    +
    +    def min(self, comp=None):
             """
             Find the minimum item in this RDD.
     
    -        >>> sc.parallelize([1.0, 5.0, 43.0, 10.0]).min()
    -        1.0
    +        @param comp: A function used to compare two elements, the builtin `cmp`
    +                     will be used by default.
    +
    +        >>> rdd = sc.parallelize([2.0, 5.0, 43.0, 10.0])
    +        >>> rdd.min()
    +        2.0
    +        >>> rdd.min(lambda a, b: cmp(str(a), str(b)))
    +        10.0
             """
    -        return self.reduce(min)
    +        if comp is not None:
    --- End diff --
    
    min and comp have different meanings:
    
       >>> min(1, 2)
       1
       >>> cmp(1, 2)
       -1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by JoshRosen <gi...@git.apache.org>.

Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2094#issuecomment-53172158
  
    I like this updated approach of using `key` instead of a comparator, since that's a closer match to Python's `min` function.  Can you update the PR's title and description to reflect this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2094#discussion_r16628865
  
    --- Diff: python/pyspark/rdd.py ---
    @@ -810,23 +810,45 @@ def func(iterator):
     
             return self.mapPartitions(func).fold(zeroValue, combOp)
     
    -    def max(self):
    +    def max(self, comp=None):
             """
             Find the maximum item in this RDD.
     
    -        >>> sc.parallelize([1.0, 5.0, 43.0, 10.0]).max()
    +        @param comp: A function used to compare two elements, the builtin `cmp`
    --- End diff --
    
    I think `cmp` is the function used in `max` or `min`, so `cmp` is the default value for `comp`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `key` argument for ...

Posted by JoshRosen <gi...@git.apache.org>.

Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2094#issuecomment-53174176
  
    I've merged this into master.  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by mattf <gi...@git.apache.org>.

Github user mattf commented on the pull request:

    https://github.com/apache/spark/pull/2094#issuecomment-53139806
  
    are you planning to add tests for these?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `key` argument for ...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/2094


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2094#issuecomment-53169261
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19113/consoleFull) for   PR 2094 at commit [`ccbaf25`](https://github.com/apache/spark/commit/ccbaf25ce6d601bcbc7cb6081128c2b4236925ad).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2094#discussion_r16631909
  
    --- Diff: python/pyspark/rdd.py ---
    @@ -810,23 +810,45 @@ def func(iterator):
     
             return self.mapPartitions(func).fold(zeroValue, combOp)
     
    -    def max(self):
    +    def max(self, comp=None):
             """
             Find the maximum item in this RDD.
     
    -        >>> sc.parallelize([1.0, 5.0, 43.0, 10.0]).max()
    +        @param comp: A function used to compare two elements, the builtin `cmp`
    --- End diff --
    
    Yes, using `comp` here is bit confusing.  The builtin `min` use `key`, it will be better for Python programer, but it will be different than Scala API.
    
    cc @mateiz @rxin @JoshRosen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by mattf <gi...@git.apache.org>.

Github user mattf commented on the pull request:

    https://github.com/apache/spark/pull/2094#issuecomment-53151507
  
    agreed re doctest. i forgot it was in use.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/2094#issuecomment-53143222
  
    @mattf thank you for reviewing this, I think the docs tests is enough, they have cover the cases w or w/o `comp`, which kinds of tests should be added?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by mattf <gi...@git.apache.org>.

Github user mattf commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2094#discussion_r16630361
  
    --- Diff: python/pyspark/rdd.py ---
    @@ -810,23 +810,45 @@ def func(iterator):
     
             return self.mapPartitions(func).fold(zeroValue, combOp)
     
    -    def max(self):
    +    def max(self, comp=None):
             """
             Find the maximum item in this RDD.
     
    -        >>> sc.parallelize([1.0, 5.0, 43.0, 10.0]).max()
    +        @param comp: A function used to compare two elements, the builtin `cmp`
    +                     will be used by default.
    +
    +        >>> rdd = sc.parallelize([1.0, 5.0, 43.0, 10.0])
    +        >>> rdd.max()
             43.0
    +        >>> rdd.max(lambda a, b: cmp(str(a), str(b)))
    +        5.0
             """
    -        return self.reduce(max)
    +        if comp is not None:
    +            func = lambda a, b: a if comp(a, b) >= 0 else b
    +        else:
    +            func = max
     
    -    def min(self):
    +        return self.reduce(func)
    +
    +    def min(self, comp=None):
             """
             Find the minimum item in this RDD.
     
    -        >>> sc.parallelize([1.0, 5.0, 43.0, 10.0]).min()
    -        1.0
    +        @param comp: A function used to compare two elements, the builtin `cmp`
    +                     will be used by default.
    +
    +        >>> rdd = sc.parallelize([2.0, 5.0, 43.0, 10.0])
    +        >>> rdd.min()
    +        2.0
    +        >>> rdd.min(lambda a, b: cmp(str(a), str(b)))
    +        10.0
             """
    -        return self.reduce(min)
    +        if comp is not None:
    --- End diff --
    
    consider default of comp=min in arg list and test for comp is not min


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2094#issuecomment-53167763
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19113/consoleFull) for   PR 2094 at commit [`ccbaf25`](https://github.com/apache/spark/commit/ccbaf25ce6d601bcbc7cb6081128c2b4236925ad).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2094#issuecomment-53169370
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19112/consoleFull) for   PR 2094 at commit [`ad7e374`](https://github.com/apache/spark/commit/ad7e374bd834d1e789ff95bba09f0c87ba67c4fd).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by mattf <gi...@git.apache.org>.

Github user mattf commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2094#discussion_r16627922
  
    --- Diff: python/pyspark/rdd.py ---
    @@ -810,23 +810,45 @@ def func(iterator):
     
             return self.mapPartitions(func).fold(zeroValue, combOp)
     
    -    def max(self):
    +    def max(self, comp=None):
             """
             Find the maximum item in this RDD.
     
    -        >>> sc.parallelize([1.0, 5.0, 43.0, 10.0]).max()
    +        @param comp: A function used to compare two elements, the builtin `cmp`
    --- End diff --
    
    nit - the buildin 'max'


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by mattf <gi...@git.apache.org>.

Github user mattf commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2094#discussion_r16627931
  
    --- Diff: python/pyspark/rdd.py ---
    @@ -810,23 +810,45 @@ def func(iterator):
     
             return self.mapPartitions(func).fold(zeroValue, combOp)
     
    -    def max(self):
    +    def max(self, comp=None):
             """
             Find the maximum item in this RDD.
     
    -        >>> sc.parallelize([1.0, 5.0, 43.0, 10.0]).max()
    +        @param comp: A function used to compare two elements, the builtin `cmp`
    +                     will be used by default.
    +
    +        >>> rdd = sc.parallelize([1.0, 5.0, 43.0, 10.0])
    +        >>> rdd.max()
             43.0
    +        >>> rdd.max(lambda a, b: cmp(str(a), str(b)))
    +        5.0
             """
    -        return self.reduce(max)
    +        if comp is not None:
    +            func = lambda a, b: a if comp(a, b) >= 0 else b
    +        else:
    +            func = max
     
    -    def min(self):
    +        return self.reduce(func)
    +
    +    def min(self, comp=None):
             """
             Find the minimum item in this RDD.
     
    -        >>> sc.parallelize([1.0, 5.0, 43.0, 10.0]).min()
    -        1.0
    +        @param comp: A function used to compare two elements, the builtin `cmp`
    --- End diff --
    
    nit - the builtin 'min'


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by mattf <gi...@git.apache.org>.

Github user mattf commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2094#discussion_r16630356
  
    --- Diff: python/pyspark/rdd.py ---
    @@ -810,23 +810,45 @@ def func(iterator):
     
             return self.mapPartitions(func).fold(zeroValue, combOp)
     
    -    def max(self):
    +    def max(self, comp=None):
             """
             Find the maximum item in this RDD.
     
    -        >>> sc.parallelize([1.0, 5.0, 43.0, 10.0]).max()
    +        @param comp: A function used to compare two elements, the builtin `cmp`
    --- End diff --
    
    cmp may be used in max, but for this func the default is on line 829. either way, a minor nitpick.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2094#issuecomment-53167601
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19112/consoleFull) for   PR 2094 at commit [`ad7e374`](https://github.com/apache/spark/commit/ad7e374bd834d1e789ff95bba09f0c87ba67c4fd).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2094#discussion_r16631953
  
    --- Diff: python/pyspark/rdd.py ---
    @@ -810,23 +810,45 @@ def func(iterator):
     
             return self.mapPartitions(func).fold(zeroValue, combOp)
     
    -    def max(self):
    +    def max(self, comp=None):
             """
             Find the maximum item in this RDD.
     
    -        >>> sc.parallelize([1.0, 5.0, 43.0, 10.0]).max()
    +        @param comp: A function used to compare two elements, the builtin `cmp`
    --- End diff --
    
    We already use `key` in Python instead of `Ordering` in Scala, so I had change it into `key`.
    
    Also , I would like to add `key` to top(), will be helpful, such as:
    
    rdd.map(lambda x: (x, 1)).reduce(add).top(20, key=itemgetter(1))
    
    We already have `ord` in Scala. Should I add this in this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2094#issuecomment-53025855
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19073/consoleFull) for   PR 2094 at commit [`dd91e08`](https://github.com/apache/spark/commit/dd91e08a92ebace863506cdfe52114ffeec894c9).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2094#issuecomment-53160372
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19110/consoleFull) for   PR 2094 at commit [`2f63512`](https://github.com/apache/spark/commit/2f63512e10a608722c1e8cd9ab5d22124d389a5d).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by JoshRosen <gi...@git.apache.org>.

Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2094#issuecomment-53172077
  
    Epydoc renders docstrings + `@params` kind of oddly, but I don't think it's a big deal:
    
    ![image](https://cloud.githubusercontent.com/assets/50748/4022261/077448f0-2b26-11e4-863c-8361aa11d8a5.png)
    
    In the long run, we might want to move to Sphinx, since that seems to be what's popular with most major Python projects. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2871] [PySpark] add `comp` argument for...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2094#issuecomment-53023114
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19073/consoleFull) for   PR 2094 at commit [`dd91e08`](https://github.com/apache/spark/commit/dd91e08a92ebace863506cdfe52114ffeec894c9).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org