You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by ashashwat <gi...@git.apache.org> on 2018/02/04 11:33:43 UTC

[GitHub] spark pull request #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviou...

GitHub user ashashwat opened a pull request:

    https://github.com/apache/spark/pull/20503

    [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for Rows.

    ## What changes were proposed in this pull request?
    
    Fix \_\_repr\_\_ behaviour for Rows.
    
    Rows \_\_repr\_\_ assumes data is a string when column name is missing.
    Examples,
    ```
    >>> from pyspark.sql.types import Row
    >>> Row ("Alice", "11")
    <Row(Alice, 11)>
    
    >>> Row (name="Alice", age=11)
    Row(age=11, name='Alice')
    
    >>> Row ("Alice", 11)
    <snip stack trace>
    TypeError: sequence item 1: expected string, int found
    ```
    
    This is because Row () when called without column names assumes
    everything is a string.
    
    ## How was this patch tested?
    
    Manually tested and unittest was added in `python/pyspark/sql/tests.py`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ashashwat/spark SPARK-23299

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20503.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20503
    
----
commit 6604e9fdaa710cd894b4799390144e404667402e
Author: Shashwat Anand <me...@...>
Date:   2018-02-04T10:27:31Z

    Fix __repr__ behaviour for Rows.
    
    Rows __repr__ assumes data is strings when column name is missing.
    
    Examples,
    >>> Row ("Alice", "11")
    <Row(Alice, 11)>
    
    >>> Row (name="Alice", age=11)
    Row(age=11, name='Alice')
    
    >>> Row ("Alice", 11)
    <snip stack trace>
    TypeError: sequence item 1: expected string, int found
    
    This is because Row () when called without column names assumes
    everything is string.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by ashashwat <gi...@git.apache.org>.
Github user ashashwat commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    @HyukjinKwon `return "<Row(%s)>" % ", ".join("%s" % (fields) for fields in self)` takes care of everything.
    ```
    
    >>> Row ("aa", 11)
    <Row(aa, 11)>
    
    >>> Row (u"아", 11)
    <Row(아, 11)>
    
    >>> Row ("아", 11)
    <Row(아, 11)>
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Jenkins ok to test.
    Gentle ping again to @ashashwat - are you still interested in this PR?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by ashashwat <gi...@git.apache.org>.
Github user ashashwat commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    @holdenk I am on it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    I think it still makes sense to produce a repr anyway because we successfully can create the instance for now but .. let me take a closer look within few days for sure.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviou...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20503#discussion_r225844655
  
    --- Diff: python/pyspark/sql/tests.py ---
    @@ -234,6 +234,10 @@ def test_empty_row(self):
             row = Row()
             self.assertEqual(len(row), 0)
     
    +    def test_row_without_column_name(self):
    +        row = Row("Alice", 11)
    --- End diff --
    
    Can we add a doctest for this usage (Row as objects not as a namedtuple class), and documentation in `Row` at `types.py`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by ashashwat <gi...@git.apache.org>.
Github user ashashwat commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    @HyukjinKwon Here is what I tried:
    
    ```
    # Code: return "<Row(%s)>" % ", ".join(fields.encode("utf8") for fields in self)
    >>> Row (u"아", "11")
    <Row(아, 11)>
    # Fails for integer fields.
    
    # Code: return "<Row(%s)>" % ", ".join(str(fields) for fields in self)
    >>> Row (u"아", "11")
    UnicodeEncodeError: 'ascii' codec can't encode character u'\uc544' in position 0: ordinal not in range(128)
    
    # Code: return "<Row(%s)>" % ", ".join(repr(fields) for fields in self)
    >>> Row (u"아", 11)
    <Row(u'\uc544', 11)>
    
    # Code: return "<Row(%s)>" % ", ".join(unicode(fields).encode("utf8") for fields in self)
    >>> Row (u"아", 11)
    <Row(아, 11)>
    ```
    
    repr is definitely a better option than str.  But why not unicode? 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    **[Test build #98110 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98110/testReport)** for PR 20503 at commit [`890aa65`](https://github.com/apache/spark/commit/890aa6514196b3c672c4581120506632dd49b4a6).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Gentle ping


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    BTW, does non-string field names work in this namedtuple way?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    **[Test build #87143 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87143/testReport)** for PR 20503 at commit [`890aa65`](https://github.com/apache/spark/commit/890aa6514196b3c672c4581120506632dd49b4a6).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by ashashwat <gi...@git.apache.org>.
Github user ashashwat commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    @HyukjinKwon Do you mean something like `Row (a=1, b=2, c=3)` or `Row (1="Alice", 2=11)`?  Former works fine, latter fails with `SyntaxError: keyword can't be an expression`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Jenkins, ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviou...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20503#discussion_r225846669
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -1581,7 +1581,7 @@ def __repr__(self):
                 return "Row(%s)" % ", ".join("%s=%r" % (k, v)
                                              for k, v in zip(self.__fields__, tuple(self)))
             else:
    -            return "<Row(%s)>" % ", ".join(self)
    +            return "<Row(%s)>" % ", ".join("%s" % (fields) for fields in self)
    --- End diff --
    
    nit fields => field


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Looks good otherwise.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    **[Test build #87143 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87143/testReport)** for PR 20503 at commit [`890aa65`](https://github.com/apache/spark/commit/890aa6514196b3c672c4581120506632dd49b4a6).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    **[Test build #91594 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91594/testReport)** for PR 20503 at commit [`890aa65`](https://github.com/apache/spark/commit/890aa6514196b3c672c4581120506632dd49b4a6).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    I meant things like this:
    
    ```python
    >>> from pyspark.sql import Row
    >>> RowClass = Row(1)
    >>> RowClass("a")
    Row(1='a')
    ```
    
    ```python
    >>> spark.createDataFrame([RowClass("a")])
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/.../spark/python/pyspark/sql/session.py", line 686, in createDataFrame
        rdd, schema = self._createFromLocal(map(prepare, data), schema)
      File "/.../spark/python/pyspark/sql/session.py", line 410, in _createFromLocal
        struct = self._inferSchemaFromList(data, names=schema)
      File "/.../spark/python/pyspark/sql/session.py", line 342, in _inferSchemaFromList
        schema = reduce(_merge_type, (_infer_schema(row, names) for row in data))
      File "/.../spark/python/pyspark/sql/session.py", line 342, in <genexpr>
        schema = reduce(_merge_type, (_infer_schema(row, names) for row in data))
      File "/.../spark/python/pyspark/sql/types.py", line 1099, in _infer_schema
        fields = [StructField(k, _infer_type(v), True) for k, v in items]
      File "/.../spark/python/pyspark/sql/types.py", line 407, in __init__
        assert isinstance(name, basestring), "field name should be string"
    AssertionError: field name should be string
    ```
    
    The reason I initially didn't suggest to use `str` is, it breaks `unicode` in Python 2 IIRC. For example,
    
    ```
    str(u"아")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    UnicodeEncodeError: 'ascii' codec can't encode character u'\uc544' in position 0: ordinal not in range(128)
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Awesome, thanks. Let me know if I can help :)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviou...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20503#discussion_r225846381
  
    --- Diff: python/pyspark/sql/tests.py ---
    @@ -234,6 +234,10 @@ def test_empty_row(self):
             row = Row()
             self.assertEqual(len(row), 0)
     
    +    def test_row_without_column_name(self):
    +        row = Row("Alice", 11)
    +        self.assertEqual(row.__repr__(), "<Row(Alice, 11)>")
    --- End diff --
    
    I would test non-ascii compatible characters as well


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    `unicode(fields).encode("utf8")`: in this case, we will try to decode it by system default encoding first and then encode it by udf-8 if the input is `str` (bytes). So, for example, I think `unicode("아")` case won't work.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviou...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20503#discussion_r225846867
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -1581,7 +1581,7 @@ def __repr__(self):
                 return "Row(%s)" % ", ".join("%s=%r" % (k, v)
                                              for k, v in zip(self.__fields__, tuple(self)))
             else:
    -            return "<Row(%s)>" % ", ".join(self)
    +            return "<Row(%s)>" % ", ".join("%s" % (fields) for fields in self)
    --- End diff --
    
    `"%s" % (fields) for fields in self` -> `"%s" % field for field in self`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Gentle ping again to @ashashwat . Also @HyukjinKwon what are your opinions on the test coverage?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Check if it's `unicode` and convert, etc. might also work ..


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    **[Test build #91594 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91594/testReport)** for PR 20503 at commit [`890aa65`](https://github.com/apache/spark/commit/890aa6514196b3c672c4581120506632dd49b4a6).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98110/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by ashashwat <gi...@git.apache.org>.
Github user ashashwat commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    @HyukjinKwon Should I add more tests covering Unicode?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87143/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    **[Test build #98110 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98110/testReport)** for PR 20503 at commit [`890aa65`](https://github.com/apache/spark/commit/890aa6514196b3c672c4581120506632dd49b4a6).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    ping @ashashwat to update


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviou...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20503#discussion_r225846584
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -1581,7 +1581,7 @@ def __repr__(self):
                 return "Row(%s)" % ", ".join("%s=%r" % (k, v)
                                              for k, v in zip(self.__fields__, tuple(self)))
             else:
    -            return "<Row(%s)>" % ", ".join(self)
    +            return "<Row(%s)>" % ", ".join("%s" % (fields) for fields in self)
    --- End diff --
    
    nit `"%s" % (fields) for fields in self` -> `"%s" % fields for fields in self`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    I _think_ this could be good to backport into 2.4 assuming the current RC fails if @ashashwat has the chance to update it and no one sees any issues with including this in a backport to that branch.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    we still need to fix this, right?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Sure let's add a test with a unicode string to it if there's concern about that and make sure the existing repr with named fields is covered the same test case since I don't see an existing explicit test for that (although it's probably covered implicitly elsewhere).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20503
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91594/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org