You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by uncleGen <gi...@git.apache.org> on 2017/03/12 13:00:06 UTC

[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

GitHub user uncleGen opened a pull request:

    https://github.com/apache/spark/pull/17267

    [SPARK-19926][PYSPARK] Make pyspark exception more readable

    ## What changes were proposed in this pull request?
    
    Exception in pyspark is a little difficult to read.
    
    before pr, like:
    
    ```
    Traceback (most recent call last):
      File "<stdin>", line 5, in <module>
      File "/root/dev/spark/dist/python/pyspark/sql/streaming.py", line 853, in start
        return self._sq(self._jwrite.start())
      File "/root/dev/spark/dist/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
      File "/root/dev/spark/dist/python/pyspark/sql/utils.py", line 69, in deco
        raise AnalysisException(s.split(': ', 1)[1], stackTrace)
    pyspark.sql.utils.AnalysisException: u'Append output mode not supported when there are streaming aggregations on streaming DataFrames/DataSets without watermark;;\nAggregate [window#17, word#5], [window#17 AS window#11, word#5, count(1) AS count#16L]\n+- Filter ((t#6 >= window#17.start) && (t#6 < window#17.end))\n   +- Expand [ArrayBuffer(named_struct(start, ((((CEIL((cast((precisetimestamp(t#6) - 0) as double) / cast(30000000 as double))) + cast(0 as bigint)) - cast(1 as bigint)) * 30000000) + 0), end, (((((CEIL((cast((precisetimestamp(t#6) - 0) as double) / cast(30000000 as double))) + cast(0 as bigint)) - cast(1 as bigint)) * 30000000) + 0) + 30000000)), word#5, t#6-T30000ms), ArrayBuffer(named_struct(start, ((((CEIL((cast((precisetimestamp(t#6) - 0) as double) / cast(30000000 as double))) + cast(1 as bigint)) - cast(1 as bigint)) * 30000000) + 0), end, (((((CEIL((cast((precisetimestamp(t#6) - 0) as double) / cast(30000000 as double))) + cast(1 as bigint)) - cast(1 as bigint))
  * 30000000) + 0) + 30000000)), word#5, t#6-T30000ms)], [window#17, word#5, t#6-T30000ms]\n      +- EventTimeWatermark t#6: timestamp, interval 30 seconds\n         +- Project [cast(word#0 as string) AS word#5, cast(t#1 as timestamp) AS t#6]\n            +- StreamingRelation DataSource(org.apache.spark.sql.SparkSession@c4079ca,csv,List(),Some(StructType(StructField(word,StringType,true), StructField(t,IntegerType,true))),List(),None,Map(sep -> ;, path -> /tmp/data),None), FileSource[/tmp/data], [word#0, t#1]\n'
    ```
    
    after pr:
    
    ```
    Traceback (most recent call last):
      File "<stdin>", line 5, in <module>
      File "/root/dev/spark/dist/python/pyspark/sql/streaming.py", line 853, in start
        return self._sq(self._jwrite.start())
      File "/root/dev/spark/dist/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
      File "/root/dev/spark/dist/python/pyspark/sql/utils.py", line 69, in deco
        raise AnalysisException(s.split(': ', 1)[1], stackTrace)
    pyspark.sql.utils.AnalysisException: Append output mode not supported when there are streaming aggregations on streaming DataFrames/DataSets without watermark;;
    Aggregate [window#17, word#5], [window#17 AS window#11, word#5, count(1) AS count#16L]
    +- Filter ((t#6 >= window#17.start) && (t#6 < window#17.end))
       +- Expand [ArrayBuffer(named_struct(start, ((((CEIL((cast((precisetimestamp(t#6) - 0) as double) / cast(30000000 as double))) + cast(0 as bigint)) - cast(1 as bigint)) * 30000000) + 0), end, (((((CEIL((cast((precisetimestamp(t#6) - 0) as double) / cast(30000000 as double))) + cast(0 as bigint)) - cast(1 as bigint)) * 30000000) + 0) + 30000000)), word#5, t#6-T30000ms), ArrayBuffer(named_struct(start, ((((CEIL((cast((precisetimestamp(t#6) - 0) as double) / cast(30000000 as double))) + cast(1 as bigint)) - cast(1 as bigint)) * 30000000) + 0), end, (((((CEIL((cast((precisetimestamp(t#6) - 0) as double) / cast(30000000 as double))) + cast(1 as bigint)) - cast(1 as bigint)) * 30000000) + 0) + 30000000)), word#5, t#6-T30000ms)], [window#17, word#5, t#6-T30000ms]
          +- EventTimeWatermark t#6: timestamp, interval 30 seconds
             +- Project [cast(word#0 as string) AS word#5, cast(t#1 as timestamp) AS t#6]
                +- StreamingRelation DataSource(org.apache.spark.sql.SparkSession@5265083b,csv,List(),Some(StructType(StructField(word,StringType,true), StructField(t,IntegerType,true))),List(),None,Map(sep -> ;, path -> /tmp/data),None), FileSource[/tmp/data], [word#0, t#1]
    ```
    
    ## How was this patch tested?
    
    Jenkins


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/uncleGen/spark SPARK-19926

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17267.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17267
    
----
commit 273c1bc8d719158dd074cb806d5db487b9709edb
Author: uncleGen <hu...@gmail.com>
Date:   2017-03-12T12:57:31Z

    Make pyspark exception more readable

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by uncleGen <gi...@git.apache.org>.
Github user uncleGen commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    @viirya Thanks for you review. 
    cc @srowen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17267#discussion_r123131587
  
    --- Diff: python/pyspark/sql/utils.py ---
    @@ -24,7 +28,11 @@ def __init__(self, desc, stackTrace):
             self.stackTrace = stackTrace
     
         def __str__(self):
    -        return repr(self.desc)
    +        desc = self.desc
    +        if isinstance(desc, unicode):
    +            return str(desc.encode('utf-8'))
    --- End diff --
    
    cc @zero323 and @davies too. Would you have some time to take a look for this one? This is a typical annoying problem between unicode and byte strings. There are many similar PRs (at least I can identify few PRs trying to handle this problem. One good example might help resolving other PRs too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17267#discussion_r105838261
  
    --- Diff: python/pyspark/sql/utils.py ---
    @@ -16,6 +16,10 @@
     #
     
     import py4j
    +import sys
    +
    +if sys.version > '3':
    --- End diff --
    
    I think it should be `>=`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    **[Test build #74403 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74403/testReport)** for PR 17267 at commit [`273c1bc`](https://github.com/apache/spark/commit/273c1bc8d719158dd074cb806d5db487b9709edb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17267#discussion_r105664922
  
    --- Diff: python/pyspark/sql/utils.py ---
    @@ -24,7 +24,7 @@ def __init__(self, desc, stackTrace):
             self.stackTrace = stackTrace
     
         def __str__(self):
    -        return repr(self.desc)
    +        return str(self.desc)
    --- End diff --
    
    Yea, I support this change and tested some more cases with that encode.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more user-...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    LGTM too but hope there would be a test if possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    **[Test build #74503 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74503/testReport)** for PR 17267 at commit [`edf9b12`](https://github.com/apache/spark/commit/edf9b12d29a4809beed030e6844b0aa1fede3ef5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    **[Test build #74403 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74403/testReport)** for PR 17267 at commit [`273c1bc`](https://github.com/apache/spark/commit/273c1bc8d719158dd074cb806d5db487b9709edb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74490/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by uncleGen <gi...@git.apache.org>.
Github user uncleGen commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    Thanks @HyukjinKwon\uff0cyou give a good catch\uff01I lost that case. Thanks @viirya for your suggestion.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    **[Test build #74490 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74490/testReport)** for PR 17267 at commit [`6c55e02`](https://github.com/apache/spark/commit/6c55e022660e56feff882c8feaa7710f0b0aee69).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17267#discussion_r105653661
  
    --- Diff: python/pyspark/sql/utils.py ---
    @@ -24,7 +24,7 @@ def __init__(self, desc, stackTrace):
             self.stackTrace = stackTrace
     
         def __str__(self):
    -        return repr(self.desc)
    +        return str(self.desc)
    --- End diff --
    
    Hm.. does this work for `unicode` in Python 2.7, for example, `spark.range(1).select("\uc544")`? Up to my knowledge, converting it to ascii directly throws an exception.
    
    ```python
    >>> str(u"\uc544")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    UnicodeEncodeError: 'ascii' codec can't encode character u'\uc544' in position 0: ordinal not in range(128)
    >>> repr(u"\uc544")
    "u'\\uc544'"
    ```
    
    Maybe, we should check if this is `unicode` and do `.encode`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by uncleGen <gi...@git.apache.org>.
Github user uncleGen commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    IMHO, yes. And @viirya is the original author.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17267#discussion_r121825610
  
    --- Diff: python/pyspark/sql/utils.py ---
    @@ -24,7 +28,11 @@ def __init__(self, desc, stackTrace):
             self.stackTrace = stackTrace
     
         def __str__(self):
    -        return repr(self.desc)
    +        desc = self.desc
    +        if isinstance(desc, unicode):
    +            return str(desc.encode('utf-8'))
    --- End diff --
    
    Good catch! I previously thought `str` works like Python2.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more user-...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    Correct me if I'm wrong, but I got the following message after this patch in Python 3.6:
    
    ```python
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/Users/ueshin/workspace/pyspark/spark/python/pyspark/sql/dataframe.py", line 1049, in select
        jdf = self._jdf.select(self._jcols(*cols))
      File "/Users/ueshin/workspace/pyspark/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
      File "/Users/ueshin/workspace/pyspark/spark/python/pyspark/sql/utils.py", line 77, in deco
        raise AnalysisException(s.split(': ', 1)[1], stackTrace)
    pyspark.sql.utils.AnalysisException: b"cannot resolve '`\xec\x95\x84`' given input columns: [id];;\n'Project ['\xec\x95\x84]\n+- Range (0, 1, step=1, splits=Some(8))\n"
    ```
    
    I guess this message is not desirable?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by uncleGen <gi...@git.apache.org>.
Github user uncleGen commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    ping @viirya


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more user-...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    +1 We should add a test for this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    LGTM cc @davies @holdenk 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    I'm not reviewing this patch. People who know better should merge it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    **[Test build #74490 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74490/testReport)** for PR 17267 at commit [`6c55e02`](https://github.com/apache/spark/commit/6c55e022660e56feff882c8feaa7710f0b0aee69).
     * This patch **fails Python style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74491/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more user-...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    @dataknocker do you want to take over this one? then we can continue with #18324


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    **[Test build #74487 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74487/testReport)** for PR 17267 at commit [`5bc1d8e`](https://github.com/apache/spark/commit/5bc1d8e75b3690b911cf88bcf2fba561bc63e354).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17267#discussion_r121815790
  
    --- Diff: python/pyspark/sql/utils.py ---
    @@ -24,7 +28,11 @@ def __init__(self, desc, stackTrace):
             self.stackTrace = stackTrace
     
         def __str__(self):
    -        return repr(self.desc)
    +        desc = self.desc
    +        if isinstance(desc, unicode):
    +            return str(desc.encode('utf-8'))
    --- End diff --
    
    @ueshin, you are right and I misread the codes. We need to
    
    - unicode in Python 2 => `u.encode("utf-8")`.
    - others in Python 2 => return `str(s)`.
    - others in Python 3 => return `str(s)`.
    
    Root cause for https://github.com/apache/spark/pull/17267#issuecomment-308231375 looks because `encode` on string (also same as unicode in Python 2) in Python 3 produces 8-bit bytes, `b"..."`, (also same as normal string, `"..."` and `b"..."`, where `b` is ignored, in Python 2). And `str` function works differently as below:
    
    Python 2
    
    ```python
    >>> str(b"aa")
    'aa'
    >>> b"aa"
    'aa'
    ```
    
    Python 3
    
    ```python
    >>> str(b"aa")
    "b'aa'"
    >>> "aa"
    'aa'
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17267#discussion_r105657204
  
    --- Diff: python/pyspark/sql/utils.py ---
    @@ -24,7 +24,7 @@ def __init__(self, desc, stackTrace):
             self.stackTrace = stackTrace
     
         def __str__(self):
    -        return repr(self.desc)
    +        return str(self.desc)
    --- End diff --
    
    We can add a check under Python2. If it is unicode, just encode it with utf-8.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more user-...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    +1 for adding a test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74487/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    Thanks for working on this. LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more user-...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    cc @ueshin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by uncleGen <gi...@git.apache.org>.
Github user uncleGen commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    @srowen Could you please take a view and help to merge?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17267#discussion_r105659050
  
    --- Diff: python/pyspark/sql/utils.py ---
    @@ -24,7 +24,7 @@ def __init__(self, desc, stackTrace):
             self.stackTrace = stackTrace
     
         def __str__(self):
    -        return repr(self.desc)
    +        return str(self.desc)
    --- End diff --
    
    Ah, thank you for confirmation. I thought I was mistaken :).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    **[Test build #74487 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74487/testReport)** for PR 17267 at commit [`5bc1d8e`](https://github.com/apache/spark/commit/5bc1d8e75b3690b911cf88bcf2fba561bc63e354).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    **[Test build #74503 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74503/testReport)** for PR 17267 at commit [`edf9b12`](https://github.com/apache/spark/commit/edf9b12d29a4809beed030e6844b0aa1fede3ef5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74403/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    What's the difference between the two? briefly. I don't know enough to evaluate it though the effect looks positive. Is this the only place this should change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17267#discussion_r105657313
  
    --- Diff: python/pyspark/sql/utils.py ---
    @@ -24,7 +24,7 @@ def __init__(self, desc, stackTrace):
             self.stackTrace = stackTrace
     
         def __str__(self):
    -        return repr(self.desc)
    +        return str(self.desc)
    --- End diff --
    
    @HyukjinKwon Good catch!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17267#discussion_r105654542
  
    --- Diff: python/pyspark/sql/utils.py ---
    @@ -24,7 +24,7 @@ def __init__(self, desc, stackTrace):
             self.stackTrace = stackTrace
     
         def __str__(self):
    -        return repr(self.desc)
    +        return str(self.desc)
    --- End diff --
    
    @uncleGen, could you double check if I did something wrong maybe?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more user-...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    Hey @uncleGen anytime to add a test for this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74503/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    **[Test build #74491 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74491/testReport)** for PR 17267 at commit [`7b96e97`](https://github.com/apache/spark/commit/7b96e97b60b67cab49f3108ad84759ccb0f643e0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by uncleGen <gi...@git.apache.org>.
Github user uncleGen commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    Maybe @viirya can give some suggestion.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17267#discussion_r105654236
  
    --- Diff: python/pyspark/sql/utils.py ---
    @@ -24,7 +24,7 @@ def __init__(self, desc, stackTrace):
             self.stackTrace = stackTrace
     
         def __str__(self):
    -        return repr(self.desc)
    +        return str(self.desc)
    --- End diff --
    
    I just tested with this change as below to help:
    
    - before
    
    ```python
    >>> try:
    ...     spark.range(1).select("\uc544")
    ... except Exception as e:
    ...     print e
    ...
    
    u"cannot resolve '`\uc544`' given input columns: [id];;\n'Project ['\uc544]\n+- Range (0, 1, step=1, splits=Some(8))\n"
    >>>
    >>> spark.range(1).select("\uc544")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File ".../spark/python/pyspark/sql/dataframe.py", line 992, in select
        jdf = self._jdf.select(self._jcols(*cols))
      File ".../spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
      File ".../spark/python/pyspark/sql/utils.py", line 69, in deco
        raise AnalysisException(s.split(': ', 1)[1], stackTrace)
    pyspark.sql.utils.AnalysisException: u"cannot resolve '`\uc544`' given input columns: [id];;\n'Project ['\uc544]\n+- Range (0, 1, step=1, splits=Some(8))\n"
    ```
    
    - after
    
    ```python
    >>> try:
    ...     spark.range(1).select("\uc544")
    ... except Exception as e:
    ...     print e
    ...
    Traceback (most recent call last):
      File "<stdin>", line 4, in <module>
      File ".../spark/python/pyspark/sql/utils.py", line 27, in __str__
        return str(self.desc)
    UnicodeEncodeError: 'ascii' codec can't encode character u'\uc544' in position 17: ordinal not in range(128)
    
    >>> spark.range(1).select("\uc544")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File ".../spark/python/pyspark/sql/dataframe.py", line 992, in select
        jdf = self._jdf.select(self._jcols(*cols))
      File ".../spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
      File ".../spark/python/pyspark/sql/utils.py", line 69, in deco
        raise AnalysisException(s.split(': ', 1)[1], stackTrace)
    pyspark.sql.utils.AnalysisException
    >>>
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    **[Test build #74491 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74491/testReport)** for PR 17267 at commit [`7b96e97`](https://github.com/apache/spark/commit/7b96e97b60b67cab49f3108ad84759ccb0f643e0).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    I'll take a look at reviewing this later on this week @uncleGen. Two minor thing that we can do in the meantime is make the JIRA description a bit clearer as to what the proposed change is, the other is this change isn't really tested by Jenkins - there are no tests that look at the formatting of the error strings - maybe consider adding a test or updating the description on the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/17267


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17267
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17267#discussion_r105663697
  
    --- Diff: python/pyspark/sql/utils.py ---
    @@ -24,7 +24,7 @@ def __init__(self, desc, stackTrace):
             self.stackTrace = stackTrace
     
         def __str__(self):
    -        return repr(self.desc)
    +        return str(self.desc)
    --- End diff --
    
    Maybe another benefit for this change is, before it you will see the error log in your example like:
    
    u"cannot resolve '`\uc544`' given input columns: [id];;\n'Project ['\uc544]
    
    `repr` will show unicode escape characters `\uc544`. Even you encode it, you will see binary representation for it. `str` can show the correct "\uc544" if encoded with utf-8.
    
    If I test it correctly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

Posted by uncleGen <gi...@git.apache.org>.
Github user uncleGen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17267#discussion_r105827541
  
    --- Diff: python/pyspark/sql/utils.py ---
    @@ -24,7 +24,7 @@ def __init__(self, desc, stackTrace):
             self.stackTrace = stackTrace
     
         def __str__(self):
    -        return repr(self.desc)
    +        return str(self.desc)
    --- End diff --
    
    based on latest commit:
    
    ```
    >>> df.select("\uc544")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File ".../spark/python/pyspark/sql/dataframe.py", line 992, in select
        jdf = self._jdf.select(self._jcols(*cols))
      File ".../spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
      File ".../spark/python/pyspark/sql/utils.py", line 75, in deco
        raise AnalysisException(s.split(': ', 1)[1], stackTrace)
    pyspark.sql.utils.AnalysisException
    : cannot resolve '`\uc544`' given input columns: [age, name];;
    'Project ['\uc544]
    +- Relation[age#0L,name#1] json


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org