You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Ruifeng Zheng (Jira)" <ji...@apache.org> on 2023/01/22 09:55:00 UTC
[jira] [Resolved] (SPARK-41772) Enable pyspark.sql.connect.column.Column.withField doctest

     [ https://issues.apache.org/jira/browse/SPARK-41772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ruifeng Zheng resolved SPARK-41772.
-----------------------------------
    Fix Version/s: 3.4.0
       Resolution: Fixed

Issue resolved by pull request 39699
[https://github.com/apache/spark/pull/39699]

> Enable pyspark.sql.connect.column.Column.withField doctest
> ----------------------------------------------------------
>
>                 Key: SPARK-41772
>                 URL: https://issues.apache.org/jira/browse/SPARK-41772
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Connect
>    Affects Versions: 3.4.0
>            Reporter: Hyukjin Kwon
>            Assignee: Ruifeng Zheng
>            Priority: Major
>             Fix For: 3.4.0
>
>
> It fails as below:
> {code}
> File "/.../spark/python/pyspark/sql/connect/column.py", line 391, in pyspark.sql.connect.column.Column.withField
> Failed example:
>     df.withColumn('a', df['a'].withField('b', lit(3))).select('a.b').show()
> Exception raised:
>     Traceback (most recent call last):
>       File "/.../miniconda3/envs/python3.9/lib/python3.9/doctest.py", line 1336, in __run
>         exec(compile(example.source, filename, "single",
>       File "<doctest pyspark.sql.connect.column.Column.withField[3]>", line 1, in <module>
>         df.withColumn('a', df['a'].withField('b', lit(3))).select('a.b').show()
>       File "/.../spark/python/pyspark/sql/connect/dataframe.py", line 538, in show
>         print(self._show_string(n, truncate, vertical))
>       File "/.../spark/python/pyspark/sql/connect/dataframe.py", line 424, in _show_string
>         pdf = DataFrame.withPlan(
>       File "/.../python/pyspark/sql/connect/dataframe.py", line 910, in toPandas
>         return self._session.client.to_pandas(query)
>       File "/.../python/pyspark/sql/connect/client.py", line 413, in to_pandas
>         return self._execute_and_fetch(req)
>       File "/.../spark/python/pyspark/sql/connect/client.py", line 573, in _execute_and_fetch
>         self._handle_error(rpc_error)
>       File "/.../spark/python/pyspark/sql/connect/client.py", line 627, in _handle_error
>         raise SparkConnectException(str(rpc_error)) from None
>     pyspark.sql.connect.client.SparkConnectException: <_MultiThreadedRendezvous of RPC that terminated with:
>     	status = StatusCode.UNKNOWN
>     	details = "Expression with ID: 0 is not supported"
>     	debug_error_string = "UNKNOWN:Error received from peer ipv6:%5B::1%5D:15002 {grpc_message:"Expression with ID: 0 is not supported", grpc_status:2, created_time:"2022-12-29T21:25:46.707558+09:00"}"
>     >
> **********************************************************************
> File "/Users/hyukjin.kwon/workspace/forked/spark/python/pyspark/sql/connect/column.py", line 397, in pyspark.sql.connect.column.Column.withField
> Failed example:
>     df.withColumn('a', df['a'].withField('d', lit(4))).select('a.d').show()
> Exception raised:
>     Traceback (most recent call last):
>       File "/.../miniconda3/envs/python3.9/lib/python3.9/doctest.py", line 1336, in __run
>         exec(compile(example.source, filename, "single",
>       File "<doctest pyspark.sql.connect.column.Column.withField[4]>", line 1, in <module>
>         df.withColumn('a', df['a'].withField('d', lit(4))).select('a.d').show()
>       File "/.../spark/python/pyspark/sql/connect/dataframe.py", line 538, in show
>         print(self._show_string(n, truncate, vertical))
>       File "/.../spark/python/pyspark/sql/connect/dataframe.py", line 424, in _show_string
>         pdf = DataFrame.withPlan(
>       File "/.../spark/python/pyspark/sql/connect/dataframe.py", line 910, in toPandas
>         return self._session.client.to_pandas(query)
>       File "/.../spark/python/pyspark/sql/connect/client.py", line 413, in to_pandas
>         return self._execute_and_fetch(req)
>       File "/.../spark/python/pyspark/sql/connect/client.py", line 573, in _execute_and_fetch
>         self._handle_error(rpc_error)
>       File "/.../spark/python/pyspark/sql/connect/client.py", line 627, in _handle_error
>         raise SparkConnectException(str(rpc_error)) from None
>     pyspark.sql.connect.client.SparkConnectException: <_MultiThreadedRendezvous of RPC that terminated with:
>     	status = StatusCode.UNKNOWN
>     	details = "Expression with ID: 0 is not supported"
>     	debug_error_string = "UNKNOWN:Error received from peer ipv6:%5B::1%5D:15002 {created_time:"2022-12-29T21:25:46.71644+09:00", grpc_status:2, grpc_message:"Expression with ID: 0 is not supported"}"
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org