You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2022/08/02 01:37:00 UTC

[jira] [Commented] (SPARK-39938) The prefix/suffix parameter validation in add_prefix/add_suffix should follow the pandas behavior

    [ https://issues.apache.org/jira/browse/SPARK-39938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573985#comment-17573985 ] 

Apache Spark commented on SPARK-39938:
--------------------------------------

User 'bzhaoopenstack' has created a pull request for this issue:
https://github.com/apache/spark/pull/37365

> The prefix/suffix parameter validation in add_prefix/add_suffix should follow the pandas behavior
> -------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-39938
>                 URL: https://issues.apache.org/jira/browse/SPARK-39938
>             Project: Spark
>          Issue Type: Bug
>          Components: Pandas API on Spark
>    Affects Versions: 3.2.2
>         Environment: Pandas Version: 1.3.X/1.4.X
> PySpark: Master
>            Reporter: bo zhao
>            Priority: Major
>
> We need to follow the pandas behavior of prefix/suffix parameter validation in add_prefix/add_suffix.
> Now, we force to validate it as a String type. But pandas looks all values which can be formated as String(implement __str__ func). So it's different here.
> PySpark:
> {code:java}
> >>> from pyspark import pandas as ps
> >>> df = ps.DataFrame({'A': [1, 2, 3, 4], 'B': [3, 4, 5, 6]}, columns=['A', 'B'])
> >>> df.add_suffix(666)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/home/spark/spark/python/pyspark/pandas/frame.py", line 9060, in add_suffix
>     assert isinstance(suffix, str)
> AssertionError
> >>> df.add_suffix(True)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/home/spark/spark/python/pyspark/pandas/frame.py", line 9060, in add_suffix
>     assert isinstance(suffix, str)
> AssertionError {code}
> Pandas:
> {code:java}
> >>> pd.__version__
> '1.3.5'
> >>> pdf.add_suffix(0.1)
>    A0.1  B0.1
> 0     1     3
> 1     2     4
> 2     3     5
> 3     4     6 
> >>> pdf.add_suffix(True)
>    ATrue  BTrue
> 0      1      3
> 1      2      4
> 2      3      5
> 3      4      6
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org