You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Artem Rybin (JIRA)" <ji...@apache.org> on 2019/03/19 07:54:00 UTC

[jira] [Commented] (SPARK-27052) Using PySpark udf in transform yields NULL values

    [ https://issues.apache.org/jira/browse/SPARK-27052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795797#comment-16795797 ] 

Artem Rybin commented on SPARK-27052:
-------------------------------------

Hi [~hejsgpuom62c]!

I reproduced this issue. I would like to investigate this.

Please, assign it to me.

> Using PySpark udf in transform yields NULL values
> -------------------------------------------------
>
>                 Key: SPARK-27052
>                 URL: https://issues.apache.org/jira/browse/SPARK-27052
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, SQL
>    Affects Versions: 2.4.0
>            Reporter: hejsgpuom62c
>            Priority: Major
>
> Steps to reproduce
> {code:java}
> from typing import Optional
> from pyspark.sql.functions import expr
> def f(x: Optional[int]) -> Optional[int]:
>     return x + 1 if x is not None else None
> spark.udf.register('f', f, "integer")
> df = (spark
>     .createDataFrame([(1, [1, 2, 3])], ("id", "xs"))
>     .withColumn("xsinc", expr("transform(xs, x -> f(x))")))
> df.show()
> # +---+---------+-----+
> # | id|       xs|xsinc|
> # +---+---------+-----+
> # |  1|[1, 2, 3]| [,,]|
> # +---+---------+-----+
> {code}
>  
> Source https://stackoverflow.com/a/53762650



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org