You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Artem Rybin (JIRA)" <ji...@apache.org> on 2019/03/19 07:54:00 UTC
[jira] [Commented] (SPARK-27052) Using PySpark udf in transform
yields NULL values
[ https://issues.apache.org/jira/browse/SPARK-27052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795797#comment-16795797 ]
Artem Rybin commented on SPARK-27052:
-------------------------------------
Hi [~hejsgpuom62c]!
I reproduced this issue. I would like to investigate this.
Please, assign it to me.
> Using PySpark udf in transform yields NULL values
> -------------------------------------------------
>
> Key: SPARK-27052
> URL: https://issues.apache.org/jira/browse/SPARK-27052
> Project: Spark
> Issue Type: Bug
> Components: PySpark, SQL
> Affects Versions: 2.4.0
> Reporter: hejsgpuom62c
> Priority: Major
>
> Steps to reproduce
> {code:java}
> from typing import Optional
> from pyspark.sql.functions import expr
> def f(x: Optional[int]) -> Optional[int]:
> return x + 1 if x is not None else None
> spark.udf.register('f', f, "integer")
> df = (spark
> .createDataFrame([(1, [1, 2, 3])], ("id", "xs"))
> .withColumn("xsinc", expr("transform(xs, x -> f(x))")))
> df.show()
> # +---+---------+-----+
> # | id| xs|xsinc|
> # +---+---------+-----+
> # | 1|[1, 2, 3]| [,,]|
> # +---+---------+-----+
> {code}
>
> Source https://stackoverflow.com/a/53762650
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org