You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "xge (Jira)" <ji...@apache.org> on 2020/03/10 10:31:00 UTC
[jira] [Created] (SPARK-31108) Parameter cannot be passed to pandas
udf of type map_iter
xge created SPARK-31108:
---------------------------
Summary: Parameter cannot be passed to pandas udf of type map_iter
Key: SPARK-31108
URL: https://issues.apache.org/jira/browse/SPARK-31108
Project: Spark
Issue Type: Question
Components: Examples
Affects Versions: 3.0.0
Reporter: xge
Parameters can only be passed in the following way:
********************************************************************
from pyspark.sql.functions import pandas_udf, PandasUDFType
def map_iter_pandas_udf_example(spark):
strr = "abcd
df = spark.createDataFrame([(1, 21),(2,30)],("id", "age"))
@pandas_udf(df.schema, PandasUDFType.MAP_ITER)
def filter_func(batch_iter, x = strr):
print(x)
for pdf in batch_iter:
yield pdf[pdf.id == 1]
df.mapInPandas(filter_func).show()
*******************************************************************
However, if the code edited as follow, error ccurred:
*******************************************************************
from pyspark.sql.functions import pandas_udf, PandasUDFType
def map_iter_pandas_udf_example(spark):
strr = "abcd
df = spark.createDataFrame([(1, 21),(2,30)],("id", "age"))
@pandas_udf(df.schema, PandasUDFType.MAP_ITER)
def filter_func(batch_iter, x = strr):
print(x)
for pdf in batch_iter:
yield pdf[pdf.id == 1]
data = "dbca"
df.mapInPandas(filter_func(data)).show()
*******************************************************************
ValueError: Invalid udf: the udf argument must be a pandas_udf of type MAP_ITER.
Does anyone know if pandas udf of type map_iter can pass parameters, and if so, how to write the code? Thanks.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org