You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "xge (Jira)" <ji...@apache.org> on 2020/03/10 10:31:00 UTC

[jira] [Created] (SPARK-31108) Parameter cannot be passed to pandas udf of type map_iter

xge created SPARK-31108:
---------------------------

             Summary: Parameter cannot be passed to pandas udf of type map_iter
                 Key: SPARK-31108
                 URL: https://issues.apache.org/jira/browse/SPARK-31108
             Project: Spark
          Issue Type: Question
          Components: Examples
    Affects Versions: 3.0.0
            Reporter: xge


Parameters can only be passed in the following way:

********************************************************************

from pyspark.sql.functions import pandas_udf, PandasUDFType

def map_iter_pandas_udf_example(spark):
    strr = "abcd
    df = spark.createDataFrame([(1, 21),(2,30)],("id", "age")) 

    @pandas_udf(df.schema, PandasUDFType.MAP_ITER)
    def filter_func(batch_iter, x = strr):
        print(x)
        for pdf in batch_iter:
            yield pdf[pdf.id == 1]

    df.mapInPandas(filter_func).show()

*******************************************************************

 

However, if the code edited as follow, error ccurred:

*******************************************************************

from pyspark.sql.functions import pandas_udf, PandasUDFType

def map_iter_pandas_udf_example(spark):
    strr = "abcd
    df = spark.createDataFrame([(1, 21),(2,30)],("id", "age")) 

    @pandas_udf(df.schema, PandasUDFType.MAP_ITER)
    def filter_func(batch_iter, x = strr):
        print(x)
        for pdf in batch_iter:
            yield pdf[pdf.id == 1]

    data = "dbca"

    df.mapInPandas(filter_func(data)).show()

*******************************************************************

ValueError: Invalid udf: the udf argument must be a pandas_udf of type MAP_ITER.

Does anyone know if pandas udf of type map_iter can pass parameters, and if so, how to write the code? Thanks.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org