You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2017/07/12 21:32:01 UTC
[jira] [Created] (SPARK-21394) Reviving broken callable objects in
UDF in PySpark
Hyukjin Kwon created SPARK-21394:
------------------------------------
Summary: Reviving broken callable objects in UDF in PySpark
Key: SPARK-21394
URL: https://issues.apache.org/jira/browse/SPARK-21394
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 2.2.0, 2.3.0
Reporter: Hyukjin Kwon
After SPARK-19161, we happened to break callable objects as UDFs in Python as below:
{code}
>>> from pyspark.sql import functions
>>> class F(object):
... def __call__(self, x):
... return x
...
>>> foo = F()
>>> foo(1)
1
>>> udf = functions.udf(foo)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../spark/python/pyspark/sql/functions.py", line 2142, in udf
return _udf(f=f, returnType=returnType)
File ".../spark/python/pyspark/sql/functions.py", line 2133, in _udf
return udf_obj._wrapped()
File ".../spark/python/pyspark/sql/functions.py", line 2090, in _wrapped
@functools.wraps(self.func)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/functools.py", line 33, in update_wrapper
setattr(wrapper, attr, getattr(wrapped, attr))
AttributeError: F instance has no attribute '__name__'
{code}
Note that this works in Spark 2.1 as below:
{code}
>>> from pyspark.sql import functions
>>> class F(object):
... def __call__(self, x):
... return x
...
>>> foo = F()
>>> foo(1)
1
>>> udf = functions.udf(foo)
>>> spark.range(1).select(udf("id")).show()
+-----+
|F(id)|
+-----+
| 0|
+-----+
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org