You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Max Härtwig (Jira)" <ji...@apache.org> on 2020/01/09 16:53:00 UTC

[jira] [Created] (SPARK-30473) PySpark enum subclass crashes when used inside UDF

Max Härtwig created SPARK-30473:
-----------------------------------

             Summary: PySpark enum subclass crashes when used inside UDF
                 Key: SPARK-30473
                 URL: https://issues.apache.org/jira/browse/SPARK-30473
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 2.4.4
         Environment: Databricks Runtime 6.2 (includes Apache Spark 2.4.4, Scala 2.11)
            Reporter: Max Härtwig


PySpark enum subclass crashes when used inside a UDF.

 

Example:

 
{code:java}
from enum import Enum
class Direction(Enum):
    NORTH = 0
    SOUTH = 1
{code}
 

Working:

 
{code:java}
Direction.NORTH{code}
 

 

Crashing:

 
{code:java}
@udf
def fn(a):
    Direction.NORTH
    return ""

df.withColumn("test", fn("a")){code}
 

 

Stacktrace:

 
{noformat}
SparkException: Job aborted due to stage failure: Task 0 in stage 9.0 failed 4 times, most recent failure: Lost task 0.3 in stage 9.0 (TID 235, 10.139.64.21, executor 0): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/databricks/spark/python/pyspark/serializers.py", line 182, in _read_with_length return self.loads(obj) File "/databricks/spark/python/pyspark/serializers.py", line 695, in loads return pickle.loads(obj, encoding=encoding) File "/databricks/python/lib/python3.7/enum.py", line 152, in __new__ enum_members = {k: classdict[k] for k in classdict._member_names} AttributeError: 'dict' object has no attribute '_member_names'{noformat}
 

 

I suspect the problem is in `python/pyspark/cloudpickle.py`. On line 586 in the function `_save_dynamic_enum`, the attribute `_member_names` is removed from the enum. Yet, this attribute is required by the `Enum` class and Enum subclasses will crash.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org