You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "rohit agarwal (Jira)" <ji...@apache.org> on 2020/10/20 04:25:00 UTC

[jira] [Created] (SPARK-33188) PipelineModel load resulting in error

rohit agarwal created SPARK-33188:
-------------------------------------

             Summary: PipelineModel load resulting in error
                 Key: SPARK-33188
                 URL: https://issues.apache.org/jira/browse/SPARK-33188
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 3.0.1
         Environment: Spark 3.0.1

Python 3.6.8

numpy 1.18.5
            Reporter: rohit agarwal


Steps to reproduce:
 # Define pipeline : Pipeline(stages=[discretizer, one_hot_encoder, cv1, cv2, assembler])
 # Save PipelineModel : PipelineModel.write().save('/path')
 # Load PipelineModel : PipelineModel.load('/path')

Getting following error:

TypeError: array() argument 1 must be a unicode character, not bytes

Error is in PickleSerializer class in park-3.0.1-bin-hadoop2.7/python/pyspark/serializers.py

Changing pickle.loads(obj, encoding=encoding) to pickle.loads(obj) resolves it.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org