You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "koba (Jira)" <ji...@apache.org> on 2022/06/21 12:33:00 UTC

[jira] [Created] (SPARK-39544) setPredictionCol for OneVsRest does not persist when saving model to disk

koba created SPARK-39544:
----------------------------

             Summary: setPredictionCol for OneVsRest does not persist when saving model to disk
                 Key: SPARK-39544
                 URL: https://issues.apache.org/jira/browse/SPARK-39544
             Project: Spark
          Issue Type: Improvement
          Components: ML
    Affects Versions: 3.3.0, 3.2.1, 3.2.0, 3.1.2, 3.1.1, 3.1.0, 3.0.3, 3.0.2, 3.0.1, 3.0.0
         Environment: Python 3.6

Spark 3.2
            Reporter: koba


The naming of `rawPredcitionCol` in `OneVsRest` does not persist after saving and loading a trained model. This becomes an issue when I try to stack multiple One Vs Rest models in a pipeline. Code example below. 

{{```}}

{{from pyspark.ml.classification import LinearSVC, OneVsRest, OneVsRestModel}}
{{data_path = "/sample_multiclass_classification_data.txt"}}
{{{}df = spark.read.format("libsvm").load(data_path){}}}{{{}lr = LinearSVC(regParam=0.01){}}}
{{# set the name of rawPrediction column}}
{{ovr = OneVsRest(classifier=lr, rawPredictionCol = 'raw_prediction')}}
{{{}print(ovr.getRawPredictionCol()){}}}{{{}model = ovr.fit(df){}}}{{{}model_path = 'temp' + "/ovr_model"{}}}
{{model.write().overwrite().save(model_path)}}
{{model2 = OneVsRestModel.load(model_path)}}
{{model2.getRawPredictionCol()}}

{{Output:}}

{{raw_prediction }}{{'rawPrediction'}}

{{```}}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org