You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "John Bauer (Jira)" <ji...@apache.org> on 2019/10/31 20:36:00 UTC

[jira] [Created] (SPARK-29691) Estimator fit method fails to copy params (in PySpark)

John Bauer created SPARK-29691:
----------------------------------

             Summary: Estimator fit method fails to copy params (in PySpark)
                 Key: SPARK-29691
                 URL: https://issues.apache.org/jira/browse/SPARK-29691
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 2.4.4
            Reporter: John Bauer


Estimator `fit` method (implemented in Params) is supposed to copy a dictionary of params, overwriting the estimator's previous values, before fitting the model.  However, the parameter values are not updated.  This was observed in PySpark, but may be present in the Java objects, as the PySpark code appears to be functioning correctly.

For example, this prints

{{Before: 0.8
After: 0.8}}

but After should be 0.75

{{from pyspark.ml.classification import LogisticRegression

# Load training data
training = spark \
    .read \
    .format("libsvm") \
    .load("data/mllib/sample_multiclass_classification_data.txt")

lr = LogisticRegression(maxIter=10, regParam=0.3, elasticNetParam=0.8)
print("Before:", lr.getOrDefault("elasticNetParam"))

# Fit the model, but with an updated parameter setting:
lrModel = lr.fit(training, params={"elasticNetParam" : 0.75})

print("After:", lr.getOrDefault("elasticNetParam"))}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org