You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Lucas Partridge (Jira)" <ji...@apache.org> on 2023/03/16 13:56:00 UTC

[jira] [Created] (SPARK-42825) setParams() only sets explicitly named params. Is this intentional or a bug?

Lucas Partridge created SPARK-42825:
---------------------------------------

             Summary: setParams() only sets explicitly named params. Is this intentional or a bug?
                 Key: SPARK-42825
                 URL: https://issues.apache.org/jira/browse/SPARK-42825
             Project: Spark
          Issue Type: Question
          Components: ML, PySpark
    Affects Versions: 3.3.2
            Reporter: Lucas Partridge


The Python signature/docstring of the setParams() method for the estimators and transformers under pyspark.ml imply that if you don't set any of the named params then they will be reset to their default values.

Example from [https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.ml.clustering.GaussianMixture.html#pyspark.ml.clustering.GaussianMixture.setParams] :

{{{{}}}}
{code:java}
setParams(self, \*, featuresCol="features", predictionCol="prediction", k=2, probabilityCol="probability", tol=0.01, maxIter=100, seed=None, aggregationDepth=2, weightCol=None){code}
In the extreme this would imply that if you called setParams() with no args then _all_ the params would be reset to their default values.

But what actually happens is that _only_ the params passed in the call get changed; the values of any other params aren't affected. So if you call setParams() with no args then _no_ params get changed!

So is this behavior by design? I guess it is from the name of the method. But it is counter-intuitive from its docstring. So if this behavior is intentional then perhaps the default docstring should make this explicit by saying something like:

"Sets the named params. The values of other params are not affected."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org