You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Nils Skotara (JIRA)" <ji...@apache.org> on 2019/07/08 08:45:00 UTC

[jira] [Created] (SPARK-28295) Is there a way of getting feature names in from pyspark.ml.regression GeneralizedLinearRegression?

Nils Skotara created SPARK-28295:
------------------------------------

             Summary: Is there a way of getting feature names in from pyspark.ml.regression GeneralizedLinearRegression?
                 Key: SPARK-28295
                 URL: https://issues.apache.org/jira/browse/SPARK-28295
             Project: Spark
          Issue Type: Request
          Components: Build
    Affects Versions: 2.3.1
            Reporter: Nils Skotara
             Fix For: 2.3.1


In from pyspark.ml.regression

when I fit a GeneralizedLinearRegression like this:
glr = GeneralizedLinearRegression(family="gaussian", link="identity",
 regParam=0.3, maxIter=10)
model = glr.fit(someData)

It seems like there is no way to get the matching of the features and their coefficients or standard errors. I am using an ugly work around like this right now:



field = model.summary._call_java('getClass').getDeclaredField("coefficientsWithStatistics")
object2 = model._call_java('summary')
field.setAccessible(True)
value = field.get(object2)

coef_value = {}

for i in range(0, len(value)):
   row = value[i].toString()
   values = row.split(',')
   coef_value[values[0].replace('(', '').replace(')', '')] = float(values[1])


Am I missing something?
If not, I'd like to request a method similar to model.coefficients with which one can just get the feature names in the right order, like model.features or something like that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org