You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by pseudo oduesp <ps...@gmail.com> on 2016/07/12 17:30:34 UTC

Feature importance IN random forest

Hi,
 i use pyspark 1.5.0
can i  ask you how i can get feature imprtance for a randomforest
algorithme in pyspark and please give me example
thanks for advance.

Re: Feature importance IN random forest

Posted by Yanbo Liang <yb...@gmail.com>.
Spark 1.5 only support getting feature importance for
RandomForestClassificationModel and RandomForestRegressionModel by Scala.
We support this feature in PySpark until 2.0.0.

It's very straight forward with a few lines of code.

rf = RandomForestClassifier(numTrees=3, maxDepth=2, labelCol="indexed", seed=42)

model = rf.fit(td)

model.featureImportances

Then you can get the feature importances which is a Vector.

Thanks
Yanbo

2016-07-12 10:30 GMT-07:00 pseudo oduesp <ps...@gmail.com>:

> Hi,
>  i use pyspark 1.5.0
> can i  ask you how i can get feature imprtance for a randomforest
> algorithme in pyspark and please give me example
> thanks for advance.
>