You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2020/03/23 05:09:00 UTC

[jira] [Resolved] (SPARK-31169) Random Forest in SparkML 2.3.3 vs 2.4.x

     [ https://issues.apache.org/jira/browse/SPARK-31169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-31169.
----------------------------------
    Resolution: Invalid

Let's ask questions into mailing list before filing it as an issue.

> Random Forest in SparkML 2.3.3 vs 2.4.x
> ---------------------------------------
>
>                 Key: SPARK-31169
>                 URL: https://issues.apache.org/jira/browse/SPARK-31169
>             Project: Spark
>          Issue Type: Question
>          Components: ML
>    Affects Versions: 2.3.3, 2.4.0, 2.4.3
>            Reporter: Nguyen Nhanduc
>            Priority: Major
>              Labels: MLLib,, RandomForest, SparkML
>         Attachments: spark233.jpg, spark240.jpg, spark243.jpg
>
>
> Hi all,
> When I trained the model with the Random Forest algorithm, I got different results in different versions of spark, the same input, label ratio, hyperparameter for all training. Detailed training results in the attached file. Model training results with spark 2.3.3 are much better, so I want to ask if there have been any changes to the random forest (or other algorithms) in mllib?
> Many thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org