You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kai Sasaki (JIRA)" <ji...@apache.org> on 2014/12/07 03:34:14 UTC

[jira] [Commented] (SPARK-4607) Add random seed to GradientBoostedTrees

    [ https://issues.apache.org/jira/browse/SPARK-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237029#comment-14237029 ] 

Kai Sasaki commented on SPARK-4607:
-----------------------------------

[~josephkb] I think each trees in iterations of GrandientBoostedTrees is always trained all training data. Is there any case when we have to do subsampling with making RandomForest? Current GrandientBoostedTrees code uses non subsampling RandomForest. 

> Add random seed to GradientBoostedTrees
> ---------------------------------------
>
>                 Key: SPARK-4607
>                 URL: https://issues.apache.org/jira/browse/SPARK-4607
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 1.2.0
>            Reporter: Joseph K. Bradley
>            Priority: Minor
>
> Gradient Boosted Trees does not take a random seed, but it uses randomness if the subsampling rate is < 1.  It should take a random seed parameter.
> This update will also help to make unit tests more stable by allowing determinism (using a small set of fixed random seeds).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org