You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Joseph K. Bradley (JIRA)" <ji...@apache.org> on 2015/04/02 19:40:53 UTC

[jira] [Updated] (SPARK-5972) Cache residuals for GradientBoostedTrees during training

     [ https://issues.apache.org/jira/browse/SPARK-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joseph K. Bradley updated SPARK-5972:
-------------------------------------
    Assignee: Manoj Kumar

> Cache residuals for GradientBoostedTrees during training
> --------------------------------------------------------
>
>                 Key: SPARK-5972
>                 URL: https://issues.apache.org/jira/browse/SPARK-5972
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 1.3.0
>            Reporter: Joseph K. Bradley
>            Assignee: Manoj Kumar
>            Priority: Minor
>
> In gradient boosting, the current model's prediction is re-computed for each training instance on every iteration.  The current residual (cumulative prediction of previously trained trees in the ensemble) should be cached.  That could reduce both computation (only computing the prediction of the most recently trained tree) and communication (only sending the most recently trained tree to the workers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org