You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sr...@apache.org on 2017/09/21 19:05:47 UTC
spark git commit: [SPARK-22075][ML] GBTs unpersist datasets cached by
Checkpointer
Repository: spark
Updated Branches:
refs/heads/master 9cac249fd -> b21b806ec
[SPARK-22075][ML] GBTs unpersist datasets cached by Checkpointer
## What changes were proposed in this pull request?
`PeriodicRDDCheckpointer` will automatically persist the last 3 datasets called by `PeriodicRDDCheckpointer.update()`.
In GBTs, the last 3 intermediate rdds are still cached after `fit()`
## How was this patch tested?
existing tests and local test in spark-shell
Author: Zheng RuiFeng <ru...@foxmail.com>
Closes #19288 from zhengruifeng/gbt_unpersist.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b21b806e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b21b806e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b21b806e
Branch: refs/heads/master
Commit: b21b806ecc55f15575833c1e859c35ae391ff369
Parents: 9cac249
Author: Zheng RuiFeng <ru...@foxmail.com>
Authored: Thu Sep 21 20:05:44 2017 +0100
Committer: Sean Owen <so...@cloudera.com>
Committed: Thu Sep 21 20:05:44 2017 +0100
----------------------------------------------------------------------
.../scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala | 2 ++
1 file changed, 2 insertions(+)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/spark/blob/b21b806e/mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala b/mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala
index ce2bd7b..e32447a 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala
@@ -360,7 +360,9 @@ private[spark] object GradientBoostedTrees extends Logging {
logInfo("Internal timing for DecisionTree:")
logInfo(s"$timer")
+ predErrorCheckpointer.unpersistDataSet()
predErrorCheckpointer.deleteAllCheckpoints()
+ validatePredErrorCheckpointer.unpersistDataSet()
validatePredErrorCheckpointer.deleteAllCheckpoints()
if (persistedInput) input.unpersist()
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org