You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by me...@apache.org on 2015/02/16 05:51:34 UTC

spark git commit: [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline

Repository: spark
Updated Branches:
  refs/heads/master acf2558dc -> c78a12c4c


[Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline

If it's a last estimator in Pipeline there's no need to transform data, since there's no next stage that would consume this data.

Author: Peter Rudenko <pe...@gmail.com>

Closes #4590 from petro-rudenko/patch-1 and squashes the following commits:

d13ec33 [Peter Rudenko] [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c78a12c4
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c78a12c4
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c78a12c4

Branch: refs/heads/master
Commit: c78a12c4cc4d4312c4ee1069d3b218882d32d678
Parents: acf2558
Author: Peter Rudenko <pe...@gmail.com>
Authored: Sun Feb 15 20:51:32 2015 -0800
Committer: Xiangrui Meng <me...@databricks.com>
Committed: Sun Feb 15 20:51:32 2015 -0800

----------------------------------------------------------------------
 mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/c78a12c4/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala b/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
index bb291e6..5607ed2 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
@@ -114,7 +114,9 @@ class Pipeline extends Estimator[PipelineModel] {
             throw new IllegalArgumentException(
               s"Do not support stage $stage of type ${stage.getClass}")
         }
-        curDataset = transformer.transform(curDataset, paramMap)
+        if (index < indexOfLastEstimator) {
+          curDataset = transformer.transform(curDataset, paramMap)
+        }
         transformers += transformer
       } else {
         transformers += stage.asInstanceOf[Transformer]


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org