You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sr...@apache.org on 2015/05/05 17:47:38 UTC

spark git commit: [MLLIB] [TREE] Verify size of input rdd > 0 when building meta data

Repository: spark
Updated Branches:
  refs/heads/master 9d250e64d -> d4cb38aeb


[MLLIB] [TREE] Verify size of input rdd > 0 when building meta data

Require non empty input rdd such that we can take the first labeledpoint and get the feature size

Author: Alain <ai...@usc.edu>
Author: aihe@usc.edu <ai...@usc.edu>

Closes #5810 from AiHe/decisiontree-issue and squashes the following commits:

3b1d08a [aihe@usc.edu] [MLLIB][tree] merge the assertion into the evaluation of numFeatures
cf2e567 [Alain] [MLLIB][tree] Use a rdd api to verify size of input rdd > 0 when building meta data
b448f47 [Alain] [MLLIB][tree] Verify size of input rdd > 0 when building meta data


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d4cb38ae
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d4cb38ae
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d4cb38ae

Branch: refs/heads/master
Commit: d4cb38aeb7412a353c6cbca2a9b8f9729afbaba7
Parents: 9d250e6
Author: Alain <ai...@usc.edu>
Authored: Tue May 5 16:47:34 2015 +0100
Committer: Sean Owen <so...@cloudera.com>
Committed: Tue May 5 16:47:34 2015 +0100

----------------------------------------------------------------------
 .../org/apache/spark/mllib/tree/impl/DecisionTreeMetadata.scala | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/d4cb38ae/mllib/src/main/scala/org/apache/spark/mllib/tree/impl/DecisionTreeMetadata.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/mllib/tree/impl/DecisionTreeMetadata.scala b/mllib/src/main/scala/org/apache/spark/mllib/tree/impl/DecisionTreeMetadata.scala
index f1a6ed2..f73896e 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/tree/impl/DecisionTreeMetadata.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/tree/impl/DecisionTreeMetadata.scala
@@ -107,7 +107,10 @@ private[tree] object DecisionTreeMetadata extends Logging {
       numTrees: Int,
       featureSubsetStrategy: String): DecisionTreeMetadata = {
 
-    val numFeatures = input.take(1)(0).features.size
+    val numFeatures = input.map(_.features.size).take(1).headOption.getOrElse {
+      throw new IllegalArgumentException(s"DecisionTree requires size of input RDD > 0, " +
+        s"but was given by empty one.")
+    }
     val numExamples = input.count()
     val numClasses = strategy.algo match {
       case Classification => strategy.numClasses


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org