You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by smurching <gi...@git.apache.org> on 2017/10/25 21:49:17 UTC

[GitHub] spark pull request #19381: [SPARK-10884][ML] Support prediction on single in...

Github user smurching commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19381#discussion_r146986798
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala ---
    @@ -267,6 +268,24 @@ class DecisionTreeClassifierSuite
           Vector, DecisionTreeClassificationModel](newTree, newData)
       }
     
    +  test("prediction on single instance") {
    +    val rdd = continuousDataPointsForMulticlassRDD
    +    val dt = new DecisionTreeClassifier()
    +      .setImpurity("Gini")
    +      .setMaxDepth(4)
    +      .setMaxBins(100)
    +    val categoricalFeatures = Map(0 -> 3)
    +    val numClasses = 3
    +
    +    val newData: DataFrame = TreeTests.setMetadata(rdd, categoricalFeatures, numClasses)
    +    val newTree = dt.fit(newData)
    +
    +    newTree.transform(newData).select(dt.getFeaturesCol, dt.getPredictionCol).collect().foreach {
    +      case Row(features: Vector, prediction: Double) =>
    +        assert(prediction ~== newTree.predict(features) relTol 1E-5)
    --- End diff --
    
    Can we test exact equality (e.g. `prediction === newTree.predict(features)`) here and in other unit tests?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org