You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by WeichenXu123 <gi...@git.apache.org> on 2018/05/04 09:56:02 UTC

[GitHub] spark pull request #21097: [SPARK-14682][ML] Provide evaluateEachIteration m...

Github user WeichenXu123 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21097#discussion_r186037589
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/GBTClassifierSuite.scala ---
    @@ -365,6 +365,20 @@ class GBTClassifierSuite extends MLTest with DefaultReadWriteTest {
         assert(mostImportantFeature !== mostIF)
       }
     
    +  test("model evaluateEachIteration") {
    +    for (lossType <- Seq("logistic")) {
    +      val gbt = new GBTClassifier()
    +        .setMaxDepth(2)
    +        .setMaxIter(2)
    +        .setLossType(lossType)
    +      val model = gbt.fit(trainData.toDF)
    +      val eval1 = model.evaluateEachIteration(validationData.toDF)
    +      val eval2 = GradientBoostedTrees.evaluateEachIteration(validationData,
    --- End diff --
    
    I search scikit-learn doc, there seems no similar method like `evaluateEachIteration`, we can only use `staged_predict` in `sklearn.ensemble.GradientBoostingRegressor` and then implement almost the whole logic again. In R package I also do not find this method.
    Now I update the unit test, to compare with hardcoded result.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org