You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by jkbradley <gi...@git.apache.org> on 2018/04/03 18:00:50 UTC

[GitHub] spark pull request #20837: [SPARK-23686][ML][WIP] Better instrumentation

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20837#discussion_r178905201
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala ---
    @@ -517,6 +517,9 @@ class LogisticRegression @Since("1.2.0") (
             (new MultivariateOnlineSummarizer, new MultiClassSummarizer)
           )(seqOp, combOp, $(aggregationDepth))
         }
    +    instr.logNamedValue(Instrumentation.loggerTags.numExamples, summarizer.count)
    +    instr.logNamedValue("lowestLabelWeight", labelSummarizer.histogram.min.toString)
    +    instr.logNamedValue("highestLabelWeight", labelSummarizer.histogram.min.toString)
    --- End diff --
    
    I'm OK with not logging the full histogram here.  There's a typo, where "highestLabelWeight" is actually logging the min (not max)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org