You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jkbradley <gi...@git.apache.org> on 2018/04/03 18:00:50 UTC
[GitHub] spark pull request #20837: [SPARK-23686][ML][WIP] Better instrumentation
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/20837#discussion_r178905201
--- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala ---
@@ -517,6 +517,9 @@ class LogisticRegression @Since("1.2.0") (
(new MultivariateOnlineSummarizer, new MultiClassSummarizer)
)(seqOp, combOp, $(aggregationDepth))
}
+ instr.logNamedValue(Instrumentation.loggerTags.numExamples, summarizer.count)
+ instr.logNamedValue("lowestLabelWeight", labelSummarizer.histogram.min.toString)
+ instr.logNamedValue("highestLabelWeight", labelSummarizer.histogram.min.toString)
--- End diff --
I'm OK with not logging the full histogram here. There's a typo, where "highestLabelWeight" is actually logging the min (not max)
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org