You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by WeichenXu123 <gi...@git.apache.org> on 2018/05/03 10:33:22 UTC

[GitHub] spark pull request #21218: [SPARK-24155][ML] Instrumentation improvements fo...

Github user WeichenXu123 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21218#discussion_r185756193
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
    @@ -423,6 +423,8 @@ class GaussianMixture @Since("2.0.0") (
         val summary = new GaussianMixtureSummary(model.transform(dataset),
           $(predictionCol), $(probabilityCol), $(featuresCol), $(k), logLikelihood)
         model.setSummary(Some(summary))
    +    instr.logNamedValue("logLikelihood", logLikelihood)
    +    instr.logNamedValue("clusterSizes", summary.clusterSizes.toString)
    --- End diff --
    
    The `clusterSizes.toString` will get an unreadable object address string.
    We need to print the content of the array. I suggest add a method like:
    `def logNamedArray[T](array: Array[T]): Unit`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org