You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by srowen <gi...@git.apache.org> on 2017/10/03 07:19:55 UTC
[GitHub] spark pull request #19372: [SPARK-22156][MLLIB] Fix update equation of learn...
Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/19372#discussion_r142328125
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala ---
@@ -368,11 +371,12 @@ class Word2Vec extends Serializable with Logging {
var wc = wordCount
if (wordCount - lastWordCount > 10000) {
lwc = wordCount
- // TODO: discount by iteration?
- alpha =
- learningRate * (1 - numPartitions * wordCount.toDouble / (trainWordsCount + 1))
+ alpha = learningRate *
+ (1 - (numPartitions * wordCount.toDouble + numWordsProcessedInPreviousIterations) /
+ totalWordsCounts)
if (alpha < learningRate * 0.0001) alpha = learningRate * 0.0001
- logInfo("wordCount = " + wordCount + ", alpha = " + alpha)
+ logInfo("wordCount = " + (wordCount + numWordsProcessedInPreviousIterations) +
--- End diff --
If you update this again, you can use string interpolation: `logInfo(s"wordCount = ${wordCount + ...}, alpha = $alpha")`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org