You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "ZhongYu (JIRA)" <ji...@apache.org> on 2018/06/27 07:21:00 UTC
[jira] [Created] (SPARK-24666) Word2Vec generate infinity vectors
when numIterations are large
ZhongYu created SPARK-24666:
-------------------------------
Summary: Word2Vec generate infinity vectors when numIterations are large
Key: SPARK-24666
URL: https://issues.apache.org/jira/browse/SPARK-24666
Project: Spark
Issue Type: Bug
Components: ML, MLlib
Affects Versions: 2.3.1
Environment: 2.0.X, 2.1.X, 2.2.X, 2.3.X
Reporter: ZhongYu
We found that Word2Vec generate large absolute value vectors when numIterations are large, and if numIterations are large enough (>20), the vector's value many be *infinity(or -**infinity)***, resulting in useless vectors.
In normal situations, vectors values are mainly around -1.0~1.0 when numIterations = 1.
The bug is shown on spark 2.0.X, 2.1.X, 2.2.X, 2.3.X.
There are already issues report this bug: https://issues.apache.org/jira/browse/SPARK-5261 , but the bug fix works seems missing.
Other people's reports:
[https://stackoverflow.com/questions/49741956/infinity-vectors-in-spark-mllib-word2vec]
[http://apache-spark-user-list.1001560.n3.nabble.com/word2vec-outputs-Infinity-Infinity-vectors-with-increasing-iterations-td29020.html]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org