You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Arthur Chan <ar...@gmail.com> on 2015/10/15 18:57:53 UTC
word2vec cosineSimilarity
Hi,
I am trying sample word2vec from
http://spark.apache.org/docs/latest/mllib-feature-extraction.html#example
Following are my test results:
scala> for((synonym, cosineSimilarity) <- synonyms) {
| println(s"$synonym $cosineSimilarity")
| }
taiwan 2.0518918365726297
japan 1.8960962308732054
korea 1.8789320149319788
thailand 1.7549218525671182
mongolia 1.7375501108635814
I got the values cosineSimilarity are all greater than 1, should the
cosineSimilarity be the values between 0 to 1?
How can I get the values of Similarity in 0 to 1?
Regards