You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by kaching <wa...@o2.pl> on 2016/10/07 08:05:02 UTC
MLlib: word2vec - words vectors into feature vector
Hi. How exacly MLlib implementation of word2vec converts word vectors
into one feature vector per row?
TEXT
[Hi, I, heard, ab..]
[I, wish, Java, c..]
[Logistic, regres.]
| word2vec
V
WORD VECTOR
heard [0.14950960874557...|
are [-0.1639076173305...|
neat [0.13949351012706...|
classes [0.03703496977686...|
I [-0.0189154129475...|
regression [0.15298652648925...|
Logistic [-0.1270201653242...|
Spark [-0.0535793155431...|
could [0.12216471135616...|
use [0.08246973901987...|
Hi [0.16548289358615...|
models [-0.0568316541612...|
case [0.11626788973808...|
about [-0.1500445008277...|
Java [-0.0407485179603...|
wish [0.11882393807172...|
| HOW?
V
TEXT RESULT
[Hi, I, heard, ab... ] [0.01849065460264...|
[I, wish, Java, c... ] [0.05958533100783...|
[Logistic, regres...] [-0.0110558800399...|
Is there a way to change this default method?
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: MLlib: word2vec - words vectors into feature vector
Posted by Sean Owen <so...@cloudera.com>.
It's just the average of the word vectors, for all words in the text.
On Fri, Oct 7, 2016 at 9:04 AM kaching <wa...@o2.pl> wrote:
> Hi. How exacly MLlib implementation of word2vec converts word vectors
> into one feature vector per row?
>
> TEXT
> [Hi, I, heard, ab..]
> [I, wish, Java, c..]
> [Logistic, regres.]
>
> | word2vec
>
> V
>
> WORD VECTOR
> heard [0.14950960874557...|
> are [-0.1639076173305...|
> neat [0.13949351012706...|
> classes [0.03703496977686...|
> I [-0.0189154129475...|
> regression [0.15298652648925...|
> Logistic [-0.1270201653242...|
> Spark [-0.0535793155431...|
> could [0.12216471135616...|
> use [0.08246973901987...|
> Hi [0.16548289358615...|
> models [-0.0568316541612...|
> case [0.11626788973808...|
> about [-0.1500445008277...|
> Java [-0.0407485179603...|
> wish [0.11882393807172...|
>
> | HOW?
>
> V
>
> TEXT RESULT
> [Hi, I, heard, ab... ] [0.01849065460264...|
> [I, wish, Java, c... ] [0.05958533100783...|
> [Logistic, regres...] [-0.0110558800399...|
>
> Is there a way to change this default method?
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>