You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Maximilian Michels (JIRA)" <ji...@apache.org> on 2015/09/17 17:23:06 UTC

[jira] [Commented] (SPARK-2613) CLONE - word2vec: Distributed Representation of Words

    [ https://issues.apache.org/jira/browse/SPARK-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14803072#comment-14803072 ] 

Maximilian Michels commented on SPARK-2613:
-------------------------------------------

User 'nikste' has created a pull request for this issue:
https://github.com/apache/flink/pull/1106

> CLONE - word2vec: Distributed Representation of Words
> -----------------------------------------------------
>
>                 Key: SPARK-2613
>                 URL: https://issues.apache.org/jira/browse/SPARK-2613
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: Yifan Yang
>            Assignee: Xiangrui Meng
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> We would like to add parallel implementation of word2vec to MLlib. word2vec finds distributed representation of words through training of large data sets. The Spark programming model fits nicely with word2vec as the training algorithm of word2vec is embarrassingly parallel. We will focus on skip-gram model and negative sampling in our initial implementation. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org