You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Asher Krim (JIRA)" <ji...@apache.org> on 2017/01/16 15:57:26 UTC

[jira] [Created] (SPARK-19247) improve ml word2vec save/load

Asher Krim created SPARK-19247:
----------------------------------

             Summary: improve ml word2vec save/load
                 Key: SPARK-19247
                 URL: https://issues.apache.org/jira/browse/SPARK-19247
             Project: Spark
          Issue Type: Bug
            Reporter: Asher Krim


ml word2vec models can be somewhat large (~4gb is not uncommon). The current save implementation saves the model as a single large datum, which can cause rpc issues and fail to save the model.

On the loading side, there are issues with loading this large datum as well. This was already solved for mllib word2vec in https://issues.apache.org/jira/browse/SPARK-11994, but the change was never ported to the ml word2vec implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org