You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Maandy <dy...@gmail.com> on 2015/09/09 11:01:12 UTC

[MLlib] Extensibility of MLlib classes (Word2VecModel etc.)

Hey,

I'm trying to implement doc2vec
(http://cs.stanford.edu/~quocle/paragraph_vector.pdf, mainly for
sport/research purpose due to all it's limitations so I would probably not
even try to PR it into MLlib itself) but to do that it would be highly
useful to have access to MLlib's Word2VecModel class, which is mostly
private. Is there any reason (i.e. some Spark/MLlib guidelines) for that or
would it be ok to refactor the code and make a PR? I've found a similar JIRA
issue which was posted almost a year ago but for some reason it got closed:
https://issues.apache.org/jira/browse/SPARK-4101.

Mateusz



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/MLlib-Extensibility-of-MLlib-classes-Word2VecModel-etc-tp14011.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: [MLlib] Extensibility of MLlib classes (Word2VecModel etc.)

Posted by Joseph Bradley <jo...@databricks.com>.
We tend to resist opening up APIs unless there's a strong reason to and we
feel reasonably confident that the API will remain stable.  That allows us
to make fixes if we realize there are issues with those APIs.  But if you
have an important use case, I'd recommend opening up a JIRA to discuss it.
Joseph

On Wed, Sep 9, 2015 at 2:01 AM, Maandy <dy...@gmail.com> wrote:

> Hey,
>
> I'm trying to implement doc2vec
> (http://cs.stanford.edu/~quocle/paragraph_vector.pdf, mainly for
> sport/research purpose due to all it's limitations so I would probably not
> even try to PR it into MLlib itself) but to do that it would be highly
> useful to have access to MLlib's Word2VecModel class, which is mostly
> private. Is there any reason (i.e. some Spark/MLlib guidelines) for that or
> would it be ok to refactor the code and make a PR? I've found a similar
> JIRA
> issue which was posted almost a year ago but for some reason it got closed:
> https://issues.apache.org/jira/browse/SPARK-4101.
>
> Mateusz
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/MLlib-Extensibility-of-MLlib-classes-Word2VecModel-etc-tp14011.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>