You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "yuhao yang (JIRA)" <ji...@apache.org> on 2017/03/24 19:06:41 UTC

[jira] [Commented] (SPARK-20082) Incremental update of LDA model, by adding initialModel as start point

    [ https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15940960#comment-15940960 ] 

yuhao yang commented on SPARK-20082:
------------------------------------

Yes, that's one of the things that we should improve for LDA.
If you're interested in working on the issue, could you please first share some rough design, given the complexity from both EM and Online optimizers and models.

> Incremental update of LDA model, by adding initialModel as start point
> ----------------------------------------------------------------------
>
>                 Key: SPARK-20082
>                 URL: https://issues.apache.org/jira/browse/SPARK-20082
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>    Affects Versions: 2.1.0
>            Reporter: Mathieu D
>
> Some mllib models support an initialModel to start from and update it incrementally with new data.
> From what I understand of OnlineLDAOptimizer, it is possible to incrementally update an existing model with batches of new documents.
> I suggest to add an initialModel as a start point for LDA.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org