You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Cezary Dendek (JIRA)" <ji...@apache.org> on 2017/05/16 11:52:04 UTC

[jira] [Updated] (SPARK-20767) The training continuation for saved LDA model

     [ https://issues.apache.org/jira/browse/SPARK-20767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cezary Dendek updated SPARK-20767:
----------------------------------
    Summary: The training continuation for saved LDA model  (was: The LDA model update)

> The training continuation for saved LDA model
> ---------------------------------------------
>
>                 Key: SPARK-20767
>                 URL: https://issues.apache.org/jira/browse/SPARK-20767
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>    Affects Versions: 2.1.1
>            Reporter: Cezary Dendek
>            Priority: Minor
>
> Current online implementation of the LDA model fit (OnlineLDAOptimizer) does not support the model update (ie. to account for the population/covariates drift) nor the continuation of model fitting in case of the insufficient number of iterations.
> Technical aspects:
> 1. The implementation of LDA fitting does not currently allow the coefficients pre-setting (private setter), as noted by a comment in the source code of OnlineLDAOptimizer.setLambda: "This is only used for testing now. In the future, it can help support training stop/resume".
> 2. The lambda matrix is always randomly initialized by the optimizer, which needs fixing for preset lambda matrix.
> The adaptation of the classes by the user is not possible due to protected setters & sealed / final classes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org