You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (JIRA)" <ji...@apache.org> on 2019/05/23 19:19:00 UTC
[jira] [Created] (MADLIB-1352) Add warm start to LDA
Frank McQuillan created MADLIB-1352:
---------------------------------------
Summary: Add warm start to LDA
Key: MADLIB-1352
URL: https://issues.apache.org/jira/browse/MADLIB-1352
Project: Apache MADlib
Issue Type: New Feature
Components: Module: Parallel Latent Dirichlet Allocation
Reporter: Frank McQuillan
Fix For: v2.0
In LDA
http://madlib.apache.org/docs/latest/group__grp__lda.html
make stopping criteria on perplexity rather than just number of iterations.
Suggested approach is to do what scikit-learn does
https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html
evaluate_every : int, optional (default=0)
How often to evaluate perplexity. Only used in fit method. set it to 0 or negative number to not evalute perplexity in training at all. Evaluating perplexity can help you check convergence in training process, but it will also increase total training time. Evaluating perplexity in every iteration might increase training time up to two-fold.
perp_tol : float, optional (default=1e-1)
Perplexity tolerance in batch learning. Only used when evaluate_every is greater than 0.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)