You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by heszak <hz...@collabware.com> on 2015/03/18 22:37:28 UTC

Does newly-released LDA (Latent Dirichlet Allocation) algorithm supports ngrams?

I wonder to know whether the newly-released LDA (Latent Dirichlet Allocation)
algorithm only supports uni-gram or it can also supports bi/tri-grams too?
If it can, can someone help me how I can use them?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Does-newly-released-LDA-Latent-Dirichlet-Allocation-algorithm-supports-ngrams-tp22131.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Does newly-released LDA (Latent Dirichlet Allocation) algorithm supports ngrams?

Posted by Charles Earl <ch...@gmail.com>.
Heszak,
I have only glanced at it but you should be able to incorporate tokens
approximating n-gram yourself, say by using the lucene
ShingleAnalyzerWrapper API
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/shingle/ShingleAnalyzerWrapper.html
You might also take a glance at http://www.mimno.org/articles/phrases/
C

On Wed, Mar 18, 2015 at 5:37 PM, heszak <hz...@collabware.com> wrote:

> I wonder to know whether the newly-released LDA (Latent Dirichlet
> Allocation)
> algorithm only supports uni-gram or it can also supports bi/tri-grams too?
> If it can, can someone help me how I can use them?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Does-newly-released-LDA-Latent-Dirichlet-Allocation-algorithm-supports-ngrams-tp22131.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>


-- 
- Charles