You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov> on 2015/08/01 04:12:27 UTC

Lucene / Solr and Topic Modeling?

Hey Folks,

Does anyone know of a good ALv2 compatible approach to Lucene and
to topic modeling? I’m looking to not have to do it post-facto
e.g. with a specific library, but to actually perform topic modeling
like LDA (or something else) while building the index.

The topic modeling needs to be scalable and dynamic - e.g., if I
change a query on years, the topics should be updated accordingly.
Is this possible with Lucene?

I’ve found this:

https://github.com/stepthom/lucene-lda


But it seems like it stopped short of the calls to actual topic
modeling e.g., with MALLET, etc.

Thanks for any help here.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Lucene / Solr and Topic Modeling?

Posted by Koji Sekiguchi <ko...@rondhuit.com>.
Hi Chris,

Just an FYI.

NLP4L has a function that extracts document vectors (in libsvm format) from Lucene index.
Spark MLlib can be used for executing LDA on it.

We have a short tutorial about it. See "Clustering" section
in "Working with Spark" chapter.

http://nlp4l.github.io/tutorial.html#useWithSpark

Koji

On 2015/08/01 11:12, Mattmann, Chris A (3980) wrote:
> Hey Folks,
>
> Does anyone know of a good ALv2 compatible approach to Lucene and
> to topic modeling? I’m looking to not have to do it post-facto
> e.g. with a specific library, but to actually perform topic modeling
> like LDA (or something else) while building the index.
>
> The topic modeling needs to be scalable and dynamic - e.g., if I
> change a query on years, the topics should be updated accordingly.
> Is this possible with Lucene?
>
> I’ve found this:
>
> https://github.com/stepthom/lucene-lda
>
>
> But it seems like it stopped short of the calls to actual topic
> modeling e.g., with MALLET, etc.
>
> Thanks for any help here.
>
> Cheers,
> Chris
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org