You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Paolo Platter <pa...@agilelab.it> on 2014/09/10 16:36:44 UTC

Spark & NLP

Hi all,

What is your preferred scala NLP lib ? why ?
Is there any items on the spark’s road map to integrate NLP features ?

I basically need to perform NER line by line, so I don’t need a deep integration with the distributed engine.
I only want simple dependencies and the chance to build a dictionary for italian Language.

Any suggestions ?

Thanks

Paolo Platter


Re: Spark & NLP

Posted by Oleksandr Olgashko <al...@gmail.com>.
Factorie (https://github.com/factorie/factorie) might be what you need (i'd
suggest conditional random field/maximum entropy Markov mode for NER).
Also you can use Java's libraries (for example, i'm using opennlp's
implementation of MEMM for close-to-NER-problem in Scala)
Chalk (https://github.com/scalanlp/chalk) is opennlp-based NLP library, it
also may be a good fit.

Re: Spark & NLP

Posted by andy petrella <an...@gmail.com>.
never tried but might fit your need: http://www.scalanlp.org/
It's the parent project of both breeze (already part of spark) and epic.

However you'll have to train for IT (not part of the supported list)

(actually I never used it because for my very small needs, I generally just
perform a small naive bayes trial, which does more or less the job ^^)

aℕdy ℙetrella
about.me/noootsab
[image: aℕdy ℙetrella on about.me]

<http://about.me/noootsab>

On Wed, Sep 10, 2014 at 4:36 PM, Paolo Platter <pa...@agilelab.it>
wrote:

>  Hi all,
>
>  What is your preferred scala NLP lib ? why ?
> Is there any items on the spark’s road map to integrate NLP features ?
>
>  I basically need to perform NER line by line, so I don’t need a deep
> integration with the distributed engine.
> I only want simple dependencies and the chance to build a dictionary for
> italian Language.
>
>  Any suggestions ?
>
>  Thanks
>
>  Paolo Platter
>
>