You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by "Jim - FooBar();" <ji...@gmail.com> on 2013/01/11 20:13:42 UTC
[ANN] - annotator-clj gets built-in support for openNLP compatible
annotations (NameFinder)
Hi all,
I finally got the chance to clean-up and essentially revisit some bits
of code that have helped me a lot the past year. I put it all together
in a project and open-sourced it just in case anyone else find it
useful. The project is a high-performance, dictionary-based annotator
which can be tuned for either openNLP or stanfordNLP or some custom NER
engine. Features include:
* openNLP or stanfordNLP or custom NER component compatibility
* fully parallel annotations of separate documents (optional)
* flexible API can deal with multiple dictionaries per document
(merges them in a set)
* custom tags are supported and can be provided directly on the
command-line
* basic normalisation is applied to the dictionary entries
(un-capitalisation - unless they are all capital)
* options to merge all the annotations together in a single file or
write them separately on dedicated directory
* fully functional command-line interface
* fully usable from any JVM-based language
* non-reflective source code
* data-centric & immutable API
The project lives here: https://github.com/jimpil/annotator-clj
Feel free to try it out...:)
Jim