You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by "Jim - FooBar();" <ji...@gmail.com> on 2013/01/11 20:13:42 UTC

[ANN] - annotator-clj gets built-in support for openNLP compatible annotations (NameFinder)

Hi all,

I finally got the chance to clean-up and essentially revisit some bits 
of code that have helped me a lot the past year. I put it all together 
in a project and open-sourced it just in case anyone else find it 
useful. The project is a high-performance, dictionary-based annotator 
which can be tuned for either openNLP or stanfordNLP or some custom NER 
engine. Features include:

  * openNLP or stanfordNLP or custom NER component compatibility
  * fully parallel annotations of separate documents (optional)
  * flexible API can deal with multiple dictionaries per document
    (merges them in a set)
  * custom tags are supported and can be provided directly on the
    command-line
  * basic normalisation is applied to the dictionary entries
    (un-capitalisation - unless they are all capital)
  * options to merge all the annotations together in a single file or
    write them separately on dedicated directory
  * fully functional command-line interface
  * fully usable from any JVM-based language
  * non-reflective source code
  * data-centric & immutable API

The project lives here: https://github.com/jimpil/annotator-clj

Feel free to try it out...:)

Jim