You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@opennlp.apache.org by Svetoslav Marinov <sv...@findwise.com> on 2012/01/12 15:42:29 UTC

POSTagger Perceptron API

Hi all,

There is a Perceptron model for Swedish POS tagger. How does one call it with the API? I checked the API pages as well as the documentation but there there is only reference to the MaxEnt model:

POSTaggerME tagger  = new POSTaggerME(model);

So what is the method for using the Perceptron model?

I am also curious about the performance of the trained models. Is there any reference to precision/recall? Can one get in touch with the people who have trained the models available?

If one creates a new model (say for sentence detection or POS tagging with different set of POS tags) can one upload it?

Best,
Svetoslav


Re: POSTagger Perceptron API

Posted by Jörn Kottmann <ko...@gmail.com>.
On 1/12/12 3:42 PM, Svetoslav Marinov wrote:
> Hi all,
>
> There is a Perceptron model for Swedish POS tagger. How does one call it with the API? I checked the API pages as well as the documentation but there there is only reference to the MaxEnt model:
>
> POSTaggerME tagger  = new POSTaggerME(model);
>
> So what is the method for using the Perceptron model?

The decision is made at training time, depending on the settings either
maxent or perceptron is used to train a model. The produced model can
be loaded with the code above and OpenNLP takes care to setup
everything behind the scene correctly.

We distribute a perceptron model for English.

For information about how to set the training algorithm please consult
our documentation:
http://incubator.apache.org/opennlp/documentation/1.5.2-incubating/manual/opennlp.html#tools.postagger.training


> I am also curious about the performance of the trained models. Is there any reference to precision/recall? Can one get in touch with the people who have trained the models available?
>
> If one creates a new model (say for sentence detection or POS tagging with different set of POS tags) can one upload it?
>

We currently don't have a way to share models or take care for the 
distribution, mostly for copyright/legal issues.
The way we think it should be fixed is to share open source training data.

Anyway, we have some instructions no how to train the POS tagger on 
various public corpora in our documentation.
I suggest that you take a look there:
http://incubator.apache.org/opennlp/documentation/1.5.2-incubating/manual/opennlp.html#tools.corpora

Hope that helps,
Jörn