You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by Claudia Bobach <cl...@ontochem.com> on 2011/07/06 09:10:06 UTC

OpenNLP Parser - POS tagger - tagdict

Hi,
I am using the OpenNLP Parser and I would like to optimize performance 
by training the parser.
Now I read about the training and I would like to know if the POS tagger 
which is part of the deep parser contains and uses a tagdictionary like 
the separate OpenNLP POS tagger does. I could not find anything like a 
tagdictionary in the parser. Is it possible to add a tagdictionary and a 
training model to the implemented POStagger of the deep parser?

Thank you very much for your help,
Claudia


Re: OpenNLP Parser - POS tagger - tagdict

Posted by Jason Baldridge <ja...@gmail.com>.
No, I mean letting the parser tag words itself. The per-tag accuracy might
be lower, but parsers do better using their predicted tags. Mike Collins did
some experiments with this back in the late 1990's, and I think Dan Bikel
did too in his reimplementation of Collins' parser. But, come to think of
it, this is not true for Ratnaparkhi's parser (which is what the OpenNLP
parser is based on) since it is discriminative, not generative. Anyway, the
point is that this isn't always an obvious thing.

On Wed, Jul 6, 2011 at 8:07 AM, Jörn Kottmann <ko...@gmail.com> wrote:

> On 7/6/11 2:57 PM, Jason Baldridge wrote:
>
>> Regardless of more data, it actually is typically better to let a parser
>> tag
>> words by itself rather than to use a separate tagger.
>>
>
> So "by itself" you mean the POS Tagger trained on the parser training data?
>
> Jörn
>



-- 
Jason Baldridge
Assistant Professor, Department of Linguistics
The University of Texas at Austin
http://www.jasonbaldridge.com
http://twitter.com/jasonbaldridge

Re: OpenNLP Parser - POS tagger - tagdict

Posted by Jörn Kottmann <ko...@gmail.com>.
On 7/6/11 2:57 PM, Jason Baldridge wrote:
> Regardless of more data, it actually is typically better to let a parser tag
> words by itself rather than to use a separate tagger.

So "by itself" you mean the POS Tagger trained on the parser training data?

Jörn

Re: OpenNLP Parser - POS tagger - tagdict

Posted by Jason Baldridge <ja...@gmail.com>.
Regardless of more data, it actually is typically better to let a parser tag
words by itself rather than to use a separate tagger.

On Wed, Jul 6, 2011 at 2:49 AM, Jörn Kottmann <ko...@gmail.com> wrote:

> On 7/6/11 9:10 AM, Claudia Bobach wrote:
>
>> Now I read about the training and I would like to know if the POS tagger
>> which is part of the deep parser contains and uses a tagdictionary like the
>> separate OpenNLP POS tagger does.
>>
>
> The parser model just includes a standard POS Tagger Model,
> and that model can contain a tag-dictionary, sadly you cannot
> provide that while training.
>
> +1, to fix that, do you want to provide a patch?
>
> We usually swap the POS Model for the Parser model on the website because,
> we have more training data for the POS tagger than for the parser, and this
> way it is possible to create a better POS Model with a tag dictionary.
>
> Jörn
>
>
>
>
>
>
>


-- 
Jason Baldridge
Assistant Professor, Department of Linguistics
The University of Texas at Austin
http://www.jasonbaldridge.com
http://twitter.com/jasonbaldridge

Re: OpenNLP Parser - POS tagger - tagdict

Posted by Jörn Kottmann <ko...@gmail.com>.
On 7/6/11 9:10 AM, Claudia Bobach wrote:
> Now I read about the training and I would like to know if the POS 
> tagger which is part of the deep parser contains and uses a 
> tagdictionary like the separate OpenNLP POS tagger does.

The parser model just includes a standard POS Tagger Model,
and that model can contain a tag-dictionary, sadly you cannot
provide that while training.

+1, to fix that, do you want to provide a patch?

We usually swap the POS Model for the Parser model on the website because,
we have more training data for the POS tagger than for the parser, and this
way it is possible to create a better POS Model with a tag dictionary.

Jörn