You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by Rodrigo Agerri <ro...@ehu.es> on 2014/03/07 07:59:44 UTC

language specific headrules when training parser

Hi, 

I understand that, as in the chunking.Parser class the headRules are
language-specifically casted to be the lang.en.HeadRules: 

When I train a parser in the CLI for another language adding my own set of
head rules, it will still load the lang.en.HeadRules class, although that class
will read the -headRules file I pass in the CLI as parameter? 

Is this correct? 

Thanks, 

Rodrigo


Re: language specific headrules when training parser

Posted by Jörn Kottmann <ko...@gmail.com>.
On 03/10/2014 10:59 AM, Rodrigo Agerri wrote:
> I have created the issue
>
> https://issues.apache.org/jira/browse/OPENNLP-665
>
> I understand that instead of passing the lang.en.HeadRules directly in
> the chunking.Parser.train method we could pass an interface and link
> the HeadRules class to load according to the language parameter?
>
> If you give some pointers as to what do you think it should be done
> here I would not mind to give a hand.

Thanks for creating the issue. Exactly, one way to solve it could be to 
define the interface
(there is one already, maybe extend it) and pass in a Head Rules object 
to the train method.
Internally the class name of the object could stored and re-created 
during model loading.
For that we probably need to work a bit on the serialization, but maybe 
it is already in a good
state.

With that mechanism in place we could define defaults which are 
sensitive to the language within
the train method.

I will have a look at the code and then comment on the issue about the 
details.

Jörn

Re: language specific headrules when training parser

Posted by Rodrigo Agerri <ag...@gmail.com>.
Hi,

I have created the issue

https://issues.apache.org/jira/browse/OPENNLP-665

I understand that instead of passing the lang.en.HeadRules directly in
the chunking.Parser.train method we could pass an interface and link
the HeadRules class to load according to the language parameter?

If you give some pointers as to what do you think it should be done
here I would not mind to give a hand.

Thanks,

Rodrigo

On Fri, Mar 7, 2014 at 11:05 AM, Jörn Kottmann <ko...@gmail.com> wrote:
> On 03/07/2014 07:59 AM, Rodrigo Agerri wrote:
>>
>> Hi,
>>
>> I understand that, as in the chunking.Parser class the headRules are
>> language-specifically casted to be the lang.en.HeadRules:
>>
>> When I train a parser in the CLI for another language adding my own set of
>> head rules, it will still load the lang.en.HeadRules class, although that
>> class
>> will read the -headRules file I pass in the CLI as parameter?
>>
>> Is this correct?
>
>
> Yes, as far as I know that is correct.
>
> We only have a default implementation for English currently. Anyway I
> believe
> we should make it extensible so that people have a chance to use a different
> implementation.
>
> Any help with this would be very welcome.
>
> Would be nice if you could open a jira issue, and we will fix it for 1.6.0.
>
> Thanks,
> Jörn

Re: language specific headrules when training parser

Posted by Jörn Kottmann <ko...@gmail.com>.
On 03/07/2014 07:59 AM, Rodrigo Agerri wrote:
> Hi,
>
> I understand that, as in the chunking.Parser class the headRules are
> language-specifically casted to be the lang.en.HeadRules:
>
> When I train a parser in the CLI for another language adding my own set of
> head rules, it will still load the lang.en.HeadRules class, although that class
> will read the -headRules file I pass in the CLI as parameter?
>
> Is this correct?

Yes, as far as I know that is correct.

We only have a default implementation for English currently. Anyway I 
believe
we should make it extensible so that people have a chance to use a 
different implementation.

Any help with this would be very welcome.

Would be nice if you could open a jira issue, and we will fix it for 1.6.0.

Thanks,
Jörn