You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by Rodrigo Agerri <ro...@ehu.es> on 2014/03/07 07:59:44 UTC
language specific headrules when training parser
Hi,
I understand that, as in the chunking.Parser class the headRules are
language-specifically casted to be the lang.en.HeadRules:
When I train a parser in the CLI for another language adding my own set of
head rules, it will still load the lang.en.HeadRules class, although that class
will read the -headRules file I pass in the CLI as parameter?
Is this correct?
Thanks,
Rodrigo
Re: language specific headrules when training parser
Posted by Jörn Kottmann <ko...@gmail.com>.
On 03/10/2014 10:59 AM, Rodrigo Agerri wrote:
> I have created the issue
>
> https://issues.apache.org/jira/browse/OPENNLP-665
>
> I understand that instead of passing the lang.en.HeadRules directly in
> the chunking.Parser.train method we could pass an interface and link
> the HeadRules class to load according to the language parameter?
>
> If you give some pointers as to what do you think it should be done
> here I would not mind to give a hand.
Thanks for creating the issue. Exactly, one way to solve it could be to
define the interface
(there is one already, maybe extend it) and pass in a Head Rules object
to the train method.
Internally the class name of the object could stored and re-created
during model loading.
For that we probably need to work a bit on the serialization, but maybe
it is already in a good
state.
With that mechanism in place we could define defaults which are
sensitive to the language within
the train method.
I will have a look at the code and then comment on the issue about the
details.
Jörn
Re: language specific headrules when training parser
Posted by Rodrigo Agerri <ag...@gmail.com>.
Hi,
I have created the issue
https://issues.apache.org/jira/browse/OPENNLP-665
I understand that instead of passing the lang.en.HeadRules directly in
the chunking.Parser.train method we could pass an interface and link
the HeadRules class to load according to the language parameter?
If you give some pointers as to what do you think it should be done
here I would not mind to give a hand.
Thanks,
Rodrigo
On Fri, Mar 7, 2014 at 11:05 AM, Jörn Kottmann <ko...@gmail.com> wrote:
> On 03/07/2014 07:59 AM, Rodrigo Agerri wrote:
>>
>> Hi,
>>
>> I understand that, as in the chunking.Parser class the headRules are
>> language-specifically casted to be the lang.en.HeadRules:
>>
>> When I train a parser in the CLI for another language adding my own set of
>> head rules, it will still load the lang.en.HeadRules class, although that
>> class
>> will read the -headRules file I pass in the CLI as parameter?
>>
>> Is this correct?
>
>
> Yes, as far as I know that is correct.
>
> We only have a default implementation for English currently. Anyway I
> believe
> we should make it extensible so that people have a chance to use a different
> implementation.
>
> Any help with this would be very welcome.
>
> Would be nice if you could open a jira issue, and we will fix it for 1.6.0.
>
> Thanks,
> Jörn
Re: language specific headrules when training parser
Posted by Jörn Kottmann <ko...@gmail.com>.
On 03/07/2014 07:59 AM, Rodrigo Agerri wrote:
> Hi,
>
> I understand that, as in the chunking.Parser class the headRules are
> language-specifically casted to be the lang.en.HeadRules:
>
> When I train a parser in the CLI for another language adding my own set of
> head rules, it will still load the lang.en.HeadRules class, although that class
> will read the -headRules file I pass in the CLI as parameter?
>
> Is this correct?
Yes, as far as I know that is correct.
We only have a default implementation for English currently. Anyway I
believe
we should make it extensible so that people have a chance to use a
different implementation.
Any help with this would be very welcome.
Would be nice if you could open a jira issue, and we will fix it for 1.6.0.
Thanks,
Jörn