You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@opennlp.apache.org by Jeffrey Zemerick <jz...@apache.org> on 2017/02/09 21:55:52 UTC

Hardcoded length in prefix and suffix feature generators

Hi,

I noticed that the length is hardcoded to 4 in the PrefixFeatureGenerator
and the SuffixFeatureGenerator. I made this value configurable in the XML
for each feature generator. I also add a check for the length to keep
duplicate prefixes or suffixes being returned. (If the token is "yes" with
a length of 4 there would be two "yes" features returned.) If a value is
not provided in the XML it uses the default value of 4.

You can preview the changes here:
https://github.com/apache/opennlp/compare/master...jzonthemtn:prefixsuffix?expand=1

If this is a change that's desired by the group I can make a JIRA and a
pull request.

Thanks,
Jeff

Re: Hardcoded length in prefix and suffix feature generators

Posted by William Colen <co...@apache.org>.
Looks good! Thanks for the unit tests.
Please open a Jira, squash your commits and open the PR.

2017-02-09 19:55 GMT-02:00 Jeffrey Zemerick <jz...@apache.org>:

> Hi,
>
> I noticed that the length is hardcoded to 4 in the PrefixFeatureGenerator
> and the SuffixFeatureGenerator. I made this value configurable in the XML
> for each feature generator. I also add a check for the length to keep
> duplicate prefixes or suffixes being returned. (If the token is "yes" with
> a length of 4 there would be two "yes" features returned.) If a value is
> not provided in the XML it uses the default value of 4.
>
> You can preview the changes here:
> https://github.com/apache/opennlp/compare/master...
> jzonthemtn:prefixsuffix?expand=1
>
> If this is a change that's desired by the group I can make a JIRA and a
> pull request.
>
> Thanks,
> Jeff
>