You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by Lee Hinman <ma...@gmail.com> on 2011/01/11 19:12:36 UTC

Question about detokenizer models

Opennlp fellows,

I can't seem to find out where the dictionary files for doing
detokenization live, can someone direct me to them? All they just like
the regular model files?

- Lee Hinman

Re: Question about detokenizer models

Posted by Jörn Kottmann <ko...@gmail.com>.
Right now we only have a rule based detokenizer which needs
a xml dictionary file. There exists only a sample file
right now which can be found here:
/opennlp-tools/src/test/resources/opennlp/tools/tokenize/latin-detokenizer.xml

If you create a file for a specific language please
contribute it back.

Jörn

On 1/12/11 12:58 AM, James Kosin wrote:
> Lee,
>
> The format has changed and the dictionaries should be contained as part
> of the model itself now.
>
> James
>
> On 1/11/2011 1:12 PM, Lee Hinman wrote:
>> Opennlp fellows,
>>
>> I can't seem to find out where the dictionary files for doing
>> detokenization live, can someone direct me to them? All they just like
>> the regular model files?
>>
>> - Lee Hinman


Re: Question about detokenizer models

Posted by James Kosin <ja...@gmail.com>.
Lee,

The format has changed and the dictionaries should be contained as part
of the model itself now.

James

On 1/11/2011 1:12 PM, Lee Hinman wrote:
> Opennlp fellows,
>
> I can't seem to find out where the dictionary files for doing
> detokenization live, can someone direct me to them? All they just like
> the regular model files?
>
> - Lee Hinman