You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by "Fotiadis, Konstantinos" <ko...@lmco.com> on 2012/10/09 17:08:43 UTC

RE: EXTERNAL: Re: OpenNLP Maxent Data Format

unsubscribe

Konstantinos Fotiadis
Software Engineer Senior
Innovation Technology Group
Lockheed Martin IS&GS
O: 610.354.7759 | M: 610.331.0013
-----Original Message-----
From: Jörn Kottmann [mailto:kottmann@gmail.com] 
Sent: Thursday, September 06, 2012 4:19 AM
To: users@opennlp.apache.org
Subject: EXTERNAL: Re: OpenNLP Maxent Data Format

Hello,

SharpNLP is C# clone of OpenNLP, I never worked with it or know much about it, sorry.

If you need to work with .NET and want to use OpenNLP, you can try this:
https://cwiki.apache.org/OPENNLP/a-quick-guide-to-using-opennlp-from-net.html

It should not be a problem to pass an non-existing feature to a model, in OpenNLP this is done all the time e.g. if there is word which was not seen in the training data before.

HTH,
Jörn


On 09/04/2012 06:19 PM, David Young wrote:
> Hi thanks for the reply. I am not as familiar with Java so I thought 
> Id produce a model first with SharpEntropy.
> I have not really modified the simple example so It is only several 
> lines of basic code.
>
> This is how it works:
> http://pastebin.com/LK9tNsrj
>
> The training data is as follows:
> http://pastebin.com/3icni8Jc
>
> This example works fine but the problem is when I try to use any words 
> that are not in the training data.
> For example
>              context.Add("oWord=someNewWord")...
>
> This gives an unknown key error because it is not recognised. But I 
> want to make predictions using what is known. The surrounding context.
>
> As a maximum entropy model I have lots of words in training data that 
> should be taken into account when available in addition to each word POS.
> But sometimes in the real data I want to evaluate I have the POS for 
> each word, some words that are in the training data but also in the 
> context there might be words that are not in the training data. How do 
> I still get a prediction in this case using the rest of the context?
>
> Thanks for your time.
>
> On Tue, Sep 4, 2012 at 10:50 AM, Jörn Kottmann <ko...@gmail.com> wrote:
>
>> On 09/03/2012 01:45 AM, David Young wrote:
>>
>>> But my question is; what happens when I want to use something like 
>>> "next=WordNotInModel", a word that does not exist in the training 
>>> data, and still want to get a prediction using the rest of the 
>>> surrounding context?
>>> Even If I use "next=Unknown" or "next=null" or Null I get an error 
>>> "predicateLabel KeyNotFoundException was unhandled". Because "next= 
>>> WordNotInModel" is not a known key.
>>>
>> Usually maxent is used as an API, can you post some code here so we 
>> can see what you are doing? Or do you use one of the command line 
>> utils?
>>
>> Thanks,
>> Jörn
>>