You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by Michael Schmitz <sc...@cs.washington.edu> on 2012/05/03 00:09:55 UTC

OpenNLP Maxent with float values

Hi, I'm using OpenNLP maxent to train a model.  Features are either
boolean or float.  As I understand it, OpenNLP maxent want to to have
a context in the form:

feature1=true feature2=3.5 feature3=false

For example, this JIRA has an example of the context being a string
that specifies the float values, and the float array also specifying
the float values.

https://issues.apache.org/jira/browse/OPENNLP-170

Constructing such a string is slow.  I especially want to avoid
constructing such a string when I an evaluating an instance against
the model.  Is there any way I can just use an array of float values
for the evaluation?  This is easy to create, and fast.

1.0, 3.5, 0.0

Peace.  Michael

Re: OpenNLP Maxent with float values

Posted by Jason Baldridge <ja...@gmail.com>.
Funny you ask: my semester just ended, and I'm planning to get back to it
quite soon! (After quite a long hiatus...)

On Mon, May 7, 2012 at 4:25 PM, Michael Schmitz <sc...@cs.washington.edu>wrote:

> Yeah, I was caching my strings when I had Boolean features sets.  This
> is infeasible with floats though, and it's a bit of a performance
> problem.  I ended up writing my own logistic regression execution
> code.
>
> The current API isn't going anywhere?
>
> Peace.  Michael
>
>
> On Mon, May 7, 2012 at 2:16 PM, Jörn Kottmann <ko...@gmail.com> wrote:
> > On 05/07/2012 11:12 PM, Michael Schmitz wrote:
> >>
> >> I thought the API used strings...
> >>
> >>
> >>
> http://opennlp.apache.org/documentation/1.5.2-incubating/apidocs/opennlp-maxent/opennlp/model/Event.html
> >>
> >> Specifically, I'm referring to String[] context which seems required
> >> when I looked at the underlying source (although values[] is also
> >> used).
> >>
> >
> > You need to pass in both, there is no way to get around that with
> > our current API. Anyway if you know your features in advance you can
> > cache the string objects and only pass in references instead of
> constructing
> > new
> > string objects on every call.
> >
> > Jörn
>



-- 
Jason Baldridge
Associate Professor, Department of Linguistics
The University of Texas at Austin
http://www.jasonbaldridge.com
http://twitter.com/jasonbaldridge

Re: OpenNLP Maxent with float values

Posted by Michael Schmitz <sc...@cs.washington.edu>.
Yeah, I was caching my strings when I had Boolean features sets.  This
is infeasible with floats though, and it's a bit of a performance
problem.  I ended up writing my own logistic regression execution
code.

The current API isn't going anywhere?

Peace.  Michael


On Mon, May 7, 2012 at 2:16 PM, Jörn Kottmann <ko...@gmail.com> wrote:
> On 05/07/2012 11:12 PM, Michael Schmitz wrote:
>>
>> I thought the API used strings...
>>
>>
>> http://opennlp.apache.org/documentation/1.5.2-incubating/apidocs/opennlp-maxent/opennlp/model/Event.html
>>
>> Specifically, I'm referring to String[] context which seems required
>> when I looked at the underlying source (although values[] is also
>> used).
>>
>
> You need to pass in both, there is no way to get around that with
> our current API. Anyway if you know your features in advance you can
> cache the string objects and only pass in references instead of constructing
> new
> string objects on every call.
>
> Jörn

Re: OpenNLP Maxent with float values

Posted by Jörn Kottmann <ko...@gmail.com>.
On 05/07/2012 11:12 PM, Michael Schmitz wrote:
> I thought the API used strings...
>
> http://opennlp.apache.org/documentation/1.5.2-incubating/apidocs/opennlp-maxent/opennlp/model/Event.html
>
> Specifically, I'm referring to String[] context which seems required
> when I looked at the underlying source (although values[] is also
> used).
>

You need to pass in both, there is no way to get around that with
our current API. Anyway if you know your features in advance you can
cache the string objects and only pass in references instead of 
constructing new
string objects on every call.

Jörn

Re: OpenNLP Maxent with float values

Posted by Michael Schmitz <sc...@cs.washington.edu>.
I thought the API used strings...

http://opennlp.apache.org/documentation/1.5.2-incubating/apidocs/opennlp-maxent/opennlp/model/Event.html

Specifically, I'm referring to String[] context which seems required
when I looked at the underlying source (although values[] is also
used).

Peace.  Michael


On Mon, May 7, 2012 at 1:44 PM, Jason Baldridge
<ja...@gmail.com> wrote:
> You could use the API directly instead of writing to strings.
>
> On Wed, May 2, 2012 at 5:09 PM, Michael Schmitz <sc...@cs.washington.edu>wrote:
>
>> Hi, I'm using OpenNLP maxent to train a model.  Features are either
>> boolean or float.  As I understand it, OpenNLP maxent want to to have
>> a context in the form:
>>
>> feature1=true feature2=3.5 feature3=false
>>
>> For example, this JIRA has an example of the context being a string
>> that specifies the float values, and the float array also specifying
>> the float values.
>>
>> https://issues.apache.org/jira/browse/OPENNLP-170
>>
>> Constructing such a string is slow.  I especially want to avoid
>> constructing such a string when I an evaluating an instance against
>> the model.  Is there any way I can just use an array of float values
>> for the evaluation?  This is easy to create, and fast.
>>
>> 1.0, 3.5, 0.0
>>
>> Peace.  Michael
>>
>
>
>
> --
> Jason Baldridge
> Associate Professor, Department of Linguistics
> The University of Texas at Austin
> http://www.jasonbaldridge.com
> http://twitter.com/jasonbaldridge

Re: OpenNLP Maxent with float values

Posted by Jason Baldridge <ja...@gmail.com>.
You could use the API directly instead of writing to strings.

On Wed, May 2, 2012 at 5:09 PM, Michael Schmitz <sc...@cs.washington.edu>wrote:

> Hi, I'm using OpenNLP maxent to train a model.  Features are either
> boolean or float.  As I understand it, OpenNLP maxent want to to have
> a context in the form:
>
> feature1=true feature2=3.5 feature3=false
>
> For example, this JIRA has an example of the context being a string
> that specifies the float values, and the float array also specifying
> the float values.
>
> https://issues.apache.org/jira/browse/OPENNLP-170
>
> Constructing such a string is slow.  I especially want to avoid
> constructing such a string when I an evaluating an instance against
> the model.  Is there any way I can just use an array of float values
> for the evaluation?  This is easy to create, and fast.
>
> 1.0, 3.5, 0.0
>
> Peace.  Michael
>



-- 
Jason Baldridge
Associate Professor, Department of Linguistics
The University of Texas at Austin
http://www.jasonbaldridge.com
http://twitter.com/jasonbaldridge