You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Jörn Kottmann (JIRA)" <ji...@apache.org> on 2011/07/06 11:11:17 UTC
[jira] [Commented] (OPENNLP-199) Refactor the PerceptronTrainer class to address a couple of problems

    [ https://issues.apache.org/jira/browse/OPENNLP-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060419#comment-13060419 ] 

Jörn Kottmann commented on OPENNLP-199:
---------------------------------------

Jason I would like to get this issue closed soon.

I suggest to do the following changes:
- Make the step size configurable, and disable it by default, if a user wants to use this feature he can enable it and provide a step size decrement
- Make the special averaging configurable and also disabled by default.

For me it looks like these settings should be fine-tuned per data set and not be hard-coded. When fine tuning something it is always good to start with the simplest configuration, and then test changes to it.

Please let me know what you think.

> Refactor the PerceptronTrainer class to address a couple of problems
> --------------------------------------------------------------------
>
>                 Key: OPENNLP-199
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-199
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Maxent
>    Affects Versions: maxent-3.0.1-incubating
>            Reporter: Jörn Kottmann
>            Assignee: Jason Baldridge
>             Fix For: tools-1.5.2-incubating, maxent-3.0.2-incubating
>
>
> - Changed the update to be the actual perceptron update: when a label
>   that is not the gold label is chosen for an event, the parameters
>   associated with that label are decremented, and the parameters
>   associated with the gold label are incremented. I checked this
>   empirically on several datasets, and it works better than the
>   previous update (and it involves fewer updates).
> - stepsize is decreased by stepsize/1.05 on every iteration, ensuring
>   better stability toward the end of training. This is actually the
>   main reason that the training set accuracy obtained during parameter
>   update continued to be different from that computed when parameters
>   aren't updated. Now, the parameters don't jump as much in later
>   iterations, so things settle down and those two accuracies converge
>   if enough iterations are allowed.
> - Training set accuracy is computed once per iteration.
> - Training stops if the current training set accuracy changes less
>   than a given tolerance from the accuracies obtained in each of the
>   previous three iterations.
> - Averaging is done differently than before. Rather than doing an
>   immediate update, parameters are simply accumulated after iterations
>   (this makes the code much easier to understand/maintain). Also, not
>   every iteration is used, as this tends to give to much weight to the
>   final iterations, which don't actually differ that much from one
>   another. I tried a few things and found a simple method that works
>   well: sum the parameters from the first 20 iterations and then sum
>   parameters from any further iterations that are perfect squares (25,
>   36, 49, etc). This gets a good (diverse) sample of parameters for
>   averaging since the distance between subsequent parameter sets gets
>   larger as the number of iterations gets bigger.
> - Added prepositional phrase attachment dataset to
>   src/test/resources/data/ppa. This is done with permission from
>   Adwait Ratnarparkhi -- see the README for details. 
> - Created unit test to check perceptron training consistency, using
>   the prepositional phrase attachment data. It would be good to do the
>   same for maxent.
> - Added ListEventStream to make a stream out of List<Event>
> - Added some helper methods, e.g. maxIndex, to simplify the code in
>   the main algorithm.
> - The training stats aren't shown for every iteration. Now it is just
>   the first 10 and then every 10th iteration after that.
> - modelDistribution, params, evalParams and others are no longer class
>   variables. They have been pushed into the findParameters
>   method. Other variables could/should be made non-global too, but
>   leaving as is for now.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira