You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Jason Baldridge (JIRA)" <ji...@apache.org> on 2011/06/08 06:43:58 UTC
[jira] [Resolved] (OPENNLP-155) unreliable training set accuracy in perceptron

     [ https://issues.apache.org/jira/browse/OPENNLP-155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Baldridge resolved OPENNLP-155.
-------------------------------------

    Resolution: Fixed

Oops, should have used this to post my notes. Anyway, issue is resolved. Here are the notes again. -Jason

- Changed the update to be the actual perceptron update: when a label
  that is not the gold label is chosen for an event, the parameters
  associated with that label are decremented, and the parameters
  associated with the gold label are incremented. I checked this
  empirically on several datasets, and it works better than the
  previous update (and it involves fewer updates).

- stepsize is decreased by stepsize/1.05 on every iteration, ensuring
  better stability toward the end of training. This is actually the
  main reason that the training set accuracy obtained during parameter
  update continued to be different from that computed when parameters
  aren't updated. Now, the parameters don't jump as much in later
  iterations, so things settle down and those two accuracies converge
  if enough iterations are allowed.

- Training set accuracy is computed once per iteration.

- Training stops if the current training set accuracy changes less
  than a given tolerance from the accuracies obtained in each of the
  previous three iterations.

- Averaging is done differently than before. Rather than doing an
  immediate update, parameters are simply accumulated after iterations
  (this makes the code much easier to understand/maintain). Also, not
  every iteration is used, as this tends to give to much weight to the
  final iterations, which don't actually differ that much from one
  another. I tried a few things and found a simple method that works
  well: sum the parameters from the first 20 iterations and then sum
  parameters from any further iterations that are perfect squares (25,
  36, 49, etc). This gets a good (diverse) sample of parameters for
  averaging since the distance between subsequent parameter sets gets
  larger as the number of iterations gets bigger.

- Added prepositional phrase attachment dataset to
  src/test/resources/data/ppa. This is done with permission from
  Adwait Ratnarparkhi -- see the README for details. 

- Created unit test to check perceptron training consistency, using
  the prepositional phrase attachment data. It would be good to do the
  same for maxent.

- Added ListEventStream to make a stream out of List<Event>

- Added some helper methods, e.g. maxIndex, to simplify the code in
  the main algorithm.

- The training stats aren't shown for every iteration. Now it is just
  the first 10 and then every 10th iteration after that.

- modelDistribution, params, evalParams and others are no longer class
  variables. They have been pushed into the findParameters
  method. Other variables could/should be made non-global too, but
  leaving as is for now.

> unreliable training set accuracy in perceptron
> ----------------------------------------------
>
>                 Key: OPENNLP-155
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-155
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Maxent
>    Affects Versions: maxent-3.0.1-incubating
>            Reporter: Jason Baldridge
>            Assignee: Jason Baldridge
>            Priority: Minor
>             Fix For: maxent-3.0.1-incubating, tools-1.5.1-incubating
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> The training accuracies reported during perceptron training were much higher than final training accuracy, which turned out to be an artifact of the way training examples were ordered.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira