You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Vinh Khuc (JIRA)" <ji...@apache.org> on 2014/06/16 06:08:01 UTC

[jira] [Commented] (OPENNLP-700) Remove the experimental flag from L-BFGS trainer

    [ https://issues.apache.org/jira/browse/OPENNLP-700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032107#comment-14032107 ] 

Vinh Khuc commented on OPENNLP-700:
-----------------------------------

I tested the L-BFGS learner on the CONLL 2000, CONLL 2002 and CONLL 2003 corpora with a cutoff of 5 and 100 iterations. I tried different values for L1Cost and L2Cost, and found that the setting with L1Cost = 0.1, L2Cost = 0.1 gives relatively good accuracies. See the attached file "OPENNLP-700-LBFGS-performance.txt" for more details (only results on combined corpora are included).

Comparing with the results at https://cwiki.apache.org/confluence/display/OPENNLP/TestPlan1.6.0, L-BFGS outperforms GIS on most corpora (GIS only beats L-BFGS on CONLL 2002 Spanish Combined esp.testb).

The above observation is also true for the CONLL 2002, CONLL 2003 corpora broken down to per, loc, org, misc (results are not included here).

The two parameters L1Cost and L2Cost need to be tuned for each corpus so that L-BFGS gives best accuracy. A common method for finding appropriate values for these parameters is Grid Search with Cross Validation. This feature may be added into the L-BFGS learner in the future.




> Remove the experimental flag from L-BFGS trainer
> ------------------------------------------------
>
>                 Key: OPENNLP-700
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-700
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Machine Learning
>    Affects Versions: tools-1.5.3, maxent-3.0.3
>            Reporter: Vinh Khuc
>            Assignee: Vinh Khuc
>            Priority: Minor
>             Fix For: 1.6.0
>
>
> The current L-BFGS trainer is marked with the experimental flag, i.e. the current algorithm name is MAXENT_QN_EXPERIMENTAL. The work on this issue should make sure that L-BFGS trainer's performance is stable at least with some common NLP corpora so that the experimental flag can be safely removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)