You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@opennlp.apache.org by "Vinh Khuc (JIRA)" <ji...@apache.org> on 2014/04/05 07:23:14 UTC

[jira] [Updated] (OPENNLP-671) Add L1-regularization into L-BFGS

     [ https://issues.apache.org/jira/browse/OPENNLP-671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinh Khuc updated OPENNLP-671:
------------------------------

    Description: 
L1-regularization is useful during training Maximum Entropy models since it pushes parameters of irrelevant features to zero. Hence, the parameter vector will be sparse and the trained model will be compact. 

When the number of features is much larger than the number of training examples, L1 often gives better accuracy than L2.

The implementation of L1-regularization for L-BFGS will follow the method described in the paper:

http://research.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf

  was:
L1-regularization is useful during training Maximum Entropy models since it pushes parameters of irrelevant features to zero. Hence, the trained model will be sparse and compact. 

When the number of features is much larger than the number of training examples, L1 often gives better accuracy than L2.

The implementation of L1-regularization for L-BFGS will follow the method described in the paper:

http://research.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf


> Add L1-regularization into L-BFGS
> ---------------------------------
>
>                 Key: OPENNLP-671
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-671
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Machine Learning
>            Reporter: Vinh Khuc
>
> L1-regularization is useful during training Maximum Entropy models since it pushes parameters of irrelevant features to zero. Hence, the parameter vector will be sparse and the trained model will be compact. 
> When the number of features is much larger than the number of training examples, L1 often gives better accuracy than L2.
> The implementation of L1-regularization for L-BFGS will follow the method described in the paper:
> http://research.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf



--
This message was sent by Atlassian JIRA
(v6.2#6252)