You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Jörn Kottmann (JIRA)" <ji...@apache.org> on 2011/03/01 12:52:36 UTC

[jira] Created: (OPENNLP-138) The Name Finder always creates/uses two sets of feature generators

The Name Finder always creates/uses two sets of feature generators
------------------------------------------------------------------

                 Key: OPENNLP-138
                 URL: https://issues.apache.org/jira/browse/OPENNLP-138
             Project: OpenNLP
          Issue Type: Bug
          Components: Name Finder
    Affects Versions: tools-1.5.0-sourceforge
            Reporter: Jörn Kottmann
            Assignee: Jörn Kottmann
            Priority: Blocker
             Fix For: tools-1.5.1-incubating


The NameFinderME during initialization either creates default feature generators or uses a set of feature generators provided by the user. In both cases the NameFinderME code calls the DefaultNameContextGenerator() constructor and then adds the feature generators to the context generator. The code ignores the fact that DefaultNameContextGenerator creates default feature generators on its own. The add call does not replace the existing feature generation, but really adds them, so now there are two sets of feature generators.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] Commented: (OPENNLP-138) The Name Finder always creates/uses two sets of feature generators

Posted by "James Kosin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001203#comment-13001203 ] 

James Kosin commented on OPENNLP-138:
-------------------------------------

Generates better results also:

Before change
---------------------
499:  .. loglikelihood=-2765.77789572618	0.9974904356623334
500:  .. loglikelihood=-2763.104274308045	0.9974904356623334
Writing name finder model ... done (1.841s)

Wrote name finder model to
path: C:\Users\James Kosin\Documents\NetBeansProjects\thesis\DocCompare\enNameFinder.model

Testing on eng.testa
Loading Token Name Finder model ... done (0.359s)
current: 193.2 sent/s avg: 193.2 sent/s total: 226 sent
current: 678.3 sent/s avg: 407.2 sent/s total: 851 sent
current: 624.6 sent/s avg: 476.7 sent/s total: 1465 sent
current: 916.8 sent/s avg: 584.6 sent/s total: 2380 sent
current: 668.7 sent/s avg: 601.2 sent/s total: 3048 sent


Average: 602.4 sent/s 
Total: 3251 sent
Runtime: 5.397s

Precision: 0.799424509140149
Recall: 0.794850218781555
F-Measure: 0.7971308016877637
Testing on eng.testb
Loading Token Name Finder model ... done (0.359s)
current: 149.7 sent/s avg: 149.7 sent/s total: 154 sent
current: 736.7 sent/s avg: 438.9 sent/s total: 890 sent
current: 647.0 sent/s avg: 506.8 sent/s total: 1526 sent
current: 800.6 sent/s avg: 579.9 sent/s total: 2326 sent
current: 1119.2 sent/s avg: 687.6 sent/s total: 3443 sent


Average: 687.6 sent/s 
Total: 3454 sent
Runtime: 5.023s

Precision: 0.75829636202307
Recall: 0.7565509915014165
F-Measure: 0.7574226712753699


After change
------------------
499:  .. loglikelihood=-2765.77789572618	0.9974904356623334
500:  .. loglikelihood=-2763.104274308045	0.9974904356623334
Writing name finder model ... done (1.794s)

Wrote name finder model to
path: C:\Users\James Kosin\Documents\NetBeansProjects\thesis\DocCompare\enNameFinder.model

Testing on eng.testa
Loading Token Name Finder model ... done (0.375s)
current: 291.5 sent/s avg: 291.5 sent/s total: 309 sent
current: 809.9 sent/s avg: 536.8 sent/s total: 1080 sent
current: 1176.4 sent/s avg: 749.2 sent/s total: 2255 sent


Average: 830.4 sent/s 
Total: 3251 sent
Runtime: 3.915s

Precision: 0.8666666666666667
Recall: 0.8444968024234265
F-Measure: 0.8554381179679509
Testing on eng.testb
Loading Token Name Finder model ... done (0.374s)
current: 130.2 sent/s avg: 130.2 sent/s total: 132 sent
current: 917.2 sent/s avg: 525.9 sent/s total: 1075 sent
current: 1042.4 sent/s avg: 691.8 sent/s total: 2083 sent


Average: 907.5 sent/s 
Total: 3454 sent
Runtime: 3.806s

Precision: 0.8114532685035116
Recall: 0.7978045325779037
F-Measure: 0.8045710204446032


> The Name Finder always creates/uses two sets of feature generators
> ------------------------------------------------------------------
>
>                 Key: OPENNLP-138
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-138
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Name Finder
>    Affects Versions: tools-1.5.0-sourceforge
>            Reporter: Jörn Kottmann
>            Assignee: Jörn Kottmann
>            Priority: Blocker
>             Fix For: tools-1.5.1-incubating
>
>
> The NameFinderME during initialization either creates default feature generators or uses a set of feature generators provided by the user. In both cases the NameFinderME code calls the DefaultNameContextGenerator() constructor and then adds the feature generators to the context generator. The code ignores the fact that DefaultNameContextGenerator creates default feature generators on its own. The add call does not replace the existing feature generation, but really adds them, so now there are two sets of feature generators.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] Commented: (OPENNLP-138) The Name Finder always creates/uses two sets of feature generators

Posted by "Jörn Kottmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001350#comment-13001350 ] 

Jörn Kottmann commented on OPENNLP-138:
---------------------------------------

Thanks for providing these numbers, would you mind to add them to our Test Plan?
There is a section for publicly available test and training data.

Here is the link:
https://cwiki.apache.org/confluence/display/OPENNLP/TestPlan1.5.1

We will do a second RC soon, would be nice you could then also redo the test
against this new RC.

> The Name Finder always creates/uses two sets of feature generators
> ------------------------------------------------------------------
>
>                 Key: OPENNLP-138
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-138
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Name Finder
>    Affects Versions: tools-1.5.0-sourceforge
>            Reporter: Jörn Kottmann
>            Assignee: Jörn Kottmann
>            Priority: Blocker
>             Fix For: tools-1.5.1-incubating
>
>
> The NameFinderME during initialization either creates default feature generators or uses a set of feature generators provided by the user. In both cases the NameFinderME code calls the DefaultNameContextGenerator() constructor and then adds the feature generators to the context generator. The code ignores the fact that DefaultNameContextGenerator creates default feature generators on its own. The add call does not replace the existing feature generation, but really adds them, so now there are two sets of feature generators.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] Closed: (OPENNLP-138) The Name Finder always creates/uses two sets of feature generators

Posted by "Jörn Kottmann (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jörn Kottmann closed OPENNLP-138.
---------------------------------

    Resolution: Fixed

> The Name Finder always creates/uses two sets of feature generators
> ------------------------------------------------------------------
>
>                 Key: OPENNLP-138
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-138
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Name Finder
>    Affects Versions: tools-1.5.0-sourceforge
>            Reporter: Jörn Kottmann
>            Assignee: Jörn Kottmann
>            Priority: Blocker
>             Fix For: tools-1.5.1-incubating
>
>
> The NameFinderME during initialization either creates default feature generators or uses a set of feature generators provided by the user. In both cases the NameFinderME code calls the DefaultNameContextGenerator() constructor and then adds the feature generators to the context generator. The code ignores the fact that DefaultNameContextGenerator creates default feature generators on its own. The add call does not replace the existing feature generation, but really adds them, so now there are two sets of feature generators.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira