You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "William Colen (Created) (JIRA)" <ji...@apache.org> on 2012/02/04 02:47:53 UTC

[jira] [Created] (OPENNLP-423) Improve Portuguese NameSample and ChunkSample formaters

Improve Portuguese NameSample and ChunkSample formaters
-------------------------------------------------------

                 Key: OPENNLP-423
                 URL: https://issues.apache.org/jira/browse/OPENNLP-423
             Project: OpenNLP
          Issue Type: Improvement
            Reporter: William Colen
            Assignee: William Colen
            Priority: Minor


I found some issues with the Portuguese NameSample and ChunkSample formaters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (OPENNLP-423) Improve Portuguese NameSample and ChunkSample formatters

Posted by "William Colen (Closed) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

William Colen closed OPENNLP-423.
---------------------------------

    Resolution: Fixed

Fixed. Now the evaluation results are:

NameFinder (amazonia.ad):

Evaluated 275769 samples with 325409 entities; found: 301643 entities; correct: 245105.
       TOTAL: precision:   81.26%;  recall:   75.32%; F1:   78.18%.
        time: precision:   91.97%;  recall:   90.01%; F1:   90.98%. [target: 18197; tp: 16380; fp: 1430]
     numeric: precision:   89.82%;  recall:   84.15%; F1:   86.89%. [target: 15157; tp: 12755; fp: 1446]
       event: precision:   90.58%;  recall:   78.96%; F1:   84.37%. [target: 49994; tp: 39475; fp: 4107]
       place: precision:   83.88%;  recall:   79.49%; F1:   81.63%. [target: 51669; tp: 41071; fp: 7892]
      person: precision:   77.32%;  recall:   78.85%; F1:   78.07%. [target: 83770; tp: 66050; fp: 19377]
organization: precision:   76.21%;  recall:   72.24%; F1:   74.17%. [target: 69190; tp: 49986; fp: 15604]
       thing: precision:   80.24%;  recall:   55.11%; F1:   65.35%. [target: 9905; tp: 5459; fp: 1344]
    abstract: precision:   74.02%;  recall:   55.89%; F1:   63.69%. [target: 11050; tp: 6176; fp: 2168]
     artprod: precision:   70.98%;  recall:   47.05%; F1:   56.59%. [target: 16477; tp: 7753; fp: 3170]

Chunker (Bosque_CF_8.0.ad.txt):

Evaluated 4212 samples with 48628 entities; found: 48534 entities; correct: 44314.
       TOTAL: precision:   91.31%;  recall:   91.13%; F1:   91.22%.
          PP: precision:   95.20%;  recall:   94.35%; F1:   94.77%. [target: 12645; tp: 11931; fp: 602]
          VP: precision:   92.57%;  recall:   92.93%; F1:   92.75%. [target: 8244; tp: 7661; fp: 615]
          NP: precision:   89.25%;  recall:   89.15%; F1:   89.20%. [target: 25362; tp: 22610; fp: 2723]
        ADVP: precision:   88.29%;  recall:   88.85%; F1:   88.57%. [target: 2377; tp: 2112; fp: 280]
                
> Improve Portuguese NameSample and ChunkSample formatters
> --------------------------------------------------------
>
>                 Key: OPENNLP-423
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-423
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Formats
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: William Colen
>            Assignee: William Colen
>            Priority: Minor
>             Fix For: tools-1.5.3-incubating
>
>
> I found some issues with the Portuguese NameSample and ChunkSample formaters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (OPENNLP-423) Improve Portuguese NameSample and ChunkSample formatters

Posted by "William Colen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

William Colen updated OPENNLP-423:
----------------------------------

          Component/s: Formats
    Affects Version/s: tools-1.5.3-incubating
        Fix Version/s: tools-1.5.3-incubating
              Summary: Improve Portuguese NameSample and ChunkSample formatters  (was: Improve Portuguese NameSample and ChunkSample formaters)
    
> Improve Portuguese NameSample and ChunkSample formatters
> --------------------------------------------------------
>
>                 Key: OPENNLP-423
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-423
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Formats
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: William Colen
>            Assignee: William Colen
>            Priority: Minor
>             Fix For: tools-1.5.3-incubating
>
>
> I found some issues with the Portuguese NameSample and ChunkSample formaters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira