You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2013/06/24 12:36:20 UTC

[jira] [Created] (STANBOL-1123) Label Token matching should consider tokens that are marked as "consumed"

Rupert Westenthaler created STANBOL-1123:
--------------------------------------------

             Summary: Label Token matching should consider tokens that are marked as "consumed"
                 Key: STANBOL-1123
                 URL: https://issues.apache.org/jira/browse/STANBOL-1123
             Project: Stanbol
          Issue Type: Sub-task
            Reporter: Rupert Westenthaler
            Assignee: Rupert Westenthaler


Tokens marked as "consumed" should be considered while matching Labels of Entities with the processed Text.

Marking Tokens as "consumed" aims to reduce the number or required vocabulary lookups. However considering those while matching does not hurt performance while it dose increase the quality of the linking process.

Allowing so will bring improvements especially for very long noun phrases, where an initial query (typically by using the first to nouns) might not suggest the best matching Entity. Person mentions like "{role} {given} {given} {family}" are typical examples for such cases.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira