You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2013/11/19 15:37:19 UTC

[jira] [Resolved] (STANBOL-1211) Improve Chunk support for Entitylinking

     [ https://issues.apache.org/jira/browse/STANBOL-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rupert Westenthaler resolved STANBOL-1211.
------------------------------------------

    Resolution: Fixed

fixed with

    http://svn.apache.org/r1543405 in the trunk
    http://svn.apache.org/r1543431 merged this feature back to 0.12
    http://svn.apache.org/r1543439 added documentation for this feature

> Improve Chunk support for Entitylinking
> ---------------------------------------
>
>                 Key: STANBOL-1211
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1211
>             Project: Stanbol
>          Issue Type: Improvement
>          Components: Enhancement Engines
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>             Fix For: 0.12.0
>
>
> Both the EntityLinkingEngine as well as the LuceneFstLinkingEngine do currently not use Chunk information very well. For now Chunks are only used to also lookup multiple matchable tokens in the same chunk with the Vocabulary - to increase recall in case proper-noun linking is enabled.
> However chunks can also be useful to increase precision by using the span of the Chunk as a base for calculating the confidence of the linked Entity. 
> A typical example are suggestions for Persons Names: If a text mentions the Given and Family name of a Person not present in an vocabulary the Entitylinking may suggest Entities just matching on of the two names with a 100% confidence. When using the span of the Chunk such suggestions would be omitted as the minimum label match score is typically > 50%.
> Other example include matches for "US {OrgName}" where "US" is linked when the whole organization is not found; same with "{OrgName} {Role}" where the {Role} (e.g. president) is linked; Also cases like "15. September, 2013" may cause September to be suggested if present in the vocabulary.



--
This message was sent by Atlassian JIRA
(v6.1#6144)