You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2013/12/04 08:43:36 UTC

[jira] [Resolved] (STANBOL-1230) Add Lookup Cache to EntityLinking Engine

     [ https://issues.apache.org/jira/browse/STANBOL-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rupert Westenthaler resolved STANBOL-1230.
------------------------------------------

    Resolution: Fixed

implemented with http://svn.apache.org/r1547718 in trunk
merged to 0.12 with http://svn.apache.org/r1547721

> Add Lookup Cache to EntityLinking Engine
> ----------------------------------------
>
>                 Key: STANBOL-1230
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1230
>             Project: Stanbol
>          Issue Type: Improvement
>          Components: Enhancement Engines
>    Affects Versions: 0.12.0
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>             Fix For: 0.12.0
>
>
> The EntityLinkingEngine should cache results of lookups on the EntitySearchers.
> Entities are often reoccurring in analyzed Documents. Because of that caching results for look upped  tokens should provide considerable performance improvements as tatistics shows that ~90% of the processing time for the EntityLinking engine is contributed by the entity look-up. 
> So if 20% of all Entity mentions are about reoccurring Entities the processing time should be reduced by about 18%.
> The cache will use the list of search string as key and a list of returned Entities as value. The cache will only collect look-up results for the currently analyzed document. 
> EntityLinking statistics will be updated to include the cache hit percentage.
> This issue affects both the trunk (1.0.0-SNAPSHOT) as well as the stable 0.12 releasing branch. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)