You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2013/12/04 08:43:36 UTC
[jira] [Resolved] (STANBOL-1230) Add Lookup Cache to EntityLinking
Engine
[ https://issues.apache.org/jira/browse/STANBOL-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rupert Westenthaler resolved STANBOL-1230.
------------------------------------------
Resolution: Fixed
implemented with http://svn.apache.org/r1547718 in trunk
merged to 0.12 with http://svn.apache.org/r1547721
> Add Lookup Cache to EntityLinking Engine
> ----------------------------------------
>
> Key: STANBOL-1230
> URL: https://issues.apache.org/jira/browse/STANBOL-1230
> Project: Stanbol
> Issue Type: Improvement
> Components: Enhancement Engines
> Affects Versions: 0.12.0
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
> Fix For: 0.12.0
>
>
> The EntityLinkingEngine should cache results of lookups on the EntitySearchers.
> Entities are often reoccurring in analyzed Documents. Because of that caching results for look upped tokens should provide considerable performance improvements as tatistics shows that ~90% of the processing time for the EntityLinking engine is contributed by the entity look-up.
> So if 20% of all Entity mentions are about reoccurring Entities the processing time should be reduced by about 18%.
> The cache will use the list of search string as key and a list of returned Entities as value. The cache will only collect look-up results for the currently analyzed document.
> EntityLinking statistics will be updated to include the cache hit percentage.
> This issue affects both the trunk (1.0.0-SNAPSHOT) as well as the stable 0.12 releasing branch.
--
This message was sent by Atlassian JIRA
(v6.1#6144)