You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2013/06/20 09:09:24 UTC
[jira] [Created] (STANBOL-1116) Filter Literals of suggested
Entities based on Languages used for Lookups
Rupert Westenthaler created STANBOL-1116:
--------------------------------------------
Summary: Filter Literals of suggested Entities based on Languages used for Lookups
Key: STANBOL-1116
URL: https://issues.apache.org/jira/browse/STANBOL-1116
Project: Stanbol
Issue Type: Sub-task
Reporter: Rupert Westenthaler
Assignee: Rupert Westenthaler
EntityLinking uses two languages to lookup Entities:
(1) the language of the current document (as detected by language detection)
(2) the default mapping language (default: null ... labels without language tag)
In multi-lingual vocabularies (e.g. dbpedia or freebase) entities might define literal values for a lot of languages (for freebase there might be labels for more as 100 languages for some entities)
Currently the EntityLinkingEngine includes labels of all languages in the EnhancementResults. This has two disadvantages:
(1) All values need to be provided by the EntitySearcher. This might require to convert all those values to Clerezza RDF (such as in the case of the Solr based EntitySearcher)
(2) If dereferencing is activated a lot of additional literals (ant therefore triples) are added to the Enhancement results. This has both a negative impact for performance AND also the size of the Enhancement Results.
This issue will adapt the EntiySearcher interface to allow specifying
* selected fields
* selected languages
with all requests, where the languages used to query will always be included to the parsed selected languages and the label field, type field and redirect field will always be included in the selected fields - as those information are required by the linking process itself.
EntitySearcher implementation may ignore those configurations and return all values for returned entities instead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira