You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2014/01/23 17:24:38 UTC

[jira] [Created] (STANBOL-1268) Add option to the Lucene FST Linking Engine to use fst modles for sub-languages

Rupert Westenthaler created STANBOL-1268:
--------------------------------------------

             Summary: Add option to the Lucene FST Linking Engine to use fst modles for sub-languages
                 Key: STANBOL-1268
                 URL: https://issues.apache.org/jira/browse/STANBOL-1268
             Project: Stanbol
          Issue Type: New Feature
          Components: Enhancement Engines
            Reporter: Rupert Westenthaler
            Assignee: Rupert Westenthaler


Entities in Vocabularies might use country specific lanugages (e.g. "Organisation"@en-GB and "Organization"@en-US).

When enhancing an English language text mentioning Organization it would not get linked to an entity as the language detector reports "en" but the Entity does not provide a label for that language.

This feature will allow the FST linking engine to use FST models for sub-languages (languages that start with {lang}-*) for linking.

Notes: enabling this feature will have some influence on linking performance as the engine needs to lookup entities in additional FST modles.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)