You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Fabian Christ (JIRA)" <ji...@apache.org> on 2012/05/30 16:30:23 UTC
[jira] [Updated] (STANBOL-613) Define a standard way on how to
obtain the extracted language
[ https://issues.apache.org/jira/browse/STANBOL-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Fabian Christ updated STANBOL-613:
----------------------------------
Component/s: Engine - LangID
> Define a standard way on how to obtain the extracted language
> -------------------------------------------------------------
>
> Key: STANBOL-613
> URL: https://issues.apache.org/jira/browse/STANBOL-613
> Project: Stanbol
> Issue Type: Sub-task
> Components: Engine - LangID, Enhancer
> Affects Versions: 0.9.0-incubating
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
> Priority: Minor
> Fix For: enhancer-0.10.0-incubating
>
>
> With the addition of the CELI Langauge Identification Engine there are now two different engines that do support the same feature.
> However currently Engines that do consume the detected language are "hard coded" to the LangId Engine (enhancer/engines/langid). Something that need to be changed to allow the adoption of alternatives - like the CELI based implementation.
> The suggestion is to use the following Pattern to extract the language
> (1) via Annotations:
> ?x rdf:type fise:TextAnnotation .
> ?x dc:language ?language .
> OPTIONAL {
> ?x dc:created ?engine
> }
> OPTIONAL {
> ?x fise:confidence ?confidence
> }
> (2) via ContentItem metadata
> ?ci dc:language ?language
> (2) is a fallback if (1) delivers no results.
> Methods that
> * extract the language (with the highest confidence) - including fallback to (2)
> * extract all languages (sorted by confidence) - including fallback to (2)
> * extract all TextAnnotations with dc:language values
> are added to the EnhancementEngineHelper utility of the enhancer.servicesapi module
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira