You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2012/05/17 10:49:02 UTC

[jira] [Commented] (STANBOL-617) Define how TopicEnhancements are written to the Enhancement Structure

    [ https://issues.apache.org/jira/browse/STANBOL-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13277669#comment-13277669 ] 

Rupert Westenthaler commented on STANBOL-617:
---------------------------------------------

related to (9): Maybe we should use a more specific dc:type for the fise:TextAnnotation as skos:Concpt is rather generic for describing a fise:TextAnnotation represents extracted Topics. In addition fise:TextAnnotations that represent extracted Entities from a Thesaurus currently also use skos:Concept as dc:type.
                
> Define how TopicEnhancements are written to the Enhancement Structure
> ---------------------------------------------------------------------
>
>                 Key: STANBOL-617
>                 URL: https://issues.apache.org/jira/browse/STANBOL-617
>             Project: Stanbol
>          Issue Type: Bug
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>            Priority: Minor
>             Fix For: 0.10.0-incubating
>
>
> In future three Enhancement Engine will annotate Topics extracted form analyzed ContentItems
> * Topic Engine
> * Zemanta Engine
> * CELI Classification Engine (See STANBOL-583)
> While all do annotate Topics very similar there are some small variations that need to be aligned to make it easier for users to consume those annotations.
> Topic Annotation are a special type of Annotation that is very similar to a fise:EntityAnnotation. The following listing shows expected triples
> (1)    ?ta rdf:type fise:TopicAnnotation
> (2)    ?ta fise:entity-reference ?topic-uri
> (3)    ?ta fise:entity-label ?topic-label
> (4)    ?ta fise:entity-type ?topic-type
> (5)    ?ta dc:relation ?ta
> (6)    ?ta rdf:type fise:TextAnnotation
> (7)    ?ta fise:start ?sectionStartPos
> (8)    ?ta fise:end ?sectionEndPos
> (9)    ?ta dc:type skos:Concept
> (1,3,5,6,9) are required
> (2) defines the URI of the assigned Topic. This might not be available in case the Topic has only a label but is not formally assigned an unique ID
> (4) the type of the Topic. It is strongly suggested to use skos:Concept as type.
> (6,7,8) do link the fise:TopicAnnotation with the text. (7,8) are required if a topic needs to be assigned to an sub-section of the analyzed content. 
> NOTE: fise:selected-text and fise:selection-context are not used in this example as those text could be very huge for bigger sections. Here we would need to define a better way to define the context for TextAnnotations that select whole sections of the parsed content.
> As far as I know the TopicEngine already follows this approach. The ZemantaEngine and the CELI Classification Engine need to be adapted (as part of this Issue) to conform to the defined structure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira