You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Aingaran Pillai (JIRA)" <ji...@apache.org> on 2015/07/21 10:24:05 UTC

[jira] [Commented] (STANBOL-1121) Event extraction Enhancement Engine

    [ https://issues.apache.org/jira/browse/STANBOL-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634752#comment-14634752 ] 

Aingaran Pillai commented on STANBOL-1121:
------------------------------------------

[~rwesten] any updates on this?

> Event extraction Enhancement Engine
> -----------------------------------
>
>                 Key: STANBOL-1121
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1121
>             Project: Stanbol
>          Issue Type: New Feature
>          Components: Enhancement Engines
>            Reporter: Cristian Petroaca
>            Assignee: Rupert Westenthaler
>              Labels: extraction, triple
>
> Functionality
> =========
> Develop an Enhancement Engine which would construct a formal knowledge representation from natural language text. The knowledge extracted from the text would be in the form of Triples (Subject-Verb-Object). This Enhancement Engine will be mainly concerned with representation of real-world events.
> Example : 
> We have the text "Google buys Youtube". Google=Subject, buys=verb, Youtube=object.
> Implementation
> ===========
> Triple Extraction
> -----------------------
> The following will be applied on the natural language text in order to extract the triples:
> + Named entity extraction
> + Co-reference resolution of those named entites
> + POS Tagging or dependency trees to figure out what verbs and object are in conjunction to the named entities.
> Based on the last step we would have the set of triples. 
> Formal representation of triples
> ---------------------------------------------
> The formal representation of the triples will be based on the DOLCE foundational ontology. We will have the following data structures :
>  * fise:SettingAnnotation
>     * {fise:Enhancement} metadata
> describes the context of the data
>  * fise:ParticipantAnnotation
>     * {fise:Enhancement} metadata
>     * fise:inSetting {settingAnnotation}
>     * fise:hasMention {textAnnotation}
>     * fise:suggestion {entityAnnotation} (multiple if there are more
> suggestions)
>     * dc:type one of fise:Agent, fise:Patient, fise:Instrument, fise:Cause
> describes the participants from the context. In our example these would be "Google" and "Youtube". In Dolce ontology these would be the Endurants.
>  * fise:OccurrentAnnotation
>     * {fise:Enhancement} metadata
>     * fise:inSetting {settingAnnotation}
>     * fise:hasMention {textAnnotation}
>     * dc:type set to fise:Activity
>     *??:hasRelations (describes the particpants linked to this occurent - TBD)
> describes the action made by the participants. In our example this would be "buys". In Dolce ontology this would be the Perdurant.
> For further information see also the Mail Thread related to this Issue: http://markmail.org/message/qed6y5avbymvmmgu



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)