You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by rw...@apache.org on 2012/05/30 13:28:40 UTC

svn commit: r1344200 - in /incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer: enhancementstructure.mdtext hallo-annotate_scrrenshot.png

Author: rwesten
Date: Wed May 30 11:28:40 2012
New Revision: 1344200

URL: http://svn.apache.org/viewvc?rev=1344200&view=rev
Log:
first work on PART 2: Consuming the Stanbol Enhancement Structure

Added:
    incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/hallo-annotate_scrrenshot.png   (with props)
Modified:
    incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext

Modified: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext
URL: http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext?rev=1344200&r1=1344199&r2=1344200&view=diff
==============================================================================
--- incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext (original)
+++ incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext Wed May 30 11:28:40 2012
@@ -135,4 +135,78 @@ TopicAnnotation are used to categorize/c
 
 # Part 2: Using the Stanbol Enhancement Structure
 
-TODO: Work in progress
\ No newline at end of file
+## Entity Tagging
+
+TODO: Work in progress
+
+## Entity Disambiguation
+
+TODO: Work in progress
+
+## Occurrence based Annotation
+
+This describes a user interface similar to one of a spell/grammar checker. But instead of marking misspelled words entities recognized within the text are suggested to the user. The following figure shows such an interface as implemented by the [hallo.js](http://hallojs.org) combined with the [annotate.js](https://github.com/szabyg/annotate.js) plugin.
+
+![Occurrence based Annotation UI](hallo-annotate_scrrenshot.png "hallo.js with the annotate.js plugin used to implement an text occurrence based annotation UI")
+
+To implement user interfaces like that one needs to acquire the following information form the enhancements returned by the Stanbol Enhancer.
+
+__Showing the Occurrences within the Text__
+
+This described how to obtain the necessary information needed to visualize extracted Entities within the text.
+
+1. Query for/iterate over 'fise:TextAnnotation's of the enhancement results.
+    * it is important to only use TextAnnotations that define a 'fise:selected-text' property. TextAnnotations that do not define this property usually select whole sections or even the document as a whole. Those are not of interest for this use case.
+2. Determine the exact occurrence of the TextAnnoations
+    * in case of plain text content this can be easily done by using the values of 'fise:start' and 'fise:end'
+    * in case the content includes additional markup the char indexes of 'fise:start'/'fise:end' will not match. In such cases the preferred way is to first search the occurrence of'fise:selection-context' and thann the occurrence of 'fise:selected-text' within.
+3. Retrieve suggested Entities for a given TextAnnotation. For that one needs to search for "?suggestion dc:relation {text-annotation}" where '{text-annotation}' refers to the URI of the current TextAnnotation.
+    * Note that there will be TextAnnotations with no suggestions.
+
+The following SPARQL query could be used to select all the required information. However the use of SPARQL is optional as the required information can be also easily retrieved by other means (e.g. the filtered Iteratros as typically provided by RDF frameworks). 
+
+    :::sparql
+    select * 
+    from {
+        ?textAnnotation rdfs:type fise:TextAnnotation
+        ?textAnnotation fise:selected-text ?selected
+        ?textAnnotation fise:selection-context ?context
+        ?textAnnotation fise:start ?startIndex
+        ?textAnnotation fise:end ?endIndex
+        ?textAnnotation dc:type ?nature
+	optional { ?suggestions dc:relation ?textAnnotation }
+    }
+
+Additionally:
+
+* The value of the 'dc:type' is well suited to select different style sheets. See the section for [fise:TextAnnotation](#fisetextannotation) for detailed information.
+* The UI might need to differentiate between TextAnnotations with/without suggestions.
+
+__Processing Suggested Entities__
+
+In principle there are three different cases
+
+1. No suggestion: This indicates that a Named Entity was recognized during natural language processing, but to matching Entity was found within the knowledge base. In this case users might want to
+    * manually search the knowledge base for an Entity. The Stanbol Entityhub Sites Endpoint can be used to implement this feature by sending a "GET http://{host}:{port}/entityhub/sites/find?name={name}" (see the WebUI of your Stanbol instance for the detailed documentation).
+    * Create a new Entity based on the current TextAnnotation. In this case the 'fise:selected-text' should be suggested as 'rdfs:label' and the 'dc:type' value could be used for the 'rdf:type'. New Entities can be added to the knowledge base by sending a "POST http://{host}:{port}/entityhub/entity" with the RDF data of the Entity as content (see the WebUI of your Stanbol instance for the detailed documentation).
+2. Distinct suggestion: This means that there is only a single suggestion with a high 'fise:confidence'. Also multiple suggestions where the first one as a high confidence and additional suggestions come with low confidence values may fit this description. In such situations 
+    * the UI might want to automatically accept the suggestion
+    * allow users to show additional suggestion on request.
+    * undo automatic acceptance of the suggestion.
+3. Ambiguous Suggestions: This situation is satisfied if multiple entities are suggested with a medium to high 'fise:confidence'. In those cases typically the user must provide additional input by
+    * selecting the correct entity
+    * rejecting all suggestions
+    * also manually searching and/or creating a new Entity as described for (1) would be possible interaction
+
+The required data for for the described interaction patters are available within the enhancement results as follows:
+
+The following assumes {text-annotation} - the URI of the current 'fise:TextAnnotation' - as context
+
+1. Query for/iterate over all entity suggestions: The suggestions for {text-annotation} can be acquired by using "?entityAnnotation dc:relation {text-annotation}
+    * only results with the the 'rdf:type' 'fise:EntityAnnotation' should be processed. However typically all results will be any way of that type.
+    * the 'fise:confidence' property represents the confidence of the suggestion in the range FROM 0 (very uncertain) TO 1 (very certain). Note that the 'fise:confidence' value is optional - so there might be EntityAnnotations without confidence information. However all [EnhancementEngines managed by the Stanbol community](engines/list.html) do provide confidence information.
+2. Visualize suggestions: EntityAnnotations do provide some basic information about the suggested Entity that can be used for visualization. Most important the URI of the suggested entity as value of 'fise:referenced-entity'. Additional the label and the types of the Entity are included.
+3. Retrieving additional information about referenced Entities: While the EntityAnnotation includes some basic information some users might want to retrieve all available information of referenced Entities - to dereference the Entity:
+    * As this is a rather common use case the [EntityLinkingEngine]() and [KeywordLinkingEngine]() are by default configured to include information of Entities within the EnhancementResults. So users that use those EnhancementEngines will not need to dereference Entities as those information are already available within the enhancement results.
+    * If a 'fise:EntityAnnotation' has the 'entityhub:site' property Entities can be dereferenced by using the Stanbol Entityhub (see the section for [fise:EntityAnnotation](#fiseentityannotation) for details)
+    * In all other cases the URI of the suggested entity need to be used for dereferencing. If the referenced Entity is part of the [Linked Data](http://linkeddata.org/) cloud this is often possible by the [CoolURI](http://www.w3.org/TR/cooluris/) - basically sending a "GET -h "Accept: application/json+rdf" {entity-uri}".

Added: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/hallo-annotate_scrrenshot.png
URL: http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/hallo-annotate_scrrenshot.png?rev=1344200&view=auto
==============================================================================
Binary file - no diff available.

Propchange: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/hallo-annotate_scrrenshot.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream