You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by bu...@apache.org on 2012/06/15 09:11:14 UTC

svn commit: r821830 - in /websites/staging/stanbol/trunk/content: ./ stanbol/docs/trunk/enhancer/enhancementstructure.html stanbol/docs/trunk/enhancer/es_entitydisambiguation.png

Author: buildbot
Date: Fri Jun 15 07:11:13 2012
New Revision: 821830

Log:
Staging update by buildbot for stanbol

Added:
    websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/es_entitydisambiguation.png   (with props)
Modified:
    websites/staging/stanbol/trunk/content/   (props changed)
    websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.html

Propchange: websites/staging/stanbol/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Jun 15 07:11:13 2012
@@ -1 +1 @@
-1350094
+1350479

Modified: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.html
==============================================================================
--- websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.html (original)
+++ websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.html Fri Jun 15 07:11:13 2012
@@ -242,7 +242,38 @@
 <p>The usage of 'fise:TopicAnnotation' is similar to EntityAnnotation. They do use the exact same properties ('fise:entity-referene','fise:entity-label',fise:entity-type', 'fise:confidence','entityhub:site'). The only difference is that one need to iterate over '{topic-anntoation} rdf:type fise:TopicAnnotaion'. So typically clients will want to use the exact same code to process {entity-annotation} and {topic-annotation} instances.</p>
 <p>In the next section "Entity Disambiguation" an improved version of Entity Tagging is described that allows users to: (1) accept/decline a spotted Entity and than (2) select one of several suggested Entities.</p>
 <h2 id="entity-disambiguation">Entity Disambiguation</h2>
-<p>TODO: Work in progress</p>
+<p>Entity Disambiguation is required if an entity detected in the analyzed text can refer to different Entities. The following figure shows an example where "Bob Marley" is detected as a person in the text however there are two possible matches within the controlled vocabulary.</p>
+<p><img alt="Entity Disambiguation" src="es_entitydisambiguation.png" title="&quot;Bob Marley as spotted in the Text may refer to two different persons in DBpedia.org" /></p>
+<p>The fact that one Entity detected in the Text - represented by a 'fise:TextAnnotation' may have multiple suggested Entities - represented by the two 'fise:EntityAnnotation's - has a negative impact on <a href="#entity-tagging">Entity Tagging</a> interface that suggest tags based on 'fise:entityAnnotation's. This is because such an interface would show in the above case two suggestions: (1) for <a href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> and (2) for <a href="http://dbpedia.org/resource/Bob_Marley_%28comedian%29">dbpedia:Bob_Marley_(comedian)</a>. So even if the user want to tag this content with "Bob Marley" he will need to reject at least one of the two suggestions.</p>
+<p>Adding explicit support for Entity Disambiguation to an Entity Tagging user interface can solve this problem by grouping suggested entities along 'fise:TextAnnotation's they are suggested for. </p>
+<h3 id="grouping-suggested-entities">Grouping suggested Entities</h3>
+<p>The goal of the Entity Tagging UI with disambiguation support is to show only a single tag suggestion for all Entities suggested for the same section in the analyzed text. To active this we need to follow the link between 'fise:EntityAnnotation' and 'fise:TextAnnotation'.</p>
+<p>There are several options on how to do active this. Here an option is presented that starts with iterating over 'fise:EntityAnnotation's because the assumption that one wants to improve an existing <a href="#entity-tagging">Entity Tagging</a> interface.</p>
+<ol>
+<li>Iterate over all 'fise:EntityAnnotation' instances. This refers to all resources such as "{entity-annotation} rdf:type fise:EntityAnnotation". <ul>
+<li>For more information on how to collect information for extracted Entities see the <a href="#process-suggested-entities">according section</a> in the simple <a href="#entity-tagging">Entity Tagging</a> interface.</li>
+</ul>
+</li>
+<li>Retrieve the 'fise:TextAnnotation' referenced by processed 'fise:EntityAnnotation's. For that one needs to retrieve the value(s) of the 'dc:relation' property.</li>
+<li>While iterating over the 'fise:EntityAnnotation's establish a mapping 'fise:TextAnnotation' -&gt; 'fise:EntityAnnotation','fise:EntityAnnotation, ...<ul>
+<li>the list of 'fise:EntityAnnotation's for each 'fise:TextAnnotation' needs to be sorted based on the value of the 'fise:confidence' property of the EntityAnnotation. Ensure that the EntityAnnotation with the higher confidence is first in the list. 'fise:confidence' values are in the range 0..1 where higher numbers represent a higher certainly.</li>
+</ul>
+</li>
+<li>Suggest tags based on 'fise:TextAnnotation's - keys in the mapping created in step (3).<ul>
+<li>Allow users to easily accept the Entity with the highest rank - <a href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> in the above example. Especially if the confidence of the first suggestion is high (e.g. &gt;= 0.8) and considerable higher as confidence values of other options.</li>
+<li>Provide users with the possibility to inspect further suggested options - to disambiguate between different options.</li>
+</ul>
+</li>
+</ol>
+<h3 id="showing-the-extraction-context">Showing the extraction Context</h3>
+<p>To allow users to more easily disambiguate between the suggested Entities it is important to provide them with information about the extraction context of the suggested entities. This is of special importance if content is not completely visible to the user (e.g. because it is to long to fit on the screen or the content is of a type that can not be rendered within the browser).</p>
+<p>Assuming the suggested Entities are grouped by 'fise:TextAnnotation' - as explained in the above section - one can use the information provided by the TextAnnotation to visualize the context and therefore helping the user in with the disambiguation task.</p>
+<p>The following information of the TextAnnotation can be used for this task:</p>
+<ul>
+<li>'fise:selection-context': This is the text surrounding the extracted Entity. The exact size of this context depends on the configuration and the EnhancementEngine. But typically it is the current sentence or about 50 charters before an after the selection</li>
+<li>'fise:selected-text': This is the text representing the extracted Entity - the section of the text the Entity was suggested for. The 'fise:selected-text' MUST BE contained within the 'fise:selection-context' so user interfaces to want to highlight the selected part of the context can use a contains query in the selection context for the selected text. In case of multiple matches it is typically sufficient to highlight all occurrences</li>
+<li>'fise:start' and 'fise:end' values could be also used to determine the location however because those offset are relative to the start of the content it is typically easier to use the occurrences of the selected text within the selection context.</li>
+</ul>
 <h2 id="occurrence-based-annotation">Occurrence based Annotation</h2>
 <p>This describes a user interface similar to one of a spell/grammar checker. But instead of marking misspelled words entities recognized within the text are suggested to the user. The following figure shows such an interface as implemented by the <a href="http://hallojs.org">hallo.js</a> combined with the <a href="https://github.com/szabyg/annotate.js">annotate.js</a> plugin (see the <a href="http://hallojs.org/annotate.html">Demo here</a> <small>(last accessed 2012-05-30)</small> - click in the Text and press the "annotate" button).</p>
 <p><img alt="Occurrence based Annotation UI" src="hallo-annotate_scrrenshot.png" title="hallo.js with the annotate.js plugin used to implement an text occurrence based annotation UI" /></p>

Added: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/es_entitydisambiguation.png
==============================================================================
Binary file - no diff available.

Propchange: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/es_entitydisambiguation.png
------------------------------------------------------------------------------
    svn:mime-type = image/png