You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uima.apache.org by sc...@apache.org on 2015/11/01 16:17:42 UTC

svn commit: r1711807 - /uima/uimaj/branches/experiment-v3-jcas/uimaj-core/src/main/java/org/apache/uima/cas/text/AnnotationIndex.java

Author: schor
Date: Sun Nov  1 15:17:42 2015
New Revision: 1711807

URL: http://svn.apache.org/viewvc?rev=1711807&view=rev
Log:
[UIMA-4681] javadoc clarification for subiterators

Modified:
    uima/uimaj/branches/experiment-v3-jcas/uimaj-core/src/main/java/org/apache/uima/cas/text/AnnotationIndex.java

Modified: uima/uimaj/branches/experiment-v3-jcas/uimaj-core/src/main/java/org/apache/uima/cas/text/AnnotationIndex.java
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/experiment-v3-jcas/uimaj-core/src/main/java/org/apache/uima/cas/text/AnnotationIndex.java?rev=1711807&r1=1711806&r2=1711807&view=diff
==============================================================================
--- uima/uimaj/branches/experiment-v3-jcas/uimaj-core/src/main/java/org/apache/uima/cas/text/AnnotationIndex.java (original)
+++ uima/uimaj/branches/experiment-v3-jcas/uimaj-core/src/main/java/org/apache/uima/cas/text/AnnotationIndex.java Sun Nov  1 15:17:42 2015
@@ -79,13 +79,45 @@ public interface AnnotationIndex<T exten
 
   /**
    * Return a subiterator whose bounds are defined by the input annotation.
+   * 
    * <p>
-   * The subiterator will return annotations <code>b</code> s.t. <code>annot &lt; b</code>,
-   * <code>annot.getBegin() &lt;= b.getBegin()</code> and
-   * <code>annot.getEnd() &gt;= b.getEnd()</code>. For annotations x, y, <code>x &lt; y</code>
+   * The <code>annot</code> is used for 3 purposes:</p>
+   * <ul><li>It is used to compute the position in the index where the iteration starts.</li>
+   * <li>It is used to compute end point where the iterator stops when moving forward.</li>
+   * <li>It is used to specify which annotations will be skipped while iterating.</li>
+   * </ul>
+   * 
+   * <p>The starting position is computed by first finding a position 
+   * whose annotation compares equal with the <code>annot</code> (this might be one of several), and then
+   * advancing until reaching a position where the annotation there is not equal to the 
+   * <code>annot</code>.
+   * If no item in the index is equal (meaning it has the same begin, the same end, and is the same type
+   * as the <code>annot</code>) 
+   * then the iterator is positioned to the first annotation 
+   * which is greater than the <code>annot</code>, or
+   * if there are no annotations greater than the <code>annot</code>, the iterator is marked invalid.
+   * </p>
+   * <p>The iterator will stop (become invalid) when
+   * <ul><li>it runs out of items in the index going forward or backwards, or</li>
+   * <li>while moving forward, it reaches a point where the annotation at that position has a 
+   * start is beyond the <code>annot's</code> end position, or</li>
+   * <li>while moving backwards, it reaches a position in front of its original starting position.</li>
+   * </ul>
+   * <p>While iterating, it operates like a <code>strict</code> iterator; 
+   * annotations whose end positions are &gt; the end position of <code>annot</code> are skipped.
+   * </p>
+
+     * <p>This is equivalent to returning annotations <code>b</code> such that</p> 
+   * <ul><li><code>annot &lt; b</code>, and</li>
+   * <li><code>annot.getEnd() &gt;= b.getBegin()</code>, skipping <code>b's</code>
+   * whose end position is &gt; annot.getEnd().</li>
+   * </ul>
+   * 
+   * <p>For annotations x, y, <code>x &lt; y</code>
    * here is to be interpreted as "x comes before y in the index", according to the rules defined in
    * the description of {@link AnnotationIndex this class}.
    * </p>
+   * 
    * <p>
    * This definition implies that annotations <code>b</code> that have the same span as
    * <code>annot</code> may or may not be returned by the subiterator. This is determined by the
@@ -95,14 +127,16 @@ public interface AnnotationIndex<T exten
    * <code>b</code> are of the same type, then the behavior is undefined.
    * </p>
    * <p>
-   * For example, if you an annotation <code>s</code> of type <code>Sentence</code> and an
-   * annotation <code>p</code> of type <code>Paragraph</code> that have the same span, and you
+   * <p>
+   * For example, if you have an annotation <code>S</code> of type <code>Sentence</code> and an
+   * annotation <code>P</code> of type <code>Paragraph</code> that have the same span, and you
    * have defined <code>Paragraph</code> before <code>Sentence</code> in your type priorities,
-   * then <code>subiterator(p)</code> will give you an iterator that will return <code>s</code>,
-   * but <code>subiterator(s)</code> will give you an iterator that will NOT return <code>p</code>.
+   * then <code>subiterator(P)</code> will give you an iterator that will return <code>S</code>,
+   * but <code>subiterator(S)</code> will give you an iterator that will NOT return <code>P</code>.
    * The intuition is that a Paragraph is conceptually larger than a Sentence, as defined by the
    * type priorities.
    * </p>
+   * 
    * <p>
    * Calling <code>subiterator(a)</code> is equivalent to calling
    * <code>subiterator(a, true, true).</code>. See
@@ -116,19 +150,52 @@ public interface AnnotationIndex<T exten
   FSIterator<T> subiterator(AnnotationFS annot);
 
   /**
-   * Return a subiterator whose bounds are defined by the input annotation.
+   * Return a subiterator whose bounds are defined by the <code>annot</code>.
    * <p>
-   * A <code>strict</code> subiterator is defined as follows: it will return annotations
-   * <code>b</code> s.t. <code>annot &lt; b</code>,
-   * <code>annot.getBegin() &lt;= b.getBegin()</code> and
-   * <code>annot.getEnd() &gt;= b.getEnd()</code>. For annotations x,y, <code>x &lt; y</code>
+   * The <code>annot</code> is used in 2 or 3 ways.</p>
+   * <ul><li>It specifies the left-most position in the index where the iteration starts.</li>
+   * <li>It specifies an end point where the iterator stops.</li>
+   * <li>If <code>strict</code> is specified, the end point also specifies which annotations 
+   * will be skipped while iterating.</li>
+   * </ul>
+   * <p>The starting position is computed by first finding the position 
+   * whose annotation compares equal with the <code>annot</code>, and then
+   * advancing until reaching a position where the annotation there is not equal to the 
+   * <code>annot</code>.
+   * If no item in the index is equal (meaning it has the same begin, the same end, and is the same type
+   * as the <code>annot</code>) 
+   * then the iterator is positioned to the first annotation 
+   * which is greater than the <code>annot</code>, or
+   * if there are no annotations greater than the <code>annot</code>, the iterator is marked invalid.
+   * </p>
+   * <p>The iterator will stop (become invalid) when
+   * <ul><li>it runs out of items in the index going forward or backwards, or</li>
+   * <li>while moving forward, it reaches a point where the annotation at that position has a 
+   * start is beyond the <code>annot's</code> end position, or</li>
+   * <li>while moving backwards, it reaches a position in front of its original starting position</li>
+   * </ul>
+   * </p>
+   * <p>Ignoring <code>strict</code> and <code>ambiguous</code> for a moment, 
+   * this is equivalent  to returning annotations <code>b</code> such that</p> 
+   * <ul><li><code>annot &lt; b</code> using the standard annotation comparator, and</li>
+   * <li><code>annot.getEnd() &gt;= b.getBegin()</code>, and also bounded by the index itself.</li>
+   * </ul></p>
+   * <p>
+   * A <code>strict</code> subiterator skips annotations where 
+   * <code>annot.getEnd() &lt; b.getEnd()</code>.
+   * </p>
+   * <p>
+   * A <code>ambiguous = false</code> specification produces an unambigouse iterator, which 
+   * computes a subset of the annotations, going forward, such that annotations whose <code>begin</code>
+   * is contained within the previous returned annotation's span, are skipped.
+   * </p>
+   * <p>For annotations x,y, <code>x &lt; y</code>
    * here is to be interpreted as "x comes before y in the index", according to the rules defined in
    * the description of {@link AnnotationIndex this class}.
    * <p>
-   * If <code>strict</code> is set to <code>false</code>, the boundary conditions are relaxed
-   * as follows: return annotations <code>b</code> s.t. <code>annot &lt; b</code> and
-   * <code>annot.getBegin() &lt;= b.getBegin() &lt;= annot.getEnd()</code>. The resulting
-   * iterator may also be disambiguated.
+   * If <code>strict = true</code> then annotations whose end is &gt; <code>annot.getEnd()</code>
+   * are skipped.
+   * </p> 
    * <p>
    * These definitions imply that annotations <code>b</code> that have the same span as
    * <code>annot</code> may or may not be returned by the subiterator. This is determined by the
@@ -138,15 +205,15 @@ public interface AnnotationIndex<T exten
    * <code>b</code> are of the same type, then the behavior is undefined.
    * </p>
    * <p>
-   * For example, if you an annotation <code>s</code> of type <code>Sentence</code> and an
-   * annotation <code>p</code> of type <code>Paragraph</code> that have the same span, and you
+   * For example, if you have an annotation <code>S</code> of type <code>Sentence</code> and an
+   * annotation <code>P</code> of type <code>Paragraph</code> that have the same span, and you
    * have defined <code>Paragraph</code> before <code>Sentence</code> in your type priorities,
-   * then <code>subiterator(p)</code> will give you an iterator that will return <code>s</code>,
-   * but <code>subiterator(s)</code> will give you an iterator that will NOT return <code>p</code>.
+   * then <code>subiterator(P)</code> will give you an iterator that will return <code>S</code>,
+   * but <code>subiterator(S)</code> will give you an iterator that will NOT return <code>P</code>.
    * The intuition is that a Paragraph is conceptually larger than a Sentence, as defined by the
    * type priorities.
    * </p>
-   * 
+    * 
    * @param annot
    *          Annotation setting boundary conditions for subiterator.
    * @param ambiguous