You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uima.apache.org by sc...@apache.org on 2015/11/01 16:17:42 UTC
svn commit: r1711807 -
/uima/uimaj/branches/experiment-v3-jcas/uimaj-core/src/main/java/org/apache/uima/cas/text/AnnotationIndex.java
Author: schor
Date: Sun Nov 1 15:17:42 2015
New Revision: 1711807
URL: http://svn.apache.org/viewvc?rev=1711807&view=rev
Log:
[UIMA-4681] javadoc clarification for subiterators
Modified:
uima/uimaj/branches/experiment-v3-jcas/uimaj-core/src/main/java/org/apache/uima/cas/text/AnnotationIndex.java
Modified: uima/uimaj/branches/experiment-v3-jcas/uimaj-core/src/main/java/org/apache/uima/cas/text/AnnotationIndex.java
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/experiment-v3-jcas/uimaj-core/src/main/java/org/apache/uima/cas/text/AnnotationIndex.java?rev=1711807&r1=1711806&r2=1711807&view=diff
==============================================================================
--- uima/uimaj/branches/experiment-v3-jcas/uimaj-core/src/main/java/org/apache/uima/cas/text/AnnotationIndex.java (original)
+++ uima/uimaj/branches/experiment-v3-jcas/uimaj-core/src/main/java/org/apache/uima/cas/text/AnnotationIndex.java Sun Nov 1 15:17:42 2015
@@ -79,13 +79,45 @@ public interface AnnotationIndex<T exten
/**
* Return a subiterator whose bounds are defined by the input annotation.
+ *
* <p>
- * The subiterator will return annotations <code>b</code> s.t. <code>annot < b</code>,
- * <code>annot.getBegin() <= b.getBegin()</code> and
- * <code>annot.getEnd() >= b.getEnd()</code>. For annotations x, y, <code>x < y</code>
+ * The <code>annot</code> is used for 3 purposes:</p>
+ * <ul><li>It is used to compute the position in the index where the iteration starts.</li>
+ * <li>It is used to compute end point where the iterator stops when moving forward.</li>
+ * <li>It is used to specify which annotations will be skipped while iterating.</li>
+ * </ul>
+ *
+ * <p>The starting position is computed by first finding a position
+ * whose annotation compares equal with the <code>annot</code> (this might be one of several), and then
+ * advancing until reaching a position where the annotation there is not equal to the
+ * <code>annot</code>.
+ * If no item in the index is equal (meaning it has the same begin, the same end, and is the same type
+ * as the <code>annot</code>)
+ * then the iterator is positioned to the first annotation
+ * which is greater than the <code>annot</code>, or
+ * if there are no annotations greater than the <code>annot</code>, the iterator is marked invalid.
+ * </p>
+ * <p>The iterator will stop (become invalid) when
+ * <ul><li>it runs out of items in the index going forward or backwards, or</li>
+ * <li>while moving forward, it reaches a point where the annotation at that position has a
+ * start is beyond the <code>annot's</code> end position, or</li>
+ * <li>while moving backwards, it reaches a position in front of its original starting position.</li>
+ * </ul>
+ * <p>While iterating, it operates like a <code>strict</code> iterator;
+ * annotations whose end positions are > the end position of <code>annot</code> are skipped.
+ * </p>
+
+ * <p>This is equivalent to returning annotations <code>b</code> such that</p>
+ * <ul><li><code>annot < b</code>, and</li>
+ * <li><code>annot.getEnd() >= b.getBegin()</code>, skipping <code>b's</code>
+ * whose end position is > annot.getEnd().</li>
+ * </ul>
+ *
+ * <p>For annotations x, y, <code>x < y</code>
* here is to be interpreted as "x comes before y in the index", according to the rules defined in
* the description of {@link AnnotationIndex this class}.
* </p>
+ *
* <p>
* This definition implies that annotations <code>b</code> that have the same span as
* <code>annot</code> may or may not be returned by the subiterator. This is determined by the
@@ -95,14 +127,16 @@ public interface AnnotationIndex<T exten
* <code>b</code> are of the same type, then the behavior is undefined.
* </p>
* <p>
- * For example, if you an annotation <code>s</code> of type <code>Sentence</code> and an
- * annotation <code>p</code> of type <code>Paragraph</code> that have the same span, and you
+ * <p>
+ * For example, if you have an annotation <code>S</code> of type <code>Sentence</code> and an
+ * annotation <code>P</code> of type <code>Paragraph</code> that have the same span, and you
* have defined <code>Paragraph</code> before <code>Sentence</code> in your type priorities,
- * then <code>subiterator(p)</code> will give you an iterator that will return <code>s</code>,
- * but <code>subiterator(s)</code> will give you an iterator that will NOT return <code>p</code>.
+ * then <code>subiterator(P)</code> will give you an iterator that will return <code>S</code>,
+ * but <code>subiterator(S)</code> will give you an iterator that will NOT return <code>P</code>.
* The intuition is that a Paragraph is conceptually larger than a Sentence, as defined by the
* type priorities.
* </p>
+ *
* <p>
* Calling <code>subiterator(a)</code> is equivalent to calling
* <code>subiterator(a, true, true).</code>. See
@@ -116,19 +150,52 @@ public interface AnnotationIndex<T exten
FSIterator<T> subiterator(AnnotationFS annot);
/**
- * Return a subiterator whose bounds are defined by the input annotation.
+ * Return a subiterator whose bounds are defined by the <code>annot</code>.
* <p>
- * A <code>strict</code> subiterator is defined as follows: it will return annotations
- * <code>b</code> s.t. <code>annot < b</code>,
- * <code>annot.getBegin() <= b.getBegin()</code> and
- * <code>annot.getEnd() >= b.getEnd()</code>. For annotations x,y, <code>x < y</code>
+ * The <code>annot</code> is used in 2 or 3 ways.</p>
+ * <ul><li>It specifies the left-most position in the index where the iteration starts.</li>
+ * <li>It specifies an end point where the iterator stops.</li>
+ * <li>If <code>strict</code> is specified, the end point also specifies which annotations
+ * will be skipped while iterating.</li>
+ * </ul>
+ * <p>The starting position is computed by first finding the position
+ * whose annotation compares equal with the <code>annot</code>, and then
+ * advancing until reaching a position where the annotation there is not equal to the
+ * <code>annot</code>.
+ * If no item in the index is equal (meaning it has the same begin, the same end, and is the same type
+ * as the <code>annot</code>)
+ * then the iterator is positioned to the first annotation
+ * which is greater than the <code>annot</code>, or
+ * if there are no annotations greater than the <code>annot</code>, the iterator is marked invalid.
+ * </p>
+ * <p>The iterator will stop (become invalid) when
+ * <ul><li>it runs out of items in the index going forward or backwards, or</li>
+ * <li>while moving forward, it reaches a point where the annotation at that position has a
+ * start is beyond the <code>annot's</code> end position, or</li>
+ * <li>while moving backwards, it reaches a position in front of its original starting position</li>
+ * </ul>
+ * </p>
+ * <p>Ignoring <code>strict</code> and <code>ambiguous</code> for a moment,
+ * this is equivalent to returning annotations <code>b</code> such that</p>
+ * <ul><li><code>annot < b</code> using the standard annotation comparator, and</li>
+ * <li><code>annot.getEnd() >= b.getBegin()</code>, and also bounded by the index itself.</li>
+ * </ul></p>
+ * <p>
+ * A <code>strict</code> subiterator skips annotations where
+ * <code>annot.getEnd() < b.getEnd()</code>.
+ * </p>
+ * <p>
+ * A <code>ambiguous = false</code> specification produces an unambigouse iterator, which
+ * computes a subset of the annotations, going forward, such that annotations whose <code>begin</code>
+ * is contained within the previous returned annotation's span, are skipped.
+ * </p>
+ * <p>For annotations x,y, <code>x < y</code>
* here is to be interpreted as "x comes before y in the index", according to the rules defined in
* the description of {@link AnnotationIndex this class}.
* <p>
- * If <code>strict</code> is set to <code>false</code>, the boundary conditions are relaxed
- * as follows: return annotations <code>b</code> s.t. <code>annot < b</code> and
- * <code>annot.getBegin() <= b.getBegin() <= annot.getEnd()</code>. The resulting
- * iterator may also be disambiguated.
+ * If <code>strict = true</code> then annotations whose end is > <code>annot.getEnd()</code>
+ * are skipped.
+ * </p>
* <p>
* These definitions imply that annotations <code>b</code> that have the same span as
* <code>annot</code> may or may not be returned by the subiterator. This is determined by the
@@ -138,15 +205,15 @@ public interface AnnotationIndex<T exten
* <code>b</code> are of the same type, then the behavior is undefined.
* </p>
* <p>
- * For example, if you an annotation <code>s</code> of type <code>Sentence</code> and an
- * annotation <code>p</code> of type <code>Paragraph</code> that have the same span, and you
+ * For example, if you have an annotation <code>S</code> of type <code>Sentence</code> and an
+ * annotation <code>P</code> of type <code>Paragraph</code> that have the same span, and you
* have defined <code>Paragraph</code> before <code>Sentence</code> in your type priorities,
- * then <code>subiterator(p)</code> will give you an iterator that will return <code>s</code>,
- * but <code>subiterator(s)</code> will give you an iterator that will NOT return <code>p</code>.
+ * then <code>subiterator(P)</code> will give you an iterator that will return <code>S</code>,
+ * but <code>subiterator(S)</code> will give you an iterator that will NOT return <code>P</code>.
* The intuition is that a Paragraph is conceptually larger than a Sentence, as defined by the
* type priorities.
* </p>
- *
+ *
* @param annot
* Annotation setting boundary conditions for subiterator.
* @param ambiguous