You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Richard Eckart de Castilho <re...@apache.org> on 2021/01/15 10:00:32 UTC

Heads up on work on SelectFS (recap)

Hey folks, happy new year,

I am back on working on the SelectFS API in UIMAv3 (e.g. [1] and other PRs/issues).

"Short" recap: 

UIMA so far does not provide a nice set of methods to check
in which spatial relationship two annotations exist
(e.g. overlapping, colocated, covering, etc.) - so we I went
about to implement such methods.

When given a concrete example, it is in many cases clear
e.g. if two annotations overlap or cover each other, but 
in particular for annotations where begin and end are the
same, this is not always clear. So a lot o work went into
working out a matrix specifying the relationships and examples
telling which relationship exists when [2]

Then many many unit tests were implemented to ensure that the
behavior of the SelectFS API (which offers selectors such as
following, covering, coveredBy, etc.) is consistent with
the matrix. This uncovered a number of inconsistencies and bugs,
many of which have meanwhile already been addressed.

While working on all this, I found that the already complex code
of SelectFS and Subiterator became more and more complex and hard
to understand. So I was starting to search for ideas how to reduce
the complexity.

As a solution to removing the complexity, I came up with the idea of
trying to reduce all the different cases (covering, covered by, following,
preceding, etc.) to one case, namely covered-by .

So I have re-defined all the different spatial relationships 
that annotations can have in terms of covered by [3]. This yielded yet another
update to the relationship definition matrix which resolved a particular annoying
edge case. The result of this update is that two annotations A:[10,10] and B:[10,10]
are simultaneously left and right of each other. I'm afraid, I cannot find
a concise explanation for why having this oddity is reasonable.

For the moment, I am implementing additional unit tests to check the behavior
of SelectFS and to ensure we have a comprehensive test suite which might later
facilitate a refactoring/simplification of the SelectFS/Subiterator code.

Cheers,

-- Richard

[1] https://github.com/apache/uima-uimaj/pull/85
[2] https://github.com/apache/uima-uimaj/blob/b1170818ecd8b02f5619cd25c060d25b2c971ed1/uima-docbook-v3-users-guide/src/docbook/images/version_3_users_guide/annotation_predicates/annotation-relations-table.png
[3] https://github.com/apache/uima-uimaj/blob/b1170818ecd8b02f5619cd25c060d25b2c971ed1/uima-docbook-v3-users-guide/src/docbook/images/version_3_users_guide/annotation_predicates/annotation-relations-definition.png