You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Marshall Schor <ms...@schor.com> on 2015/05/05 17:26:12 UTC

Some details about the recent changes to the UIMA core iterators/indexes

Hi,

The latest delivery of changes into the core has several refactorings and
cleanups in the indexes and iterators, mainly to reduce the duplication of code
(which in some cases resulted in fixes being put into some but not all "paths",
for example).

This delivery also includes the flattened indexes (which are built automatically
when it is detected that these are probably not being updated, but are being
iterated over frequently). At the termination of the JVM, there is (for now) a
shutdown hook which writes a few statistics to standard out, about the flattened
indexes.  An example:

Time to flatten was 12,543,446 microseconds
Flatten tuning, threshold: 50, creations: 402,309 uses: 5583888, discards: 2063

The threshold is a constant - the minimum number of iterator reorderings done
while iterating over a type and its subtypes, before a flattened version is
created.  The creations are the number of times a flattened index was created;
the discards are the number of times an existing flattened iterator was
discarded because a) the heap got too small, or b) an index update occurred. 
The uses - this is the number of times an interator was created that made use of
the flattened index.

I'll put this output under control of a -Duima.flattenedIndex.statistics or some
such control.

I'd be curious to hear from users what, if any, performance change you might
observe; this could be small if the annotator analytics use most of the time,
but could be significant for very fast annotators.

-Marshall