You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Marshall Schor <ms...@schor.com> on 2017/01/06 21:16:42 UTC

Question on some internals of Ruta engine operation

I think I'm getting a handle on what might be the issue with another test
difference running with UIMA v3 and Ruta, but would like to confirm something
about how Ruta operates.


I see the RutaStream checkAnchor code "splitting" a RutaBasic annotation into
two.  It carefully first removes the original from the index, changes its "end",
and creates a new annotation, and adds both of these back to the index.

It does this while the stream has iterator(s), e.g. the currentIt field. 

Normally, this is not allowed (updating an index while iterating over it), but
UIMA v2 made an exception: this was allowed if the first operation on the
iterator was a moveTo first/last/some-specific-FS.  These moveTo operations
"reset" the iterator state to a known position, using the then-current values of
the indexes.

In version 3, we added a copy-on-write style for iterators, that changed this (I
need to fix that) to avoid throwing ConcurrentModificationExceptions.  This
needs to be altered so that iterators that do the special moveTo operations that
formerly "reset" the state, acquire the new current state of the index, if a
copy-on-write had occurred, so they can "see" the changed index.

Before I embark on this fix, I'd feel better if I could get some confirmation
that Ruta is operating in this manner (at least for this test case) (i.e.,

1) adding Annotations to indexes
2) getting iterator(s) over those in RutaStream
3) removing and adding Annotations to the indexes while holding on to these
iterators
4) avoiding any ConcurrentModificationExceptions by always doing one of the 3
repositioning iterator operations: moveTo First/Last/a-Feature_structure, before
doing any other operation on the iterator.

Thanks. -Marshall


Re: Question on some internals of Ruta engine operation

Posted by Marshall Schor <ms...@schor.com>.
ruta-core-ext tests ran without error.

The other ruta projects are eclipse plugins, and some maven things.

Now going to work on reviewing and checking into svn the changes...

-M


On 1/10/2017 4:41 PM, Marshall Schor wrote:
> hi, just a progress report.
>
> Running Ruta (actually, only the ruta-core test cases) with UIMA v3 exposed some
> bugs in UIMA v3, undetected due to lack of test case coverage for many varieties
> of inserts/removes into sorted indexes, and manyvarieties of subiterator use. 
> So testing with Ruta was a valuable exercise!
>
> The ruta-core tests now appear to work. :-)
>
> Now starting to look at other ruta projects (e.g. ruta-core-ext, etc.) that have
> tests.
>
> -Marshall
>
>
>


Re: Question on some internals of Ruta engine operation

Posted by Peter Klügl <pe...@averbis.com>.
:-D

Am 10.01.2017 um 22:41 schrieb Marshall Schor:
> hi, just a progress report.
>
> Running Ruta (actually, only the ruta-core test cases) with UIMA v3 exposed some
> bugs in UIMA v3, undetected due to lack of test case coverage for many varieties
> of inserts/removes into sorted indexes, and manyvarieties of subiterator use.
> So testing with Ruta was a valuable exercise!
>
> The ruta-core tests now appear to work. :-)
>
> Now starting to look at other ruta projects (e.g. ruta-core-ext, etc.) that have
> tests.
>
> -Marshall
>
>


Re: Question on some internals of Ruta engine operation

Posted by Marshall Schor <ms...@schor.com>.
hi, just a progress report.

Running Ruta (actually, only the ruta-core test cases) with UIMA v3 exposed some
bugs in UIMA v3, undetected due to lack of test case coverage for many varieties
of inserts/removes into sorted indexes, and manyvarieties of subiterator use. 
So testing with Ruta was a valuable exercise!

The ruta-core tests now appear to work. :-)

Now starting to look at other ruta projects (e.g. ruta-core-ext, etc.) that have
tests.

-Marshall



Re: Question on some internals of Ruta engine operation

Posted by Marshall Schor <ms...@schor.com>.
This test showed up a bug in an edge case of the sorted index algorithm. 
Investigating now. -M


On 1/6/2017 4:45 PM, Marshall Schor wrote:
> some investigation shows this was already implemented in v3. More investigation
> needed.
>
> -M
>
>
> On 1/6/2017 4:16 PM, Marshall Schor wrote:
>> I think I'm getting a handle on what might be the issue with another test
>> difference running with UIMA v3 and Ruta, but would like to confirm something
>> about how Ruta operates.
>>
>>
>> I see the RutaStream checkAnchor code "splitting" a RutaBasic annotation into
>> two.  It carefully first removes the original from the index, changes its "end",
>> and creates a new annotation, and adds both of these back to the index.
>>
>> It does this while the stream has iterator(s), e.g. the currentIt field. 
>>
>> Normally, this is not allowed (updating an index while iterating over it), but
>> UIMA v2 made an exception: this was allowed if the first operation on the
>> iterator was a moveTo first/last/some-specific-FS.  These moveTo operations
>> "reset" the iterator state to a known position, using the then-current values of
>> the indexes.
>>
>> In version 3, we added a copy-on-write style for iterators, that changed this (I
>> need to fix that) to avoid throwing ConcurrentModificationExceptions.  This
>> needs to be altered so that iterators that do the special moveTo operations that
>> formerly "reset" the state, acquire the new current state of the index, if a
>> copy-on-write had occurred, so they can "see" the changed index.
>>
>> Before I embark on this fix, I'd feel better if I could get some confirmation
>> that Ruta is operating in this manner (at least for this test case) (i.e.,
>>
>> 1) adding Annotations to indexes
>> 2) getting iterator(s) over those in RutaStream
>> 3) removing and adding Annotations to the indexes while holding on to these
>> iterators
>> 4) avoiding any ConcurrentModificationExceptions by always doing one of the 3
>> repositioning iterator operations: moveTo First/Last/a-Feature_structure, before
>> doing any other operation on the iterator.
>>
>> Thanks. -Marshall
>>
>>
>


Re: Question on some internals of Ruta engine operation

Posted by Marshall Schor <ms...@schor.com>.
some investigation shows this was already implemented in v3. More investigation
needed.

-M


On 1/6/2017 4:16 PM, Marshall Schor wrote:
> I think I'm getting a handle on what might be the issue with another test
> difference running with UIMA v3 and Ruta, but would like to confirm something
> about how Ruta operates.
>
>
> I see the RutaStream checkAnchor code "splitting" a RutaBasic annotation into
> two.  It carefully first removes the original from the index, changes its "end",
> and creates a new annotation, and adds both of these back to the index.
>
> It does this while the stream has iterator(s), e.g. the currentIt field. 
>
> Normally, this is not allowed (updating an index while iterating over it), but
> UIMA v2 made an exception: this was allowed if the first operation on the
> iterator was a moveTo first/last/some-specific-FS.  These moveTo operations
> "reset" the iterator state to a known position, using the then-current values of
> the indexes.
>
> In version 3, we added a copy-on-write style for iterators, that changed this (I
> need to fix that) to avoid throwing ConcurrentModificationExceptions.  This
> needs to be altered so that iterators that do the special moveTo operations that
> formerly "reset" the state, acquire the new current state of the index, if a
> copy-on-write had occurred, so they can "see" the changed index.
>
> Before I embark on this fix, I'd feel better if I could get some confirmation
> that Ruta is operating in this manner (at least for this test case) (i.e.,
>
> 1) adding Annotations to indexes
> 2) getting iterator(s) over those in RutaStream
> 3) removing and adding Annotations to the indexes while holding on to these
> iterators
> 4) avoiding any ConcurrentModificationExceptions by always doing one of the 3
> repositioning iterator operations: moveTo First/Last/a-Feature_structure, before
> doing any other operation on the iterator.
>
> Thanks. -Marshall
>
>


Re: Question on some internals of Ruta engine operation

Posted by Marshall Schor <ms...@schor.com>.
Thanks!

Re: no need to adapt UIMA v3 ... I can simply adapt/fix Ruta

That's nice to know!  I still think I want to adapt V3 to be more backwards compatible, where it doesn't really "cost" anything, because of wanting to have other users with their highly varied coding styles, etc., have an easier migration path.

So, I'm thankful we have significant users of UIMA (like Ruta, uima-as, uimaFIT) to test with :-)

-Marshall

On 1/9/2017 3:41 AM, Peter Kl�gl wrote:
> Hi,
>
>
> Am 06.01.2017 um 22:16 schrieb Marshall Schor:
>> ...
>>
>> Before I embark on this fix, I'd feel better if I could get some confirmation
>> that Ruta is operating in this manner (at least for this test case) (i.e.,
>>
>> 1) adding Annotations to indexes
>> 2) getting iterator(s) over those in RutaStream
>> 3) removing and adding Annotations to the indexes while holding on to these
>> iterators
>> 4) avoiding any ConcurrentModificationExceptions by always doing one of the 3
>> repositioning iterator operations: moveTo First/Last/a-Feature_structure, before
>> doing any other operation on the iterator.
>>
> 1) - 3) yes
>
> 4) not by purpose, but accidentally yes. It could happen but I have not
> found a use case where it directly accesses the other methods.
>
> Let me mention that the design of RutaStream and the approach with
> RutaBasic annotation origins in the implementation of TextMarker before
> it was ported to UIMA. I incrementally changed the implementation, added
> new functionality, refactored a bit over the years, but overall I did
> not change it completely because the implementations of
> conditions/actions depended on it. The iterators are now still there,
> but it is not necessary anymore that RutaStream holds them. The complete
> RutaStream/RutaBasic stuff is on my TODO list for 2017: creating a
> stable/clean interface for RutaSream with different implementations,
> e.g., not requiring RutaBasic.
>
> There is no need to adapt UIMA v3 to allow stuff that should be
> supported. I can simply adapt/fix Ruta.
>
>
> Peter
>
>


Re: Question on some internals of Ruta engine operation

Posted by Peter Klügl <pe...@averbis.com>.
Hi,


Am 06.01.2017 um 22:16 schrieb Marshall Schor:
> ...
>
> Before I embark on this fix, I'd feel better if I could get some confirmation
> that Ruta is operating in this manner (at least for this test case) (i.e.,
>
> 1) adding Annotations to indexes
> 2) getting iterator(s) over those in RutaStream
> 3) removing and adding Annotations to the indexes while holding on to these
> iterators
> 4) avoiding any ConcurrentModificationExceptions by always doing one of the 3
> repositioning iterator operations: moveTo First/Last/a-Feature_structure, before
> doing any other operation on the iterator.
>

1) - 3) yes

4) not by purpose, but accidentally yes. It could happen but I have not
found a use case where it directly accesses the other methods.

Let me mention that the design of RutaStream and the approach with
RutaBasic annotation origins in the implementation of TextMarker before
it was ported to UIMA. I incrementally changed the implementation, added
new functionality, refactored a bit over the years, but overall I did
not change it completely because the implementations of
conditions/actions depended on it. The iterators are now still there,
but it is not necessary anymore that RutaStream holds them. The complete
RutaStream/RutaBasic stuff is on my TODO list for 2017: creating a
stable/clean interface for RutaSream with different implementations,
e.g., not requiring RutaBasic.

There is no need to adapt UIMA v3 to allow stuff that should be
supported. I can simply adapt/fix Ruta.


Peter