You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Timo Boehme <ti...@ontochem.com> on 2009/10/07 11:17:50 UTC
iterator positioning on same region annotations
Hi,
in my scenario there may be multiple annotations of same type for the
same region. Before I add an annotation I would like to check if such an
annotation already exists.
To accomplish this I use FSIndex.iterator( newAnnotation ) to get an
iterator which starts at the position of my new (but not added)
annotation. According to the method description the iterator should be
positioned so that previous annotations are less compared to newAnnotation.
However sometimes if I call moveToPrevious() (directly after iterator
creation) I will get (with get()) an annotation (of same type) with same
region as newAnnotation - which in my opinion is not less.
Thus I would like to know if annotations of same type for same region
will trigger some 'unspecified' behavior or if my understanding of the
iterator is wrong or if I stumbled upon a bug?
Kind regards
Timo
--
Timo Boehme
OntoChem GmbH
H.-Damerow-Str. 4
06120 Halle/Saale
T: +49 345 4780472
F: +49 345 4780471
timo.boehme@ontochem.com
_____________________________________________________________________
OntoChem GmbH
Geschäftsführer: Dr. Lutz Weber
Sitz: Halle / Saale
Registergericht: Stendal
Registernummer: HRB 215461
_____________________________________________________________________
Re: iterator positioning on same region annotations
Posted by Timo Boehme <ti...@ontochem.com>.
Thanks Thilo and Matthias for the explanations which confirmed my
assumptions.
Thilo Goetz wrote:
> Just to clarify: the moveTo() method tries to move the
> iterator to a position such that the annotation at the
> position is equal to the one you're looking for. If
> there is more than one such annotation, it is undefined
> which one the iterator will point to.
>
> Is this consistent with the API docs? I would say it
> isn't. We say "move the iterator to the first FS...".
> Intuitively, that should mean to the "leftmost" FS, but
> in fact it is implemented to mean the first one that
> the algorithm finds, which will generally _not_ be the
> leftmost one.
I think the API docs for FSIndex.iterator(FeatureStructure fs) is even
more inconsistent:
"...The position of the iterator will be set such that the feature
structure returned by a call to the iterator's get() method is greater
than or equal to fs, and any previous FS is less than FS..."
Thus here it really means "leftmost" FS which, as I pointed out, is not.
I will open a bug report on this.
--Timo
Timo Boehme
OntoChem GmbH
H.-Damerow-Str. 4
06120 Halle/Saale
T: +49 345 4780472
F: +49 345 4780471
timo.boehme@ontochem.com
_____________________________________________________________________
OntoChem GmbH
Geschäftsführer: Dr. Lutz Weber
Sitz: Halle / Saale
Registergericht: Stendal
Registernummer: HRB 215461
_____________________________________________________________________
Re: iterator positioning on same region annotations
Posted by Timo Boehme <ti...@ontochem.com>.
Thilo Goetz wrote:
> As a workaround, you can moveToPrevious() until you hit
> an annotation that is not equal to the one you're looking
> for.
I tried this workaround but found it to be very slow (moveToPrevious()
seems to be a costly operation). Since I didn't want to add dummy types
and type priorities as Matthias suggested I found a simple and fast
workaround:
- create a dummy annotation D of same type as annotation A to be test for
- set D.begin=A.begin and D.end=A.end+1
(thus make sure D comes before all annotations we are interested in)
- in case moveTo(D) really found an annotation of this range use next()
until it reaches the desired range (cheap operation)
now the iterator points to the correct start of annotations with
boundaries identical to A
Timo
> Matthias Wendt wrote:
>> Hi Timo,
>>
>> the order relation of the feature structures is defined by the index
>> definition. Have a look at the index definition of the (built-in)
>> annotation index. You can if you open any annotator descriptor using the
>> component editor in eclipse. This helped me a lot in understanding the
>> behaviour of the iterators.
>>
>> To put it short, if two annotations of the same type have exactly the
>> same boundaries, the behaviour is indeed unspecified. However, you can
>> avoid this indeterminism, by adding a second type and assigning it a
>> higher priority. If you don't need a second type, you can use it as a
>> helper, shifting an instance across the CAS as needed ;) - at least, I
>> don't know of any more elegant method.
>>
>> -- Hope this helps
>>
>> Matthias
>>
>>
>>
>>
>> Timo Boehme schrieb:
>>> Hi,
>>>
>>> in my scenario there may be multiple annotations of same type for the
>>> same region. Before I add an annotation I would like to check if such an
>>> annotation already exists.
>>>
>>> To accomplish this I use FSIndex.iterator( newAnnotation ) to get an
>>> iterator which starts at the position of my new (but not added)
>>> annotation. According to the method description the iterator should be
>>> positioned so that previous annotations are less compared to
>>> newAnnotation.
>>>
>>> However sometimes if I call moveToPrevious() (directly after iterator
>>> creation) I will get (with get()) an annotation (of same type) with same
>>> region as newAnnotation - which in my opinion is not less.
>>>
>>> Thus I would like to know if annotations of same type for same region
>>> will trigger some 'unspecified' behavior or if my understanding of the
>>> iterator is wrong or if I stumbled upon a bug?
>>>
>>>
>>> Kind regards
>>> Timo
>>>
>>>
>>>
Timo Boehme
OntoChem GmbH
H.-Damerow-Str. 4
06120 Halle/Saale
T: +49 345 4780472
F: +49 345 4780471
timo.boehme@ontochem.com
_____________________________________________________________________
OntoChem GmbH
Geschäftsführer: Dr. Lutz Weber
Sitz: Halle / Saale
Registergericht: Stendal
Registernummer: HRB 215461
_____________________________________________________________________
Re: iterator positioning on same region annotations
Posted by Thilo Goetz <tw...@gmx.de>.
Just to clarify: the moveTo() method tries to move the
iterator to a position such that the annotation at the
position is equal to the one you're looking for. If
there is more than one such annotation, it is undefined
which one the iterator will point to.
Is this consistent with the API docs? I would say it
isn't. We say "move the iterator to the first FS...".
Intuitively, that should mean to the "leftmost" FS, but
in fact it is implemented to mean the first one that
the algorithm finds, which will generally _not_ be the
leftmost one.
As a workaround, you can moveToPrevious() until you hit
an annotation that is not equal to the one you're looking
for.
I think we should fix this, you're not the first one to
complain. Please open a Jira issue, and we can look
into it after the upcoming release.
--Thilo
Matthias Wendt wrote:
> Hi Timo,
>
> the order relation of the feature structures is defined by the index
> definition. Have a look at the index definition of the (built-in)
> annotation index. You can if you open any annotator descriptor using the
> component editor in eclipse. This helped me a lot in understanding the
> behaviour of the iterators.
>
> To put it short, if two annotations of the same type have exactly the
> same boundaries, the behaviour is indeed unspecified. However, you can
> avoid this indeterminism, by adding a second type and assigning it a
> higher priority. If you don't need a second type, you can use it as a
> helper, shifting an instance across the CAS as needed ;) - at least, I
> don't know of any more elegant method.
>
> -- Hope this helps
>
> Matthias
>
>
>
>
> Timo Boehme schrieb:
>> Hi,
>>
>> in my scenario there may be multiple annotations of same type for the
>> same region. Before I add an annotation I would like to check if such an
>> annotation already exists.
>>
>> To accomplish this I use FSIndex.iterator( newAnnotation ) to get an
>> iterator which starts at the position of my new (but not added)
>> annotation. According to the method description the iterator should be
>> positioned so that previous annotations are less compared to
>> newAnnotation.
>>
>> However sometimes if I call moveToPrevious() (directly after iterator
>> creation) I will get (with get()) an annotation (of same type) with same
>> region as newAnnotation - which in my opinion is not less.
>>
>> Thus I would like to know if annotations of same type for same region
>> will trigger some 'unspecified' behavior or if my understanding of the
>> iterator is wrong or if I stumbled upon a bug?
>>
>>
>> Kind regards
>> Timo
>>
>>
>>
>
Re: iterator positioning on same region annotations
Posted by Matthias Wendt <ma...@neofonie.de>.
Hi Timo,
the order relation of the feature structures is defined by the index
definition. Have a look at the index definition of the (built-in)
annotation index. You can if you open any annotator descriptor using the
component editor in eclipse. This helped me a lot in understanding the
behaviour of the iterators.
To put it short, if two annotations of the same type have exactly the
same boundaries, the behaviour is indeed unspecified. However, you can
avoid this indeterminism, by adding a second type and assigning it a
higher priority. If you don't need a second type, you can use it as a
helper, shifting an instance across the CAS as needed ;) - at least, I
don't know of any more elegant method.
-- Hope this helps
Matthias
Timo Boehme schrieb:
> Hi,
>
> in my scenario there may be multiple annotations of same type for the
> same region. Before I add an annotation I would like to check if such an
> annotation already exists.
>
> To accomplish this I use FSIndex.iterator( newAnnotation ) to get an
> iterator which starts at the position of my new (but not added)
> annotation. According to the method description the iterator should be
> positioned so that previous annotations are less compared to newAnnotation.
>
> However sometimes if I call moveToPrevious() (directly after iterator
> creation) I will get (with get()) an annotation (of same type) with same
> region as newAnnotation - which in my opinion is not less.
>
> Thus I would like to know if annotations of same type for same region
> will trigger some 'unspecified' behavior or if my understanding of the
> iterator is wrong or if I stumbled upon a bug?
>
>
> Kind regards
> Timo
>
>
>