You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Hannes Korte <ha...@iais.fraunhofer.de> on 2010/04/07 13:57:30 UTC
Possibly a bug in subiterator
Hi,
I noticed a strange behavior of the annotation index subiterator in
uimaj 2.2.2 and 2.3.0.
Consider the sentence: 'Testing the UIMA-Framework'
with tokens: 'Testing' 'the' 'UIMA-Framework'
and the named entity: 'UIMA'
The type priorities list NamedEntity on top of the Token type.
If I call the Token subiterator for the NamedEntity 'UIMA' with
strict=false, I get an empty result. According to the docs, the
definition of Tokens contained in the NamendEntity is in the
strict=false setting defined as:
annot.getBegin() <= b.getBegin() <= annot.getEnd()
for NamedEntity annot and Token b. This is true for 'UIMA' and
'UIMA-Framework', but the subiterator is empty.
If I change the NamedEntity to ' UIMA' (including the preceeding space),
then it works correctly, and the Token 'UIMA-Framework' is contained in
the subiterator.
I appended a simple java class with all needed files to demonstrate the
problem. Any ideas?
Best regards,
Hannes
Re: Possibly a bug in subiterator
Posted by Thilo Goetz <tw...@gmx.de>.
Hi,
thanks for reporting this. Please open a JIRA issue and
attach the files, I'll take a look (just paste the text
from your email as issue description). Thanks.
--Thilo
On 4/7/2010 13:57, Hannes Korte wrote:
> Hi,
>
> I noticed a strange behavior of the annotation index subiterator in
> uimaj 2.2.2 and 2.3.0.
>
> Consider the sentence: 'Testing the UIMA-Framework'
> with tokens: 'Testing' 'the' 'UIMA-Framework'
> and the named entity: 'UIMA'
>
> The type priorities list NamedEntity on top of the Token type.
>
> If I call the Token subiterator for the NamedEntity 'UIMA' with
> strict=false, I get an empty result. According to the docs, the
> definition of Tokens contained in the NamendEntity is in the
> strict=false setting defined as:
>
> annot.getBegin() <= b.getBegin() <= annot.getEnd()
>
> for NamedEntity annot and Token b. This is true for 'UIMA' and
> 'UIMA-Framework', but the subiterator is empty.
>
> If I change the NamedEntity to ' UIMA' (including the preceeding space),
> then it works correctly, and the Token 'UIMA-Framework' is contained in
> the subiterator.
>
> I appended a simple java class with all needed files to demonstrate the
> problem. Any ideas?
>
> Best regards,
> Hannes
>
>
>