You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Philip Ogren <ph...@ogren.info> on 2007/06/21 17:04:23 UTC
ClassCastException thrown when using subiterator and moveTo()
I am having difficulty with using the FSIterator returned by the
AnnotationIndex.subiterator(AnnotationFS) method.
The following is a code fragment:
AnnotationIndex annotationIndex = jCas.getAnnotationIndex(tokenType);
FSIterator tokenIterator = annotationIndex.subiterator(sentenceAnnotation);
annotationIterator.moveTo(tokenAnnotation);
Here is the relevant portion of the stack trace:
java.lang.ClassCastException: edu.colorado.cslr.dessert.types.Token
at java.util.Collections.indexedBinarySearch(Unknown Source)
at java.util.Collections.binarySearch(Unknown Source)
at org.apache.uima.cas.impl.Subiterator.moveTo(Subiterator.java:224)
If I change the second line to the following, then I do not have any
problems with an exception being thrown.
FSIterator tokenIterator = annotationIndex.iterator();
Is this a bug or some misunderstanding on my part of how subiterator
should work?
Thanks,
Philip
Unit testing with Groovy
Posted by Philip Ogren <ph...@ogren.info>.
Great!
I have found that a great way to get started with Groovy is for unit
testing. I borrowed a few lines of Java code for obtaining a JCas that
I saw on an earlier post (from Marshall I think) that I used to create a
little util class that has a single method:
static JCas process(String descriptorFileName, String textFileName)
{
File textFile = new File(textFileName)
String text
if(textFile.exists())
text = textFile.text
else
text = textFileName
XMLInputSource xmlInput = new XMLInputSource(new
File(descriptorFileName))
ResourceSpecifier specifier =
UIMAFramework.getXMLParser().parseResourceSpecifier(xmlInput)
AnalysisEngine analysisEngine =
UIMAFramework.produceAnalysisEngine(specifier)
JCas jCas = analysisEngine.newJCas()
jCas.setDocumentText(text)
analysisEngine.process(jCas)
return jCas
}
With this I can make a unit test very easily:
void testTokenizer()
{
println "running token tests ..."
JCas jCas = TestsUtil.process(tokenDescriptorFile,
"data/test/docs/sampletext.txt")
FSIndex tokenIndex = jCas.getAnnotationIndex(Token.type)
assert tokenIndex.size() == 132
...
}
I have also started exploring writing uima components with Groovy. I
recently put together a file collection reader that uses Groovy's Ant
builder which allows one to define includes and excludes for a FileSet
in a descriptor file the same way you would in a build.xml file. I
haven't made it available yet because it needs a README and it is also
dog slow on large corpora. However, it is <100 lines of code and
provides nifty functionality for smaller data sets. My point is simply
to give an example of what is possible. When I understand the CAS
better, it might be cool think about creating a Groovy builder that
would be a more dynamic solution than JCas - but that is just a dream at
the moment given my time and (lack of) expertise of these two
technologies.
Thilo Goetz wrote:
> Fixed for 2.2, thanks for reporting and providing the test case.
> This Groovy stuff looks pretty cool, I'll have to check it out
> some more...
>
> --Thilo
>
> Thilo Goetz wrote:
>
>> The easiest place to create a test case is on Jira. Open a Jira
>> bug for UIMA, and you can attach the test case. When you do that,
>> please check the box that says something like, ok to include in
>> Apache code (so we can check it in and use it as regression test).
>>
>> Groovy, hm. Never used it before. If it doesn't take me more than
>> 5 min to set up in Eclipse, and I can still debug, not a problem ;-)
>>
>> --Thilo
>>
>> Philip Ogren wrote:
>>
>>> Yes. I will throw one together. Would you mind if I used Groovy for
>>> this - or is that going to be annoying? Let me know. Also, from an
>>> earlier email I saw on this list it seems that attachments are a
>>> problem. Is there some place where I could directly load a test case?
>>>
>>> In the mean time, here is a work around I just put together (I'm still
>>> unit testing the code that uses this - so I'm not certain this is bug
>>> free):
>>>
>>> public static FSIterator getWindowIterator(JCas jCas, Annotation
>>> windowAnnotation, Type type)
>>> {
>>> ConstraintFactory constraintFactory = jCas.getConstraintFactory();
>>> FeaturePath beginFeaturePath = jCas.createFeaturePath();
>>> beginFeaturePath.addFeature(type.getFeatureByBaseName("begin"));
>>> FSIntConstraint intConstraint =
>>> constraintFactory.createIntConstraint();
>>> intConstraint.geq(windowAnnotation.getBegin());
>>> FSMatchConstraint beginConstraint =
>>> constraintFactory.embedConstraint(beginFeaturePath, intConstraint);
>>>
>>> FeaturePath endFeaturePath = jCas.createFeaturePath();
>>> endFeaturePath.addFeature(type.getFeatureByBaseName("end"));
>>> intConstraint = constraintFactory.createIntConstraint();
>>> intConstraint.leq(windowAnnotation.getEnd());
>>> FSMatchConstraint endConstraint =
>>> constraintFactory.embedConstraint(endFeaturePath, intConstraint);
>>>
>>> FSMatchConstraint windowConstraint =
>>> constraintFactory.and(beginConstraint,endConstraint);
>>> FSIndex windowIndex = jCas.getAnnotationIndex(type);
>>> FSIterator windowIterator =
>>> jCas.createFilteredIterator(windowIndex.iterator(), windowConstraint);
>>>
>>> return windowIterator;
>>> }
>>>
>>>
>>>
>>> Thilo Goetz wrote:
>>>
>>>> That's a bug. The underlying implementation of the two
>>>> iterator types you mention is totally different, hence
>>>> you see this only in one of them. Any chance you could
>>>> provide a self-contained test case that exhibits this?
>>>>
>>>> --Thilo
>>>>
>>>> Philip Ogren wrote:
>>>>
>>>>
>>>>> I am having difficulty with using the FSIterator returned by the
>>>>> AnnotationIndex.subiterator(AnnotationFS) method.
>>>>> The following is a code fragment:
>>>>>
>>>>> AnnotationIndex annotationIndex = jCas.getAnnotationIndex(tokenType);
>>>>> FSIterator tokenIterator =
>>>>> annotationIndex.subiterator(sentenceAnnotation);
>>>>> annotationIterator.moveTo(tokenAnnotation);
>>>>>
>>>>>
>>>>> Here is the relevant portion of the stack trace:
>>>>>
>>>>> java.lang.ClassCastException: edu.colorado.cslr.dessert.types.Token
>>>>> at java.util.Collections.indexedBinarySearch(Unknown Source)
>>>>> at java.util.Collections.binarySearch(Unknown Source)
>>>>> at org.apache.uima.cas.impl.Subiterator.moveTo(Subiterator.java:224)
>>>>>
>>>>>
>>>>> If I change the second line to the following, then I do not have any
>>>>> problems with an exception being thrown.
>>>>>
>>>>> FSIterator tokenIterator = annotationIndex.iterator();
>>>>>
>>>>>
>>>>> Is this a bug or some misunderstanding on my part of how subiterator
>>>>> should work?
>>>>>
>>>>> Thanks,
>>>>> Philip
>>>>>
>>>>>
>>>>
>>>>
>
>
>
Re: ClassCastException thrown when using subiterator and moveTo()
Posted by Philip Ogren <ph...@ogren.info>.
btw, the work around I had posted is *much* slower than the fixed
subiterator method - for both creating the iterator and iterating
through it. More than an order of magnitude slower (roughly 15x's
slower). Thanks for the fix!
Thilo Goetz wrote:
>>> public static FSIterator getWindowIterator(JCas jCas, Annotation
>>> windowAnnotation, Type type)
>>> {
>>> ConstraintFactory constraintFactory = jCas.getConstraintFactory();
>>> FeaturePath beginFeaturePath = jCas.createFeaturePath();
>>> beginFeaturePath.addFeature(type.getFeatureByBaseName("begin"));
>>> FSIntConstraint intConstraint =
>>> constraintFactory.createIntConstraint();
>>> intConstraint.geq(windowAnnotation.getBegin());
>>> FSMatchConstraint beginConstraint =
>>> constraintFactory.embedConstraint(beginFeaturePath, intConstraint);
>>>
>>> FeaturePath endFeaturePath = jCas.createFeaturePath();
>>> endFeaturePath.addFeature(type.getFeatureByBaseName("end"));
>>> intConstraint = constraintFactory.createIntConstraint();
>>> intConstraint.leq(windowAnnotation.getEnd());
>>> FSMatchConstraint endConstraint =
>>> constraintFactory.embedConstraint(endFeaturePath, intConstraint);
>>>
>>> FSMatchConstraint windowConstraint =
>>> constraintFactory.and(beginConstraint,endConstraint);
>>> FSIndex windowIndex = jCas.getAnnotationIndex(type);
>>> FSIterator windowIterator =
>>> jCas.createFilteredIterator(windowIndex.iterator(), windowConstraint);
>>>
>>> return windowIterator;
>>> }
>>>
>>>
>>>
>>> Thilo Goetz wrote:
>>>
>>>> That's a bug. The underlying implementation of the two
>>>> iterator types you mention is totally different, hence
>>>> you see this only in one of them. Any chance you could
>>>> provide a self-contained test case that exhibits this?
>>>>
>>>> --Thilo
>>>>
>>>> Philip Ogren wrote:
>>>>
>>>>
>>>>> I am having difficulty with using the FSIterator returned by the
>>>>> AnnotationIndex.subiterator(AnnotationFS) method.
>>>>> The following is a code fragment:
>>>>>
>>>>> AnnotationIndex annotationIndex = jCas.getAnnotationIndex(tokenType);
>>>>> FSIterator tokenIterator =
>>>>> annotationIndex.subiterator(sentenceAnnotation);
>>>>> annotationIterator.moveTo(tokenAnnotation);
>>>>>
>>>>>
>>>>> Here is the relevant portion of the stack trace:
>>>>>
>>>>> java.lang.ClassCastException: edu.colorado.cslr.dessert.types.Token
>>>>> at java.util.Collections.indexedBinarySearch(Unknown Source)
>>>>> at java.util.Collections.binarySearch(Unknown Source)
>>>>> at org.apache.uima.cas.impl.Subiterator.moveTo(Subiterator.java:224)
>>>>>
>>>>>
>>>>> If I change the second line to the following, then I do not have any
>>>>> problems with an exception being thrown.
>>>>>
>>>>> FSIterator tokenIterator = annotationIndex.iterator();
>>>>>
>>>>>
>>>>> Is this a bug or some misunderstanding on my part of how subiterator
>>>>> should work?
>>>>>
>>>>> Thanks,
>>>>> Philip
>>>>>
>>>>>
>>>>
>>>>
>
>
>
Re: ClassCastException thrown when using subiterator and moveTo()
Posted by Thilo Goetz <tw...@gmx.de>.
Fixed for 2.2, thanks for reporting and providing the test case.
This Groovy stuff looks pretty cool, I'll have to check it out
some more...
--Thilo
Thilo Goetz wrote:
> The easiest place to create a test case is on Jira. Open a Jira
> bug for UIMA, and you can attach the test case. When you do that,
> please check the box that says something like, ok to include in
> Apache code (so we can check it in and use it as regression test).
>
> Groovy, hm. Never used it before. If it doesn't take me more than
> 5 min to set up in Eclipse, and I can still debug, not a problem ;-)
>
> --Thilo
>
> Philip Ogren wrote:
>> Yes. I will throw one together. Would you mind if I used Groovy for
>> this - or is that going to be annoying? Let me know. Also, from an
>> earlier email I saw on this list it seems that attachments are a
>> problem. Is there some place where I could directly load a test case?
>>
>> In the mean time, here is a work around I just put together (I'm still
>> unit testing the code that uses this - so I'm not certain this is bug
>> free):
>>
>> public static FSIterator getWindowIterator(JCas jCas, Annotation
>> windowAnnotation, Type type)
>> {
>> ConstraintFactory constraintFactory = jCas.getConstraintFactory();
>> FeaturePath beginFeaturePath = jCas.createFeaturePath();
>> beginFeaturePath.addFeature(type.getFeatureByBaseName("begin"));
>> FSIntConstraint intConstraint =
>> constraintFactory.createIntConstraint();
>> intConstraint.geq(windowAnnotation.getBegin());
>> FSMatchConstraint beginConstraint =
>> constraintFactory.embedConstraint(beginFeaturePath, intConstraint);
>>
>> FeaturePath endFeaturePath = jCas.createFeaturePath();
>> endFeaturePath.addFeature(type.getFeatureByBaseName("end"));
>> intConstraint = constraintFactory.createIntConstraint();
>> intConstraint.leq(windowAnnotation.getEnd());
>> FSMatchConstraint endConstraint =
>> constraintFactory.embedConstraint(endFeaturePath, intConstraint);
>>
>> FSMatchConstraint windowConstraint =
>> constraintFactory.and(beginConstraint,endConstraint);
>> FSIndex windowIndex = jCas.getAnnotationIndex(type);
>> FSIterator windowIterator =
>> jCas.createFilteredIterator(windowIndex.iterator(), windowConstraint);
>>
>> return windowIterator;
>> }
>>
>>
>>
>> Thilo Goetz wrote:
>>> That's a bug. The underlying implementation of the two
>>> iterator types you mention is totally different, hence
>>> you see this only in one of them. Any chance you could
>>> provide a self-contained test case that exhibits this?
>>>
>>> --Thilo
>>>
>>> Philip Ogren wrote:
>>>
>>>> I am having difficulty with using the FSIterator returned by the
>>>> AnnotationIndex.subiterator(AnnotationFS) method.
>>>> The following is a code fragment:
>>>>
>>>> AnnotationIndex annotationIndex = jCas.getAnnotationIndex(tokenType);
>>>> FSIterator tokenIterator =
>>>> annotationIndex.subiterator(sentenceAnnotation);
>>>> annotationIterator.moveTo(tokenAnnotation);
>>>>
>>>>
>>>> Here is the relevant portion of the stack trace:
>>>>
>>>> java.lang.ClassCastException: edu.colorado.cslr.dessert.types.Token
>>>> at java.util.Collections.indexedBinarySearch(Unknown Source)
>>>> at java.util.Collections.binarySearch(Unknown Source)
>>>> at org.apache.uima.cas.impl.Subiterator.moveTo(Subiterator.java:224)
>>>>
>>>>
>>>> If I change the second line to the following, then I do not have any
>>>> problems with an exception being thrown.
>>>>
>>>> FSIterator tokenIterator = annotationIndex.iterator();
>>>>
>>>>
>>>> Is this a bug or some misunderstanding on my part of how subiterator
>>>> should work?
>>>>
>>>> Thanks,
>>>> Philip
>>>>
>>>
>>>
Re: ClassCastException thrown when using subiterator and moveTo()
Posted by Thilo Goetz <tw...@gmx.de>.
The easiest place to create a test case is on Jira. Open a Jira
bug for UIMA, and you can attach the test case. When you do that,
please check the box that says something like, ok to include in
Apache code (so we can check it in and use it as regression test).
Groovy, hm. Never used it before. If it doesn't take me more than
5 min to set up in Eclipse, and I can still debug, not a problem ;-)
--Thilo
Philip Ogren wrote:
> Yes. I will throw one together. Would you mind if I used Groovy for
> this - or is that going to be annoying? Let me know. Also, from an
> earlier email I saw on this list it seems that attachments are a
> problem. Is there some place where I could directly load a test case?
>
> In the mean time, here is a work around I just put together (I'm still
> unit testing the code that uses this - so I'm not certain this is bug
> free):
>
> public static FSIterator getWindowIterator(JCas jCas, Annotation
> windowAnnotation, Type type)
> {
> ConstraintFactory constraintFactory = jCas.getConstraintFactory();
> FeaturePath beginFeaturePath = jCas.createFeaturePath();
> beginFeaturePath.addFeature(type.getFeatureByBaseName("begin"));
> FSIntConstraint intConstraint =
> constraintFactory.createIntConstraint();
> intConstraint.geq(windowAnnotation.getBegin());
> FSMatchConstraint beginConstraint =
> constraintFactory.embedConstraint(beginFeaturePath, intConstraint);
>
> FeaturePath endFeaturePath = jCas.createFeaturePath();
> endFeaturePath.addFeature(type.getFeatureByBaseName("end"));
> intConstraint = constraintFactory.createIntConstraint();
> intConstraint.leq(windowAnnotation.getEnd());
> FSMatchConstraint endConstraint =
> constraintFactory.embedConstraint(endFeaturePath, intConstraint);
>
> FSMatchConstraint windowConstraint =
> constraintFactory.and(beginConstraint,endConstraint);
> FSIndex windowIndex = jCas.getAnnotationIndex(type);
> FSIterator windowIterator =
> jCas.createFilteredIterator(windowIndex.iterator(), windowConstraint);
>
> return windowIterator;
> }
>
>
>
> Thilo Goetz wrote:
>> That's a bug. The underlying implementation of the two
>> iterator types you mention is totally different, hence
>> you see this only in one of them. Any chance you could
>> provide a self-contained test case that exhibits this?
>>
>> --Thilo
>>
>> Philip Ogren wrote:
>>
>>> I am having difficulty with using the FSIterator returned by the
>>> AnnotationIndex.subiterator(AnnotationFS) method.
>>> The following is a code fragment:
>>>
>>> AnnotationIndex annotationIndex = jCas.getAnnotationIndex(tokenType);
>>> FSIterator tokenIterator =
>>> annotationIndex.subiterator(sentenceAnnotation);
>>> annotationIterator.moveTo(tokenAnnotation);
>>>
>>>
>>> Here is the relevant portion of the stack trace:
>>>
>>> java.lang.ClassCastException: edu.colorado.cslr.dessert.types.Token
>>> at java.util.Collections.indexedBinarySearch(Unknown Source)
>>> at java.util.Collections.binarySearch(Unknown Source)
>>> at org.apache.uima.cas.impl.Subiterator.moveTo(Subiterator.java:224)
>>>
>>>
>>> If I change the second line to the following, then I do not have any
>>> problems with an exception being thrown.
>>>
>>> FSIterator tokenIterator = annotationIndex.iterator();
>>>
>>>
>>> Is this a bug or some misunderstanding on my part of how subiterator
>>> should work?
>>>
>>> Thanks,
>>> Philip
>>>
>>
>>
>>
>
Re: ClassCastException thrown when using subiterator and moveTo()
Posted by Philip Ogren <ph...@ogren.info>.
Yes. I will throw one together. Would you mind if I used Groovy for
this - or is that going to be annoying? Let me know. Also, from an
earlier email I saw on this list it seems that attachments are a
problem. Is there some place where I could directly load a test case?
In the mean time, here is a work around I just put together (I'm still
unit testing the code that uses this - so I'm not certain this is bug free):
public static FSIterator getWindowIterator(JCas jCas, Annotation
windowAnnotation, Type type)
{
ConstraintFactory constraintFactory = jCas.getConstraintFactory();
FeaturePath beginFeaturePath = jCas.createFeaturePath();
beginFeaturePath.addFeature(type.getFeatureByBaseName("begin"));
FSIntConstraint intConstraint =
constraintFactory.createIntConstraint();
intConstraint.geq(windowAnnotation.getBegin());
FSMatchConstraint beginConstraint =
constraintFactory.embedConstraint(beginFeaturePath, intConstraint);
FeaturePath endFeaturePath = jCas.createFeaturePath();
endFeaturePath.addFeature(type.getFeatureByBaseName("end"));
intConstraint = constraintFactory.createIntConstraint();
intConstraint.leq(windowAnnotation.getEnd());
FSMatchConstraint endConstraint =
constraintFactory.embedConstraint(endFeaturePath, intConstraint);
FSMatchConstraint windowConstraint =
constraintFactory.and(beginConstraint,endConstraint);
FSIndex windowIndex = jCas.getAnnotationIndex(type);
FSIterator windowIterator =
jCas.createFilteredIterator(windowIndex.iterator(), windowConstraint);
return windowIterator;
}
Thilo Goetz wrote:
> That's a bug. The underlying implementation of the two
> iterator types you mention is totally different, hence
> you see this only in one of them. Any chance you could
> provide a self-contained test case that exhibits this?
>
> --Thilo
>
> Philip Ogren wrote:
>
>> I am having difficulty with using the FSIterator returned by the
>> AnnotationIndex.subiterator(AnnotationFS) method.
>> The following is a code fragment:
>>
>> AnnotationIndex annotationIndex = jCas.getAnnotationIndex(tokenType);
>> FSIterator tokenIterator = annotationIndex.subiterator(sentenceAnnotation);
>> annotationIterator.moveTo(tokenAnnotation);
>>
>>
>> Here is the relevant portion of the stack trace:
>>
>> java.lang.ClassCastException: edu.colorado.cslr.dessert.types.Token
>> at java.util.Collections.indexedBinarySearch(Unknown Source)
>> at java.util.Collections.binarySearch(Unknown Source)
>> at org.apache.uima.cas.impl.Subiterator.moveTo(Subiterator.java:224)
>>
>>
>> If I change the second line to the following, then I do not have any
>> problems with an exception being thrown.
>>
>> FSIterator tokenIterator = annotationIndex.iterator();
>>
>>
>> Is this a bug or some misunderstanding on my part of how subiterator
>> should work?
>>
>> Thanks,
>> Philip
>>
>
>
>
Re: ClassCastException thrown when using subiterator and moveTo()
Posted by Thilo Goetz <tw...@gmx.de>.
That's a bug. The underlying implementation of the two
iterator types you mention is totally different, hence
you see this only in one of them. Any chance you could
provide a self-contained test case that exhibits this?
--Thilo
Philip Ogren wrote:
>
> I am having difficulty with using the FSIterator returned by the
> AnnotationIndex.subiterator(AnnotationFS) method.
> The following is a code fragment:
>
> AnnotationIndex annotationIndex = jCas.getAnnotationIndex(tokenType);
> FSIterator tokenIterator = annotationIndex.subiterator(sentenceAnnotation);
> annotationIterator.moveTo(tokenAnnotation);
>
>
> Here is the relevant portion of the stack trace:
>
> java.lang.ClassCastException: edu.colorado.cslr.dessert.types.Token
> at java.util.Collections.indexedBinarySearch(Unknown Source)
> at java.util.Collections.binarySearch(Unknown Source)
> at org.apache.uima.cas.impl.Subiterator.moveTo(Subiterator.java:224)
>
>
> If I change the second line to the following, then I do not have any
> problems with an exception being thrown.
>
> FSIterator tokenIterator = annotationIndex.iterator();
>
>
> Is this a bug or some misunderstanding on my part of how subiterator
> should work?
>
> Thanks,
> Philip