You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Philip Ogren <ph...@ogren.info> on 2007/06/21 17:04:23 UTC

ClassCastException thrown when using subiterator and moveTo()

I am having difficulty with using the FSIterator returned by the 
AnnotationIndex.subiterator(AnnotationFS) method. 

The following is a code fragment:

AnnotationIndex annotationIndex = jCas.getAnnotationIndex(tokenType);
FSIterator tokenIterator = annotationIndex.subiterator(sentenceAnnotation);
annotationIterator.moveTo(tokenAnnotation);


Here is the relevant portion of the stack trace:

	java.lang.ClassCastException: edu.colorado.cslr.dessert.types.Token
	at java.util.Collections.indexedBinarySearch(Unknown Source)
	at java.util.Collections.binarySearch(Unknown Source)
	at org.apache.uima.cas.impl.Subiterator.moveTo(Subiterator.java:224)


If I change the second line to the following, then I do not have any 
problems with an exception being thrown.

FSIterator tokenIterator = annotationIndex.iterator();


Is this a bug or some misunderstanding on my part of how subiterator 
should work?

Thanks,
Philip

Unit testing with Groovy

Posted by Philip Ogren <ph...@ogren.info>.
Great! 

I have found that a great way to get started with Groovy is for unit 
testing.  I borrowed a few lines of Java code for obtaining a JCas that 
I saw on an earlier post (from Marshall I think) that I used to create a 
little util class that has a single method:

    static JCas process(String descriptorFileName, String textFileName)
    {
        File textFile = new File(textFileName) 
        String text
        if(textFile.exists())
            text = textFile.text
        else
            text = textFileName
           
        XMLInputSource xmlInput = new XMLInputSource(new 
File(descriptorFileName))
        ResourceSpecifier specifier = 
UIMAFramework.getXMLParser().parseResourceSpecifier(xmlInput)
        AnalysisEngine analysisEngine = 
UIMAFramework.produceAnalysisEngine(specifier)
        JCas jCas = analysisEngine.newJCas()

        jCas.setDocumentText(text)
        analysisEngine.process(jCas)
        return jCas
    }

With this I can make a unit test very easily:

    void testTokenizer()
    {
        println "running token tests ..."
        JCas jCas = TestsUtil.process(tokenDescriptorFile, 
"data/test/docs/sampletext.txt")               
        FSIndex tokenIndex = jCas.getAnnotationIndex(Token.type)
        assert tokenIndex.size() == 132
        ...
    }

I have also started exploring writing uima components with Groovy.  I 
recently put together a file collection reader that uses Groovy's Ant 
builder which allows one to define includes and excludes for a FileSet 
in a descriptor file the same way you would in a build.xml file.  I 
haven't made it available yet because it needs a README and it is also 
dog slow on large corpora.  However, it is <100 lines of code and 
provides nifty functionality for smaller data sets.  My point is simply 
to give an example of what is possible.  When I understand the CAS 
better, it might be cool think about creating a Groovy builder that 
would be a more dynamic solution than JCas - but that is just a dream at 
the moment given my time and (lack of) expertise of these two 
technologies. 



Thilo Goetz wrote:
> Fixed for 2.2, thanks for reporting and providing the test case.
> This Groovy stuff looks pretty cool, I'll have to check it out
> some more...
>
> --Thilo
>
> Thilo Goetz wrote:
>   
>> The easiest place to create a test case is on Jira.  Open a Jira
>> bug for UIMA, and you can attach the test case.  When you do that,
>> please check the box that says something like, ok to include in
>> Apache code (so we can check it in and use it as regression test).
>>
>> Groovy, hm.  Never used it before.  If it doesn't take me more than
>> 5 min to set up in Eclipse, and I can still debug, not a problem ;-)
>>
>> --Thilo
>>
>> Philip Ogren wrote:
>>     
>>> Yes.  I will throw one together.  Would you mind if I used Groovy for
>>> this - or is that going to be annoying?  Let me know.  Also, from an
>>> earlier email I saw on this list it seems that attachments are a
>>> problem.  Is there some place where I could directly load a test case?
>>>
>>> In the mean time, here is a work around I just put together (I'm still
>>> unit testing the code that uses this - so I'm not certain this is bug
>>> free):
>>>
>>>    public static FSIterator getWindowIterator(JCas jCas, Annotation
>>> windowAnnotation, Type type)
>>>    {
>>>        ConstraintFactory constraintFactory = jCas.getConstraintFactory();
>>>              FeaturePath beginFeaturePath = jCas.createFeaturePath();
>>>        beginFeaturePath.addFeature(type.getFeatureByBaseName("begin"));
>>>        FSIntConstraint intConstraint =
>>> constraintFactory.createIntConstraint();
>>>        intConstraint.geq(windowAnnotation.getBegin());
>>>        FSMatchConstraint beginConstraint =
>>> constraintFactory.embedConstraint(beginFeaturePath, intConstraint);
>>>
>>>        FeaturePath endFeaturePath = jCas.createFeaturePath();
>>>        endFeaturePath.addFeature(type.getFeatureByBaseName("end"));
>>>        intConstraint = constraintFactory.createIntConstraint();
>>>        intConstraint.leq(windowAnnotation.getEnd());
>>>        FSMatchConstraint endConstraint =
>>> constraintFactory.embedConstraint(endFeaturePath, intConstraint);
>>>
>>>        FSMatchConstraint windowConstraint =
>>> constraintFactory.and(beginConstraint,endConstraint);
>>>        FSIndex windowIndex = jCas.getAnnotationIndex(type);
>>>        FSIterator windowIterator =
>>> jCas.createFilteredIterator(windowIndex.iterator(), windowConstraint);
>>>
>>>        return windowIterator;
>>>    }
>>>
>>>
>>>
>>> Thilo Goetz wrote:
>>>       
>>>> That's a bug.  The underlying implementation of the two
>>>> iterator types you mention is totally different, hence
>>>> you see this only in one of them.  Any chance you could
>>>> provide a self-contained test case that exhibits this?
>>>>
>>>> --Thilo
>>>>
>>>> Philip Ogren wrote:
>>>>  
>>>>         
>>>>> I am having difficulty with using the FSIterator returned by the
>>>>> AnnotationIndex.subiterator(AnnotationFS) method.
>>>>> The following is a code fragment:
>>>>>
>>>>> AnnotationIndex annotationIndex = jCas.getAnnotationIndex(tokenType);
>>>>> FSIterator tokenIterator =
>>>>> annotationIndex.subiterator(sentenceAnnotation);
>>>>> annotationIterator.moveTo(tokenAnnotation);
>>>>>
>>>>>
>>>>> Here is the relevant portion of the stack trace:
>>>>>
>>>>>     java.lang.ClassCastException: edu.colorado.cslr.dessert.types.Token
>>>>>     at java.util.Collections.indexedBinarySearch(Unknown Source)
>>>>>     at java.util.Collections.binarySearch(Unknown Source)
>>>>>     at org.apache.uima.cas.impl.Subiterator.moveTo(Subiterator.java:224)
>>>>>
>>>>>
>>>>> If I change the second line to the following, then I do not have any
>>>>> problems with an exception being thrown.
>>>>>
>>>>> FSIterator tokenIterator = annotationIndex.iterator();
>>>>>
>>>>>
>>>>> Is this a bug or some misunderstanding on my part of how subiterator
>>>>> should work?
>>>>>
>>>>> Thanks,
>>>>> Philip
>>>>>     
>>>>>           
>>>>   
>>>>         
>
>
>   

Re: ClassCastException thrown when using subiterator and moveTo()

Posted by Philip Ogren <ph...@ogren.info>.
btw, the work around I had posted is *much* slower than the fixed 
subiterator method - for both creating the iterator and iterating 
through it.  More than an order of magnitude slower (roughly 15x's 
slower).  Thanks for the fix!


Thilo Goetz wrote:
>>>    public static FSIterator getWindowIterator(JCas jCas, Annotation
>>> windowAnnotation, Type type)
>>>    {
>>>        ConstraintFactory constraintFactory = jCas.getConstraintFactory();
>>>              FeaturePath beginFeaturePath = jCas.createFeaturePath();
>>>        beginFeaturePath.addFeature(type.getFeatureByBaseName("begin"));
>>>        FSIntConstraint intConstraint =
>>> constraintFactory.createIntConstraint();
>>>        intConstraint.geq(windowAnnotation.getBegin());
>>>        FSMatchConstraint beginConstraint =
>>> constraintFactory.embedConstraint(beginFeaturePath, intConstraint);
>>>
>>>        FeaturePath endFeaturePath = jCas.createFeaturePath();
>>>        endFeaturePath.addFeature(type.getFeatureByBaseName("end"));
>>>        intConstraint = constraintFactory.createIntConstraint();
>>>        intConstraint.leq(windowAnnotation.getEnd());
>>>        FSMatchConstraint endConstraint =
>>> constraintFactory.embedConstraint(endFeaturePath, intConstraint);
>>>
>>>        FSMatchConstraint windowConstraint =
>>> constraintFactory.and(beginConstraint,endConstraint);
>>>        FSIndex windowIndex = jCas.getAnnotationIndex(type);
>>>        FSIterator windowIterator =
>>> jCas.createFilteredIterator(windowIndex.iterator(), windowConstraint);
>>>
>>>        return windowIterator;
>>>    }
>>>
>>>
>>>
>>> Thilo Goetz wrote:
>>>       
>>>> That's a bug.  The underlying implementation of the two
>>>> iterator types you mention is totally different, hence
>>>> you see this only in one of them.  Any chance you could
>>>> provide a self-contained test case that exhibits this?
>>>>
>>>> --Thilo
>>>>
>>>> Philip Ogren wrote:
>>>>  
>>>>         
>>>>> I am having difficulty with using the FSIterator returned by the
>>>>> AnnotationIndex.subiterator(AnnotationFS) method.
>>>>> The following is a code fragment:
>>>>>
>>>>> AnnotationIndex annotationIndex = jCas.getAnnotationIndex(tokenType);
>>>>> FSIterator tokenIterator =
>>>>> annotationIndex.subiterator(sentenceAnnotation);
>>>>> annotationIterator.moveTo(tokenAnnotation);
>>>>>
>>>>>
>>>>> Here is the relevant portion of the stack trace:
>>>>>
>>>>>     java.lang.ClassCastException: edu.colorado.cslr.dessert.types.Token
>>>>>     at java.util.Collections.indexedBinarySearch(Unknown Source)
>>>>>     at java.util.Collections.binarySearch(Unknown Source)
>>>>>     at org.apache.uima.cas.impl.Subiterator.moveTo(Subiterator.java:224)
>>>>>
>>>>>
>>>>> If I change the second line to the following, then I do not have any
>>>>> problems with an exception being thrown.
>>>>>
>>>>> FSIterator tokenIterator = annotationIndex.iterator();
>>>>>
>>>>>
>>>>> Is this a bug or some misunderstanding on my part of how subiterator
>>>>> should work?
>>>>>
>>>>> Thanks,
>>>>> Philip
>>>>>     
>>>>>           
>>>>   
>>>>         
>
>
>   

Re: ClassCastException thrown when using subiterator and moveTo()

Posted by Thilo Goetz <tw...@gmx.de>.
Fixed for 2.2, thanks for reporting and providing the test case.
This Groovy stuff looks pretty cool, I'll have to check it out
some more...

--Thilo

Thilo Goetz wrote:
> The easiest place to create a test case is on Jira.  Open a Jira
> bug for UIMA, and you can attach the test case.  When you do that,
> please check the box that says something like, ok to include in
> Apache code (so we can check it in and use it as regression test).
> 
> Groovy, hm.  Never used it before.  If it doesn't take me more than
> 5 min to set up in Eclipse, and I can still debug, not a problem ;-)
> 
> --Thilo
> 
> Philip Ogren wrote:
>> Yes.  I will throw one together.  Would you mind if I used Groovy for
>> this - or is that going to be annoying?  Let me know.  Also, from an
>> earlier email I saw on this list it seems that attachments are a
>> problem.  Is there some place where I could directly load a test case?
>>
>> In the mean time, here is a work around I just put together (I'm still
>> unit testing the code that uses this - so I'm not certain this is bug
>> free):
>>
>>    public static FSIterator getWindowIterator(JCas jCas, Annotation
>> windowAnnotation, Type type)
>>    {
>>        ConstraintFactory constraintFactory = jCas.getConstraintFactory();
>>              FeaturePath beginFeaturePath = jCas.createFeaturePath();
>>        beginFeaturePath.addFeature(type.getFeatureByBaseName("begin"));
>>        FSIntConstraint intConstraint =
>> constraintFactory.createIntConstraint();
>>        intConstraint.geq(windowAnnotation.getBegin());
>>        FSMatchConstraint beginConstraint =
>> constraintFactory.embedConstraint(beginFeaturePath, intConstraint);
>>
>>        FeaturePath endFeaturePath = jCas.createFeaturePath();
>>        endFeaturePath.addFeature(type.getFeatureByBaseName("end"));
>>        intConstraint = constraintFactory.createIntConstraint();
>>        intConstraint.leq(windowAnnotation.getEnd());
>>        FSMatchConstraint endConstraint =
>> constraintFactory.embedConstraint(endFeaturePath, intConstraint);
>>
>>        FSMatchConstraint windowConstraint =
>> constraintFactory.and(beginConstraint,endConstraint);
>>        FSIndex windowIndex = jCas.getAnnotationIndex(type);
>>        FSIterator windowIterator =
>> jCas.createFilteredIterator(windowIndex.iterator(), windowConstraint);
>>
>>        return windowIterator;
>>    }
>>
>>
>>
>> Thilo Goetz wrote:
>>> That's a bug.  The underlying implementation of the two
>>> iterator types you mention is totally different, hence
>>> you see this only in one of them.  Any chance you could
>>> provide a self-contained test case that exhibits this?
>>>
>>> --Thilo
>>>
>>> Philip Ogren wrote:
>>>  
>>>> I am having difficulty with using the FSIterator returned by the
>>>> AnnotationIndex.subiterator(AnnotationFS) method.
>>>> The following is a code fragment:
>>>>
>>>> AnnotationIndex annotationIndex = jCas.getAnnotationIndex(tokenType);
>>>> FSIterator tokenIterator =
>>>> annotationIndex.subiterator(sentenceAnnotation);
>>>> annotationIterator.moveTo(tokenAnnotation);
>>>>
>>>>
>>>> Here is the relevant portion of the stack trace:
>>>>
>>>>     java.lang.ClassCastException: edu.colorado.cslr.dessert.types.Token
>>>>     at java.util.Collections.indexedBinarySearch(Unknown Source)
>>>>     at java.util.Collections.binarySearch(Unknown Source)
>>>>     at org.apache.uima.cas.impl.Subiterator.moveTo(Subiterator.java:224)
>>>>
>>>>
>>>> If I change the second line to the following, then I do not have any
>>>> problems with an exception being thrown.
>>>>
>>>> FSIterator tokenIterator = annotationIndex.iterator();
>>>>
>>>>
>>>> Is this a bug or some misunderstanding on my part of how subiterator
>>>> should work?
>>>>
>>>> Thanks,
>>>> Philip
>>>>     
>>>
>>>   

Re: ClassCastException thrown when using subiterator and moveTo()

Posted by Thilo Goetz <tw...@gmx.de>.
The easiest place to create a test case is on Jira.  Open a Jira
bug for UIMA, and you can attach the test case.  When you do that,
please check the box that says something like, ok to include in
Apache code (so we can check it in and use it as regression test).

Groovy, hm.  Never used it before.  If it doesn't take me more than
5 min to set up in Eclipse, and I can still debug, not a problem ;-)

--Thilo

Philip Ogren wrote:
> Yes.  I will throw one together.  Would you mind if I used Groovy for
> this - or is that going to be annoying?  Let me know.  Also, from an
> earlier email I saw on this list it seems that attachments are a
> problem.  Is there some place where I could directly load a test case?
> 
> In the mean time, here is a work around I just put together (I'm still
> unit testing the code that uses this - so I'm not certain this is bug
> free):
> 
>    public static FSIterator getWindowIterator(JCas jCas, Annotation
> windowAnnotation, Type type)
>    {
>        ConstraintFactory constraintFactory = jCas.getConstraintFactory();
>              FeaturePath beginFeaturePath = jCas.createFeaturePath();
>        beginFeaturePath.addFeature(type.getFeatureByBaseName("begin"));
>        FSIntConstraint intConstraint =
> constraintFactory.createIntConstraint();
>        intConstraint.geq(windowAnnotation.getBegin());
>        FSMatchConstraint beginConstraint =
> constraintFactory.embedConstraint(beginFeaturePath, intConstraint);
> 
>        FeaturePath endFeaturePath = jCas.createFeaturePath();
>        endFeaturePath.addFeature(type.getFeatureByBaseName("end"));
>        intConstraint = constraintFactory.createIntConstraint();
>        intConstraint.leq(windowAnnotation.getEnd());
>        FSMatchConstraint endConstraint =
> constraintFactory.embedConstraint(endFeaturePath, intConstraint);
> 
>        FSMatchConstraint windowConstraint =
> constraintFactory.and(beginConstraint,endConstraint);
>        FSIndex windowIndex = jCas.getAnnotationIndex(type);
>        FSIterator windowIterator =
> jCas.createFilteredIterator(windowIndex.iterator(), windowConstraint);
> 
>        return windowIterator;
>    }
> 
> 
> 
> Thilo Goetz wrote:
>> That's a bug.  The underlying implementation of the two
>> iterator types you mention is totally different, hence
>> you see this only in one of them.  Any chance you could
>> provide a self-contained test case that exhibits this?
>>
>> --Thilo
>>
>> Philip Ogren wrote:
>>  
>>> I am having difficulty with using the FSIterator returned by the
>>> AnnotationIndex.subiterator(AnnotationFS) method.
>>> The following is a code fragment:
>>>
>>> AnnotationIndex annotationIndex = jCas.getAnnotationIndex(tokenType);
>>> FSIterator tokenIterator =
>>> annotationIndex.subiterator(sentenceAnnotation);
>>> annotationIterator.moveTo(tokenAnnotation);
>>>
>>>
>>> Here is the relevant portion of the stack trace:
>>>
>>>     java.lang.ClassCastException: edu.colorado.cslr.dessert.types.Token
>>>     at java.util.Collections.indexedBinarySearch(Unknown Source)
>>>     at java.util.Collections.binarySearch(Unknown Source)
>>>     at org.apache.uima.cas.impl.Subiterator.moveTo(Subiterator.java:224)
>>>
>>>
>>> If I change the second line to the following, then I do not have any
>>> problems with an exception being thrown.
>>>
>>> FSIterator tokenIterator = annotationIndex.iterator();
>>>
>>>
>>> Is this a bug or some misunderstanding on my part of how subiterator
>>> should work?
>>>
>>> Thanks,
>>> Philip
>>>     
>>
>>
>>   
> 

Re: ClassCastException thrown when using subiterator and moveTo()

Posted by Philip Ogren <ph...@ogren.info>.
Yes.  I will throw one together.  Would you mind if I used Groovy for 
this - or is that going to be annoying?  Let me know.  Also, from an 
earlier email I saw on this list it seems that attachments are a 
problem.  Is there some place where I could directly load a test case?

In the mean time, here is a work around I just put together (I'm still 
unit testing the code that uses this - so I'm not certain this is bug free):

    public static FSIterator getWindowIterator(JCas jCas, Annotation 
windowAnnotation, Type type)
    {
        ConstraintFactory constraintFactory = jCas.getConstraintFactory();
       
        FeaturePath beginFeaturePath = jCas.createFeaturePath();
        beginFeaturePath.addFeature(type.getFeatureByBaseName("begin"));
        FSIntConstraint intConstraint = 
constraintFactory.createIntConstraint();
        intConstraint.geq(windowAnnotation.getBegin());
        FSMatchConstraint beginConstraint = 
constraintFactory.embedConstraint(beginFeaturePath, intConstraint);

        FeaturePath endFeaturePath = jCas.createFeaturePath();
        endFeaturePath.addFeature(type.getFeatureByBaseName("end"));
        intConstraint = constraintFactory.createIntConstraint();
        intConstraint.leq(windowAnnotation.getEnd());
        FSMatchConstraint endConstraint = 
constraintFactory.embedConstraint(endFeaturePath, intConstraint);

        FSMatchConstraint windowConstraint = 
constraintFactory.and(beginConstraint,endConstraint);
        FSIndex windowIndex = jCas.getAnnotationIndex(type);
        FSIterator windowIterator = 
jCas.createFilteredIterator(windowIndex.iterator(), windowConstraint);

        return windowIterator;
    }



Thilo Goetz wrote:
> That's a bug.  The underlying implementation of the two
> iterator types you mention is totally different, hence
> you see this only in one of them.  Any chance you could
> provide a self-contained test case that exhibits this?
>
> --Thilo
>
> Philip Ogren wrote:
>   
>> I am having difficulty with using the FSIterator returned by the
>> AnnotationIndex.subiterator(AnnotationFS) method.
>> The following is a code fragment:
>>
>> AnnotationIndex annotationIndex = jCas.getAnnotationIndex(tokenType);
>> FSIterator tokenIterator = annotationIndex.subiterator(sentenceAnnotation);
>> annotationIterator.moveTo(tokenAnnotation);
>>
>>
>> Here is the relevant portion of the stack trace:
>>
>>     java.lang.ClassCastException: edu.colorado.cslr.dessert.types.Token
>>     at java.util.Collections.indexedBinarySearch(Unknown Source)
>>     at java.util.Collections.binarySearch(Unknown Source)
>>     at org.apache.uima.cas.impl.Subiterator.moveTo(Subiterator.java:224)
>>
>>
>> If I change the second line to the following, then I do not have any
>> problems with an exception being thrown.
>>
>> FSIterator tokenIterator = annotationIndex.iterator();
>>
>>
>> Is this a bug or some misunderstanding on my part of how subiterator
>> should work?
>>
>> Thanks,
>> Philip
>>     
>
>
>   

Re: ClassCastException thrown when using subiterator and moveTo()

Posted by Thilo Goetz <tw...@gmx.de>.
That's a bug.  The underlying implementation of the two
iterator types you mention is totally different, hence
you see this only in one of them.  Any chance you could
provide a self-contained test case that exhibits this?

--Thilo

Philip Ogren wrote:
> 
> I am having difficulty with using the FSIterator returned by the
> AnnotationIndex.subiterator(AnnotationFS) method.
> The following is a code fragment:
> 
> AnnotationIndex annotationIndex = jCas.getAnnotationIndex(tokenType);
> FSIterator tokenIterator = annotationIndex.subiterator(sentenceAnnotation);
> annotationIterator.moveTo(tokenAnnotation);
> 
> 
> Here is the relevant portion of the stack trace:
> 
>     java.lang.ClassCastException: edu.colorado.cslr.dessert.types.Token
>     at java.util.Collections.indexedBinarySearch(Unknown Source)
>     at java.util.Collections.binarySearch(Unknown Source)
>     at org.apache.uima.cas.impl.Subiterator.moveTo(Subiterator.java:224)
> 
> 
> If I change the second line to the following, then I do not have any
> problems with an exception being thrown.
> 
> FSIterator tokenIterator = annotationIndex.iterator();
> 
> 
> Is this a bug or some misunderstanding on my part of how subiterator
> should work?
> 
> Thanks,
> Philip