You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@uima.apache.org by Chengmin Ding <ch...@gmail.com> on 2009/06/24 14:20:49 UTC

Re: question about InterOP between Apache UIMA and Omnifind Annotators (CAS2JDBC)

Thanks Thilo!  I didn't mean to cross-post to the other list but I didn't
see my question posted in my gmail account so just tried again. Sorry about
it.

A couple of years ago when we used the IBM UIMA framework, we could run
CAS2JDBC out of Omnifind by including the Omnifind base annotators into the
aggregate analysis engine. (following the Omnifind handbook and suggestions
from Sebastian , c.f.
http://www.ibm.com/developerworks/forums/thread.jspa?threadID=157872&tstart=0&messageID=13941628
)

I guess my question should be better phrased this way:  we tried to use the
IBM UIMA Adaptor to wrap up the Omnifind base annotator
(of_tokenization.xml) and does this supposed to work?   In our pipeline, we
used the Adaptor twice, one for the Omnifind base annotator(at the
beginning), one of the CAS2JDBC consumer(at the end).

I appreciate any suggestions/comments on this.

-Chengmin

On Wed, Jun 24, 2009 at 3:05 AM, Thilo Goetz <tw...@gmx.de> wrote:

> Hi Chengmin,
>
> please don't cross post.  Answers below.
>
> Chengmin Ding wrote:
> > Hello,
> >
> > We have used the UIMA Adapter for IBM annotators and it worked for some
> of
> > our testing annotators.  However, when we tried it on cas2jdbc, we got
> the
> > following error:
> >
> > We have a CPE pipeline and the CAS2JDBC is the only consumer/engine based
> on
> > IBM UIMA framework. We are using Apache UIMA 2.2 for the entire pipeline.
> We
> > were thinking this was caused by missing Omnifind specific annotator
> which
> > fills out the DocumentAnnotation or the omnifind specific
> > com.ibm.es.tt.DocumentMetaData feature structure (which contains
> documentid
> > etc features). We then added the base annotator from Omnifind
> > (OF_Tokenization.xml etc) and also wrapped it up with the adapter. But we
> > still got the same error. Our questions are:
> >
> > 1) Is the error indeed caused by missing some Omnifind specific annotator
> > that fills out the DocumentAnnotation feature structure?
>
> Not quite sure from the error message, but very likely yes.  I suppose
> that cas2jdbc was never intended to be run outside the OF UIMA pipeline.
> OF has an internal document model that is shared between its annotators,
> and I assume that cas2jdbc relies on that model.  Seems reasonable, given
> that you will later need to identify documents in the DB based on some ID
> or other.
>
> > 2) Is there any way to further isolate the problem via any tools
> considering
> > we do not have the source code for cas2jdbc?
>
> I can't think of any.  A better place to ask would be the IBM OF
> support forum.
>
> > 3) Can the IBM UIMA Adapter be used the same way to wrap regular
> annotator,
> > aggregated analysis engine and consumers ?
>
> Yes for primitive and aggregate AEs.  Consumers I actually don't know,
> they used to have a special status in IBM UIMA.  It doesn't look like
> that's your problem, though.
>
> > 4) Does Apache UIMA have any plan to come up with a CAS2JDBC compatible
> db
> > consumer?
>
> If there is one, I don't know of it.
>
> --Thilo
>
> >
> > Thanks a lot!
> > ================================================
> > org.apache.uima.analysis_engine.AnalysisEngineProcessException
> > at
> >
> com.ibm.uima.adapter.ibm.IBMAnalysisEngineWrapper.processAndOutputNewCASes(Unknown
> > Source)
> > at
> >
> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:218)
> > at
> >
> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:892)
> > at
> >
> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:577)
> > Caused by: com.ibm.uima.analysis_engine.AnalysisEngineProcessException:
> The
> > common analysis structure cannot be processed. See the previous exception
> > for details.
> > at
> >
> com.ibm.uima.reference_impl.analysis_engine.compatibility.CasConsumerAdapter.process(CasConsumerAdapter.java:93)
> > at
> >
> com.ibm.uima.reference_impl.analysis_engine.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:392)
> > at
> >
> com.ibm.uima.reference_impl.analysis_engine.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:297)
> > at
> >
> com.ibm.uima.reference_impl.analysis_engine.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:218)
> > ... 4 more
> > Caused by: com.ibm.uima.resource.ResourceProcessException: The common
> > analysis structure cannot be processed. See the previous exception for
> > details.
> > at
> >
> com.ibm.uima.consumer.cas2jdbc.utils.Cas2JdbcLogger.log_PROCESS_CAS__SEVERE(Unknown
> > Source)
> > at com.ibm.uima.consumer.cas2jdbc.Cas2Jdbc.processCas(Unknown Source)
> > at
> >
> com.ibm.uima.reference_impl.analysis_engine.compatibility.CasConsumerAdapter.process(CasConsumerAdapter.java:89)
> > ... 7 more
> > Caused by: com.ibm.uima.resource.ResourceProcessException: The document's
> ID
> > cannot be parsed. See the previous exception for details.
> > at
> >
> com.ibm.uima.consumer.cas2jdbc.utils.Cas2JdbcLogger.log_BAD_DOCID__SEVERE(Unknown
> > Source)
> > at com.ibm.uima.consumer.cas2jdbc.Cas2Jdbc.parseDocID(Unknown Source)
> > ... 9 more
> > Caused by: java.lang.NullPointerException
> > ... 10 more
> >
> > -Chengmin
> >
>
>

Re: question about InterOP between Apache UIMA and Omnifind Annotators (CAS2JDBC)

Posted by Thilo Goetz <tw...@gmx.de>.

Chengmin Ding wrote:
> Thanks Thilo!  I didn't mean to cross-post to the other list but I didn't
> see my question posted in my gmail account so just tried again. Sorry about
> it.
> 
> A couple of years ago when we used the IBM UIMA framework, we could run
> CAS2JDBC out of Omnifind by including the Omnifind base annotators into the
> aggregate analysis engine. (following the Omnifind handbook and suggestions
> from Sebastian , c.f.
> http://www.ibm.com/developerworks/forums/thread.jspa?threadID=157872&tstart=0&messageID=13941628
> )
> 
> I guess my question should be better phrased this way:  we tried to use the
> IBM UIMA Adaptor to wrap up the Omnifind base annotator
> (of_tokenization.xml) and does this supposed to work?   In our pipeline, we
> used the Adaptor twice, one for the Omnifind base annotator(at the
> beginning), one of the CAS2JDBC consumer(at the end).
> 
> I appreciate any suggestions/comments on this.

Sorry, I told you everything I could dredge up from the depths
of my memory.  Please try the OF forum on developerworks (not
the UIMA forum): http://www.ibm.com/developerworks/forums/forum.jspa?forumID=757
You may have more luck there.

--Thilo

> 
> -Chengmin
> 
> On Wed, Jun 24, 2009 at 3:05 AM, Thilo Goetz <tw...@gmx.de> wrote:
> 
>> Hi Chengmin,
>>
>> please don't cross post.  Answers below.
>>
>> Chengmin Ding wrote:
>>> Hello,
>>>
>>> We have used the UIMA Adapter for IBM annotators and it worked for some
>> of
>>> our testing annotators.  However, when we tried it on cas2jdbc, we got
>> the
>>> following error:
>>>
>>> We have a CPE pipeline and the CAS2JDBC is the only consumer/engine based
>> on
>>> IBM UIMA framework. We are using Apache UIMA 2.2 for the entire pipeline.
>> We
>>> were thinking this was caused by missing Omnifind specific annotator
>> which
>>> fills out the DocumentAnnotation or the omnifind specific
>>> com.ibm.es.tt.DocumentMetaData feature structure (which contains
>> documentid
>>> etc features). We then added the base annotator from Omnifind
>>> (OF_Tokenization.xml etc) and also wrapped it up with the adapter. But we
>>> still got the same error. Our questions are:
>>>
>>> 1) Is the error indeed caused by missing some Omnifind specific annotator
>>> that fills out the DocumentAnnotation feature structure?
>> Not quite sure from the error message, but very likely yes.  I suppose
>> that cas2jdbc was never intended to be run outside the OF UIMA pipeline.
>> OF has an internal document model that is shared between its annotators,
>> and I assume that cas2jdbc relies on that model.  Seems reasonable, given
>> that you will later need to identify documents in the DB based on some ID
>> or other.
>>
>>> 2) Is there any way to further isolate the problem via any tools
>> considering
>>> we do not have the source code for cas2jdbc?
>> I can't think of any.  A better place to ask would be the IBM OF
>> support forum.
>>
>>> 3) Can the IBM UIMA Adapter be used the same way to wrap regular
>> annotator,
>>> aggregated analysis engine and consumers ?
>> Yes for primitive and aggregate AEs.  Consumers I actually don't know,
>> they used to have a special status in IBM UIMA.  It doesn't look like
>> that's your problem, though.
>>
>>> 4) Does Apache UIMA have any plan to come up with a CAS2JDBC compatible
>> db
>>> consumer?
>> If there is one, I don't know of it.
>>
>> --Thilo
>>
>>> Thanks a lot!
>>> ================================================
>>> org.apache.uima.analysis_engine.AnalysisEngineProcessException
>>> at
>>>
>> com.ibm.uima.adapter.ibm.IBMAnalysisEngineWrapper.processAndOutputNewCASes(Unknown
>>> Source)
>>> at
>>>
>> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:218)
>>> at
>>>
>> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:892)
>>> at
>>>
>> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:577)
>>> Caused by: com.ibm.uima.analysis_engine.AnalysisEngineProcessException:
>> The
>>> common analysis structure cannot be processed. See the previous exception
>>> for details.
>>> at
>>>
>> com.ibm.uima.reference_impl.analysis_engine.compatibility.CasConsumerAdapter.process(CasConsumerAdapter.java:93)
>>> at
>>>
>> com.ibm.uima.reference_impl.analysis_engine.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:392)
>>> at
>>>
>> com.ibm.uima.reference_impl.analysis_engine.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:297)
>>> at
>>>
>> com.ibm.uima.reference_impl.analysis_engine.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:218)
>>> ... 4 more
>>> Caused by: com.ibm.uima.resource.ResourceProcessException: The common
>>> analysis structure cannot be processed. See the previous exception for
>>> details.
>>> at
>>>
>> com.ibm.uima.consumer.cas2jdbc.utils.Cas2JdbcLogger.log_PROCESS_CAS__SEVERE(Unknown
>>> Source)
>>> at com.ibm.uima.consumer.cas2jdbc.Cas2Jdbc.processCas(Unknown Source)
>>> at
>>>
>> com.ibm.uima.reference_impl.analysis_engine.compatibility.CasConsumerAdapter.process(CasConsumerAdapter.java:89)
>>> ... 7 more
>>> Caused by: com.ibm.uima.resource.ResourceProcessException: The document's
>> ID
>>> cannot be parsed. See the previous exception for details.
>>> at
>>>
>> com.ibm.uima.consumer.cas2jdbc.utils.Cas2JdbcLogger.log_BAD_DOCID__SEVERE(Unknown
>>> Source)
>>> at com.ibm.uima.consumer.cas2jdbc.Cas2Jdbc.parseDocID(Unknown Source)
>>> ... 9 more
>>> Caused by: java.lang.NullPointerException
>>> ... 10 more
>>>
>>> -Chengmin
>>>
>>
>