You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ctakes.apache.org by "Khanna, Ritu" <ri...@med.umich.edu> on 2016/08/12 15:55:07 UTC

ctakes 3.2.2/ytex exception when processing large number of documents

Hello All,

We are processing a large number of medical documents in chunks. We ran the CPE with ytex DBCollectionReader/AggregateUMLSPlainFastProcessor/DBConsumer.
Our first batch has about 145,000 documents. However, after running for more than 48 hours and processing 78,823 documents the process was terminated with the exception pasted below:

>>>>
15 Jul 2016 17:55:53  INFO DBCollectionReader - loading document with id = {INST
ANCE_ID=720494072, INSTANCE_KEY=677503859}
Jul 15, 2016 5:55:53 PM org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEn
gine_impl callAnalysisComponentProcess(407)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException: EXCEPTION MESSAG
E LOCALIZATION FAILED: java.util.MissingResourceException: Can't find resource f
or bundle java.util.PropertyResourceBundle, key text is null for docId=null
        at org.apache.ctakes.core.ae.SimpleSegmentAnnotator.process(SimpleSegmen
tAnnotator.java:59)
        at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCa
sAnnotator_ImplBase.java:48)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.cal
lAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:375)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.pro
cessAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:296)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.processUntilNextOutputCas(ASB_impl.java:567)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.<init>(ASB_impl.java:409)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.ja
va:342)
        at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.pro
cessAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
        at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(A
nalysisEngineImplBase.java:267)
        at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext
(ProcessingUnit.java:897)
        at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(Process
ingUnit.java:577)

Jul 15, 2016 5:55:53 PM org.apache.uima.analysis_engine.impl.AggregateAnalysisEn
gine_impl processAndOutputNewCASes(275)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException: EXCEPTION MESSAG
E LOCALIZATION FAILED: java.util.MissingResourceException: Can't find resource f
or bundle java.util.PropertyResourceBundle, key text is null for docId=null
        at org.apache.ctakes.core.ae.SimpleSegmentAnnotator.process(SimpleSegmen
tAnnotator.java:59)
        at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCa
sAnnotator_ImplBase.java:48)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.cal
lAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:375)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.pro
cessAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:296)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.processUntilNextOutputCas(ASB_impl.java:567)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.<init>(ASB_impl.java:409)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.ja
va:342)
        at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.pro
cessAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
        at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(A
nalysisEngineImplBase.java:267)
        at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext
(ProcessingUnit.java:897)
        at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(Process
ingUnit.java:577)

org.apache.uima.analysis_engine.AnalysisEngineProcessException: EXCEPTION MESSAG
E LOCALIZATION FAILED: java.util.MissingResourceException: Can't find resource f
or bundle java.util.PropertyResourceBundle, key text is null for docId=null
        at org.apache.ctakes.core.ae.SimpleSegmentAnnotator.process(SimpleSegmen
tAnnotator.java:59)
        at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCa
sAnnotator_ImplBase.java:48)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.cal
lAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:375)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.pro
cessAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:296)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.processUntilNextOutputCas(ASB_impl.java:567)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.<init>(ASB_impl.java:409)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.ja
va:342)
        at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.pro
cessAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
        at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(A
nalysisEngineImplBase.java:267)
        at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext
(ProcessingUnit.java:897)
        at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(Process
ingUnit.java:577)
Jul 15, 2016 5:55:53 PM org.apache.uima.collection.impl.cpm.engine.ProcessingUni
t process
SEVERE: The container AggregatePlaintextFastUMLSProcessor returned the following
 error message: EXCEPTION MESSAGE LOCALIZATION FAILED: java.util.MissingResource
Exception: Can't find resource for bundle java.util.PropertyResourceBundle, key
text is null for docId=null (Thread Name: [Procesing Pipeline#1 Thread]::)
Jul 15, 2016 5:55:53 PM org.apache.uima.collection.impl.cpm.engine.ProcessingUni
t maybeLogSevereException(2502)
SEVERE: Thread: [Procesing Pipeline#1 Thread]::, message: EXCEPTION MESSAGE LOCA
LIZATION FAILED: java.util.MissingResourceException: Can't find resource for bun
dle java.util.PropertyResourceBundle, key text is null for docId=null
org.apache.uima.analysis_engine.AnalysisEngineProcessException: EXCEPTION MESSAG
E LOCALIZATION FAILED: java.util.MissingResourceException: Can't find resource f
or bundle java.util.PropertyResourceBundle, key text is null for docId=null
        at org.apache.ctakes.core.ae.SimpleSegmentAnnotator.process(SimpleSegmen
tAnnotator.java:59)
        at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCa
sAnnotator_ImplBase.java:48)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.cal
lAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:375)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.pro
cessAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:296)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.processUntilNextOutputCas(ASB_impl.java:567)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.<init>(ASB_impl.java:409)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.ja
va:342)
        at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.pro
cessAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
        at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(A
nalysisEngineImplBase.java:267)
        at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext
(ProcessingUnit.java:897)
        at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(Process
ingUnit.java:577)
>>>>

Any help in resolving this will be appreciated.
Ctakes version: 3.2.2

Thanks,
Ritu

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 

RE: ctakes 3.2.2/ytex exception when processing large number of documents

Posted by "Khanna, Ritu" <ri...@med.umich.edu>.
Hello,

It turned out to be a data issue. In DBCollectionReader, some of the document keys returned by the ‘Query Get Documents Key’ query did not have a corresponding row in the table with document text (‘Query Get Document’). Fixing such rows, got rid of the exception and the job ran fine.

Thanks,
Ritu

From: buddha [mailto:buddha_314@yahoo.com]
Sent: Friday, August 12, 2016 12:28 PM
To: Khanna, Ritu <ri...@med.umich.edu>
Cc: user@ctakes.apache.org
Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents

My apologies, I didn’t look deep enough.  It looks like org.apache.uima.analysis_engine.AnalysisEngineProcessException:  is throwing the error (new font!)

I’ll try to check the source today unless someone knows this package and can get back faster.

b

~~~~~
May All Your Sequences Converge

On Aug 12, 2016, at 9:12 AM, Khanna, Ritu <ri...@med.umich.edu>> wrote:

We are using ytex DBCollectionReader in the pipeline which reads the document ids and document text from database tables.
I’m not sure what f refers to here.

From: buddha [mailto: buddha_314@yahoo.com<ma...@yahoo.com>]
Sent: Friday, August 12, 2016 12:09 PM
To: user@ctakes.apache.org<ma...@ctakes.apache.org>
Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents

Is “f” a file?

"Can't find resource f
or bundle java.util.PropertyResourceBundle, key text is null for docId=null"

~~~~~
May All Your Sequences Converge

On Aug 12, 2016, at 9:06 AM, Khanna, Ritu <ri...@med.umich.edu>> wrote:

Yes, we verified that. I also checked the database table with the instance_key that was in the exception and both document and document text exist.

From: buddha [mailto:buddha_314@yahoo.com]
Sent: Friday, August 12, 2016 12:04 PM
To: user@ctakes.apache.org<ma...@ctakes.apache.org>
Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents

Are you able to verify you have a docId for every document?  This seems to be the trigger:

" key text is null for docId=null "
~~~~~
May All Your Sequences Converge

On Aug 12, 2016, at 8:55 AM, Khanna, Ritu <ri...@med.umich.edu>> wrote:

 key text is null for docId=null

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 

Re: ctakes 3.2.2/ytex exception when processing large number of documents

Posted by buddha <bu...@yahoo.com>.
My apologies, I didn’t look deep enough.  It looks like org.apache.uima.analysis_engine.AnalysisEngineProcessException:  is throwing the error (new font!)

I’ll try to check the source today unless someone knows this package and can get back faster.

b

~~~~~
May All Your Sequences Converge

> On Aug 12, 2016, at 9:12 AM, Khanna, Ritu <ri...@med.umich.edu> wrote:
> 
> We are using ytex DBCollectionReader in the pipeline which reads the document ids and document text from database tables.
> I’m not sure what f refers to here.
>  
> From: buddha [mailto: buddha_314@yahoo.com <ma...@yahoo.com>] 
> Sent: Friday, August 12, 2016 12:09 PM
> To: user@ctakes.apache.org <ma...@ctakes.apache.org>
> Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents
>  
> Is “f” a file?
>  
> "Can't find resource f
> or bundle java.util.PropertyResourceBundle, key text is null for docId=null"
>  
> ~~~~~
> May All Your Sequences Converge
>  
> On Aug 12, 2016, at 9:06 AM, Khanna, Ritu <rituk@med.umich.edu <ma...@med.umich.edu>> wrote:
>  
> Yes, we verified that. I also checked the database table with the instance_key that was in the exception and both document and document text exist.
>  
> From: buddha [mailto:buddha_314@yahoo.com <ma...@yahoo.com>] 
> Sent: Friday, August 12, 2016 12:04 PM
> To: user@ctakes.apache.org <ma...@ctakes.apache.org>
> Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents
>  
> Are you able to verify you have a docId for every document?  This seems to be the trigger:
>  
> " key text is null for docId=null "
> ~~~~~
> May All Your Sequences Converge
>  
> On Aug 12, 2016, at 8:55 AM, Khanna, Ritu <rituk@med.umich.edu <ma...@med.umich.edu>> wrote:
>  
>  key text is null for docId=null
>  
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 
>  
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 
> 


RE: ctakes 3.2.2/ytex exception when processing large number of documents

Posted by "Khanna, Ritu" <ri...@med.umich.edu>.
We are using ytex DBCollectionReader in the pipeline which reads the document ids and document text from database tables.
I’m not sure what f refers to here.

From: buddha [mailto: buddha_314@yahoo.com]
Sent: Friday, August 12, 2016 12:09 PM
To: user@ctakes.apache.org
Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents

Is “f” a file?

"Can't find resource f
or bundle java.util.PropertyResourceBundle, key text is null for docId=null"

~~~~~
May All Your Sequences Converge

On Aug 12, 2016, at 9:06 AM, Khanna, Ritu <ri...@med.umich.edu>> wrote:

Yes, we verified that. I also checked the database table with the instance_key that was in the exception and both document and document text exist.

From: buddha [mailto:buddha_314@yahoo.com]
Sent: Friday, August 12, 2016 12:04 PM
To: user@ctakes.apache.org<ma...@ctakes.apache.org>
Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents

Are you able to verify you have a docId for every document?  This seems to be the trigger:

" key text is null for docId=null "
~~~~~
May All Your Sequences Converge

On Aug 12, 2016, at 8:55 AM, Khanna, Ritu <ri...@med.umich.edu>> wrote:

 key text is null for docId=null

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 

Re: ctakes 3.2.2/ytex exception when processing large number of documents

Posted by buddha <bu...@yahoo.com>.
Is “f” a file?

"Can't find resource f
or bundle java.util.PropertyResourceBundle, key text is null for docId=null"

~~~~~
May All Your Sequences Converge

> On Aug 12, 2016, at 9:06 AM, Khanna, Ritu <ri...@med.umich.edu> wrote:
> 
> Yes, we verified that. I also checked the database table with the instance_key that was in the exception and both document and document text exist.
>  
> From: buddha [mailto:buddha_314@yahoo.com <ma...@yahoo.com>] 
> Sent: Friday, August 12, 2016 12:04 PM
> To: user@ctakes.apache.org <ma...@ctakes.apache.org>
> Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents
>  
> Are you able to verify you have a docId for every document?  This seems to be the trigger:
>  
> " key text is null for docId=null "
> ~~~~~
> May All Your Sequences Converge
>  
> On Aug 12, 2016, at 8:55 AM, Khanna, Ritu <rituk@med.umich.edu <ma...@med.umich.edu>> wrote:
>  
>  key text is null for docId=null
>  
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 
> 


RE: ctakes 3.2.2/ytex exception when processing large number of documents

Posted by "Khanna, Ritu" <ri...@med.umich.edu>.
Yes, we verified that. I also checked the database table with the instance_key that was in the exception and both document and document text exist.

From: buddha [mailto:buddha_314@yahoo.com]
Sent: Friday, August 12, 2016 12:04 PM
To: user@ctakes.apache.org
Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents

Are you able to verify you have a docId for every document?  This seems to be the trigger:

" key text is null for docId=null "
~~~~~
May All Your Sequences Converge

On Aug 12, 2016, at 8:55 AM, Khanna, Ritu <ri...@med.umich.edu>> wrote:

 key text is null for docId=null

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 

Re: ctakes 3.2.2/ytex exception when processing large number of documents

Posted by buddha <bu...@yahoo.com>.
Are you able to verify you have a docId for every document?  This seems to be the trigger:

" key text is null for docId=null "
~~~~~
May All Your Sequences Converge

> On Aug 12, 2016, at 8:55 AM, Khanna, Ritu <ri...@med.umich.edu> wrote:
> 
>  key text is null for docId=null