You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ctakes.apache.org by "Khanna, Ritu" <ri...@med.umich.edu> on 2016/08/12 15:55:07 UTC
ctakes 3.2.2/ytex exception when processing large number of
documents
Hello All,
We are processing a large number of medical documents in chunks. We ran the CPE with ytex DBCollectionReader/AggregateUMLSPlainFastProcessor/DBConsumer.
Our first batch has about 145,000 documents. However, after running for more than 48 hours and processing 78,823 documents the process was terminated with the exception pasted below:
>>>>
15 Jul 2016 17:55:53 INFO DBCollectionReader - loading document with id = {INST
ANCE_ID=720494072, INSTANCE_KEY=677503859}
Jul 15, 2016 5:55:53 PM org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEn
gine_impl callAnalysisComponentProcess(407)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException: EXCEPTION MESSAG
E LOCALIZATION FAILED: java.util.MissingResourceException: Can't find resource f
or bundle java.util.PropertyResourceBundle, key text is null for docId=null
at org.apache.ctakes.core.ae.SimpleSegmentAnnotator.process(SimpleSegmen
tAnnotator.java:59)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCa
sAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.cal
lAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:375)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.pro
cessAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:296)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.processUntilNextOutputCas(ASB_impl.java:567)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.<init>(ASB_impl.java:409)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.ja
va:342)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.pro
cessAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(A
nalysisEngineImplBase.java:267)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext
(ProcessingUnit.java:897)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(Process
ingUnit.java:577)
Jul 15, 2016 5:55:53 PM org.apache.uima.analysis_engine.impl.AggregateAnalysisEn
gine_impl processAndOutputNewCASes(275)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException: EXCEPTION MESSAG
E LOCALIZATION FAILED: java.util.MissingResourceException: Can't find resource f
or bundle java.util.PropertyResourceBundle, key text is null for docId=null
at org.apache.ctakes.core.ae.SimpleSegmentAnnotator.process(SimpleSegmen
tAnnotator.java:59)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCa
sAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.cal
lAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:375)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.pro
cessAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:296)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.processUntilNextOutputCas(ASB_impl.java:567)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.<init>(ASB_impl.java:409)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.ja
va:342)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.pro
cessAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(A
nalysisEngineImplBase.java:267)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext
(ProcessingUnit.java:897)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(Process
ingUnit.java:577)
org.apache.uima.analysis_engine.AnalysisEngineProcessException: EXCEPTION MESSAG
E LOCALIZATION FAILED: java.util.MissingResourceException: Can't find resource f
or bundle java.util.PropertyResourceBundle, key text is null for docId=null
at org.apache.ctakes.core.ae.SimpleSegmentAnnotator.process(SimpleSegmen
tAnnotator.java:59)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCa
sAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.cal
lAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:375)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.pro
cessAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:296)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.processUntilNextOutputCas(ASB_impl.java:567)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.<init>(ASB_impl.java:409)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.ja
va:342)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.pro
cessAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(A
nalysisEngineImplBase.java:267)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext
(ProcessingUnit.java:897)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(Process
ingUnit.java:577)
Jul 15, 2016 5:55:53 PM org.apache.uima.collection.impl.cpm.engine.ProcessingUni
t process
SEVERE: The container AggregatePlaintextFastUMLSProcessor returned the following
error message: EXCEPTION MESSAGE LOCALIZATION FAILED: java.util.MissingResource
Exception: Can't find resource for bundle java.util.PropertyResourceBundle, key
text is null for docId=null (Thread Name: [Procesing Pipeline#1 Thread]::)
Jul 15, 2016 5:55:53 PM org.apache.uima.collection.impl.cpm.engine.ProcessingUni
t maybeLogSevereException(2502)
SEVERE: Thread: [Procesing Pipeline#1 Thread]::, message: EXCEPTION MESSAGE LOCA
LIZATION FAILED: java.util.MissingResourceException: Can't find resource for bun
dle java.util.PropertyResourceBundle, key text is null for docId=null
org.apache.uima.analysis_engine.AnalysisEngineProcessException: EXCEPTION MESSAG
E LOCALIZATION FAILED: java.util.MissingResourceException: Can't find resource f
or bundle java.util.PropertyResourceBundle, key text is null for docId=null
at org.apache.ctakes.core.ae.SimpleSegmentAnnotator.process(SimpleSegmen
tAnnotator.java:59)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCa
sAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.cal
lAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:375)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.pro
cessAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:296)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.processUntilNextOutputCas(ASB_impl.java:567)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.<init>(ASB_impl.java:409)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.ja
va:342)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.pro
cessAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(A
nalysisEngineImplBase.java:267)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext
(ProcessingUnit.java:897)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(Process
ingUnit.java:577)
>>>>
Any help in resolving this will be appreciated.
Ctakes version: 3.2.2
Thanks,
Ritu
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
RE: ctakes 3.2.2/ytex exception when processing large number of
documents
Posted by "Khanna, Ritu" <ri...@med.umich.edu>.
Hello,
It turned out to be a data issue. In DBCollectionReader, some of the document keys returned by the ‘Query Get Documents Key’ query did not have a corresponding row in the table with document text (‘Query Get Document’). Fixing such rows, got rid of the exception and the job ran fine.
Thanks,
Ritu
From: buddha [mailto:buddha_314@yahoo.com]
Sent: Friday, August 12, 2016 12:28 PM
To: Khanna, Ritu <ri...@med.umich.edu>
Cc: user@ctakes.apache.org
Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents
My apologies, I didn’t look deep enough. It looks like org.apache.uima.analysis_engine.AnalysisEngineProcessException: is throwing the error (new font!)
I’ll try to check the source today unless someone knows this package and can get back faster.
b
~~~~~
May All Your Sequences Converge
On Aug 12, 2016, at 9:12 AM, Khanna, Ritu <ri...@med.umich.edu>> wrote:
We are using ytex DBCollectionReader in the pipeline which reads the document ids and document text from database tables.
I’m not sure what f refers to here.
From: buddha [mailto: buddha_314@yahoo.com<ma...@yahoo.com>]
Sent: Friday, August 12, 2016 12:09 PM
To: user@ctakes.apache.org<ma...@ctakes.apache.org>
Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents
Is “f” a file?
"Can't find resource f
or bundle java.util.PropertyResourceBundle, key text is null for docId=null"
~~~~~
May All Your Sequences Converge
On Aug 12, 2016, at 9:06 AM, Khanna, Ritu <ri...@med.umich.edu>> wrote:
Yes, we verified that. I also checked the database table with the instance_key that was in the exception and both document and document text exist.
From: buddha [mailto:buddha_314@yahoo.com]
Sent: Friday, August 12, 2016 12:04 PM
To: user@ctakes.apache.org<ma...@ctakes.apache.org>
Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents
Are you able to verify you have a docId for every document? This seems to be the trigger:
" key text is null for docId=null "
~~~~~
May All Your Sequences Converge
On Aug 12, 2016, at 8:55 AM, Khanna, Ritu <ri...@med.umich.edu>> wrote:
key text is null for docId=null
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
Re: ctakes 3.2.2/ytex exception when processing large number of documents
Posted by buddha <bu...@yahoo.com>.
My apologies, I didn’t look deep enough. It looks like org.apache.uima.analysis_engine.AnalysisEngineProcessException: is throwing the error (new font!)
I’ll try to check the source today unless someone knows this package and can get back faster.
b
~~~~~
May All Your Sequences Converge
> On Aug 12, 2016, at 9:12 AM, Khanna, Ritu <ri...@med.umich.edu> wrote:
>
> We are using ytex DBCollectionReader in the pipeline which reads the document ids and document text from database tables.
> I’m not sure what f refers to here.
>
> From: buddha [mailto: buddha_314@yahoo.com <ma...@yahoo.com>]
> Sent: Friday, August 12, 2016 12:09 PM
> To: user@ctakes.apache.org <ma...@ctakes.apache.org>
> Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents
>
> Is “f” a file?
>
> "Can't find resource f
> or bundle java.util.PropertyResourceBundle, key text is null for docId=null"
>
> ~~~~~
> May All Your Sequences Converge
>
> On Aug 12, 2016, at 9:06 AM, Khanna, Ritu <rituk@med.umich.edu <ma...@med.umich.edu>> wrote:
>
> Yes, we verified that. I also checked the database table with the instance_key that was in the exception and both document and document text exist.
>
> From: buddha [mailto:buddha_314@yahoo.com <ma...@yahoo.com>]
> Sent: Friday, August 12, 2016 12:04 PM
> To: user@ctakes.apache.org <ma...@ctakes.apache.org>
> Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents
>
> Are you able to verify you have a docId for every document? This seems to be the trigger:
>
> " key text is null for docId=null "
> ~~~~~
> May All Your Sequences Converge
>
> On Aug 12, 2016, at 8:55 AM, Khanna, Ritu <rituk@med.umich.edu <ma...@med.umich.edu>> wrote:
>
> key text is null for docId=null
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
>
RE: ctakes 3.2.2/ytex exception when processing large number of
documents
Posted by "Khanna, Ritu" <ri...@med.umich.edu>.
We are using ytex DBCollectionReader in the pipeline which reads the document ids and document text from database tables.
I’m not sure what f refers to here.
From: buddha [mailto: buddha_314@yahoo.com]
Sent: Friday, August 12, 2016 12:09 PM
To: user@ctakes.apache.org
Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents
Is “f” a file?
"Can't find resource f
or bundle java.util.PropertyResourceBundle, key text is null for docId=null"
~~~~~
May All Your Sequences Converge
On Aug 12, 2016, at 9:06 AM, Khanna, Ritu <ri...@med.umich.edu>> wrote:
Yes, we verified that. I also checked the database table with the instance_key that was in the exception and both document and document text exist.
From: buddha [mailto:buddha_314@yahoo.com]
Sent: Friday, August 12, 2016 12:04 PM
To: user@ctakes.apache.org<ma...@ctakes.apache.org>
Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents
Are you able to verify you have a docId for every document? This seems to be the trigger:
" key text is null for docId=null "
~~~~~
May All Your Sequences Converge
On Aug 12, 2016, at 8:55 AM, Khanna, Ritu <ri...@med.umich.edu>> wrote:
key text is null for docId=null
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
Re: ctakes 3.2.2/ytex exception when processing large number of documents
Posted by buddha <bu...@yahoo.com>.
Is “f” a file?
"Can't find resource f
or bundle java.util.PropertyResourceBundle, key text is null for docId=null"
~~~~~
May All Your Sequences Converge
> On Aug 12, 2016, at 9:06 AM, Khanna, Ritu <ri...@med.umich.edu> wrote:
>
> Yes, we verified that. I also checked the database table with the instance_key that was in the exception and both document and document text exist.
>
> From: buddha [mailto:buddha_314@yahoo.com <ma...@yahoo.com>]
> Sent: Friday, August 12, 2016 12:04 PM
> To: user@ctakes.apache.org <ma...@ctakes.apache.org>
> Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents
>
> Are you able to verify you have a docId for every document? This seems to be the trigger:
>
> " key text is null for docId=null "
> ~~~~~
> May All Your Sequences Converge
>
> On Aug 12, 2016, at 8:55 AM, Khanna, Ritu <rituk@med.umich.edu <ma...@med.umich.edu>> wrote:
>
> key text is null for docId=null
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
>
RE: ctakes 3.2.2/ytex exception when processing large number of
documents
Posted by "Khanna, Ritu" <ri...@med.umich.edu>.
Yes, we verified that. I also checked the database table with the instance_key that was in the exception and both document and document text exist.
From: buddha [mailto:buddha_314@yahoo.com]
Sent: Friday, August 12, 2016 12:04 PM
To: user@ctakes.apache.org
Subject: Re: ctakes 3.2.2/ytex exception when processing large number of documents
Are you able to verify you have a docId for every document? This seems to be the trigger:
" key text is null for docId=null "
~~~~~
May All Your Sequences Converge
On Aug 12, 2016, at 8:55 AM, Khanna, Ritu <ri...@med.umich.edu>> wrote:
key text is null for docId=null
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
Re: ctakes 3.2.2/ytex exception when processing large number of documents
Posted by buddha <bu...@yahoo.com>.
Are you able to verify you have a docId for every document? This seems to be the trigger:
" key text is null for docId=null "
~~~~~
May All Your Sequences Converge
> On Aug 12, 2016, at 8:55 AM, Khanna, Ritu <ri...@med.umich.edu> wrote:
>
> key text is null for docId=null