You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by samir chabou <sa...@yahoo.com.INVALID> on 2015/04/11 06:54:02 UTC

iterate on the features of CAS consumer (FileWriterCasConsumer)

  Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
Thanks

Re: iterate on the features of CAS consumer (FileWriterCasConsumer)

Posted by samir chabou <sa...@yahoo.com>.
Hi Tim,I was able to use CasIOUtil package to iterate on the Cas features. First, I need it to create a new Cas and I used JCasFactory for that. Below is the two lines of code. Thanks for your help

JCas jcas = JCasFactory.createJCas(); //create a new case CasIOUtil.readJCas(jcas, new File("C:\\temp\\uima\\xcas\\xCasAbstrct.xcas")); //load the existing Cas into the new one
Samir


     On Wednesday, April 15, 2015 2:53 PM, samir chabou <sa...@yahoo.com> wrote:
   

 Thanks Tim for your suggestion I'll try to experiment with the CasIOUtil method and keep the uesr/dev list posted.   


     On Wednesday, April 15, 2015 7:07 AM, "Miller, Timothy" <Ti...@childrens.harvard.edu> wrote:
   

 The standard way that we do save redundant processing time is by writing the CAS for each file to an XMI file after one pass on the data which runs all the analysis engines.

For example, if we are working on experiments, we have one pipeline that does all the NLP feature generation (POS tags, dependency parsing, dictionary lookup, etc.), and writes each document to an xmi file in a directory using UimaFit's CasIOUtil class:
https://uima.apache.org/d/uimafit-current/api/org/apache/uima/fit/util/CasIOUtil.html

Then in a second machine learning pipeline we read the xmi files (using a different CasIOUtil method) and vary any machine learning parameters we want using the same standard annotations.

Hope this helps.
Tim

________________________________________
From: samir chabou [samirchb@yahoo.com.INVALID]
Sent: Monday, April 13, 2015 11:22 PM
To: dev@ctakes.apache.org; user@ctakes.apache.org
Subject: Re: iterate on the features of CAS consumer (FileWriterCasConsumer)

  Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
please advise Thanks


    On Saturday, April 11, 2015 12:54 AM, samir chabou <sa...@yahoo.com> wrote:


  Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
Thanks



   

  

Re: iterate on the features of CAS consumer (FileWriterCasConsumer)

Posted by samir chabou <sa...@yahoo.com.INVALID>.
Hi Tim,I was able to use CasIOUtil package to iterate on the Cas features. First, I need it to create a new Cas and I used JCasFactory for that. Below is the two lines of code. Thanks for your help

JCas jcas = JCasFactory.createJCas(); //create a new case CasIOUtil.readJCas(jcas, new File("C:\\temp\\uima\\xcas\\xCasAbstrct.xcas")); //load the existing Cas into the new one
Samir


     On Wednesday, April 15, 2015 2:53 PM, samir chabou <sa...@yahoo.com> wrote:
   

 Thanks Tim for your suggestion I'll try to experiment with the CasIOUtil method and keep the uesr/dev list posted.   


     On Wednesday, April 15, 2015 7:07 AM, "Miller, Timothy" <Ti...@childrens.harvard.edu> wrote:
   

 The standard way that we do save redundant processing time is by writing the CAS for each file to an XMI file after one pass on the data which runs all the analysis engines.

For example, if we are working on experiments, we have one pipeline that does all the NLP feature generation (POS tags, dependency parsing, dictionary lookup, etc.), and writes each document to an xmi file in a directory using UimaFit's CasIOUtil class:
https://uima.apache.org/d/uimafit-current/api/org/apache/uima/fit/util/CasIOUtil.html

Then in a second machine learning pipeline we read the xmi files (using a different CasIOUtil method) and vary any machine learning parameters we want using the same standard annotations.

Hope this helps.
Tim

________________________________________
From: samir chabou [samirchb@yahoo.com.INVALID]
Sent: Monday, April 13, 2015 11:22 PM
To: dev@ctakes.apache.org; user@ctakes.apache.org
Subject: Re: iterate on the features of CAS consumer (FileWriterCasConsumer)

  Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
please advise Thanks


    On Saturday, April 11, 2015 12:54 AM, samir chabou <sa...@yahoo.com> wrote:


  Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
Thanks



   

  

Re: iterate on the features of CAS consumer (FileWriterCasConsumer)

Posted by samir chabou <sa...@yahoo.com>.
Thanks Tim for your suggestion I'll try to experiment with the CasIOUtil method and keep the uesr/dev list posted.   


     On Wednesday, April 15, 2015 7:07 AM, "Miller, Timothy" <Ti...@childrens.harvard.edu> wrote:
   

 The standard way that we do save redundant processing time is by writing the CAS for each file to an XMI file after one pass on the data which runs all the analysis engines.

For example, if we are working on experiments, we have one pipeline that does all the NLP feature generation (POS tags, dependency parsing, dictionary lookup, etc.), and writes each document to an xmi file in a directory using UimaFit's CasIOUtil class:
https://uima.apache.org/d/uimafit-current/api/org/apache/uima/fit/util/CasIOUtil.html

Then in a second machine learning pipeline we read the xmi files (using a different CasIOUtil method) and vary any machine learning parameters we want using the same standard annotations.

Hope this helps.
Tim

________________________________________
From: samir chabou [samirchb@yahoo.com.INVALID]
Sent: Monday, April 13, 2015 11:22 PM
To: dev@ctakes.apache.org; user@ctakes.apache.org
Subject: Re: iterate on the features of CAS consumer (FileWriterCasConsumer)

  Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
please advise Thanks


    On Saturday, April 11, 2015 12:54 AM, samir chabou <sa...@yahoo.com> wrote:


  Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
Thanks



  

Re: iterate on the features of CAS consumer (FileWriterCasConsumer)

Posted by samir chabou <sa...@yahoo.com.INVALID>.
Thanks Tim for your suggestion I'll try to experiment with the CasIOUtil method and keep the uesr/dev list posted.   


     On Wednesday, April 15, 2015 7:07 AM, "Miller, Timothy" <Ti...@childrens.harvard.edu> wrote:
   

 The standard way that we do save redundant processing time is by writing the CAS for each file to an XMI file after one pass on the data which runs all the analysis engines.

For example, if we are working on experiments, we have one pipeline that does all the NLP feature generation (POS tags, dependency parsing, dictionary lookup, etc.), and writes each document to an xmi file in a directory using UimaFit's CasIOUtil class:
https://uima.apache.org/d/uimafit-current/api/org/apache/uima/fit/util/CasIOUtil.html

Then in a second machine learning pipeline we read the xmi files (using a different CasIOUtil method) and vary any machine learning parameters we want using the same standard annotations.

Hope this helps.
Tim

________________________________________
From: samir chabou [samirchb@yahoo.com.INVALID]
Sent: Monday, April 13, 2015 11:22 PM
To: dev@ctakes.apache.org; user@ctakes.apache.org
Subject: Re: iterate on the features of CAS consumer (FileWriterCasConsumer)

  Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
please advise Thanks


    On Saturday, April 11, 2015 12:54 AM, samir chabou <sa...@yahoo.com> wrote:


  Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
Thanks



  

RE: iterate on the features of CAS consumer (FileWriterCasConsumer)

Posted by "Miller, Timothy" <Ti...@childrens.harvard.edu>.
The standard way that we do save redundant processing time is by writing the CAS for each file to an XMI file after one pass on the data which runs all the analysis engines.

For example, if we are working on experiments, we have one pipeline that does all the NLP feature generation (POS tags, dependency parsing, dictionary lookup, etc.), and writes each document to an xmi file in a directory using UimaFit's CasIOUtil class:
https://uima.apache.org/d/uimafit-current/api/org/apache/uima/fit/util/CasIOUtil.html

Then in a second machine learning pipeline we read the xmi files (using a different CasIOUtil method) and vary any machine learning parameters we want using the same standard annotations.

Hope this helps.
Tim

________________________________________
From: samir chabou [samirchb@yahoo.com.INVALID]
Sent: Monday, April 13, 2015 11:22 PM
To: dev@ctakes.apache.org; user@ctakes.apache.org
Subject: Re: iterate on the features of CAS consumer (FileWriterCasConsumer)

   Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
please advise Thanks


     On Saturday, April 11, 2015 12:54 AM, samir chabou <sa...@yahoo.com> wrote:


   Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
Thanks


RE: iterate on the features of CAS consumer (FileWriterCasConsumer)

Posted by "Miller, Timothy" <Ti...@childrens.harvard.edu>.
The standard way that we do save redundant processing time is by writing the CAS for each file to an XMI file after one pass on the data which runs all the analysis engines.

For example, if we are working on experiments, we have one pipeline that does all the NLP feature generation (POS tags, dependency parsing, dictionary lookup, etc.), and writes each document to an xmi file in a directory using UimaFit's CasIOUtil class:
https://uima.apache.org/d/uimafit-current/api/org/apache/uima/fit/util/CasIOUtil.html

Then in a second machine learning pipeline we read the xmi files (using a different CasIOUtil method) and vary any machine learning parameters we want using the same standard annotations.

Hope this helps.
Tim

________________________________________
From: samir chabou [samirchb@yahoo.com.INVALID]
Sent: Monday, April 13, 2015 11:22 PM
To: dev@ctakes.apache.org; user@ctakes.apache.org
Subject: Re: iterate on the features of CAS consumer (FileWriterCasConsumer)

   Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
please advise Thanks


     On Saturday, April 11, 2015 12:54 AM, samir chabou <sa...@yahoo.com> wrote:


   Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
Thanks


RE: iterate on the features of CAS consumer (FileWriterCasConsumer)

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Samir, 
Like Tim, I'm a little bewildered and not sure that I understand your questions.  

If you want to run a collection of files, check the code in org.apache.ctakes.core.cpe.CmdLineCpeRunner.

If you want to iterate through the cas to find its contents, I'd look at the code in  org.apache.ctakes.core.cc. SentenceTokensPrinter and org.apache.ctakes.core.cc. JdbcWriterTemplate or even org.apache.ctakes.core.cc. TokenOffsetsCasConsumer.  You probably don't want to use them directly, but you can use the code as examples.  

If you want a dense xmi file to be saved for each note processed, check out org.apache.ctakes.core.cc. FilesInDirectoryCasConsumer

If you want to read a directory of files then you probably want to look at org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader

Sean

-----Original Message-----
From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu] 
Sent: Tuesday, April 14, 2015 7:09 AM
To: dev@ctakes.apache.org; samir chabou
Subject: RE: iterate on the features of CAS consumer (FileWriterCasConsumer)

Samir,
I'm not sure I understand your question. Are you saying you want to be able to look at/process annotations in a file more than once without re-processing the note?
Tim

________________________________________
From: samir chabou [samirchb@yahoo.com.INVALID]
Sent: Monday, April 13, 2015 11:22 PM
To: dev@ctakes.apache.org; user@ctakes.apache.org
Subject: Re: iterate on the features of CAS consumer (FileWriterCasConsumer)

   Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
please advise Thanks


     On Saturday, April 11, 2015 12:54 AM, samir chabou <sa...@yahoo.com> wrote:


   Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
Thanks


RE: iterate on the features of CAS consumer (FileWriterCasConsumer)

Posted by "Miller, Timothy" <Ti...@childrens.harvard.edu>.
Samir,
I'm not sure I understand your question. Are you saying you want to be able to look at/process annotations in a file more than once without re-processing the note?
Tim

________________________________________
From: samir chabou [samirchb@yahoo.com.INVALID]
Sent: Monday, April 13, 2015 11:22 PM
To: dev@ctakes.apache.org; user@ctakes.apache.org
Subject: Re: iterate on the features of CAS consumer (FileWriterCasConsumer)

   Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
please advise Thanks


     On Saturday, April 11, 2015 12:54 AM, samir chabou <sa...@yahoo.com> wrote:


   Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
Thanks


Re: iterate on the features of CAS consumer (FileWriterCasConsumer)

Posted by samir chabou <sa...@yahoo.com.INVALID>.
   Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
please advise Thanks


     On Saturday, April 11, 2015 12:54 AM, samir chabou <sa...@yahoo.com> wrote:
   

   Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
Thanks

  

Re: iterate on the features of CAS consumer (FileWriterCasConsumer)

Posted by samir chabou <sa...@yahoo.com>.
   Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
please advise Thanks


     On Saturday, April 11, 2015 12:54 AM, samir chabou <sa...@yahoo.com> wrote:
   

   Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ?
Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time.
Thanks