You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by Alaa al Barari <al...@gmail.com> on 2017/02/13 08:39:30 UTC

passing information through pipleline.process

Hi,

I have around 16 different schemes, and the way I see it through ctakes, it
gets the information for all schemes then filter the un wanted ones
afterwards, this is a performance concern for me, so I am trying to modify
ctakes to suite my needs.

can I override process(final JCas jcas)  if yes then where and how ?  its
from UIMA library right ? if not how can I pass an information like final
ConceptCode from pipeline.process to createConcepts, my createConcepts now
look like below :
createConcepts( final Collection<Long> cuiCodes, final ConceptCode
conceptCode) ?

-- 
Eng Alaa Al-Barari
phone 0599297470

RE: passing information through pipleline.process

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Alaa,
Have you actually tested and noticed a serious time problem?  Any given cas will probably not have enough concepts stored to make this an issue.
Sean

-----Original Message-----
From: shahid ashraf [mailto:shahid@trialx.com] 
Sent: Tuesday, February 14, 2017 9:56 AM
To: cTakes developers list
Subject: Re: passing information through pipleline.process

hi

What i can suggest is in your web app you initialize both pipelines and based on request e.g ic9 call the process method of the same .. or viceversa. (It can be memory intensive...)


P.S: ... this is sort of hack

On Tue, Feb 14, 2017 at 8:12 PM, Alaa al Barari <al...@gmail.com>
wrote:

> not really i want this in dynamic manner, sometimes I need icd9 and 
> other times I need 10. if there is a way to specify exactly which 
> scheme it should work with.
>
> like I have a webservice that sometimes it should return icd9 and 
> other times icd10 based on the request. modifying the xml wont do the trick.
>
> On Tue, Feb 14, 2017 at 4:33 PM, Finan, Sean < 
> Sean.Finan@childrens.harvard.edu> wrote:
>
> > Go into resources/org/apache/ctakes/dictionary/lookup/fast/ and edit
> your
> > xml configuration file.  Comment out the 2 lines that have icd9 
> > listed as desired schemes.  That should do it.
> >
> > The configuration file basically tells ctakes what is in your 
> > dictionary database.  If you comment those lines then it will think 
> > that there is no
> > icd9 table in your database.  It won't bother to look up icd9 codes 
> > or store icd9 codes in the cas.  So, icd9 codes will not be in the 
> > array -
> no
> > filtering is necessary and the iteration is shorter.
> >
> > Sean
> >
> > -----Original Message-----
> > From: Alaa al Barari [mailto:alaa.albarari@gmail.com]
> > Sent: Tuesday, February 14, 2017 9:29 AM
> > To: dev@ctakes.apache.org
> > Subject: Re: passing information through pipleline.process
> >
> > Thanks for your answer!
> >
> > Do you want ctakes to never find codes from certain schemes during 
> > processing?  yes exactly this is what I am shooting for, if I dont 
> > need them at the moment I dont want them found. suppose I have icd10 
> > and
> icd9, I
> > want it to not find icd9 when I am looking for 10 and nice versa.
> >
> >
> > On Tue, Feb 14, 2017 at 4:13 PM, Finan, Sean < 
> > Sean.Finan@childrens.harvard.edu> wrote:
> >
> > > Hi Alaa,
> > >
> > > I have a question for you.  Do you want ctakes to never find codes 
> > > from certain schemes during processing?  Or do you want it to find 
> > > them, but only return them when you request them explicitly after
> > processing?
> > >
> > > > can I override process(final JCas jcas)
> > > That depends.  If you are subclassing an AE that declares it final 
> > > you cannot override it. If it is declared final then there may be 
> > > a reason.  In that case use composition.  If you need help, web 
> > > search "composition over inheritance".  You may want to use 
> > > composition anyway
> > ...
> > >
> > > > its from UIMA library right
> > > Very close.  Process( JCas ) is from UimaFit while process (CAS ) 
> > > is from uima.  Web search UimaFit or go to uima.apache.org for 
> > > more
> > information.
> > >
> > > > how can I pass an information like final ConceptCode ...
> > > > createConcepts(
> > > final Collection<Long> cuiCodes, final ConceptCode
> > > conceptCode)
> > > Now you are getting into the fast-dictionary-lookup code that is 
> > > in ctakes.  You probably don't want to override this.  Look at my 
> > > question above.  What I am getting at is, if you only want ctakes 
> > > to maintain a small list of schemes then you should just change 
> > > the dictionary configuration file instead of delving into coding.  
> > > The configuration file is xml and lists the schemes in your 
> > > database, one per line.  Just comment out the schemes that you don't want.
> > >
> > > As for a faster way to get codes of a different scheme, you are 
> > > pretty much out of luck.  This is because the uima cas stores 
> > > everything in an array.  To get items of interest ctakes has to go 
> > > through the array and filter out whatever you don't want.  We try 
> > > to do that as efficiently as possible, but we are tied to this 
> > > array storage
> mechanism.
> > >
> > > Sean
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Alaa al Barari [mailto:alaa.albarari@gmail.com]
> > > Sent: Monday, February 13, 2017 3:40 AM
> > > To: dev@ctakes.apache.org
> > > Subject: passing information through pipleline.process
> > >
> > > Hi,
> > >
> > > I have around 16 different schemes, and the way I see it through 
> > > ctakes, it gets the information for all schemes then filter the un 
> > > wanted ones afterwards, this is a performance concern for me, so I 
> > > am trying to modify ctakes to suite my needs.
> > >
> > > can I override process(final JCas jcas)  if yes then where and how ?
> > > its from UIMA library right ? if not how can I pass an information 
> > > like final ConceptCode from pipeline.process to createConcepts, my 
> > > createConcepts now look like below :
> > > createConcepts( final Collection<Long> cuiCodes, final ConceptCode
> > > conceptCode) ?
> > >
> > > --
> > > Eng Alaa Al-Barari
> > > phone 0599297470
> > >
> >
> >
> >
> > --
> > Eng Alaa Al-Barari
> > phone 0599297470
> >
>
>
>
> --
> Eng Alaa Al-Barari
> phone 0599297470
>



--
with Regards
Shahid Ashraf

Re: passing information through pipleline.process

Posted by shahid ashraf <sh...@trialx.com>.
hi

What i can suggest is in your web app you initialize both pipelines and
based on request e.g ic9 call the process method of the same .. or
viceversa. (It can be memory intensive...)


P.S: ... this is sort of hack

On Tue, Feb 14, 2017 at 8:12 PM, Alaa al Barari <al...@gmail.com>
wrote:

> not really i want this in dynamic manner, sometimes I need icd9 and other
> times I need 10. if there is a way to specify exactly which scheme it
> should work with.
>
> like I have a webservice that sometimes it should return icd9 and other
> times icd10 based on the request. modifying the xml wont do the trick.
>
> On Tue, Feb 14, 2017 at 4:33 PM, Finan, Sean <
> Sean.Finan@childrens.harvard.edu> wrote:
>
> > Go into resources/org/apache/ctakes/dictionary/lookup/fast/ and edit
> your
> > xml configuration file.  Comment out the 2 lines that have icd9 listed as
> > desired schemes.  That should do it.
> >
> > The configuration file basically tells ctakes what is in your dictionary
> > database.  If you comment those lines then it will think that there is no
> > icd9 table in your database.  It won't bother to look up icd9 codes or
> > store icd9 codes in the cas.  So, icd9 codes will not be in the array -
> no
> > filtering is necessary and the iteration is shorter.
> >
> > Sean
> >
> > -----Original Message-----
> > From: Alaa al Barari [mailto:alaa.albarari@gmail.com]
> > Sent: Tuesday, February 14, 2017 9:29 AM
> > To: dev@ctakes.apache.org
> > Subject: Re: passing information through pipleline.process
> >
> > Thanks for your answer!
> >
> > Do you want ctakes to never find codes from certain schemes during
> > processing?  yes exactly this is what I am shooting for, if I dont need
> > them at the moment I dont want them found. suppose I have icd10 and
> icd9, I
> > want it to not find icd9 when I am looking for 10 and nice versa.
> >
> >
> > On Tue, Feb 14, 2017 at 4:13 PM, Finan, Sean <
> > Sean.Finan@childrens.harvard.edu> wrote:
> >
> > > Hi Alaa,
> > >
> > > I have a question for you.  Do you want ctakes to never find codes
> > > from certain schemes during processing?  Or do you want it to find
> > > them, but only return them when you request them explicitly after
> > processing?
> > >
> > > > can I override process(final JCas jcas)
> > > That depends.  If you are subclassing an AE that declares it final you
> > > cannot override it. If it is declared final then there may be a
> > > reason.  In that case use composition.  If you need help, web search
> > > "composition over inheritance".  You may want to use composition anyway
> > ...
> > >
> > > > its from UIMA library right
> > > Very close.  Process( JCas ) is from UimaFit while process (CAS ) is
> > > from uima.  Web search UimaFit or go to uima.apache.org for more
> > information.
> > >
> > > > how can I pass an information like final ConceptCode ...
> > > > createConcepts(
> > > final Collection<Long> cuiCodes, final ConceptCode
> > > conceptCode)
> > > Now you are getting into the fast-dictionary-lookup code that is in
> > > ctakes.  You probably don't want to override this.  Look at my
> > > question above.  What I am getting at is, if you only want ctakes to
> > > maintain a small list of schemes then you should just change the
> > > dictionary configuration file instead of delving into coding.  The
> > > configuration file is xml and lists the schemes in your database, one
> > > per line.  Just comment out the schemes that you don't want.
> > >
> > > As for a faster way to get codes of a different scheme, you are pretty
> > > much out of luck.  This is because the uima cas stores everything in
> > > an array.  To get items of interest ctakes has to go through the array
> > > and filter out whatever you don't want.  We try to do that as
> > > efficiently as possible, but we are tied to this array storage
> mechanism.
> > >
> > > Sean
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Alaa al Barari [mailto:alaa.albarari@gmail.com]
> > > Sent: Monday, February 13, 2017 3:40 AM
> > > To: dev@ctakes.apache.org
> > > Subject: passing information through pipleline.process
> > >
> > > Hi,
> > >
> > > I have around 16 different schemes, and the way I see it through
> > > ctakes, it gets the information for all schemes then filter the un
> > > wanted ones afterwards, this is a performance concern for me, so I am
> > > trying to modify ctakes to suite my needs.
> > >
> > > can I override process(final JCas jcas)  if yes then where and how ?
> > > its from UIMA library right ? if not how can I pass an information
> > > like final ConceptCode from pipeline.process to createConcepts, my
> > > createConcepts now look like below :
> > > createConcepts( final Collection<Long> cuiCodes, final ConceptCode
> > > conceptCode) ?
> > >
> > > --
> > > Eng Alaa Al-Barari
> > > phone 0599297470
> > >
> >
> >
> >
> > --
> > Eng Alaa Al-Barari
> > phone 0599297470
> >
>
>
>
> --
> Eng Alaa Al-Barari
> phone 0599297470
>



-- 
with Regards
Shahid Ashraf

Re: passing information through pipleline.process

Posted by Alaa al Barari <al...@gmail.com>.
not really i want this in dynamic manner, sometimes I need icd9 and other
times I need 10. if there is a way to specify exactly which scheme it
should work with.

like I have a webservice that sometimes it should return icd9 and other
times icd10 based on the request. modifying the xml wont do the trick.

On Tue, Feb 14, 2017 at 4:33 PM, Finan, Sean <
Sean.Finan@childrens.harvard.edu> wrote:

> Go into resources/org/apache/ctakes/dictionary/lookup/fast/ and edit your
> xml configuration file.  Comment out the 2 lines that have icd9 listed as
> desired schemes.  That should do it.
>
> The configuration file basically tells ctakes what is in your dictionary
> database.  If you comment those lines then it will think that there is no
> icd9 table in your database.  It won't bother to look up icd9 codes or
> store icd9 codes in the cas.  So, icd9 codes will not be in the array - no
> filtering is necessary and the iteration is shorter.
>
> Sean
>
> -----Original Message-----
> From: Alaa al Barari [mailto:alaa.albarari@gmail.com]
> Sent: Tuesday, February 14, 2017 9:29 AM
> To: dev@ctakes.apache.org
> Subject: Re: passing information through pipleline.process
>
> Thanks for your answer!
>
> Do you want ctakes to never find codes from certain schemes during
> processing?  yes exactly this is what I am shooting for, if I dont need
> them at the moment I dont want them found. suppose I have icd10 and icd9, I
> want it to not find icd9 when I am looking for 10 and nice versa.
>
>
> On Tue, Feb 14, 2017 at 4:13 PM, Finan, Sean <
> Sean.Finan@childrens.harvard.edu> wrote:
>
> > Hi Alaa,
> >
> > I have a question for you.  Do you want ctakes to never find codes
> > from certain schemes during processing?  Or do you want it to find
> > them, but only return them when you request them explicitly after
> processing?
> >
> > > can I override process(final JCas jcas)
> > That depends.  If you are subclassing an AE that declares it final you
> > cannot override it. If it is declared final then there may be a
> > reason.  In that case use composition.  If you need help, web search
> > "composition over inheritance".  You may want to use composition anyway
> ...
> >
> > > its from UIMA library right
> > Very close.  Process( JCas ) is from UimaFit while process (CAS ) is
> > from uima.  Web search UimaFit or go to uima.apache.org for more
> information.
> >
> > > how can I pass an information like final ConceptCode ...
> > > createConcepts(
> > final Collection<Long> cuiCodes, final ConceptCode
> > conceptCode)
> > Now you are getting into the fast-dictionary-lookup code that is in
> > ctakes.  You probably don't want to override this.  Look at my
> > question above.  What I am getting at is, if you only want ctakes to
> > maintain a small list of schemes then you should just change the
> > dictionary configuration file instead of delving into coding.  The
> > configuration file is xml and lists the schemes in your database, one
> > per line.  Just comment out the schemes that you don't want.
> >
> > As for a faster way to get codes of a different scheme, you are pretty
> > much out of luck.  This is because the uima cas stores everything in
> > an array.  To get items of interest ctakes has to go through the array
> > and filter out whatever you don't want.  We try to do that as
> > efficiently as possible, but we are tied to this array storage mechanism.
> >
> > Sean
> >
> >
> >
> > -----Original Message-----
> > From: Alaa al Barari [mailto:alaa.albarari@gmail.com]
> > Sent: Monday, February 13, 2017 3:40 AM
> > To: dev@ctakes.apache.org
> > Subject: passing information through pipleline.process
> >
> > Hi,
> >
> > I have around 16 different schemes, and the way I see it through
> > ctakes, it gets the information for all schemes then filter the un
> > wanted ones afterwards, this is a performance concern for me, so I am
> > trying to modify ctakes to suite my needs.
> >
> > can I override process(final JCas jcas)  if yes then where and how ?
> > its from UIMA library right ? if not how can I pass an information
> > like final ConceptCode from pipeline.process to createConcepts, my
> > createConcepts now look like below :
> > createConcepts( final Collection<Long> cuiCodes, final ConceptCode
> > conceptCode) ?
> >
> > --
> > Eng Alaa Al-Barari
> > phone 0599297470
> >
>
>
>
> --
> Eng Alaa Al-Barari
> phone 0599297470
>



-- 
Eng Alaa Al-Barari
phone 0599297470

RE: passing information through pipleline.process

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Go into resources/org/apache/ctakes/dictionary/lookup/fast/ and edit your xml configuration file.  Comment out the 2 lines that have icd9 listed as desired schemes.  That should do it.

The configuration file basically tells ctakes what is in your dictionary database.  If you comment those lines then it will think that there is no icd9 table in your database.  It won't bother to look up icd9 codes or store icd9 codes in the cas.  So, icd9 codes will not be in the array - no filtering is necessary and the iteration is shorter.

Sean

-----Original Message-----
From: Alaa al Barari [mailto:alaa.albarari@gmail.com] 
Sent: Tuesday, February 14, 2017 9:29 AM
To: dev@ctakes.apache.org
Subject: Re: passing information through pipleline.process

Thanks for your answer!

Do you want ctakes to never find codes from certain schemes during processing?  yes exactly this is what I am shooting for, if I dont need them at the moment I dont want them found. suppose I have icd10 and icd9, I want it to not find icd9 when I am looking for 10 and nice versa.


On Tue, Feb 14, 2017 at 4:13 PM, Finan, Sean < Sean.Finan@childrens.harvard.edu> wrote:

> Hi Alaa,
>
> I have a question for you.  Do you want ctakes to never find codes 
> from certain schemes during processing?  Or do you want it to find 
> them, but only return them when you request them explicitly after processing?
>
> > can I override process(final JCas jcas)
> That depends.  If you are subclassing an AE that declares it final you 
> cannot override it. If it is declared final then there may be a 
> reason.  In that case use composition.  If you need help, web search 
> "composition over inheritance".  You may want to use composition anyway ...
>
> > its from UIMA library right
> Very close.  Process( JCas ) is from UimaFit while process (CAS ) is 
> from uima.  Web search UimaFit or go to uima.apache.org for more information.
>
> > how can I pass an information like final ConceptCode ... 
> > createConcepts(
> final Collection<Long> cuiCodes, final ConceptCode
> conceptCode)
> Now you are getting into the fast-dictionary-lookup code that is in 
> ctakes.  You probably don't want to override this.  Look at my 
> question above.  What I am getting at is, if you only want ctakes to 
> maintain a small list of schemes then you should just change the 
> dictionary configuration file instead of delving into coding.  The 
> configuration file is xml and lists the schemes in your database, one 
> per line.  Just comment out the schemes that you don't want.
>
> As for a faster way to get codes of a different scheme, you are pretty 
> much out of luck.  This is because the uima cas stores everything in 
> an array.  To get items of interest ctakes has to go through the array 
> and filter out whatever you don't want.  We try to do that as 
> efficiently as possible, but we are tied to this array storage mechanism.
>
> Sean
>
>
>
> -----Original Message-----
> From: Alaa al Barari [mailto:alaa.albarari@gmail.com]
> Sent: Monday, February 13, 2017 3:40 AM
> To: dev@ctakes.apache.org
> Subject: passing information through pipleline.process
>
> Hi,
>
> I have around 16 different schemes, and the way I see it through 
> ctakes, it gets the information for all schemes then filter the un 
> wanted ones afterwards, this is a performance concern for me, so I am 
> trying to modify ctakes to suite my needs.
>
> can I override process(final JCas jcas)  if yes then where and how ?  
> its from UIMA library right ? if not how can I pass an information 
> like final ConceptCode from pipeline.process to createConcepts, my 
> createConcepts now look like below :
> createConcepts( final Collection<Long> cuiCodes, final ConceptCode
> conceptCode) ?
>
> --
> Eng Alaa Al-Barari
> phone 0599297470
>



--
Eng Alaa Al-Barari
phone 0599297470

Re: passing information through pipleline.process

Posted by Alaa al Barari <al...@gmail.com>.
Thanks for your answer!

Do you want ctakes to never find codes from certain schemes during
processing?  yes exactly this is what I am shooting for, if I dont need
them at the moment I dont want them found. suppose I have icd10 and icd9, I
want it to not find icd9 when I am looking for 10 and nice versa.


On Tue, Feb 14, 2017 at 4:13 PM, Finan, Sean <
Sean.Finan@childrens.harvard.edu> wrote:

> Hi Alaa,
>
> I have a question for you.  Do you want ctakes to never find codes from
> certain schemes during processing?  Or do you want it to find them, but
> only return them when you request them explicitly after processing?
>
> > can I override process(final JCas jcas)
> That depends.  If you are subclassing an AE that declares it final you
> cannot override it. If it is declared final then there may be a reason.  In
> that case use composition.  If you need help, web search "composition over
> inheritance".  You may want to use composition anyway ...
>
> > its from UIMA library right
> Very close.  Process( JCas ) is from UimaFit while process (CAS ) is from
> uima.  Web search UimaFit or go to uima.apache.org for more information.
>
> > how can I pass an information like final ConceptCode ... createConcepts(
> final Collection<Long> cuiCodes, final ConceptCode
> conceptCode)
> Now you are getting into the fast-dictionary-lookup code that is in
> ctakes.  You probably don't want to override this.  Look at my question
> above.  What I am getting at is, if you only want ctakes to maintain a
> small list of schemes then you should just change the dictionary
> configuration file instead of delving into coding.  The configuration file
> is xml and lists the schemes in your database, one per line.  Just comment
> out the schemes that you don't want.
>
> As for a faster way to get codes of a different scheme, you are pretty
> much out of luck.  This is because the uima cas stores everything in an
> array.  To get items of interest ctakes has to go through the array and
> filter out whatever you don't want.  We try to do that as efficiently as
> possible, but we are tied to this array storage mechanism.
>
> Sean
>
>
>
> -----Original Message-----
> From: Alaa al Barari [mailto:alaa.albarari@gmail.com]
> Sent: Monday, February 13, 2017 3:40 AM
> To: dev@ctakes.apache.org
> Subject: passing information through pipleline.process
>
> Hi,
>
> I have around 16 different schemes, and the way I see it through ctakes,
> it gets the information for all schemes then filter the un wanted ones
> afterwards, this is a performance concern for me, so I am trying to modify
> ctakes to suite my needs.
>
> can I override process(final JCas jcas)  if yes then where and how ?  its
> from UIMA library right ? if not how can I pass an information like final
> ConceptCode from pipeline.process to createConcepts, my createConcepts now
> look like below :
> createConcepts( final Collection<Long> cuiCodes, final ConceptCode
> conceptCode) ?
>
> --
> Eng Alaa Al-Barari
> phone 0599297470
>



-- 
Eng Alaa Al-Barari
phone 0599297470

RE: passing information through pipleline.process

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Alaa,

I have a question for you.  Do you want ctakes to never find codes from certain schemes during processing?  Or do you want it to find them, but only return them when you request them explicitly after processing?

> can I override process(final JCas jcas)
That depends.  If you are subclassing an AE that declares it final you cannot override it. If it is declared final then there may be a reason.  In that case use composition.  If you need help, web search "composition over inheritance".  You may want to use composition anyway ... 

> its from UIMA library right
Very close.  Process( JCas ) is from UimaFit while process (CAS ) is from uima.  Web search UimaFit or go to uima.apache.org for more information.

> how can I pass an information like final ConceptCode ... createConcepts( final Collection<Long> cuiCodes, final ConceptCode
conceptCode)
Now you are getting into the fast-dictionary-lookup code that is in ctakes.  You probably don't want to override this.  Look at my question above.  What I am getting at is, if you only want ctakes to maintain a small list of schemes then you should just change the dictionary configuration file instead of delving into coding.  The configuration file is xml and lists the schemes in your database, one per line.  Just comment out the schemes that you don't want.

As for a faster way to get codes of a different scheme, you are pretty much out of luck.  This is because the uima cas stores everything in an array.  To get items of interest ctakes has to go through the array and filter out whatever you don't want.  We try to do that as efficiently as possible, but we are tied to this array storage mechanism.

Sean



-----Original Message-----
From: Alaa al Barari [mailto:alaa.albarari@gmail.com] 
Sent: Monday, February 13, 2017 3:40 AM
To: dev@ctakes.apache.org
Subject: passing information through pipleline.process

Hi,

I have around 16 different schemes, and the way I see it through ctakes, it gets the information for all schemes then filter the un wanted ones afterwards, this is a performance concern for me, so I am trying to modify ctakes to suite my needs.

can I override process(final JCas jcas)  if yes then where and how ?  its from UIMA library right ? if not how can I pass an information like final ConceptCode from pipeline.process to createConcepts, my createConcepts now look like below :
createConcepts( final Collection<Long> cuiCodes, final ConceptCode
conceptCode) ?

--
Eng Alaa Al-Barari
phone 0599297470