You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by Harpreet Khanduja <hs...@rit.edu> on 2014/07/17 23:06:42 UTC

Lucene for UMLS2014

Hello,
    I would be grateful if someone could help.

    I created a lucene index for umls2014 but only for snomed vocabulary.
    I did this because I thought this would reduce the dictionary look up
time.
    But it still almost the same. Is there any other way to improve the
dictionary look up time?

Thank you,
Harpreet

Re: Lucene for UMLS2014

Posted by Harpreet Khanduja <hs...@rit.edu>.
Hello,
 Thanks for your help.

It works but it does not give me the code value associated with a cui for
SNOMED vocabulary.
How can I get the code value for SNOMED  or any other vocabulary.

"codingScheme" : "CTakes",  "cui" :          "C0085580",  "tui" :
 "T047",   "code" :         ""

Thank you,

Harpreet




On Tue, Jul 22, 2014 at 4:19 PM, Harpreet Khanduja <hs...@g.rit.edu>
wrote:

> I will try to do the same.
>
> Thank you,
>
> Harpreet
>
>
> On Tue, Jul 22, 2014 at 4:11 PM, Masanz, James J. <Ma...@mayo.edu>
> wrote:
>
>>
>> I'm not an svn guru, but you can use Team->Update to get the latest of
>> all the things you have not customized, plus SVN will tell you of the
>> conflicts, and you can merge your customizations into the latest. I've done
>> it when I haven't had many customizations to preserve.
>>
>> To get the new dictionary lookup (sub)project, you might have to do
>> something to get it imported, such as going into the SVN repository
>> exploring view and use Check out as Maven Project menu option on that
>> (sub)project.
>>
>> -----Original Message-----
>> From: Harpreet Khanduja [mailto:hsk5004@rit.edu]
>> Sent: Tuesday, July 22, 2014 2:32 PM
>> To: dev@ctakes.apache.org
>> Subject: Re: Lucene for UMLS2014
>>
>> Hello,
>>
>> I checked out 3.1.1 from trunk SVN.
>>
>> Thank you
>>
>>
>>
>> On Tue, Jul 22, 2014 at 2:29 PM, Masanz, James J. <Ma...@mayo.edu>
>> wrote:
>>
>> > Did you download the source and import into eclipse, or did you check
>> out
>> > 3.1.1 from SVN.
>> > If you checked it out from SVN, did you check it out from trunk, or from
>> > the tag for 3.1.1.
>> >
>> > -- James
>> >
>> > -----Original Message-----
>> > From: Harpreet Khanduja [mailto:hsk5004@rit.edu]
>> > Sent: Tuesday, July 22, 2014 12:49 PM
>> > To: dev@ctakes.apache.org
>> > Subject: Re: Lucene for UMLS2014
>> >
>> > Hello,
>> >    I am using ctakes 3.1.1 in eclipse and I have added my
>> customizations to
>> > the project, but now I want to update it to 3.2 so that I can use
>> >    ctakes-dictionary-lookup-fast.
>> >    Is there any way to update the whole ctakes project to 3.2 without my
>> > customizations getting removed?
>> >
>> >   It would be a great help.
>> >
>> > Thank you,
>> >
>> > Harpreet
>> >
>> >
>> >
>> >
>> >
>> > On Tue, Jul 22, 2014 at 10:53 AM, Harpreet Khanduja <hs...@g.rit.edu>
>> > wrote:
>> >
>> > > Thank you so much for your help.
>> > >
>> > > Harpreet.
>> > >
>> > >
>> > >
>> > > On Mon, Jul 21, 2014 at 6:28 PM, Finan, Sean <
>> > > Sean.Finan@childrens.harvard.edu> wrote:
>> > >
>> > >> Hi Harpreet,
>> > >>
>> > >> If you are willing to use cTakes 3.2, try the dictionary-lookup-fast
>> > >> module as a replacement of the default dictionary-lookup.  That
>> module
>> > has
>> > >> a new dictionary resource (hsql, not lucene) and slightly different
>> > methods
>> > >> for lookup and matching.  In time trials it has been faster than the
>> > >> default module (hence the name).  Accuracy depends upon the parameter
>> > >> settings, but in the tests performed so far the results are
>> comparable
>> > or
>> > >> better.  The new dictionary is much leaner than the current default
>> > >> dictionary, small enough to port from the hsql cached version to a
>> hsql
>> > >> in-memory version.  Using the in-memory version makes dictionary
>> lookup
>> > >> practically instantaneous (hundredths of a second).  Limited
>> > documentation
>> > >> is available in the module's doc/ directory.
>> > >>
>> > >> I will be on vacation for a week, but please don't hesitate to write
>> if
>> > >> you have any questions.
>> > >>
>> > >> Sean
>> > >> ________________________________________
>> > >> From: Harpreet Khanduja [hsk5004@rit.edu]
>> > >> Sent: Thursday, July 17, 2014 5:07 PM
>> > >> To: dev@ctakes.apache.org
>> > >> Subject: Lucene for UMLS2014
>> > >>
>> > >> Hello,
>> > >>     I would be grateful if someone could help.
>> > >>
>> > >>     I created a lucene index for umls2014 but only for snomed
>> > vocabulary.
>> > >>     I did this because I thought this would reduce the dictionary
>> look
>> > up
>> > >> time.
>> > >>     But it still almost the same. Is there any other way to improve
>> the
>> > >> dictionary look up time?
>> > >>
>> > >> Thank you,
>> > >> Harpreet
>> > >>
>> > >
>> > >
>> >
>>
>
>

Re: Lucene for UMLS2014

Posted by Harpreet Khanduja <hs...@rit.edu>.
I will try to do the same.

Thank you,

Harpreet


On Tue, Jul 22, 2014 at 4:11 PM, Masanz, James J. <Ma...@mayo.edu>
wrote:

>
> I'm not an svn guru, but you can use Team->Update to get the latest of all
> the things you have not customized, plus SVN will tell you of the
> conflicts, and you can merge your customizations into the latest. I've done
> it when I haven't had many customizations to preserve.
>
> To get the new dictionary lookup (sub)project, you might have to do
> something to get it imported, such as going into the SVN repository
> exploring view and use Check out as Maven Project menu option on that
> (sub)project.
>
> -----Original Message-----
> From: Harpreet Khanduja [mailto:hsk5004@rit.edu]
> Sent: Tuesday, July 22, 2014 2:32 PM
> To: dev@ctakes.apache.org
> Subject: Re: Lucene for UMLS2014
>
> Hello,
>
> I checked out 3.1.1 from trunk SVN.
>
> Thank you
>
>
>
> On Tue, Jul 22, 2014 at 2:29 PM, Masanz, James J. <Ma...@mayo.edu>
> wrote:
>
> > Did you download the source and import into eclipse, or did you check out
> > 3.1.1 from SVN.
> > If you checked it out from SVN, did you check it out from trunk, or from
> > the tag for 3.1.1.
> >
> > -- James
> >
> > -----Original Message-----
> > From: Harpreet Khanduja [mailto:hsk5004@rit.edu]
> > Sent: Tuesday, July 22, 2014 12:49 PM
> > To: dev@ctakes.apache.org
> > Subject: Re: Lucene for UMLS2014
> >
> > Hello,
> >    I am using ctakes 3.1.1 in eclipse and I have added my customizations
> to
> > the project, but now I want to update it to 3.2 so that I can use
> >    ctakes-dictionary-lookup-fast.
> >    Is there any way to update the whole ctakes project to 3.2 without my
> > customizations getting removed?
> >
> >   It would be a great help.
> >
> > Thank you,
> >
> > Harpreet
> >
> >
> >
> >
> >
> > On Tue, Jul 22, 2014 at 10:53 AM, Harpreet Khanduja <hs...@g.rit.edu>
> > wrote:
> >
> > > Thank you so much for your help.
> > >
> > > Harpreet.
> > >
> > >
> > >
> > > On Mon, Jul 21, 2014 at 6:28 PM, Finan, Sean <
> > > Sean.Finan@childrens.harvard.edu> wrote:
> > >
> > >> Hi Harpreet,
> > >>
> > >> If you are willing to use cTakes 3.2, try the dictionary-lookup-fast
> > >> module as a replacement of the default dictionary-lookup.  That module
> > has
> > >> a new dictionary resource (hsql, not lucene) and slightly different
> > methods
> > >> for lookup and matching.  In time trials it has been faster than the
> > >> default module (hence the name).  Accuracy depends upon the parameter
> > >> settings, but in the tests performed so far the results are comparable
> > or
> > >> better.  The new dictionary is much leaner than the current default
> > >> dictionary, small enough to port from the hsql cached version to a
> hsql
> > >> in-memory version.  Using the in-memory version makes dictionary
> lookup
> > >> practically instantaneous (hundredths of a second).  Limited
> > documentation
> > >> is available in the module's doc/ directory.
> > >>
> > >> I will be on vacation for a week, but please don't hesitate to write
> if
> > >> you have any questions.
> > >>
> > >> Sean
> > >> ________________________________________
> > >> From: Harpreet Khanduja [hsk5004@rit.edu]
> > >> Sent: Thursday, July 17, 2014 5:07 PM
> > >> To: dev@ctakes.apache.org
> > >> Subject: Lucene for UMLS2014
> > >>
> > >> Hello,
> > >>     I would be grateful if someone could help.
> > >>
> > >>     I created a lucene index for umls2014 but only for snomed
> > vocabulary.
> > >>     I did this because I thought this would reduce the dictionary look
> > up
> > >> time.
> > >>     But it still almost the same. Is there any other way to improve
> the
> > >> dictionary look up time?
> > >>
> > >> Thank you,
> > >> Harpreet
> > >>
> > >
> > >
> >
>

RE: Lucene for UMLS2014

Posted by "Masanz, James J." <Ma...@mayo.edu>.
I'm not an svn guru, but you can use Team->Update to get the latest of all the things you have not customized, plus SVN will tell you of the conflicts, and you can merge your customizations into the latest. I've done it when I haven't had many customizations to preserve.

To get the new dictionary lookup (sub)project, you might have to do something to get it imported, such as going into the SVN repository exploring view and use Check out as Maven Project menu option on that (sub)project.

-----Original Message-----
From: Harpreet Khanduja [mailto:hsk5004@rit.edu] 
Sent: Tuesday, July 22, 2014 2:32 PM
To: dev@ctakes.apache.org
Subject: Re: Lucene for UMLS2014

Hello,

I checked out 3.1.1 from trunk SVN.

Thank you



On Tue, Jul 22, 2014 at 2:29 PM, Masanz, James J. <Ma...@mayo.edu>
wrote:

> Did you download the source and import into eclipse, or did you check out
> 3.1.1 from SVN.
> If you checked it out from SVN, did you check it out from trunk, or from
> the tag for 3.1.1.
>
> -- James
>
> -----Original Message-----
> From: Harpreet Khanduja [mailto:hsk5004@rit.edu]
> Sent: Tuesday, July 22, 2014 12:49 PM
> To: dev@ctakes.apache.org
> Subject: Re: Lucene for UMLS2014
>
> Hello,
>    I am using ctakes 3.1.1 in eclipse and I have added my customizations to
> the project, but now I want to update it to 3.2 so that I can use
>    ctakes-dictionary-lookup-fast.
>    Is there any way to update the whole ctakes project to 3.2 without my
> customizations getting removed?
>
>   It would be a great help.
>
> Thank you,
>
> Harpreet
>
>
>
>
>
> On Tue, Jul 22, 2014 at 10:53 AM, Harpreet Khanduja <hs...@g.rit.edu>
> wrote:
>
> > Thank you so much for your help.
> >
> > Harpreet.
> >
> >
> >
> > On Mon, Jul 21, 2014 at 6:28 PM, Finan, Sean <
> > Sean.Finan@childrens.harvard.edu> wrote:
> >
> >> Hi Harpreet,
> >>
> >> If you are willing to use cTakes 3.2, try the dictionary-lookup-fast
> >> module as a replacement of the default dictionary-lookup.  That module
> has
> >> a new dictionary resource (hsql, not lucene) and slightly different
> methods
> >> for lookup and matching.  In time trials it has been faster than the
> >> default module (hence the name).  Accuracy depends upon the parameter
> >> settings, but in the tests performed so far the results are comparable
> or
> >> better.  The new dictionary is much leaner than the current default
> >> dictionary, small enough to port from the hsql cached version to a hsql
> >> in-memory version.  Using the in-memory version makes dictionary lookup
> >> practically instantaneous (hundredths of a second).  Limited
> documentation
> >> is available in the module's doc/ directory.
> >>
> >> I will be on vacation for a week, but please don't hesitate to write if
> >> you have any questions.
> >>
> >> Sean
> >> ________________________________________
> >> From: Harpreet Khanduja [hsk5004@rit.edu]
> >> Sent: Thursday, July 17, 2014 5:07 PM
> >> To: dev@ctakes.apache.org
> >> Subject: Lucene for UMLS2014
> >>
> >> Hello,
> >>     I would be grateful if someone could help.
> >>
> >>     I created a lucene index for umls2014 but only for snomed
> vocabulary.
> >>     I did this because I thought this would reduce the dictionary look
> up
> >> time.
> >>     But it still almost the same. Is there any other way to improve the
> >> dictionary look up time?
> >>
> >> Thank you,
> >> Harpreet
> >>
> >
> >
>

Re: Lucene for UMLS2014

Posted by Harpreet Khanduja <hs...@rit.edu>.
Hello,

I checked out 3.1.1 from trunk SVN.

Thank you



On Tue, Jul 22, 2014 at 2:29 PM, Masanz, James J. <Ma...@mayo.edu>
wrote:

> Did you download the source and import into eclipse, or did you check out
> 3.1.1 from SVN.
> If you checked it out from SVN, did you check it out from trunk, or from
> the tag for 3.1.1.
>
> -- James
>
> -----Original Message-----
> From: Harpreet Khanduja [mailto:hsk5004@rit.edu]
> Sent: Tuesday, July 22, 2014 12:49 PM
> To: dev@ctakes.apache.org
> Subject: Re: Lucene for UMLS2014
>
> Hello,
>    I am using ctakes 3.1.1 in eclipse and I have added my customizations to
> the project, but now I want to update it to 3.2 so that I can use
>    ctakes-dictionary-lookup-fast.
>    Is there any way to update the whole ctakes project to 3.2 without my
> customizations getting removed?
>
>   It would be a great help.
>
> Thank you,
>
> Harpreet
>
>
>
>
>
> On Tue, Jul 22, 2014 at 10:53 AM, Harpreet Khanduja <hs...@g.rit.edu>
> wrote:
>
> > Thank you so much for your help.
> >
> > Harpreet.
> >
> >
> >
> > On Mon, Jul 21, 2014 at 6:28 PM, Finan, Sean <
> > Sean.Finan@childrens.harvard.edu> wrote:
> >
> >> Hi Harpreet,
> >>
> >> If you are willing to use cTakes 3.2, try the dictionary-lookup-fast
> >> module as a replacement of the default dictionary-lookup.  That module
> has
> >> a new dictionary resource (hsql, not lucene) and slightly different
> methods
> >> for lookup and matching.  In time trials it has been faster than the
> >> default module (hence the name).  Accuracy depends upon the parameter
> >> settings, but in the tests performed so far the results are comparable
> or
> >> better.  The new dictionary is much leaner than the current default
> >> dictionary, small enough to port from the hsql cached version to a hsql
> >> in-memory version.  Using the in-memory version makes dictionary lookup
> >> practically instantaneous (hundredths of a second).  Limited
> documentation
> >> is available in the module's doc/ directory.
> >>
> >> I will be on vacation for a week, but please don't hesitate to write if
> >> you have any questions.
> >>
> >> Sean
> >> ________________________________________
> >> From: Harpreet Khanduja [hsk5004@rit.edu]
> >> Sent: Thursday, July 17, 2014 5:07 PM
> >> To: dev@ctakes.apache.org
> >> Subject: Lucene for UMLS2014
> >>
> >> Hello,
> >>     I would be grateful if someone could help.
> >>
> >>     I created a lucene index for umls2014 but only for snomed
> vocabulary.
> >>     I did this because I thought this would reduce the dictionary look
> up
> >> time.
> >>     But it still almost the same. Is there any other way to improve the
> >> dictionary look up time?
> >>
> >> Thank you,
> >> Harpreet
> >>
> >
> >
>

RE: Lucene for UMLS2014

Posted by "Masanz, James J." <Ma...@mayo.edu>.
Did you download the source and import into eclipse, or did you check out 3.1.1 from SVN.
If you checked it out from SVN, did you check it out from trunk, or from the tag for 3.1.1.

-- James

-----Original Message-----
From: Harpreet Khanduja [mailto:hsk5004@rit.edu] 
Sent: Tuesday, July 22, 2014 12:49 PM
To: dev@ctakes.apache.org
Subject: Re: Lucene for UMLS2014

Hello,
   I am using ctakes 3.1.1 in eclipse and I have added my customizations to
the project, but now I want to update it to 3.2 so that I can use
   ctakes-dictionary-lookup-fast.
   Is there any way to update the whole ctakes project to 3.2 without my
customizations getting removed?

  It would be a great help.

Thank you,

Harpreet





On Tue, Jul 22, 2014 at 10:53 AM, Harpreet Khanduja <hs...@g.rit.edu>
wrote:

> Thank you so much for your help.
>
> Harpreet.
>
>
>
> On Mon, Jul 21, 2014 at 6:28 PM, Finan, Sean <
> Sean.Finan@childrens.harvard.edu> wrote:
>
>> Hi Harpreet,
>>
>> If you are willing to use cTakes 3.2, try the dictionary-lookup-fast
>> module as a replacement of the default dictionary-lookup.  That module has
>> a new dictionary resource (hsql, not lucene) and slightly different methods
>> for lookup and matching.  In time trials it has been faster than the
>> default module (hence the name).  Accuracy depends upon the parameter
>> settings, but in the tests performed so far the results are comparable or
>> better.  The new dictionary is much leaner than the current default
>> dictionary, small enough to port from the hsql cached version to a hsql
>> in-memory version.  Using the in-memory version makes dictionary lookup
>> practically instantaneous (hundredths of a second).  Limited documentation
>> is available in the module's doc/ directory.
>>
>> I will be on vacation for a week, but please don't hesitate to write if
>> you have any questions.
>>
>> Sean
>> ________________________________________
>> From: Harpreet Khanduja [hsk5004@rit.edu]
>> Sent: Thursday, July 17, 2014 5:07 PM
>> To: dev@ctakes.apache.org
>> Subject: Lucene for UMLS2014
>>
>> Hello,
>>     I would be grateful if someone could help.
>>
>>     I created a lucene index for umls2014 but only for snomed vocabulary.
>>     I did this because I thought this would reduce the dictionary look up
>> time.
>>     But it still almost the same. Is there any other way to improve the
>> dictionary look up time?
>>
>> Thank you,
>> Harpreet
>>
>
>

Re: Lucene for UMLS2014

Posted by Harpreet Khanduja <hs...@rit.edu>.
Hello,
   I am using ctakes 3.1.1 in eclipse and I have added my customizations to
the project, but now I want to update it to 3.2 so that I can use
   ctakes-dictionary-lookup-fast.
   Is there any way to update the whole ctakes project to 3.2 without my
customizations getting removed?

  It would be a great help.

Thank you,

Harpreet





On Tue, Jul 22, 2014 at 10:53 AM, Harpreet Khanduja <hs...@g.rit.edu>
wrote:

> Thank you so much for your help.
>
> Harpreet.
>
>
>
> On Mon, Jul 21, 2014 at 6:28 PM, Finan, Sean <
> Sean.Finan@childrens.harvard.edu> wrote:
>
>> Hi Harpreet,
>>
>> If you are willing to use cTakes 3.2, try the dictionary-lookup-fast
>> module as a replacement of the default dictionary-lookup.  That module has
>> a new dictionary resource (hsql, not lucene) and slightly different methods
>> for lookup and matching.  In time trials it has been faster than the
>> default module (hence the name).  Accuracy depends upon the parameter
>> settings, but in the tests performed so far the results are comparable or
>> better.  The new dictionary is much leaner than the current default
>> dictionary, small enough to port from the hsql cached version to a hsql
>> in-memory version.  Using the in-memory version makes dictionary lookup
>> practically instantaneous (hundredths of a second).  Limited documentation
>> is available in the module's doc/ directory.
>>
>> I will be on vacation for a week, but please don't hesitate to write if
>> you have any questions.
>>
>> Sean
>> ________________________________________
>> From: Harpreet Khanduja [hsk5004@rit.edu]
>> Sent: Thursday, July 17, 2014 5:07 PM
>> To: dev@ctakes.apache.org
>> Subject: Lucene for UMLS2014
>>
>> Hello,
>>     I would be grateful if someone could help.
>>
>>     I created a lucene index for umls2014 but only for snomed vocabulary.
>>     I did this because I thought this would reduce the dictionary look up
>> time.
>>     But it still almost the same. Is there any other way to improve the
>> dictionary look up time?
>>
>> Thank you,
>> Harpreet
>>
>
>

Re: Lucene for UMLS2014

Posted by Harpreet Khanduja <hs...@rit.edu>.
Thank you so much for your help.

Harpreet.



On Mon, Jul 21, 2014 at 6:28 PM, Finan, Sean <
Sean.Finan@childrens.harvard.edu> wrote:

> Hi Harpreet,
>
> If you are willing to use cTakes 3.2, try the dictionary-lookup-fast
> module as a replacement of the default dictionary-lookup.  That module has
> a new dictionary resource (hsql, not lucene) and slightly different methods
> for lookup and matching.  In time trials it has been faster than the
> default module (hence the name).  Accuracy depends upon the parameter
> settings, but in the tests performed so far the results are comparable or
> better.  The new dictionary is much leaner than the current default
> dictionary, small enough to port from the hsql cached version to a hsql
> in-memory version.  Using the in-memory version makes dictionary lookup
> practically instantaneous (hundredths of a second).  Limited documentation
> is available in the module's doc/ directory.
>
> I will be on vacation for a week, but please don't hesitate to write if
> you have any questions.
>
> Sean
> ________________________________________
> From: Harpreet Khanduja [hsk5004@rit.edu]
> Sent: Thursday, July 17, 2014 5:07 PM
> To: dev@ctakes.apache.org
> Subject: Lucene for UMLS2014
>
> Hello,
>     I would be grateful if someone could help.
>
>     I created a lucene index for umls2014 but only for snomed vocabulary.
>     I did this because I thought this would reduce the dictionary look up
> time.
>     But it still almost the same. Is there any other way to improve the
> dictionary look up time?
>
> Thank you,
> Harpreet
>

RE: Lucene for UMLS2014

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Harpreet,

If you are willing to use cTakes 3.2, try the dictionary-lookup-fast module as a replacement of the default dictionary-lookup.  That module has a new dictionary resource (hsql, not lucene) and slightly different methods for lookup and matching.  In time trials it has been faster than the default module (hence the name).  Accuracy depends upon the parameter settings, but in the tests performed so far the results are comparable or better.  The new dictionary is much leaner than the current default dictionary, small enough to port from the hsql cached version to a hsql in-memory version.  Using the in-memory version makes dictionary lookup practically instantaneous (hundredths of a second).  Limited documentation is available in the module's doc/ directory.

I will be on vacation for a week, but please don't hesitate to write if you have any questions.

Sean
________________________________________
From: Harpreet Khanduja [hsk5004@rit.edu]
Sent: Thursday, July 17, 2014 5:07 PM
To: dev@ctakes.apache.org
Subject: Lucene for UMLS2014

Hello,
    I would be grateful if someone could help.

    I created a lucene index for umls2014 but only for snomed vocabulary.
    I did this because I thought this would reduce the dictionary look up
time.
    But it still almost the same. Is there any other way to improve the
dictionary look up time?

Thank you,
Harpreet