You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by Razu Sharif <ra...@gmail.com> on 2018/02/14 09:22:30 UTC

using umls dictionary lookup offline

Dear,

Every time I run cTakes it calls out internet to check our credentials.
Whats required to make it work without internet or credential check.

Thanks
Razu Sharif

Re: using umls dictionary lookup offline

Posted by Razu Sharif <ra...@gmail.com>.
Hello,

I am running cTAKES from command line and installed it as user mode. Now I
really need to make sure it works offline also. Found instruction for it
like load Umls data to MySQL and use JdbcConceptFactory instead of
UmlsJdbcConceptFactory. I went trough xml configuration file and couldn't
figure out how to use JdbcConceptFactory. I imported cTAKES project into
eclipse still no luck. For ytex, What can I do to run cTAKES offline. Little
help will be appreciable.

Thanks

On Wed, Feb 14, 2018 at 3:22 PM, Razu Sharif <ra...@gmail.com>
wrote:

> Dear,
>
> Every time I run cTakes it calls out internet to check our credentials.
> Whats required to make it work without internet or credential check.
>
> Thanks
> Razu Sharif
>

RE: using umls dictionary lookup offline [EXTERNAL]

Posted by Gandhi Rajan Natarajan <Ga...@arisglobal.com>.
Hi Sean, Thanks for the additional info.

Regards,
Gandhi


-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
Sent: Wednesday, February 14, 2018 9:42 PM
To: dev@ctakes.apache.org
Subject: RE: using umls dictionary lookup offline [EXTERNAL]

Hi Gandhi, all,

JdbcDictionary and JdbcConceptFactory are probably the way to go.  Just to add, you don't need to load the dictionary into another database (mysql) to use Jdbc* classes.  They will work out-of-box with the default dictionary.

Sean

-----Original Message-----
From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
Sent: Wednesday, February 14, 2018 11:08 AM
To: dev@ctakes.apache.org
Subject: RE: using umls dictionary lookup offline [EXTERNAL]

Hi Razu,

You can load the UMLS data in database like MySQL and use JdbcConceptFactory instead of UmlsJdbcConceptFactory.

Regards,
Gandhi


-----Original Message-----
From: Razu Sharif [mailto:razu.cse10.ruet@gmail.com]
Sent: Wednesday, February 14, 2018 2:53 PM
To: dev@ctakes.apache.org
Subject: using umls dictionary lookup offline

Dear,

Every time I run cTakes it calls out internet to check our credentials.
Whats required to make it work without internet or credential check.

Thanks
Razu Sharif
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender or system manager by email immediately if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited and against the law.
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender or system manager by email immediately if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited and against the law.

RE: using umls dictionary lookup offline [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS]

Posted by Devikiran <de...@gmail.com>.
@Tim, @Sean,

Makes sense .. well put too.

I think I was overwhelmed with that Interface for the extractor. ;)

Example would be the case when umls would have blocked my account for not
filling the annual survey and I still use their dump happily. Now I need to
visit the contract for UMLS.

When we use these tools for corporate usage it's better to be safe than
sorry.

Regards,
Devi

On 15-Feb-2018 20:04, "Finan, Sean" <Se...@childrens.harvard.edu>
wrote:

Well put,
Thanks Tim!

-----Original Message-----
From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu]
Sent: Thursday, February 15, 2018 9:33 AM
To: dev@ctakes.apache.org
Subject: Re: using umls dictionary lookup offline [EXTERNAL] [SUSPICIOUS]
[SUSPICIOUS]

Again, not legal advice, but this is my rule of thumb:
- If you had to enter your UMLS credentials to download the copy of the
UMLS you're using with cTAKES, then you don't need to have the online
credentials check. (As Sean said, you are responsible for following
licenses in terms of redistribution.)
- If you did _not_ enter your UMLS credentials to download the copy of the
UMLS you're using with cTAKES (e.g., from our sourceforge mirror), then you
DO need to have the online credentials check. It is very beneficial to the
cTAKES project that we are allowed to redistribute the UMLS in a format
that's convenient for users getting started, so it is really important not
to abuse this.

Tim


On Thu, 2018-02-15 at 14:13 +0000, Finan, Sean wrote:
> Hi Devi,
>
> There is a lot to say on this topic, and I can't possibly cover it
> all.  Disclaimer: the following is not meant to be complete.  It is
> the rambling of a layman, not a lawyer, who hasn't slept.  I did not
> draft the UMLS license, nor have I thoroughly read it since ... I want
> to say October.  If anybody notices that I state something inaccurate
> please correct me.  Also, apologies for shouting TAKES.
>
> !!!  Please visit the UMLS license start page [1] for complete
> information on what you should do regarding its use.  Apache has no
> affiliation that I know of and this is not the best forum for legal
> matters.
>
> In short, as things apply to Apache cTAKES:
> 1) There is available on sourceforge a prebuilt database containing a
> subset of the UMLS that is usable by Apache cTAKES.
> 2) That database is not distributed or supported by Apache.  The
> licenses are incompatible.
> 3) The Apache cTAKES website downloads page [2] provides a link to it
> as a courtesy.
>
> 4) Just like help on anything else 3rd party [3], information on using
> the dictionary [4] in the Apache cTAKES wiki, Apache cTAKES mailing
> lists, etc. is provided for assistance.
> 5) There are inherent expectations that those utilizing said help
> abide by all laws and restrictions of the third party.
>
> 6) The "default" Apache cTAKES dictionary lookup uses a "rare word
> index" schema. [5]
> 7) While the database on sourceforge adheres to the rare word index
> schema,
> 8) An infinite number of databases can be created that conform to said
> schema and can be used by Apache cTAKES.  [6]
>
> 9) There is also code in Apache cTAKES that can use other database
> schemas or bar-separated value flat files.
>
> 10) While the "default clinical pipeline" [7] is possibly the most
> commonly run configuration,
> 11) The default clinical pipeline is far from being the only way to
> use Apache cTAKES.
>
> 12) While the default dictionary lookup does require a check of the
> end user's UMLS license during initialization,
> 13) it is possible that the end user may want to run Apache cTAKES
> without the herein mentioned sourceforge database.
> 14) For that reason, there are configurations of the dictionary lookup
> that do not require a UMLS credential check.
>
> I have run out of steam.  So,
> 1)  If you use the subset of the UMLS that exists on sourceforge,
> PLEASE keep the UMLS credential check enabled.
> 2)  If you use another database of your own making, you can do what
> you want.
> 3) I should also say that if you create your own dictionary using the
> UMLS, I am pretty certain that you are not allowed to distribute it
> without expressed permission from the NLM.  Please consult the UMLS
> license. [1]
>
>
> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.nlm.nih.
> gov_databases_umls.html&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdi
> oCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=IpjGTDhTHstuDCNdgaxEo9doI7Djf-cWL7JWrtOeKwE&e=
> [2] https://urldefense.proofpoint.com/v2/url?u=http-3A__ctakes.apache
> .org_downloads.cgi&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCopp
> xeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=5Pv5xjzH7FP4OSYumoLEsWrAzY5lRiZVBsYOmMoIR68&e=
> [3] https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache
> .org_confluence_display_CTAKES_External-2BTools-2Band-
> 2BApplications&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU
> &r=Heup-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=8I3gCGeAzw4jkeGDPg536JUlUHJvmIacIg8Jjx46_kQ&e=
> [4] https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache
> .org_confluence_display_CTAKES_cTAKES-2B4.0-2BDictionaries-2Band-
> 2BModels&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heu
> p-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=nFS7-kIWdv_QpbHxdxl26WBnm3yGaauhs8cRHlpqMYM&e=
> [5] https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache
> .org_confluence_display_CTAKES_cTAKES-2B4.0-2B-2D-2BFast-
> 2BDictionary-
> 2BLookup&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heu
> p-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=5ezo5pIu6x4BRQe0NAgc7QVjsMvovchFxdftvRr_jFw&e=
> [6] https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache
> .org_confluence_display_CTAKES_Dictionary-2BCreator-
> 2BGUI&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-
> IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=f93gnvK3arVWEPL3tqvfdoQ-nUFRdLSsYZ7TNwVZhxo&e=
> [7] https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache
> .org_confluence_display_CTAKES_Default-2BClinical-
> 2BPipeline&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=H
> eup-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=lPgoyQgeolXLr--vdltS7Bp4QQmXg_2dPxQfu8tPjGA&e=
>
> Sean
>
>
>
> -----Original Message-----
> From: Devikiran [mailto:devikiranr@gmail.com]
> Sent: Thursday, February 15, 2018 6:43 AM
> To: dev@ctakes.apache.org
> Subject: RE: using umls dictionary lookup offline [EXTERNAL]
>
> @Sean,
> is it not a license obligation to validate umls credentials on every
> load of the pipeline ?
>
> I have offline umls data on elasticsearch and a customized concept
> extractor . But I still stayed true to the interface of
> umlsdictionarylookupannotator and validated the credentials of umls as
> data was dumped from umls.
>
> Regards,
> Devi
>
> On 14-Feb-2018 21:42, "Finan, Sean" <Sean.Finan@childrens.harvard.edu
> >
> wrote:
>
> >
> > Hi Gandhi, all,
> >
> > JdbcDictionary and JdbcConceptFactory are probably the way to go.
> > Just to add, you don't need to load the dictionary into another
> > database (mysql) to use Jdbc* classes.  They will work out-of-box
> > with the default dictionary.
> >
> > Sean
> >
> > -----Original Message-----
> > From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.co
> > m]
> > Sent: Wednesday, February 14, 2018 11:08 AM
> > To: dev@ctakes.apache.org
> > Subject: RE: using umls dictionary lookup offline [EXTERNAL]
> >
> > Hi Razu,
> >
> > You can load the UMLS data in database like MySQL and use
> > JdbcConceptFactory instead of UmlsJdbcConceptFactory.
> >
> > Regards,
> > Gandhi
> >
> >
> > -----Original Message-----
> > From: Razu Sharif [mailto:razu.cse10.ruet@gmail.com]
> > Sent: Wednesday, February 14, 2018 2:53 PM
> > To: dev@ctakes.apache.org
> > Subject: using umls dictionary lookup offline
> >
> > Dear,
> >
> > Every time I run cTakes it calls out internet to check our
> > credentials.
> > Whats required to make it work without internet or credential check.
> >
> > Thanks
> > Razu Sharif
> > This email and any files transmitted with it are confidential and
> > intended solely for the use of the individual or entity to whom they
> > are addressed.
> > If you are not the named addressee you should not disseminate,
> > distribute or copy this e-mail. Please notify the sender or system
> > manager by email immediately if you have received this e-mail by
> > mistake and delete this e-mail from your system. If you are not the
> > intended recipient you are notified that disclosing, copying,
> > distributing or taking any action in reliance on the contents of
> > this information is strictly prohibited and against the law.
> >

RE: using umls dictionary lookup offline [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Well put,
Thanks Tim!

-----Original Message-----
From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu] 
Sent: Thursday, February 15, 2018 9:33 AM
To: dev@ctakes.apache.org
Subject: Re: using umls dictionary lookup offline [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS]

Again, not legal advice, but this is my rule of thumb:
- If you had to enter your UMLS credentials to download the copy of the UMLS you're using with cTAKES, then you don't need to have the online credentials check. (As Sean said, you are responsible for following licenses in terms of redistribution.)
- If you did _not_ enter your UMLS credentials to download the copy of the UMLS you're using with cTAKES (e.g., from our sourceforge mirror), then you DO need to have the online credentials check. It is very beneficial to the cTAKES project that we are allowed to redistribute the UMLS in a format that's convenient for users getting started, so it is really important not to abuse this.

Tim


On Thu, 2018-02-15 at 14:13 +0000, Finan, Sean wrote:
> Hi Devi,
> 
> There is a lot to say on this topic, and I can't possibly cover it 
> all.  Disclaimer: the following is not meant to be complete.  It is 
> the rambling of a layman, not a lawyer, who hasn't slept.  I did not 
> draft the UMLS license, nor have I thoroughly read it since ... I want 
> to say October.  If anybody notices that I state something inaccurate 
> please correct me.  Also, apologies for shouting TAKES.
> 
> !!!  Please visit the UMLS license start page [1] for complete 
> information on what you should do regarding its use.  Apache has no 
> affiliation that I know of and this is not the best forum for legal 
> matters.
> 
> In short, as things apply to Apache cTAKES:
> 1) There is available on sourceforge a prebuilt database containing a 
> subset of the UMLS that is usable by Apache cTAKES.
> 2) That database is not distributed or supported by Apache.  The 
> licenses are incompatible.
> 3) The Apache cTAKES website downloads page [2] provides a link to it 
> as a courtesy.
> 
> 4) Just like help on anything else 3rd party [3], information on using 
> the dictionary [4] in the Apache cTAKES wiki, Apache cTAKES mailing 
> lists, etc. is provided for assistance.
> 5) There are inherent expectations that those utilizing said help 
> abide by all laws and restrictions of the third party.
> 
> 6) The "default" Apache cTAKES dictionary lookup uses a "rare word 
> index" schema. [5]
> 7) While the database on sourceforge adheres to the rare word index 
> schema,
> 8) An infinite number of databases can be created that conform to said 
> schema and can be used by Apache cTAKES.  [6]
> 
> 9) There is also code in Apache cTAKES that can use other database 
> schemas or bar-separated value flat files.
> 
> 10) While the "default clinical pipeline" [7] is possibly the most 
> commonly run configuration,
> 11) The default clinical pipeline is far from being the only way to 
> use Apache cTAKES.
> 
> 12) While the default dictionary lookup does require a check of the 
> end user's UMLS license during initialization,
> 13) it is possible that the end user may want to run Apache cTAKES 
> without the herein mentioned sourceforge database.
> 14) For that reason, there are configurations of the dictionary lookup 
> that do not require a UMLS credential check.
> 
> I have run out of steam.  So,
> 1)  If you use the subset of the UMLS that exists on sourceforge, 
> PLEASE keep the UMLS credential check enabled.
> 2)  If you use another database of your own making, you can do what 
> you want.
> 3) I should also say that if you create your own dictionary using the 
> UMLS, I am pretty certain that you are not allowed to distribute it 
> without expressed permission from the NLM.  Please consult the UMLS 
> license. [1]
> 
> 
> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.nlm.nih.
> gov_databases_umls.html&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdi
> oCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=IpjGTDhTHstuDCNdgaxEo9doI7Djf-cWL7JWrtOeKwE&e=
> [2] https://urldefense.proofpoint.com/v2/url?u=http-3A__ctakes.apache
> .org_downloads.cgi&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCopp
> xeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=5Pv5xjzH7FP4OSYumoLEsWrAzY5lRiZVBsYOmMoIR68&e=
> [3] https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache
> .org_confluence_display_CTAKES_External-2BTools-2Band-
> 2BApplications&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU
> &r=Heup-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=8I3gCGeAzw4jkeGDPg536JUlUHJvmIacIg8Jjx46_kQ&e=
> [4] https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache
> .org_confluence_display_CTAKES_cTAKES-2B4.0-2BDictionaries-2Band-
> 2BModels&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heu
> p-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=nFS7-kIWdv_QpbHxdxl26WBnm3yGaauhs8cRHlpqMYM&e=
> [5] https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache
> .org_confluence_display_CTAKES_cTAKES-2B4.0-2B-2D-2BFast-
> 2BDictionary-
> 2BLookup&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heu
> p-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=5ezo5pIu6x4BRQe0NAgc7QVjsMvovchFxdftvRr_jFw&e=
> [6] https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache
> .org_confluence_display_CTAKES_Dictionary-2BCreator-
> 2BGUI&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-
> IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=f93gnvK3arVWEPL3tqvfdoQ-nUFRdLSsYZ7TNwVZhxo&e=
> [7] https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache
> .org_confluence_display_CTAKES_Default-2BClinical-
> 2BPipeline&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=H
> eup-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=lPgoyQgeolXLr--vdltS7Bp4QQmXg_2dPxQfu8tPjGA&e=
> 
> Sean
> 
> 
> 
> -----Original Message-----
> From: Devikiran [mailto:devikiranr@gmail.com]
> Sent: Thursday, February 15, 2018 6:43 AM
> To: dev@ctakes.apache.org
> Subject: RE: using umls dictionary lookup offline [EXTERNAL]
> 
> @Sean,
> is it not a license obligation to validate umls credentials on every 
> load of the pipeline ?
> 
> I have offline umls data on elasticsearch and a customized concept 
> extractor . But I still stayed true to the interface of 
> umlsdictionarylookupannotator and validated the credentials of umls as 
> data was dumped from umls.
> 
> Regards,
> Devi
> 
> On 14-Feb-2018 21:42, "Finan, Sean" <Sean.Finan@childrens.harvard.edu
> >
> wrote:
> 
> > 
> > Hi Gandhi, all,
> > 
> > JdbcDictionary and JdbcConceptFactory are probably the way to go. 
> > Just to add, you don't need to load the dictionary into another 
> > database (mysql) to use Jdbc* classes.  They will work out-of-box 
> > with the default dictionary.
> > 
> > Sean
> > 
> > -----Original Message-----
> > From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.co
> > m]
> > Sent: Wednesday, February 14, 2018 11:08 AM
> > To: dev@ctakes.apache.org
> > Subject: RE: using umls dictionary lookup offline [EXTERNAL]
> > 
> > Hi Razu,
> > 
> > You can load the UMLS data in database like MySQL and use 
> > JdbcConceptFactory instead of UmlsJdbcConceptFactory.
> > 
> > Regards,
> > Gandhi
> > 
> > 
> > -----Original Message-----
> > From: Razu Sharif [mailto:razu.cse10.ruet@gmail.com]
> > Sent: Wednesday, February 14, 2018 2:53 PM
> > To: dev@ctakes.apache.org
> > Subject: using umls dictionary lookup offline
> > 
> > Dear,
> > 
> > Every time I run cTakes it calls out internet to check our 
> > credentials.
> > Whats required to make it work without internet or credential check.
> > 
> > Thanks
> > Razu Sharif
> > This email and any files transmitted with it are confidential and 
> > intended solely for the use of the individual or entity to whom they 
> > are addressed.
> > If you are not the named addressee you should not disseminate, 
> > distribute or copy this e-mail. Please notify the sender or system 
> > manager by email immediately if you have received this e-mail by 
> > mistake and delete this e-mail from your system. If you are not the 
> > intended recipient you are notified that disclosing, copying, 
> > distributing or taking any action in reliance on the contents of 
> > this information is strictly prohibited and against the law.
> > 

Re: using umls dictionary lookup offline [EXTERNAL] [SUSPICIOUS]

Posted by "Miller, Timothy" <Ti...@childrens.harvard.edu>.
Again, not legal advice, but this is my rule of thumb:
- If you had to enter your UMLS credentials to download the copy of the
UMLS you're using with cTAKES, then you don't need to have the online
credentials check. (As Sean said, you are responsible for following
licenses in terms of redistribution.)
- If you did _not_ enter your UMLS credentials to download the copy of
the UMLS you're using with cTAKES (e.g., from our sourceforge mirror),
then you DO need to have the online credentials check. It is very
beneficial to the cTAKES project that we are allowed to redistribute
the UMLS in a format that's convenient for users getting started, so it
is really important not to abuse this.

Tim


On Thu, 2018-02-15 at 14:13 +0000, Finan, Sean wrote:
> Hi Devi,
> 
> There is a lot to say on this topic, and I can't possibly cover it
> all.  Disclaimer: the following is not meant to be complete.  It is
> the rambling of a layman, not a lawyer, who hasn't slept.  I did not
> draft the UMLS license, nor have I thoroughly read it since ... I
> want to say October.  If anybody notices that I state something
> inaccurate please correct me.  Also, apologies for shouting TAKES.
> 
> !!!  Please visit the UMLS license start page [1] for complete
> information on what you should do regarding its use.  Apache has no
> affiliation that I know of and this is not the best forum for legal
> matters.
> 
> In short, as things apply to Apache cTAKES:
> 1) There is available on sourceforge a prebuilt database containing a
> subset of the UMLS that is usable by Apache cTAKES.  
> 2) That database is not distributed or supported by Apache.  The
> licenses are incompatible.
> 3) The Apache cTAKES website downloads page [2] provides a link to it
> as a courtesy.
> 
> 4) Just like help on anything else 3rd party [3], information on
> using the dictionary [4] in the Apache cTAKES wiki, Apache cTAKES
> mailing lists, etc. is provided for assistance.
> 5) There are inherent expectations that those utilizing said help
> abide by all laws and restrictions of the third party.
> 
> 6) The "default" Apache cTAKES dictionary lookup uses a "rare word
> index" schema. [5]
> 7) While the database on sourceforge adheres to the rare word index
> schema,
> 8) An infinite number of databases can be created that conform to
> said schema and can be used by Apache cTAKES.  [6]
> 
> 9) There is also code in Apache cTAKES that can use other database
> schemas or bar-separated value flat files.
> 
> 10) While the "default clinical pipeline" [7] is possibly the most
> commonly run configuration,
> 11) The default clinical pipeline is far from being the only way to
> use Apache cTAKES.
> 
> 12) While the default dictionary lookup does require a check of the
> end user's UMLS license during initialization,
> 13) it is possible that the end user may want to run Apache cTAKES
> without the herein mentioned sourceforge database.
> 14) For that reason, there are configurations of the dictionary
> lookup that do not require a UMLS credential check.
> 
> I have run out of steam.  So,
> 1)  If you use the subset of the UMLS that exists on sourceforge,
> PLEASE keep the UMLS credential check enabled.
> 2)  If you use another database of your own making, you can do what
> you want.
> 3) I should also say that if you create your own dictionary using the
> UMLS, I am pretty certain that you are not allowed to distribute it
> without expressed permission from the NLM.  Please consult the UMLS
> license. [1]
> 
> 
> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.nlm.nih.
> gov_databases_umls.html&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdi
> oCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=IpjGTDhTHstuDCNdgaxEo9doI7Djf-cWL7JWrtOeKwE&e=
> [2] https://urldefense.proofpoint.com/v2/url?u=http-3A__ctakes.apache
> .org_downloads.cgi&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCopp
> xeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=5Pv5xjzH7FP4OSYumoLEsWrAzY5lRiZVBsYOmMoIR68&e=
> [3] https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache
> .org_confluence_display_CTAKES_External-2BTools-2Band-
> 2BApplications&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU
> &r=Heup-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=8I3gCGeAzw4jkeGDPg536JUlUHJvmIacIg8Jjx46_kQ&e=
> [4] https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache
> .org_confluence_display_CTAKES_cTAKES-2B4.0-2BDictionaries-2Band-
> 2BModels&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heu
> p-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=nFS7-kIWdv_QpbHxdxl26WBnm3yGaauhs8cRHlpqMYM&e=
> [5] https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache
> .org_confluence_display_CTAKES_cTAKES-2B4.0-2B-2D-2BFast-
> 2BDictionary-
> 2BLookup&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heu
> p-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=5ezo5pIu6x4BRQe0NAgc7QVjsMvovchFxdftvRr_jFw&e=
> [6] https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache
> .org_confluence_display_CTAKES_Dictionary-2BCreator-
> 2BGUI&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-
> IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=f93gnvK3arVWEPL3tqvfdoQ-nUFRdLSsYZ7TNwVZhxo&e=
> [7] https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache
> .org_confluence_display_CTAKES_Default-2BClinical-
> 2BPipeline&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=H
> eup-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=kLwhTWLiycm7tPA3QG6BD8swPRXxgO
> TpWHc6l_TCKoA&s=lPgoyQgeolXLr--vdltS7Bp4QQmXg_2dPxQfu8tPjGA&e=
> 
> Sean
> 
> 
> 
> -----Original Message-----
> From: Devikiran [mailto:devikiranr@gmail.com] 
> Sent: Thursday, February 15, 2018 6:43 AM
> To: dev@ctakes.apache.org
> Subject: RE: using umls dictionary lookup offline [EXTERNAL]
> 
> @Sean,
> is it not a license obligation to validate umls credentials on every
> load of the pipeline ?
> 
> I have offline umls data on elasticsearch and a customized concept
> extractor . But I still stayed true to the interface of
> umlsdictionarylookupannotator and validated the credentials of umls
> as data was dumped from umls.
> 
> Regards,
> Devi
> 
> On 14-Feb-2018 21:42, "Finan, Sean" <Sean.Finan@childrens.harvard.edu
> >
> wrote:
> 
> > 
> > Hi Gandhi, all,
> > 
> > JdbcDictionary and JdbcConceptFactory are probably the way to go.  
> > Just to add, you don't need to load the dictionary into another 
> > database (mysql) to use Jdbc* classes.  They will work out-of-box
> > with the default dictionary.
> > 
> > Sean
> > 
> > -----Original Message-----
> > From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.co
> > m]
> > Sent: Wednesday, February 14, 2018 11:08 AM
> > To: dev@ctakes.apache.org
> > Subject: RE: using umls dictionary lookup offline [EXTERNAL]
> > 
> > Hi Razu,
> > 
> > You can load the UMLS data in database like MySQL and use 
> > JdbcConceptFactory instead of UmlsJdbcConceptFactory.
> > 
> > Regards,
> > Gandhi
> > 
> > 
> > -----Original Message-----
> > From: Razu Sharif [mailto:razu.cse10.ruet@gmail.com]
> > Sent: Wednesday, February 14, 2018 2:53 PM
> > To: dev@ctakes.apache.org
> > Subject: using umls dictionary lookup offline
> > 
> > Dear,
> > 
> > Every time I run cTakes it calls out internet to check our
> > credentials.
> > Whats required to make it work without internet or credential
> > check.
> > 
> > Thanks
> > Razu Sharif
> > This email and any files transmitted with it are confidential and 
> > intended solely for the use of the individual or entity to whom
> > they are addressed.
> > If you are not the named addressee you should not disseminate, 
> > distribute or copy this e-mail. Please notify the sender or system 
> > manager by email immediately if you have received this e-mail by 
> > mistake and delete this e-mail from your system. If you are not
> > the 
> > intended recipient you are notified that disclosing, copying, 
> > distributing or taking any action in reliance on the contents of
> > this 
> > information is strictly prohibited and against the law.
> > 

RE: using umls dictionary lookup offline [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Devi,

There is a lot to say on this topic, and I can't possibly cover it all.  Disclaimer: the following is not meant to be complete.  It is the rambling of a layman, not a lawyer, who hasn't slept.  I did not draft the UMLS license, nor have I thoroughly read it since ... I want to say October.  If anybody notices that I state something inaccurate please correct me.  Also, apologies for shouting TAKES.

!!!  Please visit the UMLS license start page [1] for complete information on what you should do regarding its use.  Apache has no affiliation that I know of and this is not the best forum for legal matters.

In short, as things apply to Apache cTAKES:
1) There is available on sourceforge a prebuilt database containing a subset of the UMLS that is usable by Apache cTAKES.  
2) That database is not distributed or supported by Apache.  The licenses are incompatible.
3) The Apache cTAKES website downloads page [2] provides a link to it as a courtesy.

4) Just like help on anything else 3rd party [3], information on using the dictionary [4] in the Apache cTAKES wiki, Apache cTAKES mailing lists, etc. is provided for assistance.
5) There are inherent expectations that those utilizing said help abide by all laws and restrictions of the third party.

6) The "default" Apache cTAKES dictionary lookup uses a "rare word index" schema. [5]
7) While the database on sourceforge adheres to the rare word index schema,
8) An infinite number of databases can be created that conform to said schema and can be used by Apache cTAKES.  [6]

9) There is also code in Apache cTAKES that can use other database schemas or bar-separated value flat files.

10) While the "default clinical pipeline" [7] is possibly the most commonly run configuration,
11) The default clinical pipeline is far from being the only way to use Apache cTAKES.

12) While the default dictionary lookup does require a check of the end user's UMLS license during initialization,
13) it is possible that the end user may want to run Apache cTAKES without the herein mentioned sourceforge database.
14) For that reason, there are configurations of the dictionary lookup that do not require a UMLS credential check.

I have run out of steam.  So,
1)  If you use the subset of the UMLS that exists on sourceforge, PLEASE keep the UMLS credential check enabled.
2)  If you use another database of your own making, you can do what you want.
3) I should also say that if you create your own dictionary using the UMLS, I am pretty certain that you are not allowed to distribute it without expressed permission from the NLM.  Please consult the UMLS license. [1]


[1] https://www.nlm.nih.gov/databases/umls.html
[2] http://ctakes.apache.org/downloads.cgi
[3] https://cwiki.apache.org/confluence/display/CTAKES/External+Tools+and+Applications
[4] https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0+Dictionaries+and+Models
[5] https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0+-+Fast+Dictionary+Lookup
[6] https://cwiki.apache.org/confluence/display/CTAKES/Dictionary+Creator+GUI
[7] https://cwiki.apache.org/confluence/display/CTAKES/Default+Clinical+Pipeline

Sean



-----Original Message-----
From: Devikiran [mailto:devikiranr@gmail.com] 
Sent: Thursday, February 15, 2018 6:43 AM
To: dev@ctakes.apache.org
Subject: RE: using umls dictionary lookup offline [EXTERNAL]

@Sean,
is it not a license obligation to validate umls credentials on every load of the pipeline ?

I have offline umls data on elasticsearch and a customized concept extractor . But I still stayed true to the interface of umlsdictionarylookupannotator and validated the credentials of umls as data was dumped from umls.

Regards,
Devi

On 14-Feb-2018 21:42, "Finan, Sean" <Se...@childrens.harvard.edu>
wrote:

> Hi Gandhi, all,
>
> JdbcDictionary and JdbcConceptFactory are probably the way to go.  
> Just to add, you don't need to load the dictionary into another 
> database (mysql) to use Jdbc* classes.  They will work out-of-box with the default dictionary.
>
> Sean
>
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
> Sent: Wednesday, February 14, 2018 11:08 AM
> To: dev@ctakes.apache.org
> Subject: RE: using umls dictionary lookup offline [EXTERNAL]
>
> Hi Razu,
>
> You can load the UMLS data in database like MySQL and use 
> JdbcConceptFactory instead of UmlsJdbcConceptFactory.
>
> Regards,
> Gandhi
>
>
> -----Original Message-----
> From: Razu Sharif [mailto:razu.cse10.ruet@gmail.com]
> Sent: Wednesday, February 14, 2018 2:53 PM
> To: dev@ctakes.apache.org
> Subject: using umls dictionary lookup offline
>
> Dear,
>
> Every time I run cTakes it calls out internet to check our credentials.
> Whats required to make it work without internet or credential check.
>
> Thanks
> Razu Sharif
> This email and any files transmitted with it are confidential and 
> intended solely for the use of the individual or entity to whom they are addressed.
> If you are not the named addressee you should not disseminate, 
> distribute or copy this e-mail. Please notify the sender or system 
> manager by email immediately if you have received this e-mail by 
> mistake and delete this e-mail from your system. If you are not the 
> intended recipient you are notified that disclosing, copying, 
> distributing or taking any action in reliance on the contents of this 
> information is strictly prohibited and against the law.
>

RE: using umls dictionary lookup offline [EXTERNAL]

Posted by Devikiran <de...@gmail.com>.
@Sean,
is it not a license obligation to validate umls credentials on every load
of the pipeline ?

I have offline umls data on elasticsearch and a customized concept
extractor . But I still stayed true to the interface of
umlsdictionarylookupannotator and validated the credentials of umls as data
was dumped from umls.

Regards,
Devi

On 14-Feb-2018 21:42, "Finan, Sean" <Se...@childrens.harvard.edu>
wrote:

> Hi Gandhi, all,
>
> JdbcDictionary and JdbcConceptFactory are probably the way to go.  Just to
> add, you don't need to load the dictionary into another database (mysql) to
> use Jdbc* classes.  They will work out-of-box with the default dictionary.
>
> Sean
>
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
> Sent: Wednesday, February 14, 2018 11:08 AM
> To: dev@ctakes.apache.org
> Subject: RE: using umls dictionary lookup offline [EXTERNAL]
>
> Hi Razu,
>
> You can load the UMLS data in database like MySQL and use
> JdbcConceptFactory instead of UmlsJdbcConceptFactory.
>
> Regards,
> Gandhi
>
>
> -----Original Message-----
> From: Razu Sharif [mailto:razu.cse10.ruet@gmail.com]
> Sent: Wednesday, February 14, 2018 2:53 PM
> To: dev@ctakes.apache.org
> Subject: using umls dictionary lookup offline
>
> Dear,
>
> Every time I run cTakes it calls out internet to check our credentials.
> Whats required to make it work without internet or credential check.
>
> Thanks
> Razu Sharif
> This email and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to whom they are addressed.
> If you are not the named addressee you should not disseminate, distribute
> or copy this e-mail. Please notify the sender or system manager by email
> immediately if you have received this e-mail by mistake and delete this
> e-mail from your system. If you are not the intended recipient you are
> notified that disclosing, copying, distributing or taking any action in
> reliance on the contents of this information is strictly prohibited and
> against the law.
>

RE: using umls dictionary lookup offline [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Gandhi, all,

JdbcDictionary and JdbcConceptFactory are probably the way to go.  Just to add, you don't need to load the dictionary into another database (mysql) to use Jdbc* classes.  They will work out-of-box with the default dictionary.

Sean

-----Original Message-----
From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com] 
Sent: Wednesday, February 14, 2018 11:08 AM
To: dev@ctakes.apache.org
Subject: RE: using umls dictionary lookup offline [EXTERNAL]

Hi Razu,

You can load the UMLS data in database like MySQL and use JdbcConceptFactory instead of UmlsJdbcConceptFactory.

Regards,
Gandhi


-----Original Message-----
From: Razu Sharif [mailto:razu.cse10.ruet@gmail.com]
Sent: Wednesday, February 14, 2018 2:53 PM
To: dev@ctakes.apache.org
Subject: using umls dictionary lookup offline

Dear,

Every time I run cTakes it calls out internet to check our credentials.
Whats required to make it work without internet or credential check.

Thanks
Razu Sharif
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender or system manager by email immediately if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited and against the law.

Re: using umls dictionary lookup offline

Posted by Jessica Glover <gl...@gmail.com>.
Razu,

I'm not totally sure if this is up-to-date, but here's a wiki page that I
hope will be helpful.
https://cwiki.apache.org/confluence/display/CTAKES/UMLS+MS+SQL+Server+Installation

Note that you will have to have a UMLS license account to be able to
download UMLS data. You can sign up for it here:
https://uts.nlm.nih.gov/home.html

If you run into any problems, I've found the mailing list archives to be an
invaluable resource, as most of the time I have questions that many before
me have already asked. Searchable archives for Apache project mailing lists
can be found here: http://apache.markmail.org/

- Jessica

On Wed, Feb 14, 2018 at 11:07 AM, Gandhi Rajan Natarajan <
Gandhi.Natarajan@arisglobal.com> wrote:

> Hi Razu,
>
> You can load the UMLS data in database like MySQL and use
> JdbcConceptFactory instead of UmlsJdbcConceptFactory.
>
> Regards,
> Gandhi
>
>
> -----Original Message-----
> From: Razu Sharif [mailto:razu.cse10.ruet@gmail.com]
> Sent: Wednesday, February 14, 2018 2:53 PM
> To: dev@ctakes.apache.org
> Subject: using umls dictionary lookup offline
>
> Dear,
>
> Every time I run cTakes it calls out internet to check our credentials.
> Whats required to make it work without internet or credential check.
>
> Thanks
> Razu Sharif
> This email and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to whom they are addressed.
> If you are not the named addressee you should not disseminate, distribute
> or copy this e-mail. Please notify the sender or system manager by email
> immediately if you have received this e-mail by mistake and delete this
> e-mail from your system. If you are not the intended recipient you are
> notified that disclosing, copying, distributing or taking any action in
> reliance on the contents of this information is strictly prohibited and
> against the law.
>

RE: using umls dictionary lookup offline

Posted by Gandhi Rajan Natarajan <Ga...@arisglobal.com>.
Hi Razu,

You can load the UMLS data in database like MySQL and use JdbcConceptFactory instead of UmlsJdbcConceptFactory.

Regards,
Gandhi


-----Original Message-----
From: Razu Sharif [mailto:razu.cse10.ruet@gmail.com]
Sent: Wednesday, February 14, 2018 2:53 PM
To: dev@ctakes.apache.org
Subject: using umls dictionary lookup offline

Dear,

Every time I run cTakes it calls out internet to check our credentials.
Whats required to make it work without internet or credential check.

Thanks
Razu Sharif
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender or system manager by email immediately if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited and against the law.