You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by John Travis Green <jo...@gmail.com> on 2016/05/14 14:12:51 UTC

Offline access

I have a dod use case that requires offline umls verification. Anyone accomplish this yet? I recall some chatter awhile back but initial flirtations with google were unsuccessful. thanks!John  




RE: Offline access

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi John,

There is a dictionary database builder in sandbox http://svn.apache.org/repos/asf/ctakes/sandbox/dictionary-gui/
Basically, throw it into your ide and launch the main class.  You will see the gui.  Point to your umls root directory and your ctakes installation root.  Select your vocabularies.  Select your tuis.  Enter a name for the dictionary and click "Go".

I would love to have some formal online documentation for this, but haven't had the time.  Any help with that would be appreciated.

Sean

-----Original Message-----
From: John Travis Green [mailto:john.travis.green@gmail.com] 
Sent: Monday, May 16, 2016 11:36 AM
Cc: dev@ctakes.apache.org
Subject: Re: Offline access

It seems odd they would require repeated checke when you can download the whole thing for input into a db with mmsys.  How is the documentation on building the fast lookup from a local copy of the umls? A quick glance at the website didnt reveal much.  ytex reportedly works this way but Ive been getting an error with the section annotator that others addressed back in early 15 but no resolution was posted to the listserv.  Thanks all for your help on this. I have an active irb here in the army using ctakes but dod security requirements are so strict the entire server is offline.  Best, John  



Having a hosted web site which hosts UMLS resources and checks for a license before downloading locally(via the NLM/UMLS license validation web service) is indeed the norm and a more common way.  (YTEX originally and UMLS itself does this when one downloads their resources.)  The responsibility for any UMLS license adherence afterwards is essentially done by the one downloading it since the check is done at that time.  The online/upon initialization license check/bundled solution for cTAKES was really done for convenience to end users and historical reasons.  It just boils down to who wants to build, manage, maintain, host such a site that distributes and ensure the license check before downloading for these specially formatted resources.



—Pei



> On May 16, 2016, at 9:12 AM, Finan, Sean <Se...@childrens.harvard.edu> wrote:

>  
> The agreement that the ctakes core group was able to achieve with the NLM (distributor of UMLS) was that ctakes would check a user's access rights upon every use of any database derived from the UMLS.  The reason for this was that the NLM did not want one valid UMLS user to download the database and then distribute it for use by unaccredited third parties.  We have stuck to that agreement.  Upon every initial load of either of the distributed ctakes dictionary modules the user's entered password is checked online with the NLM user registry.

>  
> I think that a great compromise would be if somebody could create a "remote checkout" tool, something that checks-out a virtual license for use while you are on the road.  Maybe coordinate with the NLM on getting such a thing approved.  As ctakes is open source software you could start toying with such a client first.  To start, delegate to JdbcRareWordDictionary as does the UmlsJdbcRareWordDictionary, and delegate to JdbcConceptFactory as does the UmlsJdbcConceptFactory (for the -fast module).  Then point to the new trial "remote checkout" classes in your .xml setup file (the default being cTakesHsql.xml).  However, do NOT use these classes directly or in any production scenario as that would not abide by our agreement with the NLM.  Do not even check them into sandbox without us getting a new agreement to use such a system with the NLM.  I must emphasize that publicly doing so could cause us to lose our privileges to distribute a default dictionary.  You would still be able to download your own UMLS database and create your own dictionary for use with ctakes, but not every user can do that.  And when creating your client code favor composition over inheritance as the remote checkout client should not have IS-A, not that I can enforce anything that you do.

>  
> I repeat, NEVER use the JdbcRareWordDictionary and/or JdbcConceptFactory directly unless you are pointing to a database that was not created using the UMLS as a source.

>  
> Sean

>  
> -----Original Message-----

> From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]

> Sent: Monday, May 16, 2016 8:01 AM

> To: dev@ctakes.apache.org

> Subject: RE: Offline access

>  
> I haven't tried it and is a guess based on reading the code, but you might be able to change the dictionary implementation name in the xml file from UmlsJdbcRareWordDictionary to ConceptFactory, since Umls factory implements from ConceptFactory.

>  
> -----Original Message-----

> From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu]

> Sent: Saturday, May 14, 2016 2:28 PM

> To: dev@ctakes.apache.org

> Subject: Re: Offline access

>  
> Well, before we had online verification ctakes required downloading umls, extracting the right subset, and building a database for the dictionary tool. You can still do that - it is often necessary for use cases that our default dictionary doesn't have coverage, and while I'm not sure the state of documentation there have been several threads on the list about it. I think if you do it this way you can skip the UMLS verification step (though I don't remember exactly how that works) because you will have been verified at download time.

>  
> Can Sean or someone verify that this is true? If he builds his own dictionary (with the same subsets) can he skip the online verification?

> Thanks

> Tim

>  
> ________________________________________

> From: John Travis Green <jo...@gmail.com>

> Sent: Saturday, May 14, 2016 10:12 AM

> To: dev@ctakes.apache.org

> Subject: Offline access

>  
> I have a dod use case that requires offline umls verification. Anyone accomplish this yet? I recall some chatter awhile back but initial flirtations with google were unsuccessful. thanks!John

>  
>  
>  
>  
> IMPORTANT WARNING: The information in this message (and the documents attached to it, if any) is confidential and may be legally privileged. It is intended solely for the addressee. Access to this message by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken, or omitted to be taken, in reliance on it is prohibited and may be unlawful. If you have received this message in error, please delete all electronic copies of this message (and the documents attached to it, if any), destroy any hard copies you may have created and notify me immediately by replying to this email. Thank you.

>  
> Geisinger Health System utilizes an encryption process to safeguard Protected Health Information and other confidential data contained in external e-mail messages. If email is encrypted, the recipient will receive an e-mail instructing them to sign on to the Geisinger Health System Secure E-mail Message Center to retrieve the encrypted e-mail.




Re: Offline access

Posted by John Travis Green <jo...@gmail.com>.
It seems odd they would require repeated checke when you can download the whole thing for input into a db with mmsys.  How is the documentation on building the fast lookup from a local copy of the umls? A quick glance at the website didnt reveal much.  ytex reportedly works this way but Ive been getting an error with the section annotator that others addressed back in early 15 but no resolution was posted to the listserv.  Thanks all for your help on this. I have an active irb here in the army using ctakes but dod security requirements are so strict the entire server is offline.  Best, John  



Having a hosted web site which hosts UMLS resources and checks for a license before downloading locally(via the NLM/UMLS license validation web service) is indeed the norm and a more common way.  (YTEX originally and UMLS itself does this when one downloads their resources.)  The responsibility for any UMLS license adherence afterwards is essentially done by the one downloading it since the check is done at that time.  The online/upon initialization license check/bundled solution for cTAKES was really done for convenience to end users and historical reasons.  It just boils down to who wants to build, manage, maintain, host such a site that distributes and ensure the license check before downloading for these specially formatted resources.



—Pei



> On May 16, 2016, at 9:12 AM, Finan, Sean <Se...@childrens.harvard.edu> wrote:

>  
> The agreement that the ctakes core group was able to achieve with the NLM (distributor of UMLS) was that ctakes would check a user's access rights upon every use of any database derived from the UMLS.  The reason for this was that the NLM did not want one valid UMLS user to download the database and then distribute it for use by unaccredited third parties.  We have stuck to that agreement.  Upon every initial load of either of the distributed ctakes dictionary modules the user's entered password is checked online with the NLM user registry.

>  
> I think that a great compromise would be if somebody could create a "remote checkout" tool, something that checks-out a virtual license for use while you are on the road.  Maybe coordinate with the NLM on getting such a thing approved.  As ctakes is open source software you could start toying with such a client first.  To start, delegate to JdbcRareWordDictionary as does the UmlsJdbcRareWordDictionary, and delegate to JdbcConceptFactory as does the UmlsJdbcConceptFactory (for the -fast module).  Then point to the new trial "remote checkout" classes in your .xml setup file (the default being cTakesHsql.xml).  However, do NOT use these classes directly or in any production scenario as that would not abide by our agreement with the NLM.  Do not even check them into sandbox without us getting a new agreement to use such a system with the NLM.  I must emphasize that publicly doing so could cause us to lose our privileges to distribute a default dictionary.  You would still be able to download your own UMLS database and create your own dictionary for use with ctakes, but not every user can do that.  And when creating your client code favor composition over inheritance as the remote checkout client should not have IS-A, not that I can enforce anything that you do.

>  
> I repeat, NEVER use the JdbcRareWordDictionary and/or JdbcConceptFactory directly unless you are pointing to a database that was not created using the UMLS as a source.

>  
> Sean

>  
> -----Original Message-----

> From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]

> Sent: Monday, May 16, 2016 8:01 AM

> To: dev@ctakes.apache.org

> Subject: RE: Offline access

>  
> I haven't tried it and is a guess based on reading the code, but you might be able to change the dictionary implementation name in the xml file from UmlsJdbcRareWordDictionary to ConceptFactory, since Umls factory implements from ConceptFactory.

>  
> -----Original Message-----

> From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu]

> Sent: Saturday, May 14, 2016 2:28 PM

> To: dev@ctakes.apache.org

> Subject: Re: Offline access

>  
> Well, before we had online verification ctakes required downloading umls, extracting the right subset, and building a database for the dictionary tool. You can still do that - it is often necessary for use cases that our default dictionary doesn't have coverage, and while I'm not sure the state of documentation there have been several threads on the list about it. I think if you do it this way you can skip the UMLS verification step (though I don't remember exactly how that works) because you will have been verified at download time.

>  
> Can Sean or someone verify that this is true? If he builds his own dictionary (with the same subsets) can he skip the online verification?

> Thanks

> Tim

>  
> ________________________________________

> From: John Travis Green <jo...@gmail.com>

> Sent: Saturday, May 14, 2016 10:12 AM

> To: dev@ctakes.apache.org

> Subject: Offline access

>  
> I have a dod use case that requires offline umls verification. Anyone accomplish this yet? I recall some chatter awhile back but initial flirtations with google were unsuccessful. thanks!John

>  
>  
>  
>  
> IMPORTANT WARNING: The information in this message (and the documents attached to it, if any) is confidential and may be legally privileged. It is intended solely for the addressee. Access to this message by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken, or omitted to be taken, in reliance on it is prohibited and may be unlawful. If you have received this message in error, please delete all electronic copies of this message (and the documents attached to it, if any), destroy any hard copies you may have created and notify me immediately by replying to this email. Thank you.

>  
> Geisinger Health System utilizes an encryption process to safeguard Protected Health Information and other confidential data contained in external e-mail messages. If email is encrypted, the recipient will receive an e-mail instructing them to sign on to the Geisinger Health System Secure E-mail Message Center to retrieve the encrypted e-mail.




Re: Offline access

Posted by Pei Chen <pe...@wiredinformatics.com>.
Having a hosted web site which hosts UMLS resources and checks for a license before downloading locally(via the NLM/UMLS license validation web service) is indeed the norm and a more common way.  (YTEX originally and UMLS itself does this when one downloads their resources.)  The responsibility for any UMLS license adherence afterwards is essentially done by the one downloading it since the check is done at that time.  The online/upon initialization license check/bundled solution for cTAKES was really done for convenience to end users and historical reasons.  It just boils down to who wants to build, manage, maintain, host such a site that distributes and ensure the license check before downloading for these specially formatted resources.

—Pei

> On May 16, 2016, at 9:12 AM, Finan, Sean <Se...@childrens.harvard.edu> wrote:
> 
> The agreement that the ctakes core group was able to achieve with the NLM (distributor of UMLS) was that ctakes would check a user's access rights upon every use of any database derived from the UMLS.  The reason for this was that the NLM did not want one valid UMLS user to download the database and then distribute it for use by unaccredited third parties.  We have stuck to that agreement.  Upon every initial load of either of the distributed ctakes dictionary modules the user's entered password is checked online with the NLM user registry.
> 
> I think that a great compromise would be if somebody could create a "remote checkout" tool, something that checks-out a virtual license for use while you are on the road.  Maybe coordinate with the NLM on getting such a thing approved.  As ctakes is open source software you could start toying with such a client first.  To start, delegate to JdbcRareWordDictionary as does the UmlsJdbcRareWordDictionary, and delegate to JdbcConceptFactory as does the UmlsJdbcConceptFactory (for the -fast module).  Then point to the new trial "remote checkout" classes in your .xml setup file (the default being cTakesHsql.xml).  However, do NOT use these classes directly or in any production scenario as that would not abide by our agreement with the NLM.  Do not even check them into sandbox without us getting a new agreement to use such a system with the NLM.  I must emphasize that publicly doing so could cause us to lose our privileges to distribute a default dictionary.  You would still be able to download your own UMLS database and create your own dictionary for use with ctakes, but not every user can do that.  And when creating your client code favor composition over inheritance as the remote checkout client should not have IS-A, not that I can enforce anything that you do.
> 
> I repeat, NEVER use the JdbcRareWordDictionary and/or JdbcConceptFactory directly unless you are pointing to a database that was not created using the UMLS as a source.
> 
> Sean
> 
> -----Original Message-----
> From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
> Sent: Monday, May 16, 2016 8:01 AM
> To: dev@ctakes.apache.org
> Subject: RE: Offline access
> 
> I haven't tried it and is a guess based on reading the code, but you might be able to change the dictionary implementation name in the xml file from UmlsJdbcRareWordDictionary to ConceptFactory, since Umls factory implements from ConceptFactory.
> 
> -----Original Message-----
> From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu]
> Sent: Saturday, May 14, 2016 2:28 PM
> To: dev@ctakes.apache.org
> Subject: Re: Offline access
> 
> Well, before we had online verification ctakes required downloading umls, extracting the right subset, and building a database for the dictionary tool. You can still do that - it is often necessary for use cases that our default dictionary doesn't have coverage, and while I'm not sure the state of documentation there have been several threads on the list about it. I think if you do it this way you can skip the UMLS verification step (though I don't remember exactly how that works) because you will have been verified at download time.
> 
> Can Sean or someone verify that this is true? If he builds his own dictionary (with the same subsets) can he skip the online verification?
> Thanks
> Tim
> 
> ________________________________________
> From: John Travis Green <jo...@gmail.com>
> Sent: Saturday, May 14, 2016 10:12 AM
> To: dev@ctakes.apache.org
> Subject: Offline access
> 
> I have a dod use case that requires offline umls verification. Anyone accomplish this yet? I recall some chatter awhile back but initial flirtations with google were unsuccessful. thanks!John
> 
> 
> 
> 
> IMPORTANT WARNING: The information in this message (and the documents attached to it, if any) is confidential and may be legally privileged. It is intended solely for the addressee. Access to this message by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken, or omitted to be taken, in reliance on it is prohibited and may be unlawful. If you have received this message in error, please delete all electronic copies of this message (and the documents attached to it, if any), destroy any hard copies you may have created and notify me immediately by replying to this email. Thank you.
> 
> Geisinger Health System utilizes an encryption process to safeguard Protected Health Information and other confidential data contained in external e-mail messages. If email is encrypted, the recipient will receive an e-mail instructing them to sign on to the Geisinger Health System Secure E-mail Message Center to retrieve the encrypted e-mail.


RE: Offline access

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
The agreement that the ctakes core group was able to achieve with the NLM (distributor of UMLS) was that ctakes would check a user's access rights upon every use of any database derived from the UMLS.  The reason for this was that the NLM did not want one valid UMLS user to download the database and then distribute it for use by unaccredited third parties.  We have stuck to that agreement.  Upon every initial load of either of the distributed ctakes dictionary modules the user's entered password is checked online with the NLM user registry.  

I think that a great compromise would be if somebody could create a "remote checkout" tool, something that checks-out a virtual license for use while you are on the road.  Maybe coordinate with the NLM on getting such a thing approved.  As ctakes is open source software you could start toying with such a client first.  To start, delegate to JdbcRareWordDictionary as does the UmlsJdbcRareWordDictionary, and delegate to JdbcConceptFactory as does the UmlsJdbcConceptFactory (for the -fast module).  Then point to the new trial "remote checkout" classes in your .xml setup file (the default being cTakesHsql.xml).  However, do NOT use these classes directly or in any production scenario as that would not abide by our agreement with the NLM.  Do not even check them into sandbox without us getting a new agreement to use such a system with the NLM.  I must emphasize that publicly doing so could cause us to lose our privileges to distribute a default dictionary.  You would still be able to download your own UMLS database and create your own dictionary for use with ctakes, but not every user can do that.  And when creating your client code favor composition over inheritance as the remote checkout client should not have IS-A, not that I can enforce anything that you do.  

I repeat, NEVER use the JdbcRareWordDictionary and/or JdbcConceptFactory directly unless you are pointing to a database that was not created using the UMLS as a source.

Sean

-----Original Message-----
From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu] 
Sent: Monday, May 16, 2016 8:01 AM
To: dev@ctakes.apache.org
Subject: RE: Offline access

I haven't tried it and is a guess based on reading the code, but you might be able to change the dictionary implementation name in the xml file from UmlsJdbcRareWordDictionary to ConceptFactory, since Umls factory implements from ConceptFactory.

-----Original Message-----
From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu] 
Sent: Saturday, May 14, 2016 2:28 PM
To: dev@ctakes.apache.org
Subject: Re: Offline access

Well, before we had online verification ctakes required downloading umls, extracting the right subset, and building a database for the dictionary tool. You can still do that - it is often necessary for use cases that our default dictionary doesn't have coverage, and while I'm not sure the state of documentation there have been several threads on the list about it. I think if you do it this way you can skip the UMLS verification step (though I don't remember exactly how that works) because you will have been verified at download time.

Can Sean or someone verify that this is true? If he builds his own dictionary (with the same subsets) can he skip the online verification?
Thanks
Tim

________________________________________
From: John Travis Green <jo...@gmail.com>
Sent: Saturday, May 14, 2016 10:12 AM
To: dev@ctakes.apache.org
Subject: Offline access

I have a dod use case that requires offline umls verification. Anyone accomplish this yet? I recall some chatter awhile back but initial flirtations with google were unsuccessful. thanks!John




IMPORTANT WARNING: The information in this message (and the documents attached to it, if any) is confidential and may be legally privileged. It is intended solely for the addressee. Access to this message by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken, or omitted to be taken, in reliance on it is prohibited and may be unlawful. If you have received this message in error, please delete all electronic copies of this message (and the documents attached to it, if any), destroy any hard copies you may have created and notify me immediately by replying to this email. Thank you.

Geisinger Health System utilizes an encryption process to safeguard Protected Health Information and other confidential data contained in external e-mail messages. If email is encrypted, the recipient will receive an e-mail instructing them to sign on to the Geisinger Health System Secure E-mail Message Center to retrieve the encrypted e-mail.

RE: Offline access

Posted by "Geise, Brandon D." <bd...@geisinger.edu>.
I haven't tried it and is a guess based on reading the code, but you might be able to change the dictionary implementation name in the xml file from UmlsJdbcRareWordDictionary to ConceptFactory, since Umls factory implements from ConceptFactory.

-----Original Message-----
From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu] 
Sent: Saturday, May 14, 2016 2:28 PM
To: dev@ctakes.apache.org
Subject: Re: Offline access

Well, before we had online verification ctakes required downloading umls, extracting the right subset, and building a database for the dictionary tool. You can still do that - it is often necessary for use cases that our default dictionary doesn't have coverage, and while I'm not sure the state of documentation there have been several threads on the list about it. I think if you do it this way you can skip the UMLS verification step (though I don't remember exactly how that works) because you will have been verified at download time.

Can Sean or someone verify that this is true? If he builds his own dictionary (with the same subsets) can he skip the online verification?
Thanks
Tim

________________________________________
From: John Travis Green <jo...@gmail.com>
Sent: Saturday, May 14, 2016 10:12 AM
To: dev@ctakes.apache.org
Subject: Offline access

I have a dod use case that requires offline umls verification. Anyone accomplish this yet? I recall some chatter awhile back but initial flirtations with google were unsuccessful. thanks!John




IMPORTANT WARNING: The information in this message (and the documents attached to it, if any) is confidential and may be legally privileged. It is intended solely for the addressee. Access to this message by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken, or omitted to be taken, in reliance on it is prohibited and may be unlawful. If you have received this message in error, please delete all electronic copies of this message (and the documents attached to it, if any), destroy any hard copies you may have created and notify me immediately by replying to this email. Thank you.

Geisinger Health System utilizes an encryption process to safeguard Protected Health Information and other confidential data contained in external e-mail messages. If email is encrypted, the recipient will receive an e-mail instructing them to sign on to the Geisinger Health System Secure E-mail Message Center to retrieve the encrypted e-mail.

Re: Offline access

Posted by "Miller, Timothy" <Ti...@childrens.harvard.edu>.
Well, before we had online verification ctakes required downloading umls, extracting the right subset, and building a database for the dictionary tool. You can still do that - it is often necessary for use cases that our default dictionary doesn't have coverage, and while I'm not sure the state of documentation there have been several threads on the list about it. I think if you do it this way you can skip the UMLS verification step (though I don't remember exactly how that works) because you will have been verified at download time.

Can Sean or someone verify that this is true? If he builds his own dictionary (with the same subsets) can he skip the online verification?
Thanks
Tim

________________________________________
From: John Travis Green <jo...@gmail.com>
Sent: Saturday, May 14, 2016 10:12 AM
To: dev@ctakes.apache.org
Subject: Offline access

I have a dod use case that requires offline umls verification. Anyone accomplish this yet? I recall some chatter awhile back but initial flirtations with google were unsuccessful. thanks!John