You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ctakes.apache.org by digital paula <cy...@hotmail.com> on 2014/01/06 03:38:55 UTC

RE: cTakes: question on updating cue words

Happy New Year cTAKES Community! Hopefully everyone's staying warm.
 
Okay, I did try Matt's suggestion from developer site
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3cCED4DCCB.126B0%25mcoarr@mitre.org%3e
 
 but unfortunately it didn't work so I just stepped through the code to see what's going on with how cue words are being used in cTAKES.   I verified that the medfacts snapshot jar that contains  around 30 txt files for cue words are all being called.   So before I tried Matt's recent suggestion I did decompile the medfacts snapshot jar and add the new cue terms to the jar then added it back to cTAKES.... thought it wasn't being recognized but it is.  I had updated the speculation text file with 'predictive of' as a new cue term and while stepping through I see that 'predictive of' was recognized as a cue term from the speculation file.  
 
The problem is that it gets annotated as 'present' not as 'possible'.   That's why I thought the updated cue term wasn't being recognized.  I did a quick test using one of the already stated terms(I used 'improbable') from the speculation text file and sure enough the same file that contains the new cue term of 'predictive of' got annotated as 'possible' which has me wondering about the cue model and how it gets generated.  
 
So to echo what Tim stated, what does the cue model do?  What exactly is that file and how can the contents be viewed and regenerated or updated?   Something clearly has to be updated so 'predictive of' gets annotated as 'possible'.  
 
I'm so close to getting this resolved so I would appreciate any assistance.
 
Thanks.
 
Regards,
Paula
 
From: Timothy.Miller@childrens.harvard.edu
To: user@ctakes.apache.org
Subject: Re: cTakes: question on updating cue words
Date: Tue, 24 Dec 2013 14:19:46 +0000






Actually, I think Matt's suggestion is a bit out of date -- during development we removed the dependency on the lucene dictionary lookup and now the under development version does read those psv files directly.




But this still doesn't help Paula since she's trying to run the current release. I thought Matt or Pei might have some info about whether its possible to modify negation cue words for that release? For example, I can see in the code it uses a "cue model" which
 can be found in ctakes-assertion-res but it is a binary and I'm not sure what kind. Is there any way to modify that file?



Tim





On 12/24/2013 12:09 AM, digital paula wrote:



Thank you, Pei.  I believe I had signed up for the dev list right after Matt posted so I didn't see his email.  I will try it out. 

 

Merry Christmas to you and everyone on the list.  :-)

 

Regards,

Paula

 



Date: Mon, 23 Dec 2013 23:42:30 -0500

Subject: Re: cTakes: question on updating cue words

From: chenpei@apache.org

To: user@ctakes.apache.org



Paula,
Were you able to try Matt's suggestion on dev@?
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3cCED4DCCB.126B0%25mcoarr@mitre.org%3e










On Mon, Dec 23, 2013 at 11:57 AM, digital paula 
<cy...@hotmail.com> wrote:



Hello again cTAKES Community,

 

I think Tim's away for the holidays since I didn't see any  response.   Could someone else assist?  To reiterate, I'd like to manually update the cue words for the polarity and uncertainty features.     Please see below for details.

 

Thanks.

 

Regards,

Paula



 



From: 
cybersation@hotmail.com

To: 
user@ctakes.apache.org


Subject: RE: cTakes: question on updating cue words


Date: Thu, 19 Dec 2013 16:20:25 -0500





Hi Tim,  I just realized that my manual cue word updates didn't take.   :-(

 

I updated these two files from the med-facts.i2b21.2-SNAPSHOT.jar, then rebundled and added back to cTAKES:

1.  negation_cue_list.txt

2.  certainty.txt

 

Is there another file that you know of that needs to be updated?    In the cue folder under the jar contains only text files perhaps there's another text file I need to update, would you know what the files would be or what other updates need to be made? 




 Thanks.

 

Regards,

Paula



Date: Mon, 16 Dec 2013 16:12:29 -0500

From: 
timothy.miller@childrens.harvard.edu

To: 
user@ctakes.apache.org

Subject: Re: cTakes: question on updating cue words



Paula, I think to use the released version of ctakes you will have to do what you proposed - modify the jar. The checked in files (*.psv) that you are finding are for the under-development version.

Tim





On 12/16/2013 03:27 PM, digital paula wrote:



Hi Pei,

 

I don't consider this a bug so not sure why a jira ticket is needed, I just need to add 2 cue words wondering if I can do it manually? 


 

Exploring a bit, I see there are .psv files in Assertion component in the  Cue_Words folder that I updated but it doesn't seem to work.  I also added to the Semantic_Classes folder (in Assertion as well), the cue words in the .txt file and that didn't work
 either.   

 

One thing that I haven't tried is updating the cue words in the org.mitre.medfacts.i2b2.cuefiles package for the jar file:  medfacts-i2b2-1.2.SNAPSHPOT.jar.  That would be a little more work since I need to extract and rebuild jar file and add back to project.


 

I'm kind of on a huge deadline and hoping I can make these changes today so hoping this doesn't require a lot of time to just add a couple cue words.

 

By the way, was is a .psv file?  

 

Thanks.

 

Regards,

Paula

 



From: 
Pei.Chen@childrens.harvard.edu

To: 
dev@ctakes.apache.org

Subject: RE: cTakes: question on updating cue words

Date: Mon, 16 Dec 2013 19:31:45 +0000



[moved to dev@]

Hi Paula,

My suggestion would be to open a Jira item so
 that it could be tracked:

https://issues.apache.org/jira/browse/CTAKES (Feel free to create a new account).

Even cooler if you could attach the affected files with the patch(diffs) and any tests.

--Pei

 

 




From:
 digital paula [mailto:cybersation@hotmail.com]


Sent: Monday, December 16, 2013 1:30 PM

To: 
user@ctakes.apache.org

Subject: cTakes: question on updating cue words



 

Hello again cTAKES Community,

 

I would like to  add additional cue words to polarity (for negation) and uncertainty.    I would so appreciate if someone can let me know how I can add additional cue words.   


 

Thanks.

 

Regards,

Paula 

























 		 	   		  

RE: cTakes: question on updating cue words

Posted by digital paula <cy...@hotmail.com>.
Hello again,  I've been busy with figuring out the issue with sectionizer and had forgot about this...still haven't received a response.  :-(  I really hope Matt or anyone familiar with cue word updates can help.
 
Developers/users should be able to  update cue words because it can be subjective as to what category a cue word may fall in.  For example,  'predictive of'  could be a negation or uncertainty depending on the task of taking current or future state into consideration.   There are several cue terms in the cue term files that I'd like to change their assignment.   
 
Who determines what gets classified as negation or speculation and so forth?  Is it a committee?  I really hope someone can explain more on the cue.model and what I stated in previous email.  
 
Thanks. 
 
Regards,
Paula
 
From: cybersation@hotmail.com
To: user@ctakes.apache.org; dev@ctakes.apache.org
Subject: RE: cTakes: question on updating cue words
Date: Tue, 7 Jan 2014 13:36:33 -0500







Hi Matt,
 
I realized that I should have posted this on the developer site.
 
First of all, thanks for your followup a few weeks back.  I hadn't been subscribed to the developer list prior to your post so I didn't see it until Pei mentioned it.
 
Not sure if you saw my response on user list but I wasn't able to get your suggestion to work so I defaulted back to what I was trying to do with updating the medfacts snapshot jar.  After stepping through the code I see that cTAKES was identifying the new cue term from the updated medfacts snapshot jar (I had decompiled the jar, added the new term 'predictive of', then added jar back to cTAKES).  Okay, so cTAKES did identify the new cue term but it's not getting allocated to the 'possible' assertion type.   Since last night I've looked at several of the java files in the medfacts jar and quickly realized that UIMA is not just the foundation of cTAKES, MITRE is too!   
 
I would love to understand how does the cue word type(i.e. 'predictive of') get associated to the assertion type(i.e. 'possible')?     I can't seem to figure that out by looking at the code in the medfacts jar.    I'd like to understand how I can update it so my new cue word gets recognized as a 'possible' assertion type.    The existing words in the speculation are getting the correct 'possible' associate type.  Just the new cue term I added defaults to 'present'.    This leads me to wonder if it's the cue.model that's doing the assignment and has to be updated to recognize new cue word.   I'm hoping you can elucidate further on what the cue.model file is and how it works.  It appears to be a binary of some type.   What tools would be needed to update it?   Hmmm,  the fact that it's a .model extension leads me to believe that it's the result of some hefty machine learning that's contained in that cue.model file.  
 
Thanks so much.
 
Regards,
PaulaFrom: cybersation@hotmail.com
To: user@ctakes.apache.org
Subject: RE: cTakes: question on updating cue words
Date: Sun, 5 Jan 2014 21:38:55 -0500




Happy New Year cTAKES Community! Hopefully everyone's staying warm.
 
Okay, I did try Matt's suggestion from developer site
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3cCED4DCCB.126B0%25mcoarr@mitre.org%3e
 
 but unfortunately it didn't work so I just stepped through the code to see what's going on with how cue words are being used in cTAKES.   I verified that the medfacts snapshot jar that contains  around 30 txt files for cue words are all being called.   So before I tried Matt's recent suggestion I did decompile the medfacts snapshot jar and add the new cue terms to the jar then added it back to cTAKES.... thought it wasn't being recognized but it is.  I had updated the speculation text file with 'predictive of' as a new cue term and while stepping through I see that 'predictive of' was recognized as a cue term from the speculation file.  
 
The problem is that it gets annotated as 'present' not as 'possible'.   That's why I thought the updated cue term wasn't being recognized.  I did a quick test using one of the already stated terms(I used 'improbable') from the speculation text file and sure enough the same file that contains the new cue term of 'predictive of' got annotated as 'possible' which has me wondering about the cue model and how it gets generated.  
 
So to echo what Tim stated, what does the cue model do?  What exactly is that file and how can the contents be viewed and regenerated or updated?   Something clearly has to be updated so 'predictive of' gets annotated as 'possible'.  
 
I'm so close to getting this resolved so I would appreciate any assistance.
 
Thanks.
 
Regards,
Paula
 
From: Timothy.Miller@childrens.harvard.edu
To: user@ctakes.apache.org
Subject: Re: cTakes: question on updating cue words
Date: Tue, 24 Dec 2013 14:19:46 +0000






Actually, I think Matt's suggestion is a bit out of date -- during development we removed the dependency on the lucene dictionary lookup and now the under development version does read those psv files directly.




But this still doesn't help Paula since she's trying to run the current release. I thought Matt or Pei might have some info about whether its possible to modify negation cue words for that release? For example, I can see in the code it uses a "cue model" which
 can be found in ctakes-assertion-res but it is a binary and I'm not sure what kind. Is there any way to modify that file?



Tim





On 12/24/2013 12:09 AM, digital paula wrote:



Thank you, Pei.  I believe I had signed up for the dev list right after Matt posted so I didn't see his email.  I will try it out. 

 

Merry Christmas to you and everyone on the list.  :-)

 

Regards,

Paula

 



Date: Mon, 23 Dec 2013 23:42:30 -0500

Subject: Re: cTakes: question on updating cue words

From: chenpei@apache.org

To: user@ctakes.apache.org



Paula,
Were you able to try Matt's suggestion on dev@?
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3cCED4DCCB.126B0%25mcoarr@mitre.org%3e










On Mon, Dec 23, 2013 at 11:57 AM, digital paula 
<cy...@hotmail.com> wrote:



Hello again cTAKES Community,

 

I think Tim's away for the holidays since I didn't see any  response.   Could someone else assist?  To reiterate, I'd like to manually update the cue words for the polarity and uncertainty features.     Please see below for details.

 

Thanks.

 

Regards,

Paula



 



From: 
cybersation@hotmail.com

To: 
user@ctakes.apache.org


Subject: RE: cTakes: question on updating cue words


Date: Thu, 19 Dec 2013 16:20:25 -0500





Hi Tim,  I just realized that my manual cue word updates didn't take.   :-(

 

I updated these two files from the med-facts.i2b21.2-SNAPSHOT.jar, then rebundled and added back to cTAKES:

1.  negation_cue_list.txt

2.  certainty.txt

 

Is there another file that you know of that needs to be updated?    In the cue folder under the jar contains only text files perhaps there's another text file I need to update, would you know what the files would be or what other updates need to be made? 




 Thanks.

 

Regards,

Paula



Date: Mon, 16 Dec 2013 16:12:29 -0500

From: 
timothy.miller@childrens.harvard.edu

To: 
user@ctakes.apache.org

Subject: Re: cTakes: question on updating cue words



Paula, I think to use the released version of ctakes you will have to do what you proposed - modify the jar. The checked in files (*.psv) that you are finding are for the under-development version.

Tim





On 12/16/2013 03:27 PM, digital paula wrote:



Hi Pei,

 

I don't consider this a bug so not sure why a jira ticket is needed, I just need to add 2 cue words wondering if I can do it manually? 


 

Exploring a bit, I see there are .psv files in Assertion component in the  Cue_Words folder that I updated but it doesn't seem to work.  I also added to the Semantic_Classes folder (in Assertion as well), the cue words in the .txt file and that didn't work
 either.   

 

One thing that I haven't tried is updating the cue words in the org.mitre.medfacts.i2b2.cuefiles package for the jar file:  medfacts-i2b2-1.2.SNAPSHPOT.jar.  That would be a little more work since I need to extract and rebuild jar file and add back to project.


 

I'm kind of on a huge deadline and hoping I can make these changes today so hoping this doesn't require a lot of time to just add a couple cue words.

 

By the way, was is a .psv file?  

 

Thanks.

 

Regards,

Paula

 



From: 
Pei.Chen@childrens.harvard.edu

To: 
dev@ctakes.apache.org

Subject: RE: cTakes: question on updating cue words

Date: Mon, 16 Dec 2013 19:31:45 +0000



[moved to dev@]

Hi Paula,

My suggestion would be to open a Jira item so
 that it could be tracked:

https://issues.apache.org/jira/browse/CTAKES (Feel free to create a new account).

Even cooler if you could attach the affected files with the patch(diffs) and any tests.

--Pei

 

 




From:
 digital paula [mailto:cybersation@hotmail.com]


Sent: Monday, December 16, 2013 1:30 PM

To: 
user@ctakes.apache.org

Subject: cTakes: question on updating cue words



 

Hello again cTAKES Community,

 

I would like to  add additional cue words to polarity (for negation) and uncertainty.    I would so appreciate if someone can let me know how I can add additional cue words.   


 

Thanks.

 

Regards,

Paula 

























 		 	   		  
 		 	   		   		 	   		  

RE: cTakes: question on updating cue words

Posted by digital paula <cy...@hotmail.com>.
Hello again,  I've been busy with figuring out the issue with sectionizer and had forgot about this...still haven't received a response.  :-(  I really hope Matt or anyone familiar with cue word updates can help.
 
Developers/users should be able to  update cue words because it can be subjective as to what category a cue word may fall in.  For example,  'predictive of'  could be a negation or uncertainty depending on the task of taking current or future state into consideration.   There are several cue terms in the cue term files that I'd like to change their assignment.   
 
Who determines what gets classified as negation or speculation and so forth?  Is it a committee?  I really hope someone can explain more on the cue.model and what I stated in previous email.  
 
Thanks. 
 
Regards,
Paula
 
From: cybersation@hotmail.com
To: user@ctakes.apache.org; dev@ctakes.apache.org
Subject: RE: cTakes: question on updating cue words
Date: Tue, 7 Jan 2014 13:36:33 -0500







Hi Matt,
 
I realized that I should have posted this on the developer site.
 
First of all, thanks for your followup a few weeks back.  I hadn't been subscribed to the developer list prior to your post so I didn't see it until Pei mentioned it.
 
Not sure if you saw my response on user list but I wasn't able to get your suggestion to work so I defaulted back to what I was trying to do with updating the medfacts snapshot jar.  After stepping through the code I see that cTAKES was identifying the new cue term from the updated medfacts snapshot jar (I had decompiled the jar, added the new term 'predictive of', then added jar back to cTAKES).  Okay, so cTAKES did identify the new cue term but it's not getting allocated to the 'possible' assertion type.   Since last night I've looked at several of the java files in the medfacts jar and quickly realized that UIMA is not just the foundation of cTAKES, MITRE is too!   
 
I would love to understand how does the cue word type(i.e. 'predictive of') get associated to the assertion type(i.e. 'possible')?     I can't seem to figure that out by looking at the code in the medfacts jar.    I'd like to understand how I can update it so my new cue word gets recognized as a 'possible' assertion type.    The existing words in the speculation are getting the correct 'possible' associate type.  Just the new cue term I added defaults to 'present'.    This leads me to wonder if it's the cue.model that's doing the assignment and has to be updated to recognize new cue word.   I'm hoping you can elucidate further on what the cue.model file is and how it works.  It appears to be a binary of some type.   What tools would be needed to update it?   Hmmm,  the fact that it's a .model extension leads me to believe that it's the result of some hefty machine learning that's contained in that cue.model file.  
 
Thanks so much.
 
Regards,
PaulaFrom: cybersation@hotmail.com
To: user@ctakes.apache.org
Subject: RE: cTakes: question on updating cue words
Date: Sun, 5 Jan 2014 21:38:55 -0500




Happy New Year cTAKES Community! Hopefully everyone's staying warm.
 
Okay, I did try Matt's suggestion from developer site
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3cCED4DCCB.126B0%25mcoarr@mitre.org%3e
 
 but unfortunately it didn't work so I just stepped through the code to see what's going on with how cue words are being used in cTAKES.   I verified that the medfacts snapshot jar that contains  around 30 txt files for cue words are all being called.   So before I tried Matt's recent suggestion I did decompile the medfacts snapshot jar and add the new cue terms to the jar then added it back to cTAKES.... thought it wasn't being recognized but it is.  I had updated the speculation text file with 'predictive of' as a new cue term and while stepping through I see that 'predictive of' was recognized as a cue term from the speculation file.  
 
The problem is that it gets annotated as 'present' not as 'possible'.   That's why I thought the updated cue term wasn't being recognized.  I did a quick test using one of the already stated terms(I used 'improbable') from the speculation text file and sure enough the same file that contains the new cue term of 'predictive of' got annotated as 'possible' which has me wondering about the cue model and how it gets generated.  
 
So to echo what Tim stated, what does the cue model do?  What exactly is that file and how can the contents be viewed and regenerated or updated?   Something clearly has to be updated so 'predictive of' gets annotated as 'possible'.  
 
I'm so close to getting this resolved so I would appreciate any assistance.
 
Thanks.
 
Regards,
Paula
 
From: Timothy.Miller@childrens.harvard.edu
To: user@ctakes.apache.org
Subject: Re: cTakes: question on updating cue words
Date: Tue, 24 Dec 2013 14:19:46 +0000






Actually, I think Matt's suggestion is a bit out of date -- during development we removed the dependency on the lucene dictionary lookup and now the under development version does read those psv files directly.




But this still doesn't help Paula since she's trying to run the current release. I thought Matt or Pei might have some info about whether its possible to modify negation cue words for that release? For example, I can see in the code it uses a "cue model" which
 can be found in ctakes-assertion-res but it is a binary and I'm not sure what kind. Is there any way to modify that file?



Tim





On 12/24/2013 12:09 AM, digital paula wrote:



Thank you, Pei.  I believe I had signed up for the dev list right after Matt posted so I didn't see his email.  I will try it out. 

 

Merry Christmas to you and everyone on the list.  :-)

 

Regards,

Paula

 



Date: Mon, 23 Dec 2013 23:42:30 -0500

Subject: Re: cTakes: question on updating cue words

From: chenpei@apache.org

To: user@ctakes.apache.org



Paula,
Were you able to try Matt's suggestion on dev@?
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3cCED4DCCB.126B0%25mcoarr@mitre.org%3e










On Mon, Dec 23, 2013 at 11:57 AM, digital paula 
<cy...@hotmail.com> wrote:



Hello again cTAKES Community,

 

I think Tim's away for the holidays since I didn't see any  response.   Could someone else assist?  To reiterate, I'd like to manually update the cue words for the polarity and uncertainty features.     Please see below for details.

 

Thanks.

 

Regards,

Paula



 



From: 
cybersation@hotmail.com

To: 
user@ctakes.apache.org


Subject: RE: cTakes: question on updating cue words


Date: Thu, 19 Dec 2013 16:20:25 -0500





Hi Tim,  I just realized that my manual cue word updates didn't take.   :-(

 

I updated these two files from the med-facts.i2b21.2-SNAPSHOT.jar, then rebundled and added back to cTAKES:

1.  negation_cue_list.txt

2.  certainty.txt

 

Is there another file that you know of that needs to be updated?    In the cue folder under the jar contains only text files perhaps there's another text file I need to update, would you know what the files would be or what other updates need to be made? 




 Thanks.

 

Regards,

Paula



Date: Mon, 16 Dec 2013 16:12:29 -0500

From: 
timothy.miller@childrens.harvard.edu

To: 
user@ctakes.apache.org

Subject: Re: cTakes: question on updating cue words



Paula, I think to use the released version of ctakes you will have to do what you proposed - modify the jar. The checked in files (*.psv) that you are finding are for the under-development version.

Tim





On 12/16/2013 03:27 PM, digital paula wrote:



Hi Pei,

 

I don't consider this a bug so not sure why a jira ticket is needed, I just need to add 2 cue words wondering if I can do it manually? 


 

Exploring a bit, I see there are .psv files in Assertion component in the  Cue_Words folder that I updated but it doesn't seem to work.  I also added to the Semantic_Classes folder (in Assertion as well), the cue words in the .txt file and that didn't work
 either.   

 

One thing that I haven't tried is updating the cue words in the org.mitre.medfacts.i2b2.cuefiles package for the jar file:  medfacts-i2b2-1.2.SNAPSHPOT.jar.  That would be a little more work since I need to extract and rebuild jar file and add back to project.


 

I'm kind of on a huge deadline and hoping I can make these changes today so hoping this doesn't require a lot of time to just add a couple cue words.

 

By the way, was is a .psv file?  

 

Thanks.

 

Regards,

Paula

 



From: 
Pei.Chen@childrens.harvard.edu

To: 
dev@ctakes.apache.org

Subject: RE: cTakes: question on updating cue words

Date: Mon, 16 Dec 2013 19:31:45 +0000



[moved to dev@]

Hi Paula,

My suggestion would be to open a Jira item so
 that it could be tracked:

https://issues.apache.org/jira/browse/CTAKES (Feel free to create a new account).

Even cooler if you could attach the affected files with the patch(diffs) and any tests.

--Pei

 

 




From:
 digital paula [mailto:cybersation@hotmail.com]


Sent: Monday, December 16, 2013 1:30 PM

To: 
user@ctakes.apache.org

Subject: cTakes: question on updating cue words



 

Hello again cTAKES Community,

 

I would like to  add additional cue words to polarity (for negation) and uncertainty.    I would so appreciate if someone can let me know how I can add additional cue words.   


 

Thanks.

 

Regards,

Paula 

























 		 	   		  
 		 	   		   		 	   		  

RE: cTakes: question on updating cue words

Posted by digital paula <cy...@hotmail.com>.


Hi Matt,
 
I realized that I should have posted this on the developer site.
 
First of all, thanks for your followup a few weeks back.  I hadn't been subscribed to the developer list prior to your post so I didn't see it until Pei mentioned it.
 
Not sure if you saw my response on user list but I wasn't able to get your suggestion to work so I defaulted back to what I was trying to do with updating the medfacts snapshot jar.  After stepping through the code I see that cTAKES was identifying the new cue term from the updated medfacts snapshot jar (I had decompiled the jar, added the new term 'predictive of', then added jar back to cTAKES).  Okay, so cTAKES did identify the new cue term but it's not getting allocated to the 'possible' assertion type.   Since last night I've looked at several of the java files in the medfacts jar and quickly realized that UIMA is not just the foundation of cTAKES, MITRE is too!   
 
I would love to understand how does the cue word type(i.e. 'predictive of') get associated to the assertion type(i.e. 'possible')?     I can't seem to figure that out by looking at the code in the medfacts jar.    I'd like to understand how I can update it so my new cue word gets recognized as a 'possible' assertion type.    The existing words in the speculation are getting the correct 'possible' associate type.  Just the new cue term I added defaults to 'present'.    This leads me to wonder if it's the cue.model that's doing the assignment and has to be updated to recognize new cue word.   I'm hoping you can elucidate further on what the cue.model file is and how it works.  It appears to be a binary of some type.   What tools would be needed to update it?   Hmmm,  the fact that it's a .model extension leads me to believe that it's the result of some hefty machine learning that's contained in that cue.model file.  
 
Thanks so much.
 
Regards,
PaulaFrom: cybersation@hotmail.com
To: user@ctakes.apache.org
Subject: RE: cTakes: question on updating cue words
Date: Sun, 5 Jan 2014 21:38:55 -0500




Happy New Year cTAKES Community! Hopefully everyone's staying warm.
 
Okay, I did try Matt's suggestion from developer site
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3cCED4DCCB.126B0%25mcoarr@mitre.org%3e
 
 but unfortunately it didn't work so I just stepped through the code to see what's going on with how cue words are being used in cTAKES.   I verified that the medfacts snapshot jar that contains  around 30 txt files for cue words are all being called.   So before I tried Matt's recent suggestion I did decompile the medfacts snapshot jar and add the new cue terms to the jar then added it back to cTAKES.... thought it wasn't being recognized but it is.  I had updated the speculation text file with 'predictive of' as a new cue term and while stepping through I see that 'predictive of' was recognized as a cue term from the speculation file.  
 
The problem is that it gets annotated as 'present' not as 'possible'.   That's why I thought the updated cue term wasn't being recognized.  I did a quick test using one of the already stated terms(I used 'improbable') from the speculation text file and sure enough the same file that contains the new cue term of 'predictive of' got annotated as 'possible' which has me wondering about the cue model and how it gets generated.  
 
So to echo what Tim stated, what does the cue model do?  What exactly is that file and how can the contents be viewed and regenerated or updated?   Something clearly has to be updated so 'predictive of' gets annotated as 'possible'.  
 
I'm so close to getting this resolved so I would appreciate any assistance.
 
Thanks.
 
Regards,
Paula
 
From: Timothy.Miller@childrens.harvard.edu
To: user@ctakes.apache.org
Subject: Re: cTakes: question on updating cue words
Date: Tue, 24 Dec 2013 14:19:46 +0000






Actually, I think Matt's suggestion is a bit out of date -- during development we removed the dependency on the lucene dictionary lookup and now the under development version does read those psv files directly.




But this still doesn't help Paula since she's trying to run the current release. I thought Matt or Pei might have some info about whether its possible to modify negation cue words for that release? For example, I can see in the code it uses a "cue model" which
 can be found in ctakes-assertion-res but it is a binary and I'm not sure what kind. Is there any way to modify that file?



Tim





On 12/24/2013 12:09 AM, digital paula wrote:



Thank you, Pei.  I believe I had signed up for the dev list right after Matt posted so I didn't see his email.  I will try it out. 

 

Merry Christmas to you and everyone on the list.  :-)

 

Regards,

Paula

 



Date: Mon, 23 Dec 2013 23:42:30 -0500

Subject: Re: cTakes: question on updating cue words

From: chenpei@apache.org

To: user@ctakes.apache.org



Paula,
Were you able to try Matt's suggestion on dev@?
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3cCED4DCCB.126B0%25mcoarr@mitre.org%3e










On Mon, Dec 23, 2013 at 11:57 AM, digital paula 
<cy...@hotmail.com> wrote:



Hello again cTAKES Community,

 

I think Tim's away for the holidays since I didn't see any  response.   Could someone else assist?  To reiterate, I'd like to manually update the cue words for the polarity and uncertainty features.     Please see below for details.

 

Thanks.

 

Regards,

Paula



 



From: 
cybersation@hotmail.com

To: 
user@ctakes.apache.org


Subject: RE: cTakes: question on updating cue words


Date: Thu, 19 Dec 2013 16:20:25 -0500





Hi Tim,  I just realized that my manual cue word updates didn't take.   :-(

 

I updated these two files from the med-facts.i2b21.2-SNAPSHOT.jar, then rebundled and added back to cTAKES:

1.  negation_cue_list.txt

2.  certainty.txt

 

Is there another file that you know of that needs to be updated?    In the cue folder under the jar contains only text files perhaps there's another text file I need to update, would you know what the files would be or what other updates need to be made? 




 Thanks.

 

Regards,

Paula



Date: Mon, 16 Dec 2013 16:12:29 -0500

From: 
timothy.miller@childrens.harvard.edu

To: 
user@ctakes.apache.org

Subject: Re: cTakes: question on updating cue words



Paula, I think to use the released version of ctakes you will have to do what you proposed - modify the jar. The checked in files (*.psv) that you are finding are for the under-development version.

Tim





On 12/16/2013 03:27 PM, digital paula wrote:



Hi Pei,

 

I don't consider this a bug so not sure why a jira ticket is needed, I just need to add 2 cue words wondering if I can do it manually? 


 

Exploring a bit, I see there are .psv files in Assertion component in the  Cue_Words folder that I updated but it doesn't seem to work.  I also added to the Semantic_Classes folder (in Assertion as well), the cue words in the .txt file and that didn't work
 either.   

 

One thing that I haven't tried is updating the cue words in the org.mitre.medfacts.i2b2.cuefiles package for the jar file:  medfacts-i2b2-1.2.SNAPSHPOT.jar.  That would be a little more work since I need to extract and rebuild jar file and add back to project.


 

I'm kind of on a huge deadline and hoping I can make these changes today so hoping this doesn't require a lot of time to just add a couple cue words.

 

By the way, was is a .psv file?  

 

Thanks.

 

Regards,

Paula

 



From: 
Pei.Chen@childrens.harvard.edu

To: 
dev@ctakes.apache.org

Subject: RE: cTakes: question on updating cue words

Date: Mon, 16 Dec 2013 19:31:45 +0000



[moved to dev@]

Hi Paula,

My suggestion would be to open a Jira item so
 that it could be tracked:

https://issues.apache.org/jira/browse/CTAKES (Feel free to create a new account).

Even cooler if you could attach the affected files with the patch(diffs) and any tests.

--Pei

 

 




From:
 digital paula [mailto:cybersation@hotmail.com]


Sent: Monday, December 16, 2013 1:30 PM

To: 
user@ctakes.apache.org

Subject: cTakes: question on updating cue words



 

Hello again cTAKES Community,

 

I would like to  add additional cue words to polarity (for negation) and uncertainty.    I would so appreciate if someone can let me know how I can add additional cue words.   


 

Thanks.

 

Regards,

Paula 

























 		 	   		  
 		 	   		  

RE: cTakes: question on updating cue words

Posted by digital paula <cy...@hotmail.com>.


Hi Matt,
 
I realized that I should have posted this on the developer site.
 
First of all, thanks for your followup a few weeks back.  I hadn't been subscribed to the developer list prior to your post so I didn't see it until Pei mentioned it.
 
Not sure if you saw my response on user list but I wasn't able to get your suggestion to work so I defaulted back to what I was trying to do with updating the medfacts snapshot jar.  After stepping through the code I see that cTAKES was identifying the new cue term from the updated medfacts snapshot jar (I had decompiled the jar, added the new term 'predictive of', then added jar back to cTAKES).  Okay, so cTAKES did identify the new cue term but it's not getting allocated to the 'possible' assertion type.   Since last night I've looked at several of the java files in the medfacts jar and quickly realized that UIMA is not just the foundation of cTAKES, MITRE is too!   
 
I would love to understand how does the cue word type(i.e. 'predictive of') get associated to the assertion type(i.e. 'possible')?     I can't seem to figure that out by looking at the code in the medfacts jar.    I'd like to understand how I can update it so my new cue word gets recognized as a 'possible' assertion type.    The existing words in the speculation are getting the correct 'possible' associate type.  Just the new cue term I added defaults to 'present'.    This leads me to wonder if it's the cue.model that's doing the assignment and has to be updated to recognize new cue word.   I'm hoping you can elucidate further on what the cue.model file is and how it works.  It appears to be a binary of some type.   What tools would be needed to update it?   Hmmm,  the fact that it's a .model extension leads me to believe that it's the result of some hefty machine learning that's contained in that cue.model file.  
 
Thanks so much.
 
Regards,
PaulaFrom: cybersation@hotmail.com
To: user@ctakes.apache.org
Subject: RE: cTakes: question on updating cue words
Date: Sun, 5 Jan 2014 21:38:55 -0500




Happy New Year cTAKES Community! Hopefully everyone's staying warm.
 
Okay, I did try Matt's suggestion from developer site
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3cCED4DCCB.126B0%25mcoarr@mitre.org%3e
 
 but unfortunately it didn't work so I just stepped through the code to see what's going on with how cue words are being used in cTAKES.   I verified that the medfacts snapshot jar that contains  around 30 txt files for cue words are all being called.   So before I tried Matt's recent suggestion I did decompile the medfacts snapshot jar and add the new cue terms to the jar then added it back to cTAKES.... thought it wasn't being recognized but it is.  I had updated the speculation text file with 'predictive of' as a new cue term and while stepping through I see that 'predictive of' was recognized as a cue term from the speculation file.  
 
The problem is that it gets annotated as 'present' not as 'possible'.   That's why I thought the updated cue term wasn't being recognized.  I did a quick test using one of the already stated terms(I used 'improbable') from the speculation text file and sure enough the same file that contains the new cue term of 'predictive of' got annotated as 'possible' which has me wondering about the cue model and how it gets generated.  
 
So to echo what Tim stated, what does the cue model do?  What exactly is that file and how can the contents be viewed and regenerated or updated?   Something clearly has to be updated so 'predictive of' gets annotated as 'possible'.  
 
I'm so close to getting this resolved so I would appreciate any assistance.
 
Thanks.
 
Regards,
Paula
 
From: Timothy.Miller@childrens.harvard.edu
To: user@ctakes.apache.org
Subject: Re: cTakes: question on updating cue words
Date: Tue, 24 Dec 2013 14:19:46 +0000






Actually, I think Matt's suggestion is a bit out of date -- during development we removed the dependency on the lucene dictionary lookup and now the under development version does read those psv files directly.




But this still doesn't help Paula since she's trying to run the current release. I thought Matt or Pei might have some info about whether its possible to modify negation cue words for that release? For example, I can see in the code it uses a "cue model" which
 can be found in ctakes-assertion-res but it is a binary and I'm not sure what kind. Is there any way to modify that file?



Tim





On 12/24/2013 12:09 AM, digital paula wrote:



Thank you, Pei.  I believe I had signed up for the dev list right after Matt posted so I didn't see his email.  I will try it out. 

 

Merry Christmas to you and everyone on the list.  :-)

 

Regards,

Paula

 



Date: Mon, 23 Dec 2013 23:42:30 -0500

Subject: Re: cTakes: question on updating cue words

From: chenpei@apache.org

To: user@ctakes.apache.org



Paula,
Were you able to try Matt's suggestion on dev@?
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3cCED4DCCB.126B0%25mcoarr@mitre.org%3e










On Mon, Dec 23, 2013 at 11:57 AM, digital paula 
<cy...@hotmail.com> wrote:



Hello again cTAKES Community,

 

I think Tim's away for the holidays since I didn't see any  response.   Could someone else assist?  To reiterate, I'd like to manually update the cue words for the polarity and uncertainty features.     Please see below for details.

 

Thanks.

 

Regards,

Paula



 



From: 
cybersation@hotmail.com

To: 
user@ctakes.apache.org


Subject: RE: cTakes: question on updating cue words


Date: Thu, 19 Dec 2013 16:20:25 -0500





Hi Tim,  I just realized that my manual cue word updates didn't take.   :-(

 

I updated these two files from the med-facts.i2b21.2-SNAPSHOT.jar, then rebundled and added back to cTAKES:

1.  negation_cue_list.txt

2.  certainty.txt

 

Is there another file that you know of that needs to be updated?    In the cue folder under the jar contains only text files perhaps there's another text file I need to update, would you know what the files would be or what other updates need to be made? 




 Thanks.

 

Regards,

Paula



Date: Mon, 16 Dec 2013 16:12:29 -0500

From: 
timothy.miller@childrens.harvard.edu

To: 
user@ctakes.apache.org

Subject: Re: cTakes: question on updating cue words



Paula, I think to use the released version of ctakes you will have to do what you proposed - modify the jar. The checked in files (*.psv) that you are finding are for the under-development version.

Tim





On 12/16/2013 03:27 PM, digital paula wrote:



Hi Pei,

 

I don't consider this a bug so not sure why a jira ticket is needed, I just need to add 2 cue words wondering if I can do it manually? 


 

Exploring a bit, I see there are .psv files in Assertion component in the  Cue_Words folder that I updated but it doesn't seem to work.  I also added to the Semantic_Classes folder (in Assertion as well), the cue words in the .txt file and that didn't work
 either.   

 

One thing that I haven't tried is updating the cue words in the org.mitre.medfacts.i2b2.cuefiles package for the jar file:  medfacts-i2b2-1.2.SNAPSHPOT.jar.  That would be a little more work since I need to extract and rebuild jar file and add back to project.


 

I'm kind of on a huge deadline and hoping I can make these changes today so hoping this doesn't require a lot of time to just add a couple cue words.

 

By the way, was is a .psv file?  

 

Thanks.

 

Regards,

Paula

 



From: 
Pei.Chen@childrens.harvard.edu

To: 
dev@ctakes.apache.org

Subject: RE: cTakes: question on updating cue words

Date: Mon, 16 Dec 2013 19:31:45 +0000



[moved to dev@]

Hi Paula,

My suggestion would be to open a Jira item so
 that it could be tracked:

https://issues.apache.org/jira/browse/CTAKES (Feel free to create a new account).

Even cooler if you could attach the affected files with the patch(diffs) and any tests.

--Pei

 

 




From:
 digital paula [mailto:cybersation@hotmail.com]


Sent: Monday, December 16, 2013 1:30 PM

To: 
user@ctakes.apache.org

Subject: cTakes: question on updating cue words



 

Hello again cTAKES Community,

 

I would like to  add additional cue words to polarity (for negation) and uncertainty.    I would so appreciate if someone can let me know how I can add additional cue words.   


 

Thanks.

 

Regards,

Paula