You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ctakes.apache.org by "Hari, Sekhar" <se...@cgi.com> on 2019/06/04 13:37:06 UTC

cTAKES output

Hi All -

I see something that is not correct in the cTAKES output for the text below. I sincerely hope somebody can guide me here with my questions at the end. Not sure if I'm doing anything wrong with the cTAKES configuration.

Content:
          "Since the last approved labeling, there has been no submission to LEVAQUIN(r)
          NDAs: NDA 20-634 LEVAQUIN(r) (levofloxacin) Tablets, NDA 20-635
          LEVAQUIN(r) (levofloxacin) Injection, NDA 21-721 LEVAQUIN(r)
          (levofloxacin) Oral Solution."

There are several lines after this. But the only brand name of the drug that is mentioned in the whole document is 'LEVAQUIN' and generic name mentioned is 'levofloxacin'. These names appear at a couple of places in the document, and then there are some disease names mentioned too.

Objective:
Retrieve the generic name and brand name from the text using the cTAKES returned RXNORM codes.

We do a POST of the full text to the API - http://XX.XX.XX.XX/ctakes-web-rest/service/analyze.

...following is the output from API:
{'ANATOMICALSITEMENTION': {'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]']}, 'MEDICATIONMENTION': {'TABLETS': ['START: 474', 'END: 481', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 46992007, CUI: C0039225, TUI: T122]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 385055001, CUI: C0039225, TUI: T122]', '[CODINGSCHEME: RXNORM, CODE: 10311, CUI: C0039225, TUI: T122]'], 'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]'], 'DOC': ['START: 1152', 'END: 1155', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T121]'], 'SOLUTION': ['START: 576', 'END: 584', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 450530, CUI: C1382100, TUI: T122]'], 'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]'], 'LEVAQUIN': ['START: 452', 'END: 460', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 217992, CUI: C0721336, TUI: T121]', '[CODINGSCHEME: RXNORM, CODE: 217992, CUI: C0721336, TUI: T109]'], 'ADR': ['START: 2454', 'END: 2457', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI: T109]', '[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T109]']}, 'DRUGCHANGESTATUSANNOTATION': {}, 'STRENGTHANNOTATION': {}, 'FRACTIONSTRENGTHANNOTATION': {}, 'FREQUENCYUNITANNOTATION': {}, 'DISEASEDISORDERMENTION': {'RUPTURE': ['START: 1579', 'END: 1586', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 125671007, CUI: C3203359, TUI: T037]']}, 'SIGNSYMPTOMMENTION': {'RED': ['START: 1081', 'END: 1084', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 386713009, CUI: C0332575, TUI: T033]'], 'CONTENT': ['START: 2992', 'END: 2999', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 271599002, CUI: C0423896, TUI: T041]'], 'HISTORY': ['START: 34', 'END: 41', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 392521001, CUI: C0262926, TUI: T033]']}, 'ROUTEANNOTATION': {}, 'DATEANNOTATION': {}, 'MEASUREMENTANNOTATION': {}, 'PROCEDUREMENTION': {'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]']}, 'TIMEMENTION': {}, 'STRENGTHUNITANNOTATION': {}}

Questions:

1.       How do we restrict the output to show only RXNORM coding scheme? Please describe with any config change example, if possible.

2.       These are the unique RXNORM codes from the above output: '10311', '3256', '450530', '217992', '3639'. These codes map to the drug names: 'DESOXYCORTICOSTERONE', 'LEVAQUIN', 'DOXORUBICIN'

a.       The text do not mention anything about 'DESOXYCORTICOSTERONE' and 'DOXORUBICIN'. How is cTAKES reporting that?

b.       The text has 'levofloxacin', and an RXNORM code is not returned for this name. Any idea?

3.       How do we enable cTAKES so that it returns only those codes that are available in RxTerms dictionary? None of the RXNORM codes reported above are available in RxTerms.

Thanks
Sekhar H.


RE: cTAKES output

Posted by "Hari, Sekhar" <se...@cgi.com>.
Also, why did cTAKES did not identify ‘levofloxacin’ which is clearly mentioned in the text? It is mentioned within parenthesis. Would the parenthesis cause a problem?

Thanks
Sekhar Hari | AI Program Lead | Health Sciences R&D | Asia Pacific Solutions Delivery Center
+91 814 7027 779 (C)

From: Hari, Sekhar
Sent: Wednesday, June 5, 2019 6:26 AM
To: user@ctakes.apache.org
Subject: RE: cTAKES output

Hi Jessica –

Many thanks for the insight. I see where this is going wrong. Yes, DOC and ADR are present in the text. However, DOC is mentioned as “.doc” which is the representation of a file extension and not a drug from the text perspective. Also, ADR is mentioned in the document as an abbreviation to “Adverse Drug Reaction”.

I think the only way to exclude such words would be to pre-process the text before passing to cTAKES. Is there any other way within cTAKES to achieve this? (Ex: pass a file with stop words and add some of these abbreviations in that)?

Thanks
Sekhar H.

From: Jessica Glover <gl...@gmail.com>>
Sent: Wednesday, June 5, 2019 1:07 AM
To: user@ctakes.apache.org<ma...@ctakes.apache.org>
Subject: Re: cTAKES output

Hi Sekhar,

Do you use the CAS Visual Debugger (CVD), or even a text editor that will show you the character positions of the document text?
I can see from your output that the evidence spans for each RxNorm code are annotated.

Code       Evidence span offsets
10311      793-797
3256       1152-1155
450530   576-584
217992   452-460
3639       2454-2457

Look in those places in your document to find out what language is triggering these codes.

Other things to note:
A common abbreviation of Deoxycorticosterone is "DOC". I bolded where I see DOC and 3256 in your output. Similarly, "ADR" is another way to express Doxorubicin, and I've bolded that in your output as well. See below.

{'ANATOMICALSITEMENTION': {'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]']}, 'MEDICATIONMENTION': {'TABLETS': ['START: 474', 'END: 481', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 46992007, CUI: C0039225, TUI: T122]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 385055001, CUI: C0039225, TUI: T122]', '[CODINGSCHEME: RXNORM, CODE: 10311, CUI: C0039225, TUI: T122]'], 'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]'], 'DOC': ['START: 1152', 'END: 1155', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T121]'], 'SOLUTION': ['START: 576', 'END: 584', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 450530, CUI: C1382100, TUI: T122]'], 'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]'], 'LEVAQUIN': ['START: 452', 'END: 460', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 217992, CUI: C0721336, TUI: T121]', '[CODINGSCHEME: RXNORM, CODE: 217992, CUI: C0721336, TUI: T109]'], 'ADR': ['START: 2454', 'END: 2457', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI: T109]', '[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T109]']}, 'DRUGCHANGESTATUSANNOTATION': {}, 'STRENGTHANNOTATION': {}, 'FRACTIONSTRENGTHANNOTATION': {}, 'FREQUENCYUNITANNOTATION': {}, 'DISEASEDISORDERMENTION': {'RUPTURE': ['START: 1579', 'END: 1586', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 125671007, CUI: C3203359, TUI: T037]']}, 'SIGNSYMPTOMMENTION': {'RED': ['START: 1081', 'END: 1084', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 386713009, CUI: C0332575, TUI: T033]'], 'CONTENT': ['START: 2992', 'END: 2999', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 271599002, CUI: C0423896, TUI: T041]'], 'HISTORY': ['START: 34', 'END: 41', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 392521001, CUI: C0262926, TUI: T033]']}, 'ROUTEANNOTATION': {}, 'DATEANNOTATION': {}, 'MEASUREMENTANNOTATION': {}, 'PROCEDUREMENTION': {'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]']}, 'TIMEMENTION': {}, 'STRENGTHUNITANNOTATION': {}}


Hope this helps,
Jessica

On Tue, Jun 4, 2019 at 1:40 PM gandhi rajan <ga...@gmail.com>> wrote:
Hi Sekhar,

To answer your first question, As per my knowledge, I don't think there are any config change to filter output. You gotta pick and choose the desired output as per your requirement by parsing the output XML.

On Tuesday, June 4, 2019, Hari, Sekhar <se...@cgi.com>> wrote:
Hi All –

I see something that is not correct in the cTAKES output for the text below. I sincerely hope somebody can guide me here with my questions at the end. Not sure if I’m doing anything wrong with the cTAKES configuration.

Content:
          “Since the last approved labeling, there has been no submission to LEVAQUIN®
          NDAs: NDA 20-634 LEVAQUIN® (levofloxacin) Tablets, NDA 20-635
          LEVAQUIN® (levofloxacin) Injection, NDA 21-721 LEVAQUIN®
          (levofloxacin) Oral Solution.”

There are several lines after this. But the only brand name of the drug that is mentioned in the whole document is ‘LEVAQUIN’ and generic name mentioned is ‘levofloxacin’. These names appear at a couple of places in the document, and then there are some disease names mentioned too.

Objective:
Retrieve the generic name and brand name from the text using the cTAKES returned RXNORM codes.

We do a POST of the full text to the API - http://XX.XX.XX.XX/ctakes-web-rest/service/analyze<https://urldefense.proofpoint.com/v2/url?u=http-3A__XX.XX.XX.XX_ctakes-2Dweb-2Drest_service_analyze&d=DwMFaQ&c=H50I6Bh8SW87d_bXfZP_8g&r=GAipXiP0G0TsVpz6BpNhH1DSC_wewj2cdVIV-HVMiag&m=dOPc9E0_D-Pjz4yhOFxZhI7Qtok4PYvBQ9-6Xpd-w44&s=AJZRIsV1fJNXvx9LPRTm8NBxgaPFZAHaxc_zB7Jupkw&e=>.

…following is the output from API:
{'ANATOMICALSITEMENTION': {'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]']}, 'MEDICATIONMENTION': {'TABLETS': ['START: 474', 'END: 481', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 46992007, CUI: C0039225, TUI: T122]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 385055001, CUI: C0039225, TUI: T122]', '[CODINGSCHEME: RXNORM, CODE: 10311, CUI: C0039225, TUI: T122]'], 'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]'], 'DOC': ['START: 1152', 'END: 1155', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T121]'], 'SOLUTION': ['START: 576', 'END: 584', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 450530, CUI: C1382100, TUI: T122]'], 'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]'], 'LEVAQUIN': ['START: 452', 'END: 460', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 217992, CUI: C0721336, TUI: T121]', '[CODINGSCHEME: RXNORM, CODE: 217992, CUI: C0721336, TUI: T109]'], 'ADR': ['START: 2454', 'END: 2457', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI: T109]', '[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T109]']}, 'DRUGCHANGESTATUSANNOTATION': {}, 'STRENGTHANNOTATION': {}, 'FRACTIONSTRENGTHANNOTATION': {}, 'FREQUENCYUNITANNOTATION': {}, 'DISEASEDISORDERMENTION': {'RUPTURE': ['START: 1579', 'END: 1586', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 125671007, CUI: C3203359, TUI: T037]']}, 'SIGNSYMPTOMMENTION': {'RED': ['START: 1081', 'END: 1084', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 386713009, CUI: C0332575, TUI: T033]'], 'CONTENT': ['START: 2992', 'END: 2999', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 271599002, CUI: C0423896, TUI: T041]'], 'HISTORY': ['START: 34', 'END: 41', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 392521001, CUI: C0262926, TUI: T033]']}, 'ROUTEANNOTATION': {}, 'DATEANNOTATION': {}, 'MEASUREMENTANNOTATION': {}, 'PROCEDUREMENTION': {'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]']}, 'TIMEMENTION': {}, 'STRENGTHUNITANNOTATION': {}}

Questions:

1.       How do we restrict the output to show only RXNORM coding scheme? Please describe with any config change example, if possible.

2.       These are the unique RXNORM codes from the above output: '10311', '3256', '450530', '217992', '3639'. These codes map to the drug names: ‘DESOXYCORTICOSTERONE', 'LEVAQUIN', 'DOXORUBICIN’

a.       The text do not mention anything about ‘DESOXYCORTICOSTERONE' and 'DOXORUBICIN’. How is cTAKES reporting that?

b.       The text has ‘levofloxacin’, and an RXNORM code is not returned for this name. Any idea?

3.       How do we enable cTAKES so that it returns only those codes that are available in RxTerms dictionary? None of the RXNORM codes reported above are available in RxTerms.

Thanks
Sekhar H.



--
Regards,
Gandhi

"The best way to find urself is to lose urself in the service of others !!!"


RE: cTAKES output

Posted by "Hari, Sekhar" <se...@cgi.com>.
Hi Jessica –

Many thanks for the insight. I see where this is going wrong. Yes, DOC and ADR are present in the text. However, DOC is mentioned as “.doc” which is the representation of a file extension and not a drug from the text perspective. Also, ADR is mentioned in the document as an abbreviation to “Adverse Drug Reaction”.

I think the only way to exclude such words would be to pre-process the text before passing to cTAKES. Is there any other way within cTAKES to achieve this? (Ex: pass a file with stop words and add some of these abbreviations in that)?

Thanks
Sekhar H.

From: Jessica Glover <gl...@gmail.com>
Sent: Wednesday, June 5, 2019 1:07 AM
To: user@ctakes.apache.org
Subject: Re: cTAKES output

Hi Sekhar,

Do you use the CAS Visual Debugger (CVD), or even a text editor that will show you the character positions of the document text?
I can see from your output that the evidence spans for each RxNorm code are annotated.

Code       Evidence span offsets
10311      793-797
3256       1152-1155
450530   576-584
217992   452-460
3639       2454-2457

Look in those places in your document to find out what language is triggering these codes.

Other things to note:
A common abbreviation of Deoxycorticosterone is "DOC". I bolded where I see DOC and 3256 in your output. Similarly, "ADR" is another way to express Doxorubicin, and I've bolded that in your output as well. See below.

{'ANATOMICALSITEMENTION': {'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]']}, 'MEDICATIONMENTION': {'TABLETS': ['START: 474', 'END: 481', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 46992007, CUI: C0039225, TUI: T122]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 385055001, CUI: C0039225, TUI: T122]', '[CODINGSCHEME: RXNORM, CODE: 10311, CUI: C0039225, TUI: T122]'], 'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]'], 'DOC': ['START: 1152', 'END: 1155', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T121]'], 'SOLUTION': ['START: 576', 'END: 584', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 450530, CUI: C1382100, TUI: T122]'], 'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]'], 'LEVAQUIN': ['START: 452', 'END: 460', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 217992, CUI: C0721336, TUI: T121]', '[CODINGSCHEME: RXNORM, CODE: 217992, CUI: C0721336, TUI: T109]'], 'ADR': ['START: 2454', 'END: 2457', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI: T109]', '[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T109]']}, 'DRUGCHANGESTATUSANNOTATION': {}, 'STRENGTHANNOTATION': {}, 'FRACTIONSTRENGTHANNOTATION': {}, 'FREQUENCYUNITANNOTATION': {}, 'DISEASEDISORDERMENTION': {'RUPTURE': ['START: 1579', 'END: 1586', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 125671007, CUI: C3203359, TUI: T037]']}, 'SIGNSYMPTOMMENTION': {'RED': ['START: 1081', 'END: 1084', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 386713009, CUI: C0332575, TUI: T033]'], 'CONTENT': ['START: 2992', 'END: 2999', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 271599002, CUI: C0423896, TUI: T041]'], 'HISTORY': ['START: 34', 'END: 41', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 392521001, CUI: C0262926, TUI: T033]']}, 'ROUTEANNOTATION': {}, 'DATEANNOTATION': {}, 'MEASUREMENTANNOTATION': {}, 'PROCEDUREMENTION': {'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]']}, 'TIMEMENTION': {}, 'STRENGTHUNITANNOTATION': {}}


Hope this helps,
Jessica

On Tue, Jun 4, 2019 at 1:40 PM gandhi rajan <ga...@gmail.com>> wrote:
Hi Sekhar,

To answer your first question, As per my knowledge, I don't think there are any config change to filter output. You gotta pick and choose the desired output as per your requirement by parsing the output XML.

On Tuesday, June 4, 2019, Hari, Sekhar <se...@cgi.com>> wrote:
Hi All –

I see something that is not correct in the cTAKES output for the text below. I sincerely hope somebody can guide me here with my questions at the end. Not sure if I’m doing anything wrong with the cTAKES configuration.

Content:
          “Since the last approved labeling, there has been no submission to LEVAQUIN®
          NDAs: NDA 20-634 LEVAQUIN® (levofloxacin) Tablets, NDA 20-635
          LEVAQUIN® (levofloxacin) Injection, NDA 21-721 LEVAQUIN®
          (levofloxacin) Oral Solution.”

There are several lines after this. But the only brand name of the drug that is mentioned in the whole document is ‘LEVAQUIN’ and generic name mentioned is ‘levofloxacin’. These names appear at a couple of places in the document, and then there are some disease names mentioned too.

Objective:
Retrieve the generic name and brand name from the text using the cTAKES returned RXNORM codes.

We do a POST of the full text to the API - http://XX.XX.XX.XX/ctakes-web-rest/service/analyze<https://urldefense.proofpoint.com/v2/url?u=http-3A__XX.XX.XX.XX_ctakes-2Dweb-2Drest_service_analyze&d=DwMFaQ&c=H50I6Bh8SW87d_bXfZP_8g&r=GAipXiP0G0TsVpz6BpNhH1DSC_wewj2cdVIV-HVMiag&m=dOPc9E0_D-Pjz4yhOFxZhI7Qtok4PYvBQ9-6Xpd-w44&s=AJZRIsV1fJNXvx9LPRTm8NBxgaPFZAHaxc_zB7Jupkw&e=>.

…following is the output from API:
{'ANATOMICALSITEMENTION': {'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]']}, 'MEDICATIONMENTION': {'TABLETS': ['START: 474', 'END: 481', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 46992007, CUI: C0039225, TUI: T122]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 385055001, CUI: C0039225, TUI: T122]', '[CODINGSCHEME: RXNORM, CODE: 10311, CUI: C0039225, TUI: T122]'], 'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]'], 'DOC': ['START: 1152', 'END: 1155', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T121]'], 'SOLUTION': ['START: 576', 'END: 584', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 450530, CUI: C1382100, TUI: T122]'], 'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]'], 'LEVAQUIN': ['START: 452', 'END: 460', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 217992, CUI: C0721336, TUI: T121]', '[CODINGSCHEME: RXNORM, CODE: 217992, CUI: C0721336, TUI: T109]'], 'ADR': ['START: 2454', 'END: 2457', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI: T109]', '[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T109]']}, 'DRUGCHANGESTATUSANNOTATION': {}, 'STRENGTHANNOTATION': {}, 'FRACTIONSTRENGTHANNOTATION': {}, 'FREQUENCYUNITANNOTATION': {}, 'DISEASEDISORDERMENTION': {'RUPTURE': ['START: 1579', 'END: 1586', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 125671007, CUI: C3203359, TUI: T037]']}, 'SIGNSYMPTOMMENTION': {'RED': ['START: 1081', 'END: 1084', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 386713009, CUI: C0332575, TUI: T033]'], 'CONTENT': ['START: 2992', 'END: 2999', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 271599002, CUI: C0423896, TUI: T041]'], 'HISTORY': ['START: 34', 'END: 41', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 392521001, CUI: C0262926, TUI: T033]']}, 'ROUTEANNOTATION': {}, 'DATEANNOTATION': {}, 'MEASUREMENTANNOTATION': {}, 'PROCEDUREMENTION': {'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]']}, 'TIMEMENTION': {}, 'STRENGTHUNITANNOTATION': {}}

Questions:

1.       How do we restrict the output to show only RXNORM coding scheme? Please describe with any config change example, if possible.

2.       These are the unique RXNORM codes from the above output: '10311', '3256', '450530', '217992', '3639'. These codes map to the drug names: ‘DESOXYCORTICOSTERONE', 'LEVAQUIN', 'DOXORUBICIN’

a.       The text do not mention anything about ‘DESOXYCORTICOSTERONE' and 'DOXORUBICIN’. How is cTAKES reporting that?

b.       The text has ‘levofloxacin’, and an RXNORM code is not returned for this name. Any idea?

3.       How do we enable cTAKES so that it returns only those codes that are available in RxTerms dictionary? None of the RXNORM codes reported above are available in RxTerms.

Thanks
Sekhar H.



--
Regards,
Gandhi

"The best way to find urself is to lose urself in the service of others !!!"


Re: cTAKES output

Posted by Jessica Glover <gl...@gmail.com>.
Hi Sekhar,

Do you use the CAS Visual Debugger (CVD), or even a text editor that will
show you the character positions of the document text?
I can see from your output that the evidence spans for each RxNorm code are
annotated.

Code       Evidence span offsets
10311      793-797
3256       1152-1155
450530   576-584
217992   452-460
3639       2454-2457

Look in those places in your document to find out what language is
triggering these codes.

Other things to note:
A common abbreviation of Deoxycorticosterone is "DOC". I bolded where I see
DOC and 3256 in your output. Similarly, "ADR" is another way to express
Doxorubicin, and I've bolded that in your output as well. See below.

{'ANATOMICALSITEMENTION': {'ORAL': ['START: 793', 'END: 797', 'POLARITY:
1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI:
T030]']}, 'MEDICATIONMENTION': {'TABLETS': ['START: 474', 'END: 481',
'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 46992007, CUI: C0039225,
TUI: T122]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 385055001, CUI: C0039225,
TUI: T122]', '[CODINGSCHEME: RXNORM, CODE: 10311, CUI: C0039225, TUI:
T122]'], 'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1',
'[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]',
'[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]',
'[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]'], '
*DOC*': ['START: 1152', 'END: 1155', 'POLARITY: 1', '[CODINGSCHEME: RXNORM,
CODE: *3256*, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: RXNORM, CODE:
3256, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE:
56156001, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE:
1336006, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE:
1336006, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE:
75029008, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: RXNORM, CODE: 3256,
CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001,
CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001,
CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006,
CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008,
CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008,
CUI: C0011710, TUI: T121]'], 'SOLUTION': ['START: 576', 'END: 584',
'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 450530, CUI: C1382100, TUI:
T122]'], 'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', '[CODINGSCHEME:
SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]'], 'LEVAQUIN':
['START: 452', 'END: 460', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE:
217992, CUI: C0721336, TUI: T121]', '[CODINGSCHEME: RXNORM, CODE: 217992,
CUI: C0721336, TUI: T109]'], '*ADR*': ['START: 2454', 'END: 2457',
'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089,
TUI: T195]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089,
TUI: T109]', '[CODINGSCHEME: RXNORM, CODE: *3639*, CUI: C0013089, TUI:
T195]', '[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T109]',
'[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T195]',
'[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T109]']},
'DRUGCHANGESTATUSANNOTATION': {}, 'STRENGTHANNOTATION': {},
'FRACTIONSTRENGTHANNOTATION': {}, 'FREQUENCYUNITANNOTATION': {},
'DISEASEDISORDERMENTION': {'RUPTURE': ['START: 1579', 'END: 1586',
'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 125671007, CUI: C3203359,
TUI: T037]']}, 'SIGNSYMPTOMMENTION': {'RED': ['START: 1081', 'END: 1084',
'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 386713009, CUI: C0332575,
TUI: T033]'], 'CONTENT': ['START: 2992', 'END: 2999', 'POLARITY: 1',
'[CODINGSCHEME: SNOMEDCT_US, CODE: 271599002, CUI: C0423896, TUI: T041]'],
'HISTORY': ['START: 34', 'END: 41', 'POLARITY: 1', '[CODINGSCHEME:
SNOMEDCT_US, CODE: 392521001, CUI: C0262926, TUI: T033]']},
'ROUTEANNOTATION': {}, 'DATEANNOTATION': {}, 'MEASUREMENTANNOTATION': {},
'PROCEDUREMENTION': {'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1',
'[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]',
'[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]',
'[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]']},
'TIMEMENTION': {}, 'STRENGTHUNITANNOTATION': {}}


Hope this helps,
Jessica

On Tue, Jun 4, 2019 at 1:40 PM gandhi rajan <ga...@gmail.com> wrote:

> Hi Sekhar,
>
> To answer your first question, As per my knowledge, I don't think there
> are any config change to filter output. You gotta pick and choose the
> desired output as per your requirement by parsing the output XML.
>
> On Tuesday, June 4, 2019, Hari, Sekhar <se...@cgi.com> wrote:
>
>> Hi All –
>>
>>
>>
>> I see something that is not correct in the cTAKES output for the text
>> below. I sincerely hope somebody can guide me here with my questions at the
>> end. Not sure if I’m doing anything wrong with the cTAKES configuration.
>>
>>
>>
>> *Content:*
>>
>>           “Since the last approved labeling, there has been no submission
>> to LEVAQUIN®
>>
>>           NDAs: NDA 20-634 LEVAQUIN® (levofloxacin) Tablets, NDA 20-635
>>
>>           LEVAQUIN® (levofloxacin) Injection, NDA 21-721 LEVAQUIN®
>>
>>           (levofloxacin) Oral Solution.”
>>
>>
>>
>> There are several lines after this. But the only brand name of the drug
>> that is mentioned in the whole document is ‘LEVAQUIN’ and generic name
>> mentioned is ‘levofloxacin’. These names appear at a couple of places in
>> the document, and then there are some disease names mentioned too.
>>
>>
>>
>> *Objective:*
>>
>> Retrieve the generic name and brand name from the text using the cTAKES
>> returned RXNORM codes.
>>
>>
>>
>> We do a POST of the full text to the API -
>> http://XX.XX.XX.XX/ctakes-web-rest/service/analyze.
>>
>>
>>
>> *…following is the output from API:*
>>
>> {'ANATOMICALSITEMENTION': {'ORAL': ['START: 793', 'END: 797', 'POLARITY:
>> 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI:
>> T030]']}, 'MEDICATIONMENTION': {'TABLETS': ['START: 474', 'END: 481',
>> 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 46992007, CUI: C0039225,
>> TUI: T122]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 385055001, CUI: C0039225,
>> TUI: T122]', '[CODINGSCHEME: RXNORM, CODE: 10311, CUI: C0039225, TUI:
>> T122]'], 'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1',
>> '[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]',
>> '[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]',
>> '[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]'],
>> 'DOC': ['START: 1152', 'END: 1155', 'POLARITY: 1', '[CODINGSCHEME: RXNORM,
>> CODE: 3256, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: RXNORM, CODE: 3256,
>> CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001,
>> CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006,
>> CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006,
>> CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008,
>> CUI: C0011710, TUI: T125]', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI:
>> C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI:
>> C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI:
>> C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI:
>> C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI:
>> C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI:
>> C0011710, TUI: T121]'], 'SOLUTION': ['START: 576', 'END: 584', 'POLARITY:
>> 1', '[CODINGSCHEME: RXNORM, CODE: 450530, CUI: C1382100, TUI: T122]'],
>> 'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', '[CODINGSCHEME:
>> SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]'], 'LEVAQUIN':
>> ['START: 452', 'END: 460', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE:
>> 217992, CUI: C0721336, TUI: T121]', '[CODINGSCHEME: RXNORM, CODE: 217992,
>> CUI: C0721336, TUI: T109]'], 'ADR': ['START: 2454', 'END: 2457', 'POLARITY:
>> 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI:
>> T195]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI:
>> T109]', '[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T195]',
>> '[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T109]',
>> '[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T195]',
>> '[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T109]']},
>> 'DRUGCHANGESTATUSANNOTATION': {}, 'STRENGTHANNOTATION': {},
>> 'FRACTIONSTRENGTHANNOTATION': {}, 'FREQUENCYUNITANNOTATION': {},
>> 'DISEASEDISORDERMENTION': {'RUPTURE': ['START: 1579', 'END: 1586',
>> 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 125671007, CUI: C3203359,
>> TUI: T037]']}, 'SIGNSYMPTOMMENTION': {'RED': ['START: 1081', 'END: 1084',
>> 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 386713009, CUI: C0332575,
>> TUI: T033]'], 'CONTENT': ['START: 2992', 'END: 2999', 'POLARITY: 1',
>> '[CODINGSCHEME: SNOMEDCT_US, CODE: 271599002, CUI: C0423896, TUI: T041]'],
>> 'HISTORY': ['START: 34', 'END: 41', 'POLARITY: 1', '[CODINGSCHEME:
>> SNOMEDCT_US, CODE: 392521001, CUI: C0262926, TUI: T033]']},
>> 'ROUTEANNOTATION': {}, 'DATEANNOTATION': {}, 'MEASUREMENTANNOTATION': {},
>> 'PROCEDUREMENTION': {'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1',
>> '[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]',
>> '[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]',
>> '[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]']},
>> 'TIMEMENTION': {}, 'STRENGTHUNITANNOTATION': {}}
>>
>>
>>
>> *Questions:*
>>
>> 1.       How do we restrict the output to show only RXNORM coding
>> scheme? Please describe with any config change example, if possible.
>>
>> 2.       These are the unique RXNORM codes from the above output:
>> '10311', '3256', '450530', '217992', '3639'. These codes map to the drug
>> names: ‘DESOXYCORTICOSTERONE', 'LEVAQUIN', 'DOXORUBICIN’
>>
>> a.       The text do not mention anything about ‘DESOXYCORTICOSTERONE'
>> and 'DOXORUBICIN’. How is cTAKES reporting that?
>>
>> b.       The text has ‘levofloxacin’, and an RXNORM code is not returned
>> for this name. Any idea?
>>
>> 3.       How do we enable cTAKES so that it returns only those codes
>> that are available in RxTerms dictionary? None of the RXNORM codes reported
>> above are available in RxTerms.
>>
>>
>>
>> Thanks
>>
>> Sekhar H.
>>
>>
>>
>
>
> --
> Regards,
> Gandhi
>
> "The best way to find urself is to lose urself in the service of others
> !!!"
>
>

Re: cTAKES output

Posted by gandhi rajan <ga...@gmail.com>.
Hi Sekhar,

To answer your first question, As per my knowledge, I don't think there are
any config change to filter output. You gotta pick and choose the desired
output as per your requirement by parsing the output XML.

On Tuesday, June 4, 2019, Hari, Sekhar <se...@cgi.com> wrote:

> Hi All –
>
>
>
> I see something that is not correct in the cTAKES output for the text
> below. I sincerely hope somebody can guide me here with my questions at the
> end. Not sure if I’m doing anything wrong with the cTAKES configuration.
>
>
>
> *Content:*
>
>           “Since the last approved labeling, there has been no submission
> to LEVAQUIN®
>
>           NDAs: NDA 20-634 LEVAQUIN® (levofloxacin) Tablets, NDA 20-635
>
>           LEVAQUIN® (levofloxacin) Injection, NDA 21-721 LEVAQUIN®
>
>           (levofloxacin) Oral Solution.”
>
>
>
> There are several lines after this. But the only brand name of the drug
> that is mentioned in the whole document is ‘LEVAQUIN’ and generic name
> mentioned is ‘levofloxacin’. These names appear at a couple of places in
> the document, and then there are some disease names mentioned too.
>
>
>
> *Objective:*
>
> Retrieve the generic name and brand name from the text using the cTAKES
> returned RXNORM codes.
>
>
>
> We do a POST of the full text to the API - http://XX.XX.XX.XX/ctakes-web-
> rest/service/analyze.
>
>
>
> *…following is the output from API:*
>
> {'ANATOMICALSITEMENTION': {'ORAL': ['START: 793', 'END: 797', 'POLARITY:
> 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI:
> T030]']}, 'MEDICATIONMENTION': {'TABLETS': ['START: 474', 'END: 481',
> 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 46992007, CUI: C0039225,
> TUI: T122]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 385055001, CUI: C0039225,
> TUI: T122]', '[CODINGSCHEME: RXNORM, CODE: 10311, CUI: C0039225, TUI:
> T122]'], 'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1',
> '[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]',
> '[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]',
> '[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]'],
> 'DOC': ['START: 1152', 'END: 1155', 'POLARITY: 1', '[CODINGSCHEME: RXNORM,
> CODE: 3256, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: RXNORM, CODE: 3256,
> CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001,
> CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006,
> CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006,
> CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008,
> CUI: C0011710, TUI: T125]', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI:
> C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI:
> C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 56156001, CUI:
> C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 1336006, CUI:
> C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI:
> C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 75029008, CUI:
> C0011710, TUI: T121]'], 'SOLUTION': ['START: 576', 'END: 584', 'POLARITY:
> 1', '[CODINGSCHEME: RXNORM, CODE: 450530, CUI: C1382100, TUI: T122]'],
> 'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', '[CODINGSCHEME:
> SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]'], 'LEVAQUIN':
> ['START: 452', 'END: 460', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE:
> 217992, CUI: C0721336, TUI: T121]', '[CODINGSCHEME: RXNORM, CODE: 217992,
> CUI: C0721336, TUI: T109]'], 'ADR': ['START: 2454', 'END: 2457', 'POLARITY:
> 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI:
> T195]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI:
> T109]', '[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T195]',
> '[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T109]',
> '[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T195]',
> '[CODINGSCHEME: SNOMEDCT_US, CODE: 372817009, CUI: C0013089, TUI: T109]']},
> 'DRUGCHANGESTATUSANNOTATION': {}, 'STRENGTHANNOTATION': {},
> 'FRACTIONSTRENGTHANNOTATION': {}, 'FREQUENCYUNITANNOTATION': {},
> 'DISEASEDISORDERMENTION': {'RUPTURE': ['START: 1579', 'END: 1586',
> 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 125671007, CUI: C3203359,
> TUI: T037]']}, 'SIGNSYMPTOMMENTION': {'RED': ['START: 1081', 'END: 1084',
> 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 386713009, CUI: C0332575,
> TUI: T033]'], 'CONTENT': ['START: 2992', 'END: 2999', 'POLARITY: 1',
> '[CODINGSCHEME: SNOMEDCT_US, CODE: 271599002, CUI: C0423896, TUI: T041]'],
> 'HISTORY': ['START: 34', 'END: 41', 'POLARITY: 1', '[CODINGSCHEME:
> SNOMEDCT_US, CODE: 392521001, CUI: C0262926, TUI: T033]']},
> 'ROUTEANNOTATION': {}, 'DATEANNOTATION': {}, 'MEASUREMENTANNOTATION': {},
> 'PROCEDUREMENTION': {'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1',
> '[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]',
> '[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]',
> '[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]']},
> 'TIMEMENTION': {}, 'STRENGTHUNITANNOTATION': {}}
>
>
>
> *Questions:*
>
> 1.       How do we restrict the output to show only RXNORM coding scheme?
> Please describe with any config change example, if possible.
>
> 2.       These are the unique RXNORM codes from the above output:
> '10311', '3256', '450530', '217992', '3639'. These codes map to the drug
> names: ‘DESOXYCORTICOSTERONE', 'LEVAQUIN', 'DOXORUBICIN’
>
> a.       The text do not mention anything about ‘DESOXYCORTICOSTERONE'
> and 'DOXORUBICIN’. How is cTAKES reporting that?
>
> b.       The text has ‘levofloxacin’, and an RXNORM code is not returned
> for this name. Any idea?
>
> 3.       How do we enable cTAKES so that it returns only those codes that
> are available in RxTerms dictionary? None of the RXNORM codes reported
> above are available in RxTerms.
>
>
>
> Thanks
>
> Sekhar H.
>
>
>


-- 
Regards,
Gandhi

"The best way to find urself is to lose urself in the service of others !!!"