You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ctakes.apache.org by "Miller, Timothy" <Ti...@childrens.harvard.edu> on 2018/11/07 16:53:00 UTC

Re: Recognising Concept and its Value for text without space [EXTERNAL]

Hi Zakir,
I think the problem here is that the default tokenizer will never split up a string like POD10 into ['POD', '10'] since there is no whitespace. The dictionary lookup uses tokens as the unit of analysis, so unless something like POD10 is in the dictionary database you will not get a hit for POD (which I assume is what you wanted). The only solution I can think of is to write your own tokenizer class, and swap it for the default tokenizer and re-run your pipeline.
Tim

-----Original Message-----
From: Zakir Saifi <zakir.saifi@raxa.com<mailto:Zakir%20Saifi%20%3czakir.saifi@raxa.com%3e>>
Reply-to: <de...@ctakes.apache.org>
To: dev@ctakes.apache.org<ma...@ctakes.apache.org>
Subject: Recognising Concept and its Value for text without space [EXTERNAL]
Date: Thu, 1 Nov 2018 16:38:41 +0530

Hi, Everyone. I want Ctakes, to recognise a concept its value from the text
for those strings in which there is no space between concept and its value
For eg. POD10 (Post Operative Day 10), Pulse120. How can I achieve this in
Ctakes?