You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by Sander Puts <sa...@maastro.nl> on 2018/04/20 12:31:22 UTC

Non English custom dictionaries?

Dear,

Is it possible to use the cTakes Dictionary Creator to build a dictionary for other languages than English?

So far my custom created dictionaries contain only English terms or are empty.

E.g.
When I only install https://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/LNC-NL-NL/stats.html and build my custom dictionary selecting source/target NLC-NL-NL and TUI T-201, the resulting dictionary is empty.

Thanks in advance,

Sander Puts
Software Engineer / Researcher
MAASTRO clinic

Re: Non English custom dictionaries? [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Sander Puts,

The trunk version of the dictionary creator will use languages other than English.  You will know if you have the correct version because there will be a list of checkboxes with ISO language codes.

Make sure that when you extract your umls .rrf files using the nih tool that you specify whatever languages you want.  Please note that umls coverage of non-english languages is a little sparse.

Sean 


________________________________________
From: Sander Puts <sa...@maastro.nl>
Sent: Friday, April 20, 2018 8:31 AM
To: dev@ctakes.apache.org
Subject: Non English custom dictionaries? [EXTERNAL]

Dear,

Is it possible to use the cTakes Dictionary Creator to build a dictionary for other languages than English?

So far my custom created dictionaries contain only English terms or are empty.

E.g.
When I only install https://urldefense.proofpoint.com/v2/url?u=https-3A__www.nlm.nih.gov_research_umls_sourcereleasedocs_current_LNC-2DNL-2DNL_stats.html&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=xw6bE83zH1XD2Nhw0jvP2e-JPJ-02bcSb_I9NwbRZcY&s=2i8oEi0Znf5alUKITkQrHxBSgiUmjfDp7eaMrYmIIJE&e= and build my custom dictionary selecting source/target NLC-NL-NL and TUI T-201, the resulting dictionary is empty.

Thanks in advance,

Sander Puts
Software Engineer / Researcher
MAASTRO clinic