You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@ctakes.apache.org by bu...@apache.org on 2012/11/13 22:06:57 UTC

svn commit: r838273 - in /websites/staging/ctakes/trunk/content: ./ ctakes/3.0.0/user-guide-3.0.html

Author: buildbot
Date: Tue Nov 13 21:06:56 2012
New Revision: 838273

Log:
Staging update by buildbot for ctakes

Modified:
    websites/staging/ctakes/trunk/content/   (props changed)
    websites/staging/ctakes/trunk/content/ctakes/3.0.0/user-guide-3.0.html

Propchange: websites/staging/ctakes/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Tue Nov 13 21:06:56 2012
@@ -1 +1 @@
-1408929
+1408960

Modified: websites/staging/ctakes/trunk/content/ctakes/3.0.0/user-guide-3.0.html
==============================================================================
--- websites/staging/ctakes/trunk/content/ctakes/3.0.0/user-guide-3.0.html (original)
+++ websites/staging/ctakes/trunk/content/ctakes/3.0.0/user-guide-3.0.html Tue Nov 13 21:06:56 2012
@@ -84,360 +84,7 @@
   </div>
   <div id="contenta">
     <h1 id="ctakes-30-user-guide">cTAKES 3.0 User Guide</h1>
-<p>These instructions are for end users. With these instructions you can install
-cTAKES, configure it, and use it to process text (typically text associated
-with a medical record). If you were planning to expand, change, or modify the
-code within cTAKES, refer to the <a href="/display/VKC/cTAKES+2.5+Developer+Install+Instructions">cTAKES 2.5 Developer Install
-Instructions</a>.</p>
-<p>These instructions will cover installation and a test of the main product
-including trained models for sentence detection and tagging parts of speech,
-dictionaries from a subset of the UMLS, a very small subset of the full LVG
-resource, etc. Optional components will also be described.</p>
-<p>Once you have finished installation of cTAKES, you will be able to see what
-cTAKES is capable of. Further exploitation of the software's ability may
-require following a few additional steps involving what dictionaries are being
-used. These are the last steps in these instructions.</p>
-<h2 id="prerequisites">Prerequisites</h2>
-<p>Step</p>
-<p>Example</p>
-<ol>
-<li>Make sure you have Java 1.6 or higher. Most systems come with Java already
-installed.</li>
-</ol>
-<p>Run this command to check your version.</p>
-<p>If you do not you can install Java from
-<a href="http://www.java.com/en/download/faq/develop.xml">java.com</a>.</p>
-<h2 id="install-ctakes">Install cTAKES</h2>
-<p>Step</p>
-<p>Example</p>
-<ol>
-<li>
-<p>Navigate to the <a href="http://sourceforge.net/projects/ohnlp/files/cTAKES">source downloads for a released
-version</a> on SourceForge</p>
-</li>
-<li>
-<p>Download the <strong>cTAKES-2.5.zip</strong> file.</p>
-</li>
-</ol>
-<p>Save the file to a temporary location on your machine.</p>
-<p><img alt="screenshot illustrating step" src="/download/attachments/75014322/cTAKES-2.5" title="-download-progress.jpg?version=1&amp;modificationDate=1336161194000" /></p>
-<ol>
-<li>Unzip (extract the contents of) the compressed file you downloaded into a
-directory that you want to be the cTAKES install location.</li>
-</ol>
-<p>For example, <strong>Windows</strong>:</p>
-<p><strong>Linux</strong>: <br />
-</p>
-<p>This folder we will call <strong><cTAKES_HOME></strong>. You will need to refer to the
-directory later.</p>
-<p><img alt="screenshot illustrating step" src="/download/attachments/75014322/cTAKES-2.5-Extr" title="acting.jpg?version=1&amp;modificationDate=1336163791000" /></p>
-<p>|</p>
-<h2 id="process-documents-using-ctakes">Process documents using cTAKES</h2>
-<p>This version allows you to test most components bundled in cTAKES in two
-different ways:</p>
-<ol>
-<li>Using cTAKES CAS Visual Debugger (CVD) to view the results stored as XCAS files or run the annotators or</li>
-<li>Using cTAKES collection processing engine (CPE) to process documents in cTAKES_HOME/testdata directory</li>
-</ol>
-<h3 id="cas-visual-debugger-cvd">CAS Visual Debugger (CVD)</h3>
-<p>Step</p>
-<p>Example</p>
-<ol>
-<li>Open a command prompt and change to the cTAKES_HOME directory.</li>
-</ol>
-<p><strong>Windows</strong>: <br />
-</p>
-<p><strong>Linux</strong>: <br />
-</p>
-<p><img alt="" src="/images/icons/emoticons/warning.png" /></p>
-<p><strong>Note</strong><br />
-</p>
-<p>cTAKES_HOME must be your current directory unless you are skilled at setting
-paths on your machine.</p>
-<ol>
-<li>Start the CAS Visual Debugger by running this command:</li>
-</ol>
-<p><strong>Windows</strong>: <br />
-</p>
-<p><strong>Linux</strong>: <br />
-</p>
-<p>The application may take a minute to start on slower hardware.</p>
-<p><img alt="screenshot illustrating step" src="/download/attachments/75014322/worddav6ae43931" title="208c6ae30f6e836859a2bb19.png?version=1&amp;modificationDate=1334686366000" /></p>
-<ol>
-<li>An analysis engine (AE) needs to be loaded in order to process text.</li>
-</ol>
-<p>Use the <strong>Run</strong> -&gt; <strong>Load AE</strong> menu bar command. Navigate to the file</p>
-<p>Click <strong>Open</strong>.</p>
-<p><img alt="screenshot illustrating step" src="/download/attachments/75014322/worddav74c75ed2" title="8c7c02f2be44fe41d0b65a16.png?version=1&amp;modificationDate=1334686366000" /></p>
-<ol>
-<li>Copy the text in the example at the right (next cell) and paste the
-contents into the Text section of CVD, replacing the text that is already
-there.</li>
-</ol>
-<p>This example file can also be found in test data:</p>
-<p>Dr. Nutritious</p>
-<p>Medical Nutrition Therapy for Hyperlipidemia</p>
-<p>Referral from: Julie Tester, RD, LD, CNSD</p>
-<p>Phone contact: (555) 555-1212</p>
-<p>Height: 144 cm Current Weight: 45 kg Date of current weight: 02-29-2001</p>
-<p>Admit Weight: 53 kg BMI: 18 kg/m2</p>
-<p>Diet: General</p>
-<p>Daily Calorie needs (kcals): 1500 calories, assessed as HB + 20% for activity.</p>
-<p>Daily Protein needs: 40 grams, assessed as 1.0 g/kg.</p>
-<p>Pt has been on a 3-day calorie count and has had an average intake of 1100
-calories.</p>
-<p>She was instructed to drink 2-3 cans of liquid supplement to help promote
-weight gain.</p>
-<p>She agrees with the plan and has my number for further assessment. May want a
-Resting</p>
-<p>Metabolic Rate as well. She takes an aspirin a day for knee pain.</p>
-<ol>
-<li>From the menu bar, click <strong>Run</strong> -&gt; <strong>Run AggregatePlaintextProcessor</strong>.</li>
-</ol>
-<p>You'll get a list of all the annotations in the Analysis Results frame.</p>
-<p><img alt="screenshot illustrating step" src="/download/attachments/75014322/worddav2c25a00d" title="f26e72f2c00fae8d12e5c3a5.png?version=1&amp;modificationDate=1334686366000" /></p>
-<ol>
-<li>Named entities are now recognized in this clinical document. Annotations of
-MedicationEventMention and EntityMention are created. To find one, in the
-<strong>Analysis Results frame</strong>, click on the key in front of:</li>
-</ol>
-<p>AnnotationIndex</p>
-<p>uima.tcas.Annotation</p>
-<p>edu.mayo.bmi.uima.core.type.textsem.IdentifiedAnnotation</p>
-<p>edu.mayo.bmi.uima.core.type.textsem.EntityMention</p>
-<p>and</p>
-<p>edu.mayo.bmi.uima.core.type.textsem.EventMention</p>
-<p>edu.mayo.bmi.uima.core.type.textsem.EventMention.MedicationEventMention</p>
-<p>Then select <strong>edu.mayo.bmi.uima.core.type.<strong><em>*textsem.</em></strong>*EntityMention</strong> or <em>
-</em>edu.mayo.bmi.uima.core.type.<strong><em>*textsem.</em></strong><em>EventMention.<strong><em>*Medication</em></strong></em>Ev
-entMention**.This will show an Annotation Index in the lower frame. Select any
-annotation in that lower frame and you will see the text discovered in the
-Text frame on the right. You may close CVD if you wish.</p>
-<p><img alt="screenshot illustrating step" src="/download/attachments/75014322/worddavcfd76769" title="7d1e58970d516f1312a8a6e4.png?version=1&amp;modificationDate=1334686366000" /></p>
-<h3 id="collection-processing-engine-cpe">Collection processing engine (CPE)</h3>
-<p>Step</p>
-<p>Example</p>
-<ol>
-<li>Open a command prompt and change to the cTAKES_HOME directory:</li>
-</ol>
-<p><strong>Windows</strong>: <br />
-</p>
-<p><strong>Linux</strong>: <br />
-</p>
-<p><img alt="" src="/images/icons/emoticons/warning.png" /></p>
-<p><strong>Note</strong><br />
-</p>
-<p>Note that cTAKES_HOME must be your current directory unless you are skilled at
-setting paths on your machine.</p>
-<ol>
-<li>Start the collection processing engine by running this command:</li>
-</ol>
-<p><strong>Windows</strong>: <br />
-</p>
-<p><strong>Linux</strong>: <br />
-</p>
-<p>The application may take a minute to start on slower hardware.</p>
-<p><img alt="screenshot illustrating step" src="/download/attachments/75014322/worddav6ae43931" title="208c6ae30f6e836859a2bb19.png?version=1&amp;modificationDate=1334686366000" /></p>
-<ol>
-<li>This will bring up the Collection Processing Engine Configurator. In the
-Menu bar click <strong>File</strong> &gt; <strong>Open CPE Descriptor</strong></li>
-</ol>
-<p><img alt="screenshot illustrating step" src="/download/attachments/75014322/worddavdc5a7f71" title="827d9e3ef9c3f0c241804365.png?version=1&amp;modificationDate=1334686366000" /></p>
-<ol>
-<li>Navigate to the file</li>
-</ol>
-<p>Click <strong>Open</strong>.</p>
-<p><img alt="screenshot illustrating step" src="/download/attachments/75014322/worddav41b2878f" title="e8e61ee02d2c07e89c106cdd.png?version=1&amp;modificationDate=1334686365000" /></p>
-<ol>
-<li>Click the Play button (green/blue <strong>play arrow</strong> near the bottom).</li>
-</ol>
-<p><img alt="screenshot illustrating step" src="/download/attachments/75014322/worddav9acbe68c" title="6f920e134a8eb49db96025e7.png?version=1&amp;modificationDate=1334686365000" /></p>
-<ol>
-<li>You should see that one document was processed. You did process a
-collection of documents. In this case the collection only contained one just
-to show how to do it. Close the results window.</li>
-</ol>
-<p><img alt="screenshot illustrating step" src="/download/attachments/75014322/worddav42012486" title="af09ef9457b4022474979a83.png?version=1&amp;modificationDate=1334686366000" /></p>
-<ol>
-<li>Close the CPE application. You may be prompted to save changes. Since this
-was just a test you may click the <strong>No</strong> button.</li>
-</ol>
-<p><img alt="screenshot illustrating step" src="/download/attachments/75014322/worddav1c58ee84" title="960f01830ce34429b3d96c3c.png?version=1&amp;modificationDate=1334686366000" /></p>
-<ol>
-<li>Open a new command prompt and change to the <cTAKES_HOME></li>
-</ol>
-<p>No example.</p>
-<ol>
-<li>To test the results there is a comparison tool that will help show that the
-results match expectations with the following syntax:</li>
-</ol>
-<p>Where: <strong><em><First File></em></strong> is the first file to compare; <strong><em><Second File></em></strong> is
-the second file to compare; <strong><em><diff-html></em></strong> is where the results are written
-to</p>
-<p>Copy and paste the example at the right (next cell) which has had our example
-files already substituted into a command prompt to run. In this case we have
-shipped an example of what the output should be for you to compare against.</p>
-<p><strong>Windows</strong>:</p>
-<p><strong>Linux</strong>:</p>
-<ol>
-<li>The resulting file will open for you. Look at the comparison to see the
-annotations resulting from this pipeline.</li>
-</ol>
-<p><strong>Windows:</strong></p>
-<p><strong>Linux</strong>:</p>
-<p><img alt="screenshot illustrating step" src="/download/attachments/75014322/worddavefef1435" title="06073a5a3d6c91cbbab2c686.png?version=1&amp;modificationDate=1334686366000" /></p>
-<p>Using the same CVD and CPE programs in the manner described above, you can
-test all the other components. The analysis engines and collection processing
-engines shipped with cTAKES for some of the annotators are described in the
-following table.</p>
-<p>Annotator</p>
-<p>Description</p>
-<p>Abbreviated</p>
-<p>Example Analysis Engine (AE)</p>
-<p>Example Collection processing Engine (CPE)</p>
-<p>Example test data</p>
-<p>Clinical Document Pipeline</p>
-<p>the complete cTAKES pipeline to obtain majority of cTAKES annotations</p>
-<p>cdp</p>
-<p>cTAKES_HOME/cTAKESdesc/cdpdesc/analysis_engine/AggregatePlaintextProcessor.xml</p>
-<p>cTAKES_HOME/cTAKESdesc/cdpdesc/collection_processing_engine/test_plaintext.xml</p>
-<p>cTAKES_HOME/testdata/cdptest</p>
-<p>Chunker</p>
-<p>obtain cTAKES chunking annotations</p>
-<p>chunker</p>
-<p>cTAKES_HOME/cTAKESdesc/chunkerdesc/analysis_engine/ChunkerAggregate.xml</p>
-<p>cTAKES_HOME/cTAKESdesc/chunkerdesc/collection_processing_engine/ChunkerCPE.xml</p>
-<p>cTAKES_HOME/testdata/chunkertest</p>
-<p>Dependency Parser</p>
-<p>obtain dependency parsing tree</p>
-<p>dp</p>
-<p>cTAKES_HOME/cTAKESdesc/dpdesc/analysis_engine/ClearParserTokenizedInfPosAggreg
-ate.xml</p>
-<p>cTAKES_HOME/cTAKESdesc/dpdesc/collection_processing_engine/ClearParserCPE.xml</p>
-<p>cTAKES_HOME/testdata/dptest</p>
-<p>Drug NER</p>
-<p>the annotator to obtain drug annotations</p>
-<p>drugner</p>
-<p>cTAKES_HOME/cTAKESdesc/drugnerdesc/analysis_engine/DrugAggregatePlaintextProce
-sor.xml</p>
-<p>cTAKES_HOME/cTAKESdesc/drugnerdesc/collection_processing_engine/DrugNER_PlainT
-ext_CPE.xml</p>
-<p>cTAKES_HOME/testdata/drugnertest</p>
-<p>Dictionary Lookup</p>
-<p>mapping cTAKES annotations to dictionaries (e.g., SNOMED_CT or RxNorm</p>
-<p>lookup</p>
-<p>cTAKES_HOME/cTAKESdesc/lookupdesc/analysis_engine/TestAggregateTAE.xml</p>
-<p>cTAKES_HOME/cTAKESdesc/lookupdesc/collection_processing_engine/LookupCPE.xml</p>
-<p>cTAKES_HOME/testdata/lookuptest</p>
-<p>PAD Term Spotter</p>
-<p>identifying terms related to PAD</p>
-<p>pad</p>
-<p>cTAKES_HOME/cTAKESdesc/paddesc/analysis_engine/Radiology_TermSpotterAnnotatorT
-AE.xml</p>
-<p>cTAKES_HOME/cTAKESdesc/paddesc/collection_processing_engine/Radiology_Sample.x
-ml</p>
-<p>cTAKES_HOME/testdata/padtest</p>
-<p>Smoking Status</p>
-<p>the annotator to obtain document or patient-level smoking status</p>
-<p>smoking</p>
-<p>cTAKES_HOME/cTAKESdesc/smokingdesc/analysis_engine/SimulatedProdSmokingTAE.xml</p>
-<p>cTAKES_HOME/cTAKESdesc/smokingdesc/collection_processing_engine/Sample_Smoking
-Status_output_flatfile.xml</p>
-<p>cTAKES_HOME/testdata/smokingtest</p>
-<p>Side Effect</p>
-<p>the annotator to find side effect mentions and sentences from clinical
-documents</p>
-<p>sideeffect</p>
-<p>cTAKES_HOME/cTAKESdesc/sideeffectdesc/analysis_engine/SideEffectAggregateTAE.x
-ml</p>
-<p>cTAKES_HOME/cTAKESdesc/sideeffectdesc/collection_processing_engine/SideEffectC
-PE.xml</p>
-<p>cTAKES_HOME/testdata/sideeffecttest</p>
-<h2 id="next-steps">Next Steps</h2>
-<p>The <a href="/display/VKC/cTAKES+2.5+Component+Use+Guide">cTAKES 2.5 Component Use
-Guide</a> will help you to
-understand in great detail each of the cTAKES components that have been
-installed. In some cases you can learn how to improve the components. However,
-before you go on to process text in production you will need to consider
-dictionaries and models.</p>
-<h3 id="dictionaries">Dictionaries</h3>
-<h4 id="bundled-umls-dictionaries">Bundled UMLS Dictionaries</h4>
-<p>cTAKES includes the complete UMLS (SNOMED-CT and RxNorm) dictionaries.</p>
-<ul>
-<li>An rxnorm_index database (a Lucene index) containing drug names from RxNorm</li>
-<li>A UMLS database (using two hsqldb tables) containing anatomical sites, procedures, signs/symptoms, and disorders/diseases from SNOMED-CT (umls_ms_2011ab)</li>
-</ul>
-<p>To use them, you must have a UMLS username and password, and an Internet
-connection.</p>
-<p><img alt="" src="/images/icons/emoticons/warning.png" /></p>
-<p><strong>Note</strong><br />
-If you do not have a UMLS username and password, you may request one at <a href="https://uts.nlm.nih.gov/license.html">UMLS
-Terminology Services</a></p>
-<p>In order to use the UMLS dictionaries shipped with cTAKES you will need to do
-two things:</p>
-<p>(1) Change the UMLSUser and UMLSPW <nameValuePair> strings in these descriptor
-files with your UMLS username and password.</p>
-<ul>
-<li>Dictionary Lookup: <cTAKES_HOME>/cTAKESdesc/lookupdesc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml</li>
-<li>(optional) Drug NER: <cTAKES_HOME>/cTAKESdesc/drugnerdesc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml</li>
-</ul>
-<p>The following shows where in the files you would make the changes. (Do not
-change the <configurationParameters> by the same name.)</p>
-<p>(2) Include the DictionaryLookupAnnotatorUMLS.xml Analysis Engine within your
-aggregate Analysis Engine or switch to the ones provided by cTAKES. cTAKES has
-provided duplicates of shipped Analysis Engine descriptors, put UMLS in the
-name, and placed DictionaryLookupAnnotatorUMLS.xml within them for these
-components:</p>
-<ul>
-<li>Dictionary Lookup</li>
-<li>Clinical Documents pipeline</li>
-<li>Drug NER</li>
-<li>Side Effect</li>
-</ul>
-<p>So you simply need to switch to using those descriptors. For example, if you
-were using AggregateCdaProcessor.xml in the Clinical Documents pipeline you
-would switch to using AggregateCdaUMLSProcessor.xml instead and you will now
-hook into the complete dictionaries.</p>
-<p>You can, of course, modify your own aggregate Analysis Engine files and place
-the DictionaryLookupAnnotatorUMLS.xml Analysis Engine within them.</p>
-<p>Since this is an in-memory database implementation, please be patient during
-the initial load as it could take approximately 20-30 seconds for the database
-to initialize.</p>
-<p>If you would like to go back to using the small sample dictionaries that do
-not require a UMLS username, use the DictionaryLookupAnnotator.xml (UMLS is
-not in the file name) Analyis Engine descriptor in your aggregate. Just
-removing your password from the DictionaryLookupAnnotatorUMLS.xml files will
-not switch you back to the small sample dictionaries.</p>
-<h4 id="lvg">LVG</h4>
-<p>We have successfully tested the 2008 release of the full <a href="http://lexsrv2.
-nlm.nih.gov/LexSysGroup/Projects/lvg/current/docs/userDoc/tools/lvg.html">LVG</a>
-data. In order to use this release of the full LVG data you should:</p>
-<ol>
-<li>Download either the full version or the lite version from <a href="http://lexsrv2.nlm.nih.gov/LexSysGroup/Projects/lvg/2008/web/download.html">NIH Lexical Tools</a></li>
-<li>Extract the TGZ file that you downloaded with a tool like 7-zip (available online) to a temporary directory. On some operating systems, like Windows, this may need to be done in two steps, 1) to uncompress and 2) to unzip.</li>
-<li>Replace the directory <cTAKES_HOME>/resources/lvgresources/lvg/data/HSqlDb with data/HSqlDb from your extracted download. Replacing the entire directory is appropriate.</li>
-<li>In the future, you can upgrade to later versions of LVG by editing the <cTAKES_HOME>/resources/lvgresources/lvg/data/config/lvg.properties file, replacing "lvg2008" with the name of the new release.</li>
-</ol>
-<h4 id="building-your-own-dictionaries">Building Your Own Dictionaries</h4>
-<p>To install customized dictionaries for RxNorm, SNOMED-CT, or other
-vocabularies that are available through the UMLS, see the following posts on
-the cTAKES forums:</p>
-<ul>
-<li><a href="https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=28&amp;t=423">https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=28&amp;t=423</a></li>
-<li><a href="https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=28&amp;t=80&amp;start=20#p1459">https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=28&amp;t=80&amp;start=20#p1459</a></li>
-</ul>
-<h3 id="models">Models</h3>
-<p>Some models included in cTAKES may not represent your data distribution well.
-If you want to build or train your own models, please read the <a href="/display/VKC/cTAKES+2.5+Component+Use+Guide">cTAKES 2.5
-Component Use Guide</a>,
-particularly:</p>
-<ul>
-<li><a href="https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+-+Core#cTAKES2.5-Core-ToolsTrainingasentencedetectormodel">Training a sentence detector model</a></li>
-<li>Training a Part of Speech (POS) tagger model (Building a model Obtaining training data)</li>
-<li>Creating a Part of Speech (POS) tag dictionary (Building a tag dictionary)</li>
-<li>Training a chunker model (Building a model - Prepare GENIA training data)</li>
-<li>Training a dependency parser (Dependency Parser)</li>
-</ul>
+<p>Coming soon ...</p>
   </div>
  
  <div id="footera">