You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@ctakes.apache.org by bu...@apache.org on 2012/11/15 23:40:53 UTC

svn commit: r838537 - in /websites/staging/ctakes/trunk/content: ./ ctakes/2.6.0/ctakes-2.6-Constituency-Parser.html

Author: buildbot
Date: Thu Nov 15 22:40:52 2012
New Revision: 838537

Log:
Staging update by buildbot for ctakes

Added:
    websites/staging/ctakes/trunk/content/ctakes/2.6.0/ctakes-2.6-Constituency-Parser.html
Modified:
    websites/staging/ctakes/trunk/content/   (props changed)

Propchange: websites/staging/ctakes/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Thu Nov 15 22:40:52 2012
@@ -1 +1 @@
-1410075
+1410076

Added: websites/staging/ctakes/trunk/content/ctakes/2.6.0/ctakes-2.6-Constituency-Parser.html
==============================================================================
--- websites/staging/ctakes/trunk/content/ctakes/2.6.0/ctakes-2.6-Constituency-Parser.html (added)
+++ websites/staging/ctakes/trunk/content/ctakes/2.6.0/ctakes-2.6-Constituency-Parser.html Thu Nov 15 22:40:52 2012
@@ -0,0 +1,140 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head>
+<!--
+ 
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+ 
+       http://www.apache.org/licenses/LICENSE- 2.0
+ 
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+<link href="/ctakes/css/ctakes.css" rel="stylesheet" type="text/css">
+
+<title>cTAKES 2.6 Constituency Parser</title>
+<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+
+</head>
+ 
+<body>
+ <div class="banner">
+      <div id="bannerleft">
+		<a href="http://www.apache.org/"><img src="http://www.apache.org/images/asf_logo_wide.gif" alt="The Apache Software Foundation" border="0"/></a>
+	<br/>
+			<img alt="cTAKES logo" src="/ctakes/images/ctakes_logo.jpg" border="0"/>
+      </div>  
+    <div id="bannerright">	
+	      <img id="asf-logo" alt="Apache Incubator" src="http://incubator.apache.org/images/egg-logo.png" border="0"/></a>			
+	  </div>
+ </div>  
+  <div id="clear"></div>
+
+
+  <div id="sidenav">
+    <h1 id="general">General</h1>
+<ul>
+<li><a href="/ctakes/index.html">About</a></li>
+<li><a href="/ctakes/gettingstarted.html">Getting Started</a></li>
+<li><a href="/ctakes/downloads.html">Downloads</a></li>
+<li><a href="/ctakes/glossary.html">Glossary</a></li>
+</ul>
+<h1 id="community">Community</h1>
+<ul>
+<li><a href="/ctakes/get-involved.html">Get Involved</a></li>
+<li><a href="https://issues.apache.org/jira/browse/ctakes">Bug Tracker</a></li>
+<li><a href="/ctakes/mailing-lists.html">Mailing Lists</a></li>
+<li><a href="/ctakes/people.html">People</a></li>
+<li><a href="http://incubator.apache.org/projects/ctakes.html">Incubator page</a></li>
+<li><a href="/ctakes/license.html">License</a></li>
+<li><a href="/ctakes/history.html">History</a></li>
+<li><a href="/ctakes/community-faqs.html">Community FAQs</a></li>
+</ul>
+<h1 id="users">Users</h1>
+<ul>
+<li><a href="/ctakes/userguide.html">User Guide</a></li>
+<li><a href="/ctakes/user-faqs.html">User FAQs</a></li>
+</ul>
+<h1 id="developers">Developers</h1>
+<ul>
+<li><a href="/ctakes/developerguide.html">Developer Guide</a></li>
+<li><a href="/ctakes/developer-faqs.html">Developer FAQs</a></li>
+</ul>
+<h1 id="ppmc">PPMC</h1>
+<ul>
+<li><a href="/ctakes/ppmc-faqs.html">PPMC FAQs</a></li>
+<li><a href="/ctakes/ctakes-release-guide.html">Release Guide</a> <br />
+</li>
+</ul>
+<h1 id="asf">ASF</h1>
+<ul>
+<li><a href="http://www.apache.org">Apache Software Foundation</a></li>
+<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+<li><a href="http://www.apache.org/foundation/sponsorship.html">Become a Sponsor</a></li>
+</ul>
+  </div>
+  <div id="contenta">
+    <h1 id="ctakes-26-constituency-parser-optional">cTAKES 2.6 - Constituency Parser (optional)</h1>
+<h2 id="overview-of-constituency-parser">Overview of Constituency Parser</h2>
+<p>This parser is a wrapper around the OpenNLP parser.</p>
+<p>Since this component relies on other components as input (sentence and
+tokenizaton mainly), it contains configuration files that point at those
+components. These use relative path names for portability, but do require that
+the project be extracted at the same level as the other cTAKES components. For
+example, if your directory structure is:</p>
+<p>ctakes/core</p>
+<p>ctakes/clinical documents pipeline</p>
+<p>ctakes/...</p>
+<p>you want it to look like the following after extracting this component:</p>
+<p>ctakes/Constituency Parser</p>
+<p>Once placed there, the component can be imported to Eclipse using File -&gt;
+Import -&gt; Existing projects into workspace...</p>
+<p>The constituency parser component includes a few different UIMA analysis
+engines (AEs) for different use cases:</p>
+<ul>
+<li>AggregateParsingProcessor is mostly for testing and validation. You can run this in the UIMA Cas Visual Debugger (CVD) by running the cvd launch configuration (in resources/launch/UIMA_CVD-Constituency Parser.launch (right-click on the file in the package explorer -&gt; run as... -&gt; UIMA_CVD-Constituency Parser.launch</li>
+</ul>
+<p>Once the CVD window opens, load the AE with Run-&gt;Load AE..., and navigatd to:</p>
+<p>Constituency Parser/desc/analysis_engines/AggregateParsingProcessor.xml</p>
+<p>Load some text either by manually entering it or with File -&gt; Open text
+file..., then Run &gt; Run AE.</p>
+<ul>
+<li>ConstituencyParserAnnotator.xml is a standalone annotator that is meant to be incorporated into cTAKES pipelines (for example, upstream from the coreference component)</li>
+</ul>
+<p>Both of the above AEs assume some pre-processing as input, namely Sentence and
+Token segmentation. They also obviously depend on the quality of those
+components for quality output. With some notes the sentence segmenter does not
+reliably work and the parser will perform poorly (UPMC notes are known to
+cause trouble).</p>
+<ul>
+<li>ParserEvaluatorAnnotator.xml is used mainly internally for evaluating the parser but may be useful for anyone else interested in parser research. It is used as part of the collection processing engine ParsingCPE.xml, which reads a line at a time from a file, where each line contains whitespace-separated tokens of a single sentence.</li>
+</ul>
+<p>Parser models: This release contains two different models. The default is
+located in resources/parsermodel, and will be used if no configuration
+settings are changed. It is trained on a combination of domain-specific and
+general domain text. Domain specific text includes clinical notes, medpedia
+articles, cohort queries, clinical questions. General domain text is the Wall
+Street Journal section of the Penn Treebank.</p>
+<p>The second model is in resources/fastmodel. This model is trained only on the
+in-domain data. As a result, our preliminary (unpublished) experiments showed
+it to be a little less accurate and a little faster.</p>
+  </div>
+ 
+ <div id="footera">
+    <div id="copyrighta">
+      <p>Copyright &#169; 2011 The Apache Software Foundation, Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.<br/>Apache and the Apache feather logo are trademarks of The Apache Software Foundation.</p>
+    </div>
+ </div>
+ 
+</body>
+</html>
+