You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uima.apache.org by pk...@apache.org on 2013/06/09 18:23:25 UTC

svn commit: r1491243 - in /uima/sandbox/ruta/trunk/ruta-docbook/src/docbook: tools.ruta.language.anchoring.xml tools.ruta.language.xml

Author: pkluegl
Date: Sun Jun  9 16:23:25 2013
New Revision: 1491243

URL: http://svn.apache.org/r1491243
Log:
UIMA-2975
- added section about matching order

Added:
    uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.anchoring.xml
Modified:
    uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.xml

Added: uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.anchoring.xml
URL: http://svn.apache.org/viewvc/uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.anchoring.xml?rev=1491243&view=auto
==============================================================================
--- uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.anchoring.xml (added)
+++ uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.anchoring.xml Sun Jun  9 16:23:25 2013
@@ -0,0 +1,53 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
+<!ENTITY imgroot "images/tools/ruta/language/" >
+<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >  
+%uimaents;
+]>
+<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor 
+  license agreements. See the NOTICE file distributed with this work for additional 
+  information regarding copyright ownership. The ASF licenses this file to 
+  you under the Apache License, Version 2.0 (the "License"); you may not use 
+  this file except in compliance with the License. You may obtain a copy of 
+  the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required 
+  by applicable law or agreed to in writing, software distributed under the 
+  License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS 
+  OF ANY KIND, either express or implied. See the License for the specific 
+  language governing permissions and limitations under the License. -->
+
+<section id="ugr.tools.ruta.language.anchoring">
+  <title>Matching order</title>
+  <para>
+    If not specified otherwise, then the UIMA Ruta rules normally start the matching 
+    process with their first rule element. The first rule element searches for possible positions for its matching
+    condition and then will advise the next rule element to continue the matching process.
+    For that reason, writing rules that contain a first rule element with an optional quantifier is discouraged 
+    and will result in ignoring the optional attribute of the quantifier.
+  </para>
+  <para>
+    The starting rule element can also be manually specified by adding <quote>@</quote> directly in front of the matching condition.
+    In the following example, the rule first searches for capitalized words (CW) and then checks whether 
+    there is a period in front of the matched word.
+    <programlisting><![CDATA[PERIOD @CW;]]></programlisting>
+    This functionality can also be used for rules that start with an optional rule element by manually specifying a later
+    rule element to start the matching process.
+  </para>
+  <para>
+    The choice of the starting rule element can greatly influence the performance speed of the rule execution. 
+    This circumstance is illustrated with the following example that contains two rules, whereas already an annotation 
+    of the type <quote>LastToken</quote> was added to the last token of the document:
+    <programlisting><![CDATA[ANY LastToken;
+ANY @LastToken;]]></programlisting>
+    The first rule matches on each token of the document and checks whether the next annotation is the last token of the document.
+    This will result in many index operations because all tokens of the document are considered. 
+    The second rule, however, matches on the last token and then checks if there is any token in front of it. This
+    rule, therefore, considers only one token. 
+  </para>
+  <para>
+    The UIMA Ruta language provides also a concept for automatically selecting the starting rule element called dynamic anchoring.
+    Here, a simple heuristic concerning the position of the rule element and the involved types is applied in order to identify
+    the favorable rule element. This functionality can be activated in the <link linkend="ugr.tools.ruta.ae.basic.parameter">configuration parameters</link> of the analysis engine or 
+    directly in the script file with the <link linkend="ugr.tools.ruta.language.actions.dynamicanchoring">DYNAMICANCHORING</link> action. 
+  </para>
+</section>
\ No newline at end of file

Modified: uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.xml
URL: http://svn.apache.org/viewvc/uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.xml?rev=1491243&r1=1491242&r2=1491243&view=diff
==============================================================================
--- uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.xml (original)
+++ uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.xml Sun Jun  9 16:23:25 2013
@@ -33,7 +33,8 @@ under the License.
 
   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
     href="tools.ruta.language.syntax.xml" />
-
+  <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
+    href="tools.ruta.language.anchoring.xml" />
   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
     href="tools.ruta.language.basic_annotations.xml" />
   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"