You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uima.apache.org by pk...@apache.org on 2013/06/09 18:23:25 UTC
svn commit: r1491243 - in /uima/sandbox/ruta/trunk/ruta-docbook/src/docbook:
tools.ruta.language.anchoring.xml tools.ruta.language.xml
Author: pkluegl
Date: Sun Jun 9 16:23:25 2013
New Revision: 1491243
URL: http://svn.apache.org/r1491243
Log:
UIMA-2975
- added section about matching order
Added:
uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.anchoring.xml
Modified:
uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.xml
Added: uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.anchoring.xml
URL: http://svn.apache.org/viewvc/uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.anchoring.xml?rev=1491243&view=auto
==============================================================================
--- uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.anchoring.xml (added)
+++ uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.anchoring.xml Sun Jun 9 16:23:25 2013
@@ -0,0 +1,53 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
+<!ENTITY imgroot "images/tools/ruta/language/" >
+<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >
+%uimaents;
+]>
+<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor
+ license agreements. See the NOTICE file distributed with this work for additional
+ information regarding copyright ownership. The ASF licenses this file to
+ you under the Apache License, Version 2.0 (the "License"); you may not use
+ this file except in compliance with the License. You may obtain a copy of
+ the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required
+ by applicable law or agreed to in writing, software distributed under the
+ License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS
+ OF ANY KIND, either express or implied. See the License for the specific
+ language governing permissions and limitations under the License. -->
+
+<section id="ugr.tools.ruta.language.anchoring">
+ <title>Matching order</title>
+ <para>
+ If not specified otherwise, then the UIMA Ruta rules normally start the matching
+ process with their first rule element. The first rule element searches for possible positions for its matching
+ condition and then will advise the next rule element to continue the matching process.
+ For that reason, writing rules that contain a first rule element with an optional quantifier is discouraged
+ and will result in ignoring the optional attribute of the quantifier.
+ </para>
+ <para>
+ The starting rule element can also be manually specified by adding <quote>@</quote> directly in front of the matching condition.
+ In the following example, the rule first searches for capitalized words (CW) and then checks whether
+ there is a period in front of the matched word.
+ <programlisting><![CDATA[PERIOD @CW;]]></programlisting>
+ This functionality can also be used for rules that start with an optional rule element by manually specifying a later
+ rule element to start the matching process.
+ </para>
+ <para>
+ The choice of the starting rule element can greatly influence the performance speed of the rule execution.
+ This circumstance is illustrated with the following example that contains two rules, whereas already an annotation
+ of the type <quote>LastToken</quote> was added to the last token of the document:
+ <programlisting><![CDATA[ANY LastToken;
+ANY @LastToken;]]></programlisting>
+ The first rule matches on each token of the document and checks whether the next annotation is the last token of the document.
+ This will result in many index operations because all tokens of the document are considered.
+ The second rule, however, matches on the last token and then checks if there is any token in front of it. This
+ rule, therefore, considers only one token.
+ </para>
+ <para>
+ The UIMA Ruta language provides also a concept for automatically selecting the starting rule element called dynamic anchoring.
+ Here, a simple heuristic concerning the position of the rule element and the involved types is applied in order to identify
+ the favorable rule element. This functionality can be activated in the <link linkend="ugr.tools.ruta.ae.basic.parameter">configuration parameters</link> of the analysis engine or
+ directly in the script file with the <link linkend="ugr.tools.ruta.language.actions.dynamicanchoring">DYNAMICANCHORING</link> action.
+ </para>
+</section>
\ No newline at end of file
Modified: uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.xml
URL: http://svn.apache.org/viewvc/uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.xml?rev=1491243&r1=1491242&r2=1491243&view=diff
==============================================================================
--- uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.xml (original)
+++ uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.xml Sun Jun 9 16:23:25 2013
@@ -33,7 +33,8 @@ under the License.
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
href="tools.ruta.language.syntax.xml" />
-
+ <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
+ href="tools.ruta.language.anchoring.xml" />
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
href="tools.ruta.language.basic_annotations.xml" />
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude"