You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Michael Stenger (Jira)" <de...@uima.apache.org> on 2021/12/29 09:13:00 UTC
[jira] [Created] (UIMA-6404) Ruta: @ with quantifier ignores matches
Michael Stenger created UIMA-6404:
-------------------------------------
Summary: Ruta: @ with quantifier ignores matches
Key: UIMA-6404
URL: https://issues.apache.org/jira/browse/UIMA-6404
Project: UIMA
Issue Type: Bug
Components: Ruta
Affects Versions: 3.1.0ruta, 2.8.1ruta
Reporter: Michael Stenger
Fix For: 2.9.0ruta, 3.1.1ruta
Hi.
it seems combining the start anchor with a (minmax) quantifier causes the interpreter to miss what I would consider matches in cases where @ is put with inner rule elements like so:
{code:java}
(W @W W)[2,2];
// or
(W @W W W)[3,4];
// or
(W W @W W)[2,3];
{code}
On the other hand,
{code:java}
(W W W w)[2,2];
{code}
would match passages as expected. I suspect this is caused by the changed matching order within the composed rule element when it is applied multiple times.
Minimal Example:
Script:
{noformat}
(W @W W W)[2,2]{-> T1};
(W W @W W)[2,2]{-> T2};{noformat}
Text:
{noformat}
omega alpha beta gamma omega alpha beta gamma omega alpha{noformat}
Expected matches:
* T1, T2: omega alpha beta gamma omega alpha beta gamma
* T1, T2: alpha beta gamma omega alpha beta gamma omega
* T1, T2: beta gamma omega alpha beta gamma omega alpha
Actual matches:
* T2: beta gamma omega alpha beta gamma omega alpha
Or, since I could not find anything on the intended behaviour in such cases in the Guide, the broader question is how the interpreter is supposed to handle @ in a composed rule element that is also quantified. E.g. is it supposed to ignore the anchor from the second application (on the same match) onwards?
Best, Michael
--
This message was sent by Atlassian Jira
(v8.20.1#820001)