You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Michael Stenger (Jira)" <de...@uima.apache.org> on 2021/12/29 09:13:00 UTC

[jira] [Created] (UIMA-6404) Ruta: @ with quantifier ignores matches

Michael Stenger created UIMA-6404:
-------------------------------------

             Summary: Ruta: @ with quantifier ignores matches
                 Key: UIMA-6404
                 URL: https://issues.apache.org/jira/browse/UIMA-6404
             Project: UIMA
          Issue Type: Bug
          Components: Ruta
    Affects Versions: 3.1.0ruta, 2.8.1ruta
            Reporter: Michael Stenger
             Fix For: 2.9.0ruta, 3.1.1ruta


Hi.

it seems combining the start anchor with a (minmax) quantifier causes the interpreter to miss what I would consider matches in cases where @ is put with inner rule elements like so:
{code:java}
(W @W W)[2,2];
// or
(W @W W W)[3,4];
// or
(W W @W W)[2,3];
{code}
On the other hand,
{code:java}
(W W W w)[2,2];
{code}
would match passages as expected. I suspect this is caused by the changed matching order within the composed rule element when it is applied multiple times.

Minimal Example:

Script:
{noformat}
(W @W W W)[2,2]{-> T1};
(W W @W W)[2,2]{-> T2};{noformat}
Text:
{noformat}
omega alpha beta gamma omega alpha beta gamma omega alpha{noformat}
Expected matches:
 * T1, T2: omega alpha beta gamma omega alpha beta gamma
 * T1, T2: alpha beta gamma omega alpha beta gamma omega
 * T1, T2: beta gamma omega alpha beta gamma omega alpha

Actual matches:
 * T2: beta gamma omega alpha beta gamma omega alpha

Or, since I could not find anything on the intended behaviour in such cases in the Guide, the broader question is how the interpreter is supposed to handle @ in a composed rule element that is also quantified. E.g. is it supposed to ignore the anchor from the second application (on the same match) onwards?

Best, Michael



--
This message was sent by Atlassian Jira
(v8.20.1#820001)