You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Peter Klügl (JIRA)" <de...@uima.apache.org> on 2014/07/02 10:20:24 UTC
[jira] [Comment Edited] (UIMA-3927) Problem with optional quantifiers and starting rule element annotation

    [ https://issues.apache.org/jira/browse/UIMA-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049725#comment-14049725 ] 

Peter Klügl edited comment on UIMA-3927 at 7/2/14 8:18 AM:
-----------------------------------------------------------

Ok, there are two problems. There is a bug in the question reluctant quantifier, and there is the functionality of reluctant quantifier in general. Those do not match if they don't have to, which is controlled by the next rule element. So in your example, there is none resulting in no match. Actually, the second rule element should also not match at all. Is there a specific reason why you are using the reluctant version of the quantifier? Something like
{noformat}
Token?{REGEXP(Token.posTag.value, "At")} // Article
Token?{REGEXP(Token.posTag.value, "Aj")} // Adjective
@Token{REGEXP(Token.posTag.value, "No")->MARK(Chunk, 1,3)}; // Noun
{noformat}
should work for you since you actually want to match the optional annotations.


was (Author: pkluegl):
Ok, there are two problems. There is a bug in the question reluctant quantifier, and there is the functionality of reluctant quantifier in general. Those do not match if they don't have to, which is controlled by the next rule element. So in your example, there is none resulting in no match. Actually, the second rule element should also not match at all. Is there a specific reason why you are suing the reluctant version of the quantifier? Something like
{noformat}
Token?{REGEXP(Token.posTag.value, "At")} // Article
Token?{REGEXP(Token.posTag.value, "Aj")} // Adjective
@Token{REGEXP(Token.posTag.value, "No")->MARK(Chunk, 1,3)}; // Noun
{noformat}
should work for you since you actually want to match the optional annotations.

> Problem with optional quantifiers and starting rule element annotation
> ----------------------------------------------------------------------
>
>                 Key: UIMA-3927
>                 URL: https://issues.apache.org/jira/browse/UIMA-3927
>             Project: UIMA
>          Issue Type: Bug
>          Components: ruta
>    Affects Versions: 2.2.0ruta
>            Reporter: Prokopis Prokopidis
>            Assignee: Peter Klügl
>
> Hi,
> As the Ruta documentation mentions, "writing rules that contain a first rule element with an optional quantifier is discouraged and will result in ignoring the optional attribute of the quantifier." A solution for overcoming this is to declare a rule element as a starting rule element by adding “@” directly in front of it. Thus, I am using ruta rules like
> {code}
> Token??{REGEXP(Token.posTag.value, "At")} // Article
> Token??{REGEXP(Token.posTag.value, "Aj")} // Adjective
> @Token{REGEXP(Token.posTag.value, "No")->MARK(Chunk, 1,3)}; // Noun
> {code}
> to mark nouns and optional pre-modifiers before them as chunks
> However, the rule seems to match only Adj Noun sequences and not to match input like:
> {code}
> anArt|At anAdj|Aj aNoun|No
> {code}
> Thanks for looking into this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)