You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Mario Juric (Jira)" <de...@uima.apache.org> on 2019/10/23 10:58:00 UTC

[jira] [Created] (UIMA-6137) Type-based filtering in Ruta rules

Mario Juric created UIMA-6137:
---------------------------------

             Summary: Type-based filtering in Ruta rules
                 Key: UIMA-6137
                 URL: https://issues.apache.org/jira/browse/UIMA-6137
             Project: UIMA
          Issue Type: New Feature
          Components: Ruta
            Reporter: Mario Juric


The visibility concept in Ruta is not type-based but type coverage-based, which means that filtered types will hide the are they cover to the Ruta rules, i.e. these areas become invisible to the rules.

We have a use case where we only want to hide the types from being considered in the rules, and not the covered text area where other types found in these areas should still be considered by the rules.

We use Ruta as part of the normalization process where we have different text areas marked with annotations associated with the tags in the original content (title, abstract/summary, body, COI, authors, citations etc.), and Ruta is part of the parsing process that produces this view. Using only the content annotations Ruta is then used to markup what areas to include in a new view for doing NLP. This approach gives us maximum traceability of the normalization process.

However, the different types of content annotations can sometimes interfere with the rules beyond our control, and our current solution leads to more awkward rules that are hard to verify, and which also leads to a less performant implementation. The problem would in our case better be solved if we were able to tell Ruta simply to ignore certain types from being considered, i.e. they are invisible to the Ruta rules. Preferably we want to be able to add and remove filtered types in the script similar to how it works with the coverage based type filter.
Please see also this mailing list thread where a toy example of the problem is discussed:
 
[https://lists.apache.org/thread.html/604417ac76ab85fc8d87eef12d4696b89d3257b7a53719518d9f5408@<user.uima.apache.org>|https://lists.apache.org/thread.html/604417ac76ab85fc8d87eef12d4696b89d3257b7a53719518d9f5408@%3Cuser.uima.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)