You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Peter Klügl (JIRA)" <de...@uima.apache.org> on 2014/01/08 15:03:53 UTC

[jira] [Commented] (UIMA-3530) UIMA Rute - allow WORDLIST and WORDTABLE files to include not just plain text to be matched but also regular expressions

    [ https://issues.apache.org/jira/browse/UIMA-3530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865452#comment-13865452 ] 

Peter Klügl commented on UIMA-3530:
-----------------------------------

That is not as simple as it seems in the currect implementation, because the dictionaries are internally compiled into a trie, a tree structure of characters. I will try to think of something and all suggestions are welcome. In the meantime, you could use the simple regexp rules: http://uima.apache.org/d/ruta-current/tools.ruta.book.html#ugr.tools.ruta.language.regexprule

> UIMA Rute - allow WORDLIST and WORDTABLE files to include not just plain text to be matched but also regular expressions 
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: UIMA-3530
>                 URL: https://issues.apache.org/jira/browse/UIMA-3530
>             Project: UIMA
>          Issue Type: Wish
>          Components: ruta
>            Reporter: Dimitris Vassos
>            Priority: Minor
>
> It would greatly speed up and simplify the implementation of dictionary lookups using WORDLIST and WORDTABLE, if instead of just plain text entries in the file we could enter regular expressions.
> Especially for inflectional languages such as Greek or Russian, this feature is invaluable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)