You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Peter Klügl (JIRA)" <de...@uima.apache.org> on 2011/08/08 10:56:27 UTC

[jira] [Created] (UIMA-2207) Reimplementation of the TextMarker engine

Reimplementation of the TextMarker engine
-----------------------------------------

                 Key: UIMA-2207
                 URL: https://issues.apache.org/jira/browse/UIMA-2207
             Project: UIMA
          Issue Type: Improvement
          Components: TextMarker
            Reporter: Peter Klügl
            Assignee: Peter Klügl


Some severe flaws need to be removed from the TextMarker rule inference. This requires some major refactoring and reimplementation of internal parts of the engine project.

This improvement covers:
- fixes the flaw with the two annotations of the same type starting at the same offset.
- introduces quantifiers for sequences of rule elements.
- eases the usage of own seed information.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (UIMA-2207) Reimplementation of the TextMarker engine

Posted by "Peter Klügl (JIRA)" <de...@uima.apache.org>.
     [ https://issues.apache.org/jira/browse/UIMA-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Peter Klügl updated UIMA-2207:
------------------------------

    Attachment: UIMA-2207-1.patch

First patch added with the new implementation of the rule inference. Took me longer than expected and there were also some delays.

Most parts are implemented and remaining bugs will be fixed when more test cases are added or old TextMarker projects are migrated to the new implementation.

What is still missing:
- Dynamic anchoring is not done yet and therefore deactivated.
- Seeding and the separation of inference annotations are not done yet.
- Tooling support for the new language elements is not available yet, e.g., formatter will delete composed rule elements
- Explanation component is not yet adapted to the new features.
- Extraction of modifier engine.

I tested the rule inference on a TextMarker project with about thousand rules: The memory consumption is still too high, but the processing time is a bit faster than before even though no optimizations are included yet.

The issue will be closed when the missing stuff is added with additional patches or is moved to separate issues.


> Reimplementation of the TextMarker engine
> -----------------------------------------
>
>                 Key: UIMA-2207
>                 URL: https://issues.apache.org/jira/browse/UIMA-2207
>             Project: UIMA
>          Issue Type: Improvement
>          Components: TextMarker
>            Reporter: Peter Klügl
>            Assignee: Peter Klügl
>         Attachments: UIMA-2207-1.patch
>
>
> Some severe flaws need to be removed from the TextMarker rule inference. This requires some major refactoring and reimplementation of internal parts of the engine project.
> This improvement covers:
> - fixes the flaw with the two annotations of the same type starting at the same offset.
> - introduces quantifiers for sequences of rule elements.
> - eases the usage of own seed information.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Closed] (UIMA-2207) Reimplementation of the TextMarker engine

Posted by "Peter Klügl (JIRA)" <de...@uima.apache.org>.
     [ https://issues.apache.org/jira/browse/UIMA-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Peter Klügl closed UIMA-2207.
-----------------------------

    Resolution: Fixed

Closing issue since basic implementation is ready. Missing functionality was moved to new issues.

> Reimplementation of the TextMarker engine
> -----------------------------------------
>
>                 Key: UIMA-2207
>                 URL: https://issues.apache.org/jira/browse/UIMA-2207
>             Project: UIMA
>          Issue Type: Improvement
>          Components: TextMarker
>            Reporter: Peter Klügl
>            Assignee: Peter Klügl
>         Attachments: UIMA-2207-1.patch
>
>
> Some severe flaws need to be removed from the TextMarker rule inference. This requires some major refactoring and reimplementation of internal parts of the engine project.
> This improvement covers:
> - fixes the flaw with the two annotations of the same type starting at the same offset.
> - introduces quantifiers for sequences of rule elements.
> - eases the usage of own seed information.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (UIMA-2207) Reimplementation of the TextMarker engine

Posted by "Joern Kottmann (JIRA)" <de...@uima.apache.org>.
    [ https://issues.apache.org/jira/browse/UIMA-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112513#comment-13112513 ] 

Joern Kottmann commented on UIMA-2207:
--------------------------------------

Applied your patch. Lets close this issue and create follow up issues.

> Reimplementation of the TextMarker engine
> -----------------------------------------
>
>                 Key: UIMA-2207
>                 URL: https://issues.apache.org/jira/browse/UIMA-2207
>             Project: UIMA
>          Issue Type: Improvement
>          Components: TextMarker
>            Reporter: Peter Klügl
>            Assignee: Peter Klügl
>         Attachments: UIMA-2207-1.patch
>
>
> Some severe flaws need to be removed from the TextMarker rule inference. This requires some major refactoring and reimplementation of internal parts of the engine project.
> This improvement covers:
> - fixes the flaw with the two annotations of the same type starting at the same offset.
> - introduces quantifiers for sequences of rule elements.
> - eases the usage of own seed information.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira