You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Peter Klügl (JIRA)" <de...@uima.apache.org> on 2011/08/08 10:56:27 UTC
[jira] [Created] (UIMA-2207) Reimplementation of the TextMarker
engine
Reimplementation of the TextMarker engine
-----------------------------------------
Key: UIMA-2207
URL: https://issues.apache.org/jira/browse/UIMA-2207
Project: UIMA
Issue Type: Improvement
Components: TextMarker
Reporter: Peter Klügl
Assignee: Peter Klügl
Some severe flaws need to be removed from the TextMarker rule inference. This requires some major refactoring and reimplementation of internal parts of the engine project.
This improvement covers:
- fixes the flaw with the two annotations of the same type starting at the same offset.
- introduces quantifiers for sequences of rule elements.
- eases the usage of own seed information.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (UIMA-2207) Reimplementation of the TextMarker
engine
Posted by "Peter Klügl (JIRA)" <de...@uima.apache.org>.
[ https://issues.apache.org/jira/browse/UIMA-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Peter Klügl updated UIMA-2207:
------------------------------
Attachment: UIMA-2207-1.patch
First patch added with the new implementation of the rule inference. Took me longer than expected and there were also some delays.
Most parts are implemented and remaining bugs will be fixed when more test cases are added or old TextMarker projects are migrated to the new implementation.
What is still missing:
- Dynamic anchoring is not done yet and therefore deactivated.
- Seeding and the separation of inference annotations are not done yet.
- Tooling support for the new language elements is not available yet, e.g., formatter will delete composed rule elements
- Explanation component is not yet adapted to the new features.
- Extraction of modifier engine.
I tested the rule inference on a TextMarker project with about thousand rules: The memory consumption is still too high, but the processing time is a bit faster than before even though no optimizations are included yet.
The issue will be closed when the missing stuff is added with additional patches or is moved to separate issues.
> Reimplementation of the TextMarker engine
> -----------------------------------------
>
> Key: UIMA-2207
> URL: https://issues.apache.org/jira/browse/UIMA-2207
> Project: UIMA
> Issue Type: Improvement
> Components: TextMarker
> Reporter: Peter Klügl
> Assignee: Peter Klügl
> Attachments: UIMA-2207-1.patch
>
>
> Some severe flaws need to be removed from the TextMarker rule inference. This requires some major refactoring and reimplementation of internal parts of the engine project.
> This improvement covers:
> - fixes the flaw with the two annotations of the same type starting at the same offset.
> - introduces quantifiers for sequences of rule elements.
> - eases the usage of own seed information.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (UIMA-2207) Reimplementation of the TextMarker
engine
Posted by "Peter Klügl (JIRA)" <de...@uima.apache.org>.
[ https://issues.apache.org/jira/browse/UIMA-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Peter Klügl closed UIMA-2207.
-----------------------------
Resolution: Fixed
Closing issue since basic implementation is ready. Missing functionality was moved to new issues.
> Reimplementation of the TextMarker engine
> -----------------------------------------
>
> Key: UIMA-2207
> URL: https://issues.apache.org/jira/browse/UIMA-2207
> Project: UIMA
> Issue Type: Improvement
> Components: TextMarker
> Reporter: Peter Klügl
> Assignee: Peter Klügl
> Attachments: UIMA-2207-1.patch
>
>
> Some severe flaws need to be removed from the TextMarker rule inference. This requires some major refactoring and reimplementation of internal parts of the engine project.
> This improvement covers:
> - fixes the flaw with the two annotations of the same type starting at the same offset.
> - introduces quantifiers for sequences of rule elements.
> - eases the usage of own seed information.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (UIMA-2207) Reimplementation of the TextMarker
engine
Posted by "Joern Kottmann (JIRA)" <de...@uima.apache.org>.
[ https://issues.apache.org/jira/browse/UIMA-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112513#comment-13112513 ]
Joern Kottmann commented on UIMA-2207:
--------------------------------------
Applied your patch. Lets close this issue and create follow up issues.
> Reimplementation of the TextMarker engine
> -----------------------------------------
>
> Key: UIMA-2207
> URL: https://issues.apache.org/jira/browse/UIMA-2207
> Project: UIMA
> Issue Type: Improvement
> Components: TextMarker
> Reporter: Peter Klügl
> Assignee: Peter Klügl
> Attachments: UIMA-2207-1.patch
>
>
> Some severe flaws need to be removed from the TextMarker rule inference. This requires some major refactoring and reimplementation of internal parts of the engine project.
> This improvement covers:
> - fixes the flaw with the two annotations of the same type starting at the same offset.
> - introduces quantifiers for sequences of rule elements.
> - eases the usage of own seed information.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira