You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Marshall Schor (JIRA)" <de...@uima.apache.org> on 2010/06/24 00:23:50 UTC

[jira] Commented: (UIMA-1818) Provide simple mechanism to capture all CASes input to specified delegate

    [ https://issues.apache.org/jira/browse/UIMA-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881925#action_12881925 ] 

Marshall Schor commented on UIMA-1818:
--------------------------------------

Sounds like a valuable debugging aide.  

Is the idea that *every* CAS that comes thru a particular specified annotator would be saved to the file system?
* if so - maybe some parameter to control how many, or how frequently to sample, etc.?

The "COMPONENT_ARRAY" delegate keys need the x/y/z syntax for non UIMA-AS cases - where an aggregate contains another aggregate, etc.  This is already a convention in UIMA. So it would be good to just continue using it both for UIMA-AS cases and non-UIMA-AS cases.  

Would it be valuable to have a spec to say if the logging was to be before or after the AnalysisEnging, for each delegate? For instance, the spec could be e.g., someAggName/somePrimName:before:after  (showing both).  "before" could be the default.

Would it be valuable to dump only the changed data (a/la "delta cas")?  (possible syntax: add modifier :delta)

It would be good if the output was consumable by the CAS Viewer, too :-).

> Provide simple mechanism to capture all CASes input to specified delegate
> -------------------------------------------------------------------------
>
>                 Key: UIMA-1818
>                 URL: https://issues.apache.org/jira/browse/UIMA-1818
>             Project: UIMA
>          Issue Type: New Feature
>          Components: Async Scaleout
>            Reporter: Eddie Epstein
>            Assignee: Eddie Epstein
>
> The existing approach to capturing CASes sent to a component is to insert a new CAS-serializer-annotator just before it in the flow, or modify the component itself to serialize CASes. Both of these approaches require modifications to existing code and/or component descriptors, are somewhat time consuming and error prone.
> A much simpler approach is to just "turn on" CAS logging for a particular component using Java properties before starting the process, or to turn CAS logging on/off for an already running process using JMX operations.
> This issue covers using Java properties to turn on CAS logging for any delegate of an asynchronous aggregate.
> CAS logging would be controlled by the following properties:
> UIMA_CASLOG_BASE_DIRECTORY - optional; this is the directory under which other directories with XmiCas files will be created. If not specified, the processes current directory will be the base.
> UIMA_CASLOG_COMPONENT_ARRAY - This is a space separated list of delegates keys. If a delegate is nested inside a co-located async aggregate, the name would include the key name of the aggregate, e.g. "someAggName/someDelName". The XmiCas files will then be written into $UIMA_CASLOG_BASE_DIRECTORY/someAggName/someDelName/
> UIMA_CASLOG_TYPE_NAME - optional; this is the name of a FeatureStructure in the CAS containing a unique string to use the name each XmiCas file. If not specified, XmiCas file name will be NNN.xmi, where NNN is  the time in microseconds since the component was initialized.
> UIMA_CASLOG_FEATURE_NAME - optional unless if the TYPE_NAME is specified; this parameter gives the string feature to use. An example of type and feature names to use would be "org.apache.uima.examples.SourceDocumentInformation" and "uri".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.