You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by jefferyyuan <je...@gmail.com> on 2014/05/29 06:11:37 UTC

UIMA Dynamic Regex

One requirement of our text mining project is to allow user to specify the 
regex and corresponding type, uima will parse the text and return found match.


I am wondering whether we can create an Annotator(maybe modify or extend 
com.commvault.uima.annotator.regex.impl.RegExAnnotator) to accept the regex 
and corresponding type user spcfied, and run regex match against the text.


But checked CasAnnotator_ImplBase class, especially the process(CAS aCAS) 
method, didn't find any way to pass and read extra parameters.

Thanks in advance for any help.


Re: UIMA Dynamic Regex

Posted by Richard Eckart de Castilho <re...@apache.org>.
Parameters are not passed directly as method arguments in plain UIMA.
To access parameters in a component, you obtain the UIMAContext
from the component and from it you can get the parameter values.
Here an excerpt on how to access a parameter value from InlineXmlCasConsumer:

  public static final String PARAM_OUTPUTDIR = "OutputDirectory";

  public void initialize() throws ResourceInitializationException {
    mOutputDir = new File(((String) getConfigParameterValue(PARAM_OUTPUTDIR)).trim());

An alternative way to access parameter is provided by the uimaFIT library [1]. 
E.g. you could define a parameter field like this:

  public static final String PARAM_OUTPUTDIR = "OutputDirectory";
  @ConfigurationParameter(name=PARAM_OUTPUTDIR, mandatory=true)
  private File outputDirectory

The parameter value is set in the analysis engine description, which is
typically an XML file created through the component description editor in Eclipse.

It is also possible to generate such a description programmatically. The
uimaFIT library offers a convenient API to do so, wrapping the more verbose
UIMA API. E.g. you could configure a component like this:

  AnalysisEngineFactory.createEngineDescription(MyComponent.class
    MyComponent.PARAM_OUTPUTDIR, new File("/data/targetfolder"));

or

  AnalysisEngineFactory.createEngineDescription(MyComponent.class
    MyComponent.PARAM_OUTPUTDIR, "/data/targetfolder");

If you like, you can use uimaFIT to configure plain old non-fit UIMA 
components like RegExAnnotator, but you'll have to set all parameters
explicitly as uimaFIT does not know default parameter values unless a
component declares the @ConfigurationParameter Java annotations.

Cheers,

-- Richard

[1] http://uima.apache.org/uimafit.html

On 29.05.2014, at 06:11, jefferyyuan <je...@gmail.com> wrote:

> One requirement of our text mining project is to allow user to specify the 
> regex and corresponding type, uima will parse the text and return found match.
> 
> 
> I am wondering whether we can create an Annotator(maybe modify or extend 
> com.commvault.uima.annotator.regex.impl.RegExAnnotator) to accept the regex 
> and corresponding type user spcfied, and run regex match against the text.
> 
> 
> But checked CasAnnotator_ImplBase class, especially the process(CAS aCAS) 
> method, didn't find any way to pass and read extra parameters.
> 
> Thanks in advance for any help.