You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Marshall Schor <ms...@schor.com> on 2016/10/18 21:28:39 UTC

UIMA shared external resources with config parameters

Several Jiras talk about wanting external shared resources that can be
configured using the standard UIMA configuration settings and parameters.

For example: https://issues.apache.org/jira/browse/UIMA-2979.

Currently, there is one UIMA external resource descriptor that supports this,
which is the configurableDataResourceSpecifier.  It has a resourceMetaData
sub-element, which, in turn, has the configurationParameters and
configurationParameterSettings elements.

There is an alternate, simpler approach to configuring external resources, now
available (as of UIMA 2.9.0) - using direct access to the External Configuration
parameters.  See section 2.4.3.5 in the UIMA Reference book.  The idea is that
each External Resource would define its own set of key-value parameter settings,
and retrieve them at run time.

The normal UIMA Configuration parameter mechanism is overlayed with lots of
complex functionality.  Some examples of this include supporting overrides in
UIMA Aggregates, grouping of sets of alternative values with both some common
parameters and some non-common ones (where the group can be programmatically
picked at run time (e.g., by the document language), extra meta data (such as
type, flags (e.g. mandatory)), and so forth.

This extra complexity was created for use cases where these configuration
parameters were interpreted with respect to a UIMA Context (that is, where the
parameters were associated with a particular position within an aggregation
hierarchy for a pipeline).  I think that some of these mechanisms probably don't
work correctly, if applied to the nested configuration parameter specs inside
the resourceMetaData of a configurableDataResourceSpecifier.

I'm thinking it would not be worth the trouble to add support to allow full
support for UIMA Configuration Parameters and settings in External Resources,
because I think the (now available) External Configuration capabilities provide
sufficient capability for this, and fit better with the basic idea of External
Resources as being shared objects for a pipeline (or set of pipelines).

I'd be interested to hear other views, especially if you think External
Resources is adequate, or insufficient for this.

-Marshall


Re: UIMA shared external resources with config parameters

Posted by Marshall Schor <ms...@schor.com>.
Hi Richard,

To clarify one point, I *do not want to deprecate* the existing parameters
mechanism in external resources in favor of the external override mechanism.

I'm sorry if my post implied that position :-(.

I was trying to say 2 things:

* external override mechanism is a (weak) alternative, which could be used *for
now*, although it has the issues you state below.

* using the existing parameters mechanism for external resources would take some
detailed thought and, perhaps, design, because I'm guessing it doesn't quite
work now (is this wrong? does it work for you?). 

I thought it wouldn't quite work because part of the implementation of
configuration management depends on the UIMA Context, which already has
(different) configuration parameters and settings.

I thought this would need to be "architected" with some careful thinking, and
that, in the meanwhile, this other approach could make-do, for now.

Re: your last question: Can/should the setting PARAM_SIZE ...

I think I'm missing something, because I would just say it could be substituted
using the external override mechanism (obviously a simple answer, probably not
addressing the real issue...)

-Marshall


On 10/23/2016 6:43 AM, Richard Eckart de Castilho wrote:
>> On 19.10.2016, at 20:10, Marshall Schor <ms...@schor.com> wrote:
>>
>> 1) Specifying "object A" to a component.  My thinking did not go beyond what is
>> done today for external shared resources.  UIMA provides an
>> ExternalResourceDescription as part of a component; this is eventually fed to
>> UIMA's "produceResource" methods to produce an instance of the resource.
>>
>> So, I was thinking that you would specify "object A to a component" by just
>> including the external resource description as part of the component's metadata.
>>
>> 2) How to specify an "object B" as a parameter to "object A":
>> 2a) Object A gets to define a key (or keys) for its parameters.  Let's say uses
>> "myObjB" as the key.
>> Object A gets to decide how to interpret the value for this key coming from the
>> external settings file.  (Not architected by UIMA).
>> 2b) At a time chosen by Object A, when object A is "running", it reads the value
>> of the key "myObjB" from the external settings file, and then interprets this in
>> any way it chooses, and then uses that to define Object B (again, this would be
>> arbitrary, not architected by UIMA)
> I would still like to see an example of how these parameters that are *not*
> "external resource parameters" but "overrides" are specified in code.
>
> For external resources, based on the current "external resource parameters"
> mechanism, uimaFIT defines a convenient way of composing resources and 
> components, specifically providing parameter values *locally* to each
> single declared resource, i.e. there is *no chance for conflict* between
> e.g. multiple instances of the same resource type being used in a pipeline.
> Below is an example how external resources (even nested ones) can be bound
> to an analysis engine. The parameter values of the external resources are
> provided locally for each resource. Mind that calling createExternalResourceDescription
> twice for the same class creates two distinct external resource instances of that
> class which can be bound independently.
>
> ----
> createEngineDescription(ExtractFeaturesConnector.class,
>         ExtractFeaturesConnector.PARAM_OUTPUT_DIRECTORY, outputPath,
>         ExtractFeaturesConnector.PARAM_DATA_WRITER_CLASS, WekaDataWriter.class,
>         ExtractFeaturesConnector.PARAM_LEARNING_MODE, Constants.LM_SINGLE_LABEL,
>         ExtractFeaturesConnector.PARAM_FEATURE_MODE, Constants.FM_DOCUMENT,
>         ExtractFeaturesConnector.PARAM_ADD_INSTANCE_ID, true,
>         ExtractFeaturesConnector.PARAM_FEATURE_FILTERS, new String[] {},
>         ExtractFeaturesConnector.PARAM_IS_TESTING, false,
>         ExtractFeaturesConnector.PARAM_FEATURE_EXTRACTORS,
> ==>     asList(createExternalResourceDescription(EmoticonRatio.class,
>                 EmoticonRatio.PARAM_UNIQUE_EXTRACTOR_NAME, "123"),
> ==>             createExternalResourceDescription(NumberOfHashTags.class,
>                         NumberOfHashTags.PARAM_UNIQUE_EXTRACTOR_NAME, "1234"))));
> ---
>
> My understanding is that you want to deprecate the existing "parameters" mechanism
> in external resources in favor of the "external override" mechanism. Hence,
> I would like to know how to implement creating resource descriptions, binding
> them, and setting their parameters would be done programmatically relying just
> on the "override" mechanism and not on the existing "parameters" mechanism.
>
> More on this in the last section below.
>
>> 3) how to set non-String parameters?  Both the external settings and the normal
>> UIMA configuration parameter settings (I'm thinking of the XML descriptor)
>> represent these as strings.  So the number 1.0 is represented as the string
>> "1.0", and the code that gets configuration parameter settings is responsible
>> for type conversions, for instance, converting the string to the declared
>> configuration parameter type.
>>
>> For accessing directly external settings, there is no architected place for
>> specifying the "type" of the parameter, other than the configuration
>> declarations (which could be used for simple UIMA types only);  the external
>> settings API returns just the string (or an array of strings, which is
>> supported) to the caller, and it's up to the caller to then do whatever
>> interpretation of this string value is desired (not architected by UIMA).
> The ConfigurableDataResourceSpecifier uses a ResourceMetaData object for
> parameters. ResourceMetaData supports non-String parameter values via
> ConfigurationParameterDeclarations and ConfigurationParameterSettings.
> Types of parameters are declared in ConfigurationParameterDeclarations 
> and the framework handles the conversion between external String form and
> internal parameter values. It is not up to the component or resource to
> implement a conversion mechanism for each parameter.
>
>> 4) re: disambiguating parameters for multiple instances of a shared resource. 
>> UIMA today has the ability to have multiple instances of a shared resource, e.g.
>> a "dictionary" that is parameterized by "language";
>> multiple instances of these can be loaded.  The "get resource" api for this
>> includes specifying the parameter(s) to select the proper one, and each instance
>> that is created gets a initial "load" call whose argument can identify the instance.
>>
>> So, (not architected by UIMA) the implementation could, for example, define a
>> set of "keys": e.g.
>> my_thesaurus_en, my_thesaurus_de, ...  for some parameters that are dependent on
>> a language code. 
>>
>> Beyond this, External Resources doesn't support multiple instances, and I had
>> not considered extending this (as part of this discussion, which was about how
>> to read configuration parameters).
> If I understand you correctly, you want that the implementer of a resource
> defines some naming convention to ensure that override names can be manually
> associated with resources configured in specific ways, e.g. (pseudocode)
>
> ----
> setOverride "de_dictionary" = "german.lexicon"
> setOverride "en_dictionary" = "english.lexicon"
>
> class DictionaryResource {
>   def initialize(UimaContext ctx) {
>     def lang = ctx.getParameter("lang");
>     def lexicon = ctx.getOverride("${lang}_dictionary");
>     loadLexicon(lexicon);
>   }
> }
> ----
>
> If I have understood it correctly, that looks like a nice option
> of working e.g. with multi-language scenarios.
>
> We have used external resources more in the context of machine
> learning, specifically to model feature extractors. Here, we
> define multiple instances of external resources, e.g. to
> obtain n-grams of different sizes.
>
> ----
>
> // Defining two instances of the NGramExtractorResource with
> // different parameters.
> def unigrams = createResource(NGramExtractorResource.class, 
>   NGramExtractorResource.PARAM_SIZE, 1);
> def bigrams = createResource(NGramExtractorResource.class, 
>   NGramExtractorResource.PARAM_SIZE, 2);
>
> def analysisEngine = createEngine(Analyzer.class,
>   Analyzer.KEY_EXTRATORS, asList(unigrams, bigrams));
> ----
>
> To that end, uimaFIT introduces a custom external resource type
> "ResourceList" (extends Resource_ImplBase) [1] which is implicitly
> created in the call above. So the "unigrams" and "bigrams" bind to the
> implicitly created "resource list" and the "resource list" binds to the
> analysis engine.
>
> Can/should the setting of PARAM_SIZE in the example above be substituted
> using the "external override" mechanism?
>
> Cheers,
>
> -- Richard
>
> [1] https://svn.apache.org/repos/asf/uima/uimafit/trunk/uimafit-core/src/main/java/org/apache/uima/fit/internal/ResourceList.java


Re: UIMA shared external resources with config parameters

Posted by Richard Eckart de Castilho <re...@apache.org>.
> On 19.10.2016, at 20:10, Marshall Schor <ms...@schor.com> wrote:
> 
> 1) Specifying "object A" to a component.  My thinking did not go beyond what is
> done today for external shared resources.  UIMA provides an
> ExternalResourceDescription as part of a component; this is eventually fed to
> UIMA's "produceResource" methods to produce an instance of the resource.
> 
> So, I was thinking that you would specify "object A to a component" by just
> including the external resource description as part of the component's metadata.
> 
> 2) How to specify an "object B" as a parameter to "object A":
> 2a) Object A gets to define a key (or keys) for its parameters.  Let's say uses
> "myObjB" as the key.
> Object A gets to decide how to interpret the value for this key coming from the
> external settings file.  (Not architected by UIMA).
> 2b) At a time chosen by Object A, when object A is "running", it reads the value
> of the key "myObjB" from the external settings file, and then interprets this in
> any way it chooses, and then uses that to define Object B (again, this would be
> arbitrary, not architected by UIMA)

I would still like to see an example of how these parameters that are *not*
"external resource parameters" but "overrides" are specified in code.

For external resources, based on the current "external resource parameters"
mechanism, uimaFIT defines a convenient way of composing resources and 
components, specifically providing parameter values *locally* to each
single declared resource, i.e. there is *no chance for conflict* between
e.g. multiple instances of the same resource type being used in a pipeline.
Below is an example how external resources (even nested ones) can be bound
to an analysis engine. The parameter values of the external resources are
provided locally for each resource. Mind that calling createExternalResourceDescription
twice for the same class creates two distinct external resource instances of that
class which can be bound independently.

----
createEngineDescription(ExtractFeaturesConnector.class,
        ExtractFeaturesConnector.PARAM_OUTPUT_DIRECTORY, outputPath,
        ExtractFeaturesConnector.PARAM_DATA_WRITER_CLASS, WekaDataWriter.class,
        ExtractFeaturesConnector.PARAM_LEARNING_MODE, Constants.LM_SINGLE_LABEL,
        ExtractFeaturesConnector.PARAM_FEATURE_MODE, Constants.FM_DOCUMENT,
        ExtractFeaturesConnector.PARAM_ADD_INSTANCE_ID, true,
        ExtractFeaturesConnector.PARAM_FEATURE_FILTERS, new String[] {},
        ExtractFeaturesConnector.PARAM_IS_TESTING, false,
        ExtractFeaturesConnector.PARAM_FEATURE_EXTRACTORS,
==>     asList(createExternalResourceDescription(EmoticonRatio.class,
                EmoticonRatio.PARAM_UNIQUE_EXTRACTOR_NAME, "123"),
==>             createExternalResourceDescription(NumberOfHashTags.class,
                        NumberOfHashTags.PARAM_UNIQUE_EXTRACTOR_NAME, "1234"))));
---

My understanding is that you want to deprecate the existing "parameters" mechanism
in external resources in favor of the "external override" mechanism. Hence,
I would like to know how to implement creating resource descriptions, binding
them, and setting their parameters would be done programmatically relying just
on the "override" mechanism and not on the existing "parameters" mechanism.

More on this in the last section below.

> 3) how to set non-String parameters?  Both the external settings and the normal
> UIMA configuration parameter settings (I'm thinking of the XML descriptor)
> represent these as strings.  So the number 1.0 is represented as the string
> "1.0", and the code that gets configuration parameter settings is responsible
> for type conversions, for instance, converting the string to the declared
> configuration parameter type.
> 
> For accessing directly external settings, there is no architected place for
> specifying the "type" of the parameter, other than the configuration
> declarations (which could be used for simple UIMA types only);  the external
> settings API returns just the string (or an array of strings, which is
> supported) to the caller, and it's up to the caller to then do whatever
> interpretation of this string value is desired (not architected by UIMA).

The ConfigurableDataResourceSpecifier uses a ResourceMetaData object for
parameters. ResourceMetaData supports non-String parameter values via
ConfigurationParameterDeclarations and ConfigurationParameterSettings.
Types of parameters are declared in ConfigurationParameterDeclarations 
and the framework handles the conversion between external String form and
internal parameter values. It is not up to the component or resource to
implement a conversion mechanism for each parameter.

> 4) re: disambiguating parameters for multiple instances of a shared resource. 
> UIMA today has the ability to have multiple instances of a shared resource, e.g.
> a "dictionary" that is parameterized by "language";
> multiple instances of these can be loaded.  The "get resource" api for this
> includes specifying the parameter(s) to select the proper one, and each instance
> that is created gets a initial "load" call whose argument can identify the instance.
> 
> So, (not architected by UIMA) the implementation could, for example, define a
> set of "keys": e.g.
> my_thesaurus_en, my_thesaurus_de, ...  for some parameters that are dependent on
> a language code. 
> 
> Beyond this, External Resources doesn't support multiple instances, and I had
> not considered extending this (as part of this discussion, which was about how
> to read configuration parameters).

If I understand you correctly, you want that the implementer of a resource
defines some naming convention to ensure that override names can be manually
associated with resources configured in specific ways, e.g. (pseudocode)

----
setOverride "de_dictionary" = "german.lexicon"
setOverride "en_dictionary" = "english.lexicon"

class DictionaryResource {
  def initialize(UimaContext ctx) {
    def lang = ctx.getParameter("lang");
    def lexicon = ctx.getOverride("${lang}_dictionary");
    loadLexicon(lexicon);
  }
}
----

If I have understood it correctly, that looks like a nice option
of working e.g. with multi-language scenarios.

We have used external resources more in the context of machine
learning, specifically to model feature extractors. Here, we
define multiple instances of external resources, e.g. to
obtain n-grams of different sizes.

----

// Defining two instances of the NGramExtractorResource with
// different parameters.
def unigrams = createResource(NGramExtractorResource.class, 
  NGramExtractorResource.PARAM_SIZE, 1);
def bigrams = createResource(NGramExtractorResource.class, 
  NGramExtractorResource.PARAM_SIZE, 2);

def analysisEngine = createEngine(Analyzer.class,
  Analyzer.KEY_EXTRATORS, asList(unigrams, bigrams));
----

To that end, uimaFIT introduces a custom external resource type
"ResourceList" (extends Resource_ImplBase) [1] which is implicitly
created in the call above. So the "unigrams" and "bigrams" bind to the
implicitly created "resource list" and the "resource list" binds to the
analysis engine.

Can/should the setting of PARAM_SIZE in the example above be substituted
using the "external override" mechanism?

Cheers,

-- Richard

[1] https://svn.apache.org/repos/asf/uima/uimafit/trunk/uimafit-core/src/main/java/org/apache/uima/fit/internal/ResourceList.java

Re: UIMA shared external resources with config parameters

Posted by Marshall Schor <ms...@schor.com>.
1) Specifying "object A" to a component.  My thinking did not go beyond what is
done today for external shared resources.  UIMA provides an
ExternalResourceDescription as part of a component; this is eventually fed to
UIMA's "produceResource" methods to produce an instance of the resource.

So, I was thinking that you would specify "object A to a component" by just
including the external resource description as part of the component's metadata.

2) How to specify an "object B" as a parameter to "object A":
2a) Object A gets to define a key (or keys) for its parameters.  Let's say uses
"myObjB" as the key.
Object A gets to decide how to interpret the value for this key coming from the
external settings file.  (Not architected by UIMA).
2b) At a time chosen by Object A, when object A is "running", it reads the value
of the key "myObjB" from the external settings file, and then interprets this in
any way it chooses, and then uses that to define Object B (again, this would be
arbitrary, not architected by UIMA)

3) how to set non-String parameters?  Both the external settings and the normal
UIMA configuration parameter settings (I'm thinking of the XML descriptor)
represent these as strings.  So the number 1.0 is represented as the string
"1.0", and the code that gets configuration parameter settings is responsible
for type conversions, for instance, converting the string to the declared
configuration parameter type.

For accessing directly external settings, there is no architected place for
specifying the "type" of the parameter, other than the configuration
declarations (which could be used for simple UIMA types only);  the external
settings API returns just the string (or an array of strings, which is
supported) to the caller, and it's up to the caller to then do whatever
interpretation of this string value is desired (not architected by UIMA).

4) re: disambiguating parameters for multiple instances of a shared resource. 
UIMA today has the ability to have multiple instances of a shared resource, e.g.
a "dictionary" that is parameterized by "language";
multiple instances of these can be loaded.  The "get resource" api for this
includes specifying the parameter(s) to select the proper one, and each instance
that is created gets a initial "load" call whose argument can identify the instance.

So, (not architected by UIMA) the implementation could, for example, define a
set of "keys": e.g.
my_thesaurus_en, my_thesaurus_de, ...  for some parameters that are dependent on
a language code. 

Beyond this, External Resources doesn't support multiple instances, and I had
not considered extending this (as part of this discussion, which was about how
to read configuration parameters).

If you're thinking this is pretty minimal, I agree...  If something more
substantial is desired, this would take some more thinking, and, as you've
suggested, might even be to incorporate other mini-framework approaches (e.g.
from Spring).

-Marshall

On 10/19/2016 1:26 PM, Richard Eckart de Castilho wrote:
> Ok, but I mean actually in code. How would it look like?
>
> - how to specify one "object B" as a parameter to another "object A" and finally "object A" to a component?
> - how to set non-String parameters?
> - while setting parameters, how to disambiguate between multiple instances of a such a shared resource?
>
> Cheers,
>
> -- Richard
>
>> On 19.10.2016, at 19:22, Marshall Schor <ms...@schor.com> wrote:
>>
>> re: how to realize the scenarios illustrated in [1] could be implemented
>> using the External Configuration parameters? 
>>
>> Here's the high level thoughts:
>>
>> 1) Implementor designs several "objects" that require configuration.  They decide on the configuration parameters, giving each of them unique names (perhaps using a kind of package-naming-scheme).
>>
>> 2) Key assumption:  These objects are to be UIMA Shared External Resources.  This means that
>> 2a) there is one instance shared among many things
>> 2b) there might be multiple instances if "parameterized" versions of the 
>>     external resource are being used (e.g. dictionary, parameterized by language)
>>
>> 3) When a user puts a system together, it consists of a pipeline and a set of these external resources, and an external configuration settings (usually a file, but could be a computed Java Object, in general).
>>
>> 4) The framework runs; the external resources are created, some at initialization time (for the pipeline component that is declaring it), some at getResource( parameter) time, for parameterized resources (where, for instance you don't know the language until you get a CAS with a document and determine it).
>>
>> 5) When the resource code starts running, it sees that it needs to read its configuration parameters, so it gets the external settings object and reads whatever parameters it wants to, using its unique key names.
>>
>> --------------
>> Missing from this scenario is one of the things UIMA (but not uimaFIT) can offer as a "feature" - the externalization into some metadata information about configuration parameters.  Note that this could be added - the external resource could make use of the resourcMetaData xml element to specify configuration parameter definitions.  The UIMA framework currently ignores these, but other tooling could make use of this, if it seems desirable.
>>
>>
>> Hoping that I've not overlooked some basic flaw,
>>
>> -Marshall
>>
>> On 10/19/2016 4:17 AM, Richard Eckart de Castilho wrote:
>>> Hi Marshall,
>>>
>>>> On 18.10.2016, at 23:28, Marshall Schor <ms...@schor.com> wrote:
>>>>
>>>> Several Jiras talk about wanting external shared resources that can be
>>>> configured using the standard UIMA configuration settings and parameters.
>>>>
>>>> For example: https://issues.apache.org/jira/browse/UIMA-2979.
>>>>
>>>> Currently, there is one UIMA external resource descriptor that supports this,
>>>> which is the configurableDataResourceSpecifier.  It has a resourceMetaData
>>>> sub-element, which, in turn, has the configurationParameters and
>>>> configurationParameterSettings elements.
>>>>
>>>> There is an alternate, simpler approach to configuring external resources, now
>>>> available (as of UIMA 2.9.0) - using direct access to the External Configuration
>>>> parameters.  See section 2.4.3.5 in the UIMA Reference book.  The idea is that
>>>> each External Resource would define its own set of key-value parameter settings,
>>>> and retrieve them at run time.
>>> Could you comment how to realize the scenarios illustrated in [1] could be implemented
>>> using the External Configuration parameters? The point in these scenarios is
>>> having complex (potentially nested) objects that are composed and used to customize
>>> the behavior of one or more components. The objects take parameters and contain logic.
>>>
>>> Best,
>>>
>>> -- Richard
>>>
>>> [1] https://issues.apache.org/jira/browse/UIMA-2903?focusedCommentId=13708539&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13708539
>>>
>>>
>


Re: UIMA shared external resources with config parameters

Posted by Richard Eckart de Castilho <re...@apache.org>.
Ok, but I mean actually in code. How would it look like?

- how to specify one "object B" as a parameter to another "object A" and finally "object A" to a component?
- how to set non-String parameters?
- while setting parameters, how to disambiguate between multiple instances of a such a shared resource?

Cheers,

-- Richard

> On 19.10.2016, at 19:22, Marshall Schor <ms...@schor.com> wrote:
> 
> re: how to realize the scenarios illustrated in [1] could be implemented
> using the External Configuration parameters? 
> 
> Here's the high level thoughts:
> 
> 1) Implementor designs several "objects" that require configuration.  They decide on the configuration parameters, giving each of them unique names (perhaps using a kind of package-naming-scheme).
> 
> 2) Key assumption:  These objects are to be UIMA Shared External Resources.  This means that
> 2a) there is one instance shared among many things
> 2b) there might be multiple instances if "parameterized" versions of the 
>     external resource are being used (e.g. dictionary, parameterized by language)
> 
> 3) When a user puts a system together, it consists of a pipeline and a set of these external resources, and an external configuration settings (usually a file, but could be a computed Java Object, in general).
> 
> 4) The framework runs; the external resources are created, some at initialization time (for the pipeline component that is declaring it), some at getResource( parameter) time, for parameterized resources (where, for instance you don't know the language until you get a CAS with a document and determine it).
> 
> 5) When the resource code starts running, it sees that it needs to read its configuration parameters, so it gets the external settings object and reads whatever parameters it wants to, using its unique key names.
> 
> --------------
> Missing from this scenario is one of the things UIMA (but not uimaFIT) can offer as a "feature" - the externalization into some metadata information about configuration parameters.  Note that this could be added - the external resource could make use of the resourcMetaData xml element to specify configuration parameter definitions.  The UIMA framework currently ignores these, but other tooling could make use of this, if it seems desirable.
> 
> 
> Hoping that I've not overlooked some basic flaw,
> 
> -Marshall
> 
> On 10/19/2016 4:17 AM, Richard Eckart de Castilho wrote:
>> Hi Marshall,
>> 
>>> On 18.10.2016, at 23:28, Marshall Schor <ms...@schor.com> wrote:
>>> 
>>> Several Jiras talk about wanting external shared resources that can be
>>> configured using the standard UIMA configuration settings and parameters.
>>> 
>>> For example: https://issues.apache.org/jira/browse/UIMA-2979.
>>> 
>>> Currently, there is one UIMA external resource descriptor that supports this,
>>> which is the configurableDataResourceSpecifier.  It has a resourceMetaData
>>> sub-element, which, in turn, has the configurationParameters and
>>> configurationParameterSettings elements.
>>> 
>>> There is an alternate, simpler approach to configuring external resources, now
>>> available (as of UIMA 2.9.0) - using direct access to the External Configuration
>>> parameters.  See section 2.4.3.5 in the UIMA Reference book.  The idea is that
>>> each External Resource would define its own set of key-value parameter settings,
>>> and retrieve them at run time.
>> Could you comment how to realize the scenarios illustrated in [1] could be implemented
>> using the External Configuration parameters? The point in these scenarios is
>> having complex (potentially nested) objects that are composed and used to customize
>> the behavior of one or more components. The objects take parameters and contain logic.
>> 
>> Best,
>> 
>> -- Richard
>> 
>> [1] https://issues.apache.org/jira/browse/UIMA-2903?focusedCommentId=13708539&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13708539
>> 
>> 
> 


Re: UIMA shared external resources with config parameters

Posted by Marshall Schor <ms...@schor.com>.
re: how to realize the scenarios illustrated in [1] could be implemented
using the External Configuration parameters? 

Here's the high level thoughts:

1) Implementor designs several "objects" that require configuration.  They decide on the configuration parameters, giving each of them unique names (perhaps using a kind of package-naming-scheme).

2) Key assumption:  These objects are to be UIMA Shared External Resources.  This means that
 2a) there is one instance shared among many things
 2b) there might be multiple instances if "parameterized" versions of the 
     external resource are being used (e.g. dictionary, parameterized by language)

3) When a user puts a system together, it consists of a pipeline and a set of these external resources, and an external configuration settings (usually a file, but could be a computed Java Object, in general).

4) The framework runs; the external resources are created, some at initialization time (for the pipeline component that is declaring it), some at getResource( parameter) time, for parameterized resources (where, for instance you don't know the language until you get a CAS with a document and determine it).

5) When the resource code starts running, it sees that it needs to read its configuration parameters, so it gets the external settings object and reads whatever parameters it wants to, using its unique key names.

--------------
Missing from this scenario is one of the things UIMA (but not uimaFIT) can offer as a "feature" - the externalization into some metadata information about configuration parameters.  Note that this could be added - the external resource could make use of the resourcMetaData xml element to specify configuration parameter definitions.  The UIMA framework currently ignores these, but other tooling could make use of this, if it seems desirable.


Hoping that I've not overlooked some basic flaw,

-Marshall

On 10/19/2016 4:17 AM, Richard Eckart de Castilho wrote:
> Hi Marshall,
>
>> On 18.10.2016, at 23:28, Marshall Schor <ms...@schor.com> wrote:
>>
>> Several Jiras talk about wanting external shared resources that can be
>> configured using the standard UIMA configuration settings and parameters.
>>
>> For example: https://issues.apache.org/jira/browse/UIMA-2979.
>>
>> Currently, there is one UIMA external resource descriptor that supports this,
>> which is the configurableDataResourceSpecifier.  It has a resourceMetaData
>> sub-element, which, in turn, has the configurationParameters and
>> configurationParameterSettings elements.
>>
>> There is an alternate, simpler approach to configuring external resources, now
>> available (as of UIMA 2.9.0) - using direct access to the External Configuration
>> parameters.  See section 2.4.3.5 in the UIMA Reference book.  The idea is that
>> each External Resource would define its own set of key-value parameter settings,
>> and retrieve them at run time.
> Could you comment how to realize the scenarios illustrated in [1] could be implemented
> using the External Configuration parameters? The point in these scenarios is
> having complex (potentially nested) objects that are composed and used to customize
> the behavior of one or more components. The objects take parameters and contain logic.
>
> Best,
>
> -- Richard
>
> [1] https://issues.apache.org/jira/browse/UIMA-2903?focusedCommentId=13708539&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13708539
>
>


Re: UIMA shared external resources with config parameters

Posted by Richard Eckart de Castilho <re...@apache.org>.
Hi Marshall,

> On 18.10.2016, at 23:28, Marshall Schor <ms...@schor.com> wrote:
> 
> Several Jiras talk about wanting external shared resources that can be
> configured using the standard UIMA configuration settings and parameters.
> 
> For example: https://issues.apache.org/jira/browse/UIMA-2979.
> 
> Currently, there is one UIMA external resource descriptor that supports this,
> which is the configurableDataResourceSpecifier.  It has a resourceMetaData
> sub-element, which, in turn, has the configurationParameters and
> configurationParameterSettings elements.
> 
> There is an alternate, simpler approach to configuring external resources, now
> available (as of UIMA 2.9.0) - using direct access to the External Configuration
> parameters.  See section 2.4.3.5 in the UIMA Reference book.  The idea is that
> each External Resource would define its own set of key-value parameter settings,
> and retrieve them at run time.

Could you comment how to realize the scenarios illustrated in [1] could be implemented
using the External Configuration parameters? The point in these scenarios is
having complex (potentially nested) objects that are composed and used to customize
the behavior of one or more components. The objects take parameters and contain logic.

Best,

-- Richard

[1] https://issues.apache.org/jira/browse/UIMA-2903?focusedCommentId=13708539&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13708539


Re: UIMA shared external resources with config parameters

Posted by Peter Klügl <pe...@averbis.com>.
Hi,


Am 18.10.2016 um 23:28 schrieb Marshall Schor:
> ...
>
> I'm thinking it would not be worth the trouble to add support to allow full
> support for UIMA Configuration Parameters and settings in External Resources,
> because I think the (now available) External Configuration capabilities provide
> sufficient capability for this, and fit better with the basic idea of External
> Resources as being shared objects for a pipeline (or set of pipelines).
>
> I'd be interested to hear other views, especially if you think External
> Resources is adequate, or insufficient for this.
>

I don't know yet - I have to take a closer look first.

Does this mean that the parameters still remain string-only?

Peter