You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Marshall Schor <ms...@schor.com> on 2008/11/17 21:58:01 UTC

[DISCUSS] Changing the design for Pear wrappers and Pear-specified "environment variables"

Pears are UIMA analysis components that are wrapped/packaged to include
the classpath & data path needed to run them.  A Pear can be run as a
component in an Aggregate.

When it is run, the current Java Classpath and data path are switched
from their current values to the values specified in the Pear, while the
Annotator is running, and then switched back.

One of the things that the Pear specification allows is the setting of
"environment variables".  This capability was initially put in for
legacy applications, many of which were written in non-Java languages,
to allow them to run unmodified, if they depended on some environmental
variables to be set.  The initial "runner" for these kinds of Pears was
thought to be special repository code, which might use a web interface
to allow users to specify analysis components to run against some data. 
UIMA itself did not specify how these environmental variable settings
were to be used.

The current Pear Wrapper implementation which runs Pears within a UIMA
pipeline, takes these environmental variable setting specifications from
the Pear and, because in Java there is no way to "set" environmental
variables, instead uses them to set Java system properties (of the same
name).  So, if the Pear specification had a setting LDPATH with the
value "c:\a\b\c", the Pear Wrapper would set a Java System Property
"LDPATH" to the string value "c:\a\b\c".  Except for values of CLASSPATH
and DATAPATH, all the other environment variables specified in the Pear
are used to set System Properties.

If a system property being set already exists and has a value different
from the value being set, a WARNING message is logged, but the code
currently overrides the previous setting.

This behavior seems dangerous - it allows any Pear to override arbitrary
Java System properties.  If there is not a good use case for it, I think
it should be removed from the Pear Wrapper behavior, or at least it
should be put under the control of a "permission" value which would by
default be set to not allow this.  Would this break any existing code if
we did this? 

-Marshall

Re: [DISCUSS] Changing the design for Pear wrappers and Pear-specified "environment variables"

Posted by Michael Baessler <mb...@michael-baessler.de>.
Thomas Hampp wrote:
> Marshall Schor <ms...@schor.com> wrote on 17.11.2008 21:58:01:
> 
> ...
>> The current Pear Wrapper implementation which runs Pears within a UIMA
>> pipeline, takes these environmental variable setting specifications from
>> the Pear and, because in Java there is no way to "set" environmental
>> variables, instead uses them to set Java system properties (of the same
>> name).  So, if the Pear specification had a setting LDPATH with the
>> value "c:\a\b\c", the Pear Wrapper would set a Java System Property
>> "LDPATH" to the string value "c:\a\b\c".  Except for values of CLASSPATH
>> and DATAPATH, all the other environment variables specified in the Pear
>> are used to set System Properties.
>>
>> If a system property being set already exists and has a value different
>> from the value being set, a WARNING message is logged, but the code
>> currently overrides the previous setting.
>>
>> This behavior seems dangerous - it allows any Pear to override arbitrary
>> Java System properties.  If there is not a good use case for it, I think
>> it should be removed from the Pear Wrapper behavior, or at least it
>> should be put under the control of a "permission" value which would by
>> default be set to not allow this.  Would this break any existing code if
>> we did this? 
>>
>> -Marshall
> 
> This features is indeed used and required to run some IBM-internal 
> annotators.
> 
> The use case in our case is based on existing "legacy" technologies that 
> are being wrapped as in UIMA but can not (yet) be fully reengineered to 
> meet UIMA conventions. The existing library we are using uses a system 
> property to locate it's resources. Removing this option would break each 
> deployment of this wrapped annotator.
> 
> Of course using a system property is a limiting design choice. The problem 
> is not so much with name clashes. A well chosen name including a namespace 
> makes this as (un-)likely as name clashes in Java code. But the main 
> problem is that the library can not be instantiated twice within the same 
> JVM with different configuration. Future versions of this technology will 
> remove the dependency.
> 
> I am not sure how much sense a "permission property" makes. As I see it 
> the real risk is not so much in harmful interaction with other code in the 
> JVM (most names are chosen with a good enough name space). And setting the 
> permission to false will just break the annotator that relies on the 
> property.
> 
> I think use of this feature should be clearly discouraged. The PEAR code 
> already issues a warning. Maybe the documentation can be improved.
> 
> So definitely this should not be used for new code. But wrapping existing 
> technology is common. And sometimes the wrapping party does not have full 
> source code control, or it may not be permissible to change the code 
> (license issues) or the effort of a code change can not be contained. 
> Since open, flexible, heterogenous integration is one of UIMA's core 
> features that should also include "legacy code" integration features like 
> this one. 
> 
> 
> -Thomas


I absolutely agree with Thomas. Since we support annotators to work with
system properties / env variables and we allow to specify them in the
PEAR install descriptor we should also support this feature in the UIMA
PEAR runtime. Otherwise the PEAR runtime has limitations and does not
fully support the all PEAR packages. Maybe we can think of a
configuration parameter for UIMA that turns on/off this feature in the
future. So for example by default these additional variables are not
recognized in the PEAR runtime until a special parameter is available.
But maybe this makes things more complicated.

-- Michael



Re: [DISCUSS] Changing the design for Pear wrappers and Pear-specified "environment variables"

Posted by Thomas Hampp <th...@de.ibm.com>.
Marshall Schor <ms...@schor.com> wrote on 17.11.2008 21:58:01:

...
> The current Pear Wrapper implementation which runs Pears within a UIMA
> pipeline, takes these environmental variable setting specifications from
> the Pear and, because in Java there is no way to "set" environmental
> variables, instead uses them to set Java system properties (of the same
> name).  So, if the Pear specification had a setting LDPATH with the
> value "c:\a\b\c", the Pear Wrapper would set a Java System Property
> "LDPATH" to the string value "c:\a\b\c".  Except for values of CLASSPATH
> and DATAPATH, all the other environment variables specified in the Pear
> are used to set System Properties.
> 
> If a system property being set already exists and has a value different
> from the value being set, a WARNING message is logged, but the code
> currently overrides the previous setting.
> 
> This behavior seems dangerous - it allows any Pear to override arbitrary
> Java System properties.  If there is not a good use case for it, I think
> it should be removed from the Pear Wrapper behavior, or at least it
> should be put under the control of a "permission" value which would by
> default be set to not allow this.  Would this break any existing code if
> we did this? 
> 
> -Marshall

This features is indeed used and required to run some IBM-internal 
annotators.

The use case in our case is based on existing "legacy" technologies that 
are being wrapped as in UIMA but can not (yet) be fully reengineered to 
meet UIMA conventions. The existing library we are using uses a system 
property to locate it's resources. Removing this option would break each 
deployment of this wrapped annotator.

Of course using a system property is a limiting design choice. The problem 
is not so much with name clashes. A well chosen name including a namespace 
makes this as (un-)likely as name clashes in Java code. But the main 
problem is that the library can not be instantiated twice within the same 
JVM with different configuration. Future versions of this technology will 
remove the dependency.

I am not sure how much sense a "permission property" makes. As I see it 
the real risk is not so much in harmful interaction with other code in the 
JVM (most names are chosen with a good enough name space). And setting the 
permission to false will just break the annotator that relies on the 
property.

I think use of this feature should be clearly discouraged. The PEAR code 
already issues a warning. Maybe the documentation can be improved.

So definitely this should not be used for new code. But wrapping existing 
technology is common. And sometimes the wrapping party does not have full 
source code control, or it may not be permissible to change the code 
(license issues) or the effort of a code change can not be contained. 
Since open, flexible, heterogenous integration is one of UIMA's core 
features that should also include "legacy code" integration features like 
this one. 


-Thomas