You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Lev Kozakov <le...@gmail.com> on 2007/05/01 18:14:40 UTC

UIMA PEAR utilization issues

Recently, Michael proposed adding 'UIMA PEAR runtime' capabilities to
automatically install PEARs and run encapsulated analytics. In
connection with that discussion, I would like to share experiences
related to utilization of PEARs in UIMA and start a broader discussion
on this topic. The issues, I'm talking about, were actually identified
in the course of developing the 2nd generation of PEARs (in
2005-2006).

PEAR format was introduced as a convenient vehicle for packaging and
distributing UIMA analytics and other resources. PEAR package must
include installation descriptor file, containing the PEAR
identification, reference to the main UIMA descriptor, runtime
settings and other information.
PEAR package has 3 different states: (1) ready for packing/archiving,
(2) packed as a zip (*.pear) archive and (3) installed in a local file
system. In state 1, the package is not packed, but may be not usable
for UIMA deployment, because its installation descriptor, as well as
other descriptor/configuration files may contain $main_root
expressions. In state 3, the package is ready for UIMA deployment.
Transition from state 1 to state 2 (zipping) does not modify the
package contents, while transition from state 2 to state 3
(installation) may irreversibly modify the package contents by
localizing (replacing $main_root with absolute path) installation
descriptor and other descriptor/configuration files. The
installation/localization step is necessary for utilizing PEAR in
UIMA.

As you can see from the previous paragraph, current PEAR has the
following issues:
 1. Installation descriptor file contains several different kinds of
data: component identification, runtime settings, etc. In the 2nd
generation of PEAR we proposed separating component identification
part from runtime settings. There are multiple reasons for doing this,
but I would like to mention only the following one: the component
identification is not modifiable, while runtime settings may contain
$main_root expressions and will be modified during the localization
step.
 2. The presence of the $main_root expressions in component
descriptor/configuration files make the PEAR package (in state 1) not
usable for UIMA deployment. As a result, the package cannot be
validated before packing it by using standard UIMA tooling. To get rid
of this limitation we proposed completely removing $main_root
expressions from the package files, including runtime settings. This
requires using only relative paths or import by name in component
descriptors and modifying PEAR API to convert relative paths into
absolute paths for runtime settings.

Yet another point for discussion is the necessity of installing the
same PEAR again and again for each instantiation of UIMA component, as
stated in the 'UIMA PEAR runtime' proposal. In the 2nd generation of
PEAR we proposed a kind of local registry, which keeps track of
locally installed PEARs.

In general, I would like to start a discussion on possible ways of
improving the processes and API for packaging, managing and deploying
analytics in UIMA.

-- Lev