You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@uima.apache.org by Marshall Schor <ms...@schor.com> on 2011/07/20 00:51:52 UTC

OSGi versions of Add-on Annotators

Moving this from the RC4 release discussion to a new thread ...

I've now tried the following:

Change the build instructions so

a) the dependency goal doesn't unpack the jars
b) the OSGi build instruction doesn't say to "inline" the jars.

The result - it builds, no error messages, and has a result which includes lots
of Jars at the top level, plus a META-INF directory, and nothing else.

I tried this with the ConfigurableFeatureExtractor-osgi project.  The Jars that
get included are all the ones shown by giving the command:  mvn dependency:tree 
in the project directory.  There are many jars, including all of Ant, the
Ant-launcher Jar, a bunch of eclipse jars including things like
org.eclipse.core.jobs, the junit jar, and more.  Many of these are unnecessary,
I think, and including them in the distribution causes us to work to verify the
appropriate LICENSEs/NOTICEs are created.

It seems very unlikely that the CFE needs all these to run, normally.   Running
the non-OSGi CFE maven dependency:tree does not show the ant dependency - I'll
have to track down why those are different.  Looking at the 2.3.0 release, the
CFE non-OSGi had many fewer included Jars.

I think we can fix the build instructions to excluded the unneeded Jars.

I don't have any setup for testing the OSGi packaged artifacts - if anyone else
does, let's figure out how to test these - either by collaborating or by helping
me learn how to setup something locally to test this packaging result.

-Marshall







On 7/19/2011 3:44 PM, Marshall Schor wrote:
> Thanks, Richard.
>
> I think you are right - some of the dependencies (for example, the
> AlchemyApiAnnotator depends on Apache commons-digester, etc.) don't have OSGi
> packagings.
>
> The build strategy for the OSGi modules currently gets all the dependencies and
> unpacks them into .../target/classes directory, where a later step "jars" them up.
>
> This approach overlays files being unzipped, with later versions.  Some examples
> where this might be an issue:
> There is at the top level a license directory, containing one "LICENSE" file.
> There is at the top level a "plugin.xml" file.
> There is at the top level a META-INF dir, with LICENSE and NOTICE files among
> other things.
>
> Perhaps it would be better to package the dependencies that are not OSGi in a
> way that doesn't need to unpack, and then potentially overlay, files.
>
> It seems that OSGi and the bundle plugin support this, via the Embed-Dependency
> instruction.  Is there a reason we're not using that, instead of the "unpacking"
> approach?
>
> -Marshall
>
> On 7/19/2011 11:17 AM, Richard Eckart de Castilho wrote:
>> I wanted to package the DKPro Core UIMA modules as OSGi bundle. These have lots of dependencies on various JARs that are not available as OSGi bundles and sometimes not even available in public Maven repositories - this is why we set up a public repository of our own for the moment. It may be less an issue for the UIMA sandbox, as the individual components may not depend on third-party libraries. 
>>
>> Looking the Add Ons repository, I would suspect that Tika, Solr, Rhino, BeanShell and maybe some of the Apache Commons JARs may not be OSGi bundles. 
>>
>> I guess you aim for a mixed setup where some dependencies (namely UIMA) are imported via package-imports and others (namely the above) are packaged inside the bundles?
>>
>> Cheers,
>>
>> Richard
>>
>> Am 19.07.2011 um 17:08 schrieb Marshall Schor:
>>
>>> I suspect that the Jars are now available as OSGi bundles; do you know of
>>> specific ones that are not?
>>>
>>> Thanks. -Marshall
>>>
>>> On 7/19/2011 10:24 AM, Richard Eckart de Castilho wrote:
>>>> Hi Marshall,
>>>>
>>>> I am very interested in this. Some time back I mostly gave up on packaging UIMA components as OSGi bundles because of this. If you do not bundle all jars (*jikes*) and use package imports instead, the questions is: where do the dependencies come from? Who prepares the bundles and who installs them? Many JARs are not available as OSGi bundles.
>>>>
>>>> -- Richard
>>>>
>>>> Am 19.07.2011 um 16:13 schrieb Marshall Schor:
>>>>
>>>>> I'll take a look at the OSGi build.
>>>>>
>>>>> -Marshall
>>>>>
>>>>> On 7/17/2011 12:16 PM, Marshall Schor wrote:
>>>>>> Since this is (I think) the first time we're releasing the OSGi packaging of the
>>>>>> annotators, I think some work on their license/notice files might be needed,
>>>>>> because:
>>>>>>
>>>>>> - there are duplicate License files - one at the top level, and one in the
>>>>>> META_INF directory
>>>>>> - both of these are the plain vanilla license files.  For projects which are
>>>>>> incorporating other libraries which are under other than the Apache v 2.0
>>>>>> license, those licenses have to be included.
>>>>>> - the NOTICE file is present in the META_INF directory, but is the plain one,
>>>>>> rather than the project specific one.
>>>>>>
>>>>>> Finally, I wonder if the OSGi packaging strategy is correct - in that it
>>>>>> "bundles" every dependency into the OSGi file.  This certainly makes the file
>>>>>> easier to use, but if a user uses 2 OSGi components from UIMA, won't there be a
>>>>>> lot of unnecessary duplication (or does OSGi notice this and avoid it somehow)? 
>>>>>> I'm not sure of an alternative, but I do recall that OSGi allows for
>>>>>> dependencies on other packages; perhaps that could be useful?
>>>>>>
>>>>>> -Marshall
>>>>>>
>>>>>> On 7/15/2011 8:29 AM, Tommaso Teofili wrote:
>>>>>>> Hi all,
>>>>>>> I've prepared the new RC (4) for UIMA Addons release.
>>>>>>>
>>>>>>> The following is a list of issues addressed in this release:
>>>>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310570&version=12316093
>>>>>>>
>>>>>>> The source zip and binary files are available here:
>>>>>>> http://people.apache.org/~tommaso/uima-addons-2.3.1-rc4
>>>>>>>
>>>>>>> SVN Tag Checkout:
>>>>>>> svn co
>>>>>>> http://svn.apache.org/repos/asf/uima/addons/tags/uima-addons-2.3.1-rc4/
>>>>>>>
>>>>>>> Please cast your vote for UIMA Addons 2.3.1 release:
>>>>>>>
>>>>>>> [ ] +1 Approve the release
>>>>>>> [ ] -1 Veto the release (please provide specific comments)
>>>>>>> [ ] 0   Don't care
>>>>>>>
>>>>>>> Regards,
>>>>>>> Tommaso
>>>>>>>
>> Richard Eckart de Castilho
>>

Re: OSGi versions of Add-on Annotators

Posted by Tommaso Teofili <to...@gmail.com>.

2011/7/25 Marshall Schor <ms...@schor.com>

> Well, something is going on which I don't understand, if the OSGi packaged
> annotators are working as currently packaged, that is, without including
> the
> UIMA Jars inside every annotator packaging, but rather having them provided
> in a
> separate OSGi bundle.
>
> This is because of the following though-experiment:
>
> The UIMA bundle has code which loads things using a configuration file (the
> XML
> descriptor for the Annotator, for instance).  A typical piece of
> "application"
> code, running in a 3rd bundle (the other 2, being the UIMA framework, and
> the
> Annotator bundle), would invoke UIMA bundle things to read the XML
> descriptor,
> and then do some kind of UIMA bundle call to "instantiate" the analysis
> engine.
> This operation involves having the UIMA framework code (in its bundle)
> loading
> the classes (out of the Annotator bundle), but without the Eclipse-buddy
> kind of
> thing, the UIMA bundle code can't "see" the Annotator bundle classes.
>
> Tommaso, can you describe what the application code is doing differently
> from my
> though-experiment above, which makes things work (or where my thinking is
> off-base)?
>

To make this things work in the past without the Eclipse-buddy I created a
'utils' bundle which takes care of such class loading issues (i.e. allowing
the use of delegate classloaders to get rid of the problem you described
above), however this was prior the change in addons-osgi-runtime which
removed the uimaj-ep-runtime from the annotators' dependencies, thus this
should be tested again with the new configuration.
However I do think it's better to clean things, as you're proposing in the
'redo of the OSGi packaging poms' thread, before doing other tests which
could be suddenly invalidated by other changes :-)
Tommaso



>
> -Marshall
>
> On 7/25/2011 5:35 AM, Tommaso Teofili wrote:
> > 2011/7/25 Jörn Kottmann <ko...@gmail.com>
> >
> >> On 7/22/11 8:37 PM, Marshall Schor wrote:
> >>
> >>> I found a few things, concerning Eclipse-buddyPolicy on the web.  [1]
> >>> describes
> >>> it, [2] says it isn't implemented in Felix and won't be, because it
> causes
> >>> problems (with GC and other things), [3] is a general discussion of
> >>> workarounds.
> >>>
> >>> We actually put one of these kinds of workarounds into UIMA for logging
> >>> [4],
> >>> although not in an OSGi context.  However [5] notes that that
> workaround
> >>> has
> >>> issues, and that the proposal for adding this to OSGi itself, "fell out
> of
> >>> the
> >>> specification for OSGi R4 V4.2
> >>> core specification."
> >>>
> >>> I think this needs more careful thinking:-)  ...
> >>>
> >> So I guess the best option for us now, is to not release any
> >> OSGi annotators, right?
> >>
> >>
> > The current addons-osgi-runtime addons are working with Felix in Clerezza
> at
> > the moment (made some tests in the weekend)
> > With regards to Marshall's and Jörn's comments I think we should drop the
> > Eclipse-BuddyPolicy header, I remember the OSGi versions of annotators
> > worked without it in the past.
> > I can spend some more time on testing them removing the Eclipse
> buddy-policy
> > stuff.
> > After the amount of work spent so far with them I would be happy to put
> them
> > inside the release (if there's consensus obviously); then we can refine
> > things iteratively along future releases.
> > Regards,
> > Tommaso
> >
>

Re: OSGi versions of Add-on Annotators

Posted by Marshall Schor <ms...@schor.com>.

Well, something is going on which I don't understand, if the OSGi packaged
annotators are working as currently packaged, that is, without including the
UIMA Jars inside every annotator packaging, but rather having them provided in a
separate OSGi bundle.

This is because of the following though-experiment:

The UIMA bundle has code which loads things using a configuration file (the XML
descriptor for the Annotator, for instance).  A typical piece of "application"
code, running in a 3rd bundle (the other 2, being the UIMA framework, and the
Annotator bundle), would invoke UIMA bundle things to read the XML descriptor,
and then do some kind of UIMA bundle call to "instantiate" the analysis engine. 
This operation involves having the UIMA framework code (in its bundle) loading
the classes (out of the Annotator bundle), but without the Eclipse-buddy kind of
thing, the UIMA bundle code can't "see" the Annotator bundle classes.

Tommaso, can you describe what the application code is doing differently from my
though-experiment above, which makes things work (or where my thinking is off-base)?

-Marshall

On 7/25/2011 5:35 AM, Tommaso Teofili wrote:
> 2011/7/25 Jörn Kottmann <ko...@gmail.com>
>
>> On 7/22/11 8:37 PM, Marshall Schor wrote:
>>
>>> I found a few things, concerning Eclipse-buddyPolicy on the web.  [1]
>>> describes
>>> it, [2] says it isn't implemented in Felix and won't be, because it causes
>>> problems (with GC and other things), [3] is a general discussion of
>>> workarounds.
>>>
>>> We actually put one of these kinds of workarounds into UIMA for logging
>>> [4],
>>> although not in an OSGi context.  However [5] notes that that workaround
>>> has
>>> issues, and that the proposal for adding this to OSGi itself, "fell out of
>>> the
>>> specification for OSGi R4 V4.2
>>> core specification."
>>>
>>> I think this needs more careful thinking:-)  ...
>>>
>> So I guess the best option for us now, is to not release any
>> OSGi annotators, right?
>>
>>
> The current addons-osgi-runtime addons are working with Felix in Clerezza at
> the moment (made some tests in the weekend)
> With regards to Marshall's and Jörn's comments I think we should drop the
> Eclipse-BuddyPolicy header, I remember the OSGi versions of annotators
> worked without it in the past.
> I can spend some more time on testing them removing the Eclipse buddy-policy
> stuff.
> After the amount of work spent so far with them I would be happy to put them
> inside the release (if there's consensus obviously); then we can refine
> things iteratively along future releases.
> Regards,
> Tommaso
>

Re: OSGi versions of Add-on Annotators

Posted by Tommaso Teofili <to...@gmail.com>.

2011/7/25 Jörn Kottmann <ko...@gmail.com>

> On 7/25/11 11:35 AM, Tommaso Teofili wrote:
>
>> The current addons-osgi-runtime addons are working with Felix in Clerezza
>> at
>> the moment (made some tests in the weekend)
>> With regards to Marshall's and Jörn's comments I think we should drop the
>> Eclipse-BuddyPolicy header, I remember the OSGi versions of annotators
>> worked without it in the past.
>>
>
> What is the addons-osgi-runtime doing?
>
>
The UIMA annotators are being used behind a REST service in order to extract
tags from external pages and then saving the enriched page as an RDF graph.
Tommaso

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

On 7/25/11 11:35 AM, Tommaso Teofili wrote:
> The current addons-osgi-runtime addons are working with Felix in Clerezza at
> the moment (made some tests in the weekend)
> With regards to Marshall's and Jörn's comments I think we should drop the
> Eclipse-BuddyPolicy header, I remember the OSGi versions of annotators
> worked without it in the past.

What is the addons-osgi-runtime doing?

Jörn

Re: OSGi versions of Add-on Annotators

Posted by Tommaso Teofili <to...@gmail.com>.

2011/7/25 Jörn Kottmann <ko...@gmail.com>

> On 7/22/11 8:37 PM, Marshall Schor wrote:
>
>> I found a few things, concerning Eclipse-buddyPolicy on the web.  [1]
>> describes
>> it, [2] says it isn't implemented in Felix and won't be, because it causes
>> problems (with GC and other things), [3] is a general discussion of
>> workarounds.
>>
>> We actually put one of these kinds of workarounds into UIMA for logging
>> [4],
>> although not in an OSGi context.  However [5] notes that that workaround
>> has
>> issues, and that the proposal for adding this to OSGi itself, "fell out of
>> the
>> specification for OSGi R4 V4.2
>> core specification."
>>
>> I think this needs more careful thinking:-)  ...
>>
>
> So I guess the best option for us now, is to not release any
> OSGi annotators, right?
>
>
The current addons-osgi-runtime addons are working with Felix in Clerezza at
the moment (made some tests in the weekend)
With regards to Marshall's and Jörn's comments I think we should drop the
Eclipse-BuddyPolicy header, I remember the OSGi versions of annotators
worked without it in the past.
I can spend some more time on testing them removing the Eclipse buddy-policy
stuff.
After the amount of work spent so far with them I would be happy to put them
inside the release (if there's consensus obviously); then we can refine
things iteratively along future releases.
Regards,
Tommaso

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

On 7/22/11 8:37 PM, Marshall Schor wrote:
> I found a few things, concerning Eclipse-buddyPolicy on the web.  [1] describes
> it, [2] says it isn't implemented in Felix and won't be, because it causes
> problems (with GC and other things), [3] is a general discussion of workarounds.
>
> We actually put one of these kinds of workarounds into UIMA for logging [4],
> although not in an OSGi context.  However [5] notes that that workaround has
> issues, and that the proposal for adding this to OSGi itself, "fell out of the
> specification for OSGi R4 V4.2
> core specification."
>
> I think this needs more careful thinking:-)  ...

So I guess the best option for us now, is to not release any
OSGi annotators, right?

Jörn

Re: OSGi versions of Add-on Annotators

Posted by Marshall Schor <ms...@schor.com>.

I found a few things, concerning Eclipse-buddyPolicy on the web.  [1] describes
it, [2] says it isn't implemented in Felix and won't be, because it causes
problems (with GC and other things), [3] is a general discussion of workarounds.

We actually put one of these kinds of workarounds into UIMA for logging [4],
although not in an OSGi context.  However [5] notes that that workaround has
issues, and that the proposal for adding this to OSGi itself, "fell out of the
specification for OSGi R4 V4.2
core specification."

I think this needs more careful thinking :-) ...  -Marshall

[1] http://wiki.eclipse.org/index.php/Context_Class_Loader_Enhancements
[2]
http://markmail.org/message/2smf5ntjhzwalkmm#query:+page:1+mid:epvakx66wz5auvmu+state:results
[3] http://njbartlett.name/2010/08/30/osgi-readiness-loading-classes.html
[4] https://issues.apache.org/jira/browse/UIMA-1714
[5] http://www.mail-archive.com/equinox-dev@eclipse.org/msg03304.html
On 7/20/2011 4:15 PM, Marshall Schor wrote:
> Here's my summary of this thread, so far:
>
> 1) there's lots of interest in OSGi for UIMA components (annotators, shared type
> systems, shared resources)
>
> 2) the current addon-osgi packaging should be altered:
>   2a) to have individual dependent Jars which are not otherwise available as
> OSGi bundles, included as Jars, not unpacked, so they don't overlay one another
>   2b) to add OSGi *dependencies* for the base UIMA framework, instead of
> including it in every bundle.
>   2c) to remove unneeded dependencies from the OSGi packaging (for instance, the
> many apparently extra Jars in the ConfigurableFeatureExtractor OSGi packaging).
>
> 3) longer-term: need to come up with a proposal for better OSGi support, as the
> current one only works with Eclipse-buddy kinds of things (which is not official
> OSGi).  This new support should be along the lines of enabling UIMA framework
> itself, and annotators, to be OSGi components, to be run within any one of a
> number of OSGi containers.
> Installation should be as easy as dropping a new annotator into a special spot
> in the file system :-) .
>
> 4) I didn't really hear a consensus on actually *releasing* the OSGi bundles of
> the add-ons; is there any real use expect of this kind of packaging with the
> current state of things?  Would it even work if we removed the UIMA framework
> itself from being included in each OSGi bundle of an add-on annotator?  Is it
> only usable with the Eclipse-buddy style of embedding?
>
> -Marshall
>
> On 7/19/2011 6:51 PM, Marshall Schor wrote:
>> Moving this from the RC4 release discussion to a new thread ...
>>
>> I've now tried the following:
>>
>> Change the build instructions so
>>
>> a) the dependency goal doesn't unpack the jars
>> b) the OSGi build instruction doesn't say to "inline" the jars.
>>
>> The result - it builds, no error messages, and has a result which includes lots
>> of Jars at the top level, plus a META-INF directory, and nothing else.
>>
>> I tried this with the ConfigurableFeatureExtractor-osgi project.  The Jars that
>> get included are all the ones shown by giving the command:  mvn dependency:tree 
>> in the project directory.  There are many jars, including all of Ant, the
>> Ant-launcher Jar, a bunch of eclipse jars including things like
>> org.eclipse.core.jobs, the junit jar, and more.  Many of these are unnecessary,
>> I think, and including them in the distribution causes us to work to verify the
>> appropriate LICENSEs/NOTICEs are created.
>>
>> It seems very unlikely that the CFE needs all these to run, normally.   Running
>> the non-OSGi CFE maven dependency:tree does not show the ant dependency - I'll
>> have to track down why those are different.  Looking at the 2.3.0 release, the
>> CFE non-OSGi had many fewer included Jars.
>>
>> I think we can fix the build instructions to excluded the unneeded Jars.
>>
>> I don't have any setup for testing the OSGi packaged artifacts - if anyone else
>> does, let's figure out how to test these - either by collaborating or by helping
>> me learn how to setup something locally to test this packaging result.
>>
>> -Marshall
>>
>>
>>
>>
>>
>>
>>
>> On 7/19/2011 3:44 PM, Marshall Schor wrote:
>>> Thanks, Richard.
>>>
>>> I think you are right - some of the dependencies (for example, the
>>> AlchemyApiAnnotator depends on Apache commons-digester, etc.) don't have OSGi
>>> packagings.
>>>
>>> The build strategy for the OSGi modules currently gets all the dependencies and
>>> unpacks them into .../target/classes directory, where a later step "jars" them up.
>>>
>>> This approach overlays files being unzipped, with later versions.  Some examples
>>> where this might be an issue:
>>> There is at the top level a license directory, containing one "LICENSE" file.
>>> There is at the top level a "plugin.xml" file.
>>> There is at the top level a META-INF dir, with LICENSE and NOTICE files among
>>> other things.
>>>
>>> Perhaps it would be better to package the dependencies that are not OSGi in a
>>> way that doesn't need to unpack, and then potentially overlay, files.
>>>
>>> It seems that OSGi and the bundle plugin support this, via the Embed-Dependency
>>> instruction.  Is there a reason we're not using that, instead of the "unpacking"
>>> approach?
>>>
>>> -Marshall
>>>
>>> On 7/19/2011 11:17 AM, Richard Eckart de Castilho wrote:
>>>> I wanted to package the DKPro Core UIMA modules as OSGi bundle. These have lots of dependencies on various JARs that are not available as OSGi bundles and sometimes not even available in public Maven repositories - this is why we set up a public repository of our own for the moment. It may be less an issue for the UIMA sandbox, as the individual components may not depend on third-party libraries. 
>>>>
>>>> Looking the Add Ons repository, I would suspect that Tika, Solr, Rhino, BeanShell and maybe some of the Apache Commons JARs may not be OSGi bundles. 
>>>>
>>>> I guess you aim for a mixed setup where some dependencies (namely UIMA) are imported via package-imports and others (namely the above) are packaged inside the bundles?
>>>>
>>>> Cheers,
>>>>
>>>> Richard
>>>>
>>>> Am 19.07.2011 um 17:08 schrieb Marshall Schor:
>>>>
>>>>> I suspect that the Jars are now available as OSGi bundles; do you know of
>>>>> specific ones that are not?
>>>>>
>>>>> Thanks. -Marshall
>>>>>
>>>>> On 7/19/2011 10:24 AM, Richard Eckart de Castilho wrote:
>>>>>> Hi Marshall,
>>>>>>
>>>>>> I am very interested in this. Some time back I mostly gave up on packaging UIMA components as OSGi bundles because of this. If you do not bundle all jars (*jikes*) and use package imports instead, the questions is: where do the dependencies come from? Who prepares the bundles and who installs them? Many JARs are not available as OSGi bundles.
>>>>>>
>>>>>> -- Richard
>>>>>>
>>>>>> Am 19.07.2011 um 16:13 schrieb Marshall Schor:
>>>>>>
>>>>>>> I'll take a look at the OSGi build.
>>>>>>>
>>>>>>> -Marshall
>>>>>>>
>>>>>>> On 7/17/2011 12:16 PM, Marshall Schor wrote:
>>>>>>>> Since this is (I think) the first time we're releasing the OSGi packaging of the
>>>>>>>> annotators, I think some work on their license/notice files might be needed,
>>>>>>>> because:
>>>>>>>>
>>>>>>>> - there are duplicate License files - one at the top level, and one in the
>>>>>>>> META_INF directory
>>>>>>>> - both of these are the plain vanilla license files.  For projects which are
>>>>>>>> incorporating other libraries which are under other than the Apache v 2.0
>>>>>>>> license, those licenses have to be included.
>>>>>>>> - the NOTICE file is present in the META_INF directory, but is the plain one,
>>>>>>>> rather than the project specific one.
>>>>>>>>
>>>>>>>> Finally, I wonder if the OSGi packaging strategy is correct - in that it
>>>>>>>> "bundles" every dependency into the OSGi file.  This certainly makes the file
>>>>>>>> easier to use, but if a user uses 2 OSGi components from UIMA, won't there be a
>>>>>>>> lot of unnecessary duplication (or does OSGi notice this and avoid it somehow)? 
>>>>>>>> I'm not sure of an alternative, but I do recall that OSGi allows for
>>>>>>>> dependencies on other packages; perhaps that could be useful?
>>>>>>>>
>>>>>>>> -Marshall
>>>>>>>>
>>>>>>>> On 7/15/2011 8:29 AM, Tommaso Teofili wrote:
>>>>>>>>> Hi all,
>>>>>>>>> I've prepared the new RC (4) for UIMA Addons release.
>>>>>>>>>
>>>>>>>>> The following is a list of issues addressed in this release:
>>>>>>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310570&version=12316093
>>>>>>>>>
>>>>>>>>> The source zip and binary files are available here:
>>>>>>>>> http://people.apache.org/~tommaso/uima-addons-2.3.1-rc4
>>>>>>>>>
>>>>>>>>> SVN Tag Checkout:
>>>>>>>>> svn co
>>>>>>>>> http://svn.apache.org/repos/asf/uima/addons/tags/uima-addons-2.3.1-rc4/
>>>>>>>>>
>>>>>>>>> Please cast your vote for UIMA Addons 2.3.1 release:
>>>>>>>>>
>>>>>>>>> [ ] +1 Approve the release
>>>>>>>>> [ ] -1 Veto the release (please provide specific comments)
>>>>>>>>> [ ] 0   Don't care
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Tommaso
>>>>>>>>>
>>>> Richard Eckart de Castilho
>>>>

Re: OSGi versions of Add-on Annotators

Posted by florent andré <fl...@4sengines.com>.


On 07/20/2011 10:15 PM, Marshall Schor wrote:
> Here's my summary of this thread, so far:
>
...
>
> 4) I didn't really hear a consensus on actually *releasing* the OSGi bundles of
> the add-ons; is there any real use expect of this kind of packaging with the
> current state of things?

I (not so easily) try something with ! :)

  Would it even work if we removed the UIMA framework
> itself from being included in each OSGi bundle of an add-on annotator?

Don't know in the current state of things, but it's the way to go IMO.

Is it
> only usable with the Eclipse-buddy style of embedding?

I use this bundles in Felix actually.

++

Re: OSGi versions of Add-on Annotators

Posted by Marshall Schor <ms...@schor.com>.

On 7/21/2011 3:55 AM, Tommaso Teofili wrote:
> ... 
>> Would it even work if we removed the UIMA framework
>> itself from being included in each OSGi bundle of an add-on annotator?
>
> I successfully tried it with Apache Felix starting first the
> uimaj-ep-runtime then starting the annotators' bundles and then starting
> (and using the services of) a third bundle which depends on them.
>
> Is it
>> only usable with the Eclipse-buddy style of embedding?
>>
> on Apache Felix it worked also without the current configuration using
> Eclipse-buddy style:
>
>
>
>  <Eclipse-ExtensibleAPI>true</Eclipse-ExtensibleAPI>
>
>  <Eclipse-BuddyPolicy>registered</Eclipse-BuddyPolicy>
>
>
> thus using the default OSGi one.

I think a good test of this would be to regenerate the osgi annotator versions
without the UIMA jars, but importing those, and then redo these tests - just to
be sure things work.  (I'm worried that without the buddy policy, the copy of
the UIMA framework in its own separate bundle will load classes that can't be
"seen" by the annotators it's loading them for.  I'm worried that the previous
testing didn't pick this up, because maybe the way these annotators were used,
involved running the copy of the UIMA framework that was packaged in the same
bundle as each annotator, and not running the copy which was in the separate
bundle).

I'll do a Jira for the work to (a) change the build to leave the dependencies as
their own jars, instead of unpacking them, and (b) to remove UIMA framework
things from the annotator bundles - substituting an import-package for these.

Then, I'm hoping someone can build this and test in Felix and Stanbol :-) .

-Marshall

> In the end my opinion is that we should fix the current bundles as
> highlighted by Marshall, include them in the 2.3.1 addons release and then
> define a roadmap for a better OSGi support (we could use the wiki to collect
> proposals/pros/cons/etc. ).
> My 2 cents,
> Tommaso

Re: OSGi versions of Add-on Annotators

Posted by Tommaso Teofili <to...@gmail.com>.

Thanks Marshall for summarizing things :)

2011/7/20 Marshall Schor <ms...@schor.com>

> Here's my summary of this thread, so far:
>
> 1) there's lots of interest in OSGi for UIMA components (annotators, shared
> type
> systems, shared resources)
>

it seems so and I wasn't aware of that


>
> 2) the current addon-osgi packaging should be altered:
>  2a) to have individual dependent Jars which are not otherwise available as
> OSGi bundles, included as Jars, not unpacked, so they don't overlay one
> another
>  2b) to add OSGi *dependencies* for the base UIMA framework, instead of
> including it in every bundle.
>  2c) to remove unneeded dependencies from the OSGi packaging (for instance,
> the
> many apparently extra Jars in the ConfigurableFeatureExtractor OSGi
> packaging).
>
>
+1


> 3) longer-term: need to come up with a proposal for better OSGi support, as
> the
> current one only works with Eclipse-buddy kinds of things (which is not
> official
> OSGi).  This new support should be along the lines of enabling UIMA
> framework
> itself, and annotators, to be OSGi components, to be run within any one of
> a
> number of OSGi containers.
> Installation should be as easy as dropping a new annotator into a special
> spot
> in the file system :-) .
>

cool :) we should work together to define an accurate proposal to implement
in a major version of UIMA


>
> 4) I didn't really hear a consensus on actually *releasing* the OSGi
> bundles of
> the add-ons; is there any real use expect of this kind of packaging with
> the
> current state of things?


as highlighted by Florent they are used (as SNAPSHOT dependencies) in
Clerezza and Stanbol and, as long as I can see, they seem to be working.
Once we go through the steps defined in point 2) I'd add them to the current
release.


> Would it even work if we removed the UIMA framework
> itself from being included in each OSGi bundle of an add-on annotator?


I successfully tried it with Apache Felix starting first the
uimaj-ep-runtime then starting the annotators' bundles and then starting
(and using the services of) a third bundle which depends on them.

Is it
> only usable with the Eclipse-buddy style of embedding?
>

on Apache Felix it worked also without the current configuration using
Eclipse-buddy style:



 <Eclipse-ExtensibleAPI>true</Eclipse-ExtensibleAPI>

 <Eclipse-BuddyPolicy>registered</Eclipse-BuddyPolicy>


thus using the default OSGi one.

In the end my opinion is that we should fix the current bundles as
highlighted by Marshall, include them in the 2.3.1 addons release and then
define a roadmap for a better OSGi support (we could use the wiki to collect
proposals/pros/cons/etc. ).
My 2 cents,
Tommaso



>
> -Marshall
>
> On 7/19/2011 6:51 PM, Marshall Schor wrote:
> > Moving this from the RC4 release discussion to a new thread ...
> >
> > I've now tried the following:
> >
> > Change the build instructions so
> >
> > a) the dependency goal doesn't unpack the jars
> > b) the OSGi build instruction doesn't say to "inline" the jars.
> >
> > The result - it builds, no error messages, and has a result which
> includes lots
> > of Jars at the top level, plus a META-INF directory, and nothing else.
> >
> > I tried this with the ConfigurableFeatureExtractor-osgi project.  The
> Jars that
> > get included are all the ones shown by giving the command:  mvn
> dependency:tree
> > in the project directory.  There are many jars, including all of Ant, the
> > Ant-launcher Jar, a bunch of eclipse jars including things like
> > org.eclipse.core.jobs, the junit jar, and more.  Many of these are
> unnecessary,
> > I think, and including them in the distribution causes us to work to
> verify the
> > appropriate LICENSEs/NOTICEs are created.
> >
> > It seems very unlikely that the CFE needs all these to run, normally.
> Running
> > the non-OSGi CFE maven dependency:tree does not show the ant dependency -
> I'll
> > have to track down why those are different.  Looking at the 2.3.0
> release, the
> > CFE non-OSGi had many fewer included Jars.
> >
> > I think we can fix the build instructions to excluded the unneeded Jars.
> >
> > I don't have any setup for testing the OSGi packaged artifacts - if
> anyone else
> > does, let's figure out how to test these - either by collaborating or by
> helping
> > me learn how to setup something locally to test this packaging result.
> >
> > -Marshall
> >
> >
> >
> >
> >
> >
> >
> > On 7/19/2011 3:44 PM, Marshall Schor wrote:
> >> Thanks, Richard.
> >>
> >> I think you are right - some of the dependencies (for example, the
> >> AlchemyApiAnnotator depends on Apache commons-digester, etc.) don't have
> OSGi
> >> packagings.
> >>
> >> The build strategy for the OSGi modules currently gets all the
> dependencies and
> >> unpacks them into .../target/classes directory, where a later step
> "jars" them up.
> >>
> >> This approach overlays files being unzipped, with later versions.  Some
> examples
> >> where this might be an issue:
> >> There is at the top level a license directory, containing one "LICENSE"
> file.
> >> There is at the top level a "plugin.xml" file.
> >> There is at the top level a META-INF dir, with LICENSE and NOTICE files
> among
> >> other things.
> >>
> >> Perhaps it would be better to package the dependencies that are not OSGi
> in a
> >> way that doesn't need to unpack, and then potentially overlay, files.
> >>
> >> It seems that OSGi and the bundle plugin support this, via the
> Embed-Dependency
> >> instruction.  Is there a reason we're not using that, instead of the
> "unpacking"
> >> approach?
> >>
> >> -Marshall
> >>
> >> On 7/19/2011 11:17 AM, Richard Eckart de Castilho wrote:
> >>> I wanted to package the DKPro Core UIMA modules as OSGi bundle. These
> have lots of dependencies on various JARs that are not available as OSGi
> bundles and sometimes not even available in public Maven repositories - this
> is why we set up a public repository of our own for the moment. It may be
> less an issue for the UIMA sandbox, as the individual components may not
> depend on third-party libraries.
> >>>
> >>> Looking the Add Ons repository, I would suspect that Tika, Solr, Rhino,
> BeanShell and maybe some of the Apache Commons JARs may not be OSGi bundles.
> >>>
> >>> I guess you aim for a mixed setup where some dependencies (namely UIMA)
> are imported via package-imports and others (namely the above) are packaged
> inside the bundles?
> >>>
> >>> Cheers,
> >>>
> >>> Richard
> >>>
> >>> Am 19.07.2011 um 17:08 schrieb Marshall Schor:
> >>>
> >>>> I suspect that the Jars are now available as OSGi bundles; do you know
> of
> >>>> specific ones that are not?
> >>>>
> >>>> Thanks. -Marshall
> >>>>
> >>>> On 7/19/2011 10:24 AM, Richard Eckart de Castilho wrote:
> >>>>> Hi Marshall,
> >>>>>
> >>>>> I am very interested in this. Some time back I mostly gave up on
> packaging UIMA components as OSGi bundles because of this. If you do not
> bundle all jars (*jikes*) and use package imports instead, the questions is:
> where do the dependencies come from? Who prepares the bundles and who
> installs them? Many JARs are not available as OSGi bundles.
> >>>>>
> >>>>> -- Richard
> >>>>>
> >>>>> Am 19.07.2011 um 16:13 schrieb Marshall Schor:
> >>>>>
> >>>>>> I'll take a look at the OSGi build.
> >>>>>>
> >>>>>> -Marshall
> >>>>>>
> >>>>>> On 7/17/2011 12:16 PM, Marshall Schor wrote:
> >>>>>>> Since this is (I think) the first time we're releasing the OSGi
> packaging of the
> >>>>>>> annotators, I think some work on their license/notice files might
> be needed,
> >>>>>>> because:
> >>>>>>>
> >>>>>>> - there are duplicate License files - one at the top level, and one
> in the
> >>>>>>> META_INF directory
> >>>>>>> - both of these are the plain vanilla license files.  For projects
> which are
> >>>>>>> incorporating other libraries which are under other than the Apache
> v 2.0
> >>>>>>> license, those licenses have to be included.
> >>>>>>> - the NOTICE file is present in the META_INF directory, but is the
> plain one,
> >>>>>>> rather than the project specific one.
> >>>>>>>
> >>>>>>> Finally, I wonder if the OSGi packaging strategy is correct - in
> that it
> >>>>>>> "bundles" every dependency into the OSGi file.  This certainly
> makes the file
> >>>>>>> easier to use, but if a user uses 2 OSGi components from UIMA,
> won't there be a
> >>>>>>> lot of unnecessary duplication (or does OSGi notice this and avoid
> it somehow)?
> >>>>>>> I'm not sure of an alternative, but I do recall that OSGi allows
> for
> >>>>>>> dependencies on other packages; perhaps that could be useful?
> >>>>>>>
> >>>>>>> -Marshall
> >>>>>>>
> >>>>>>> On 7/15/2011 8:29 AM, Tommaso Teofili wrote:
> >>>>>>>> Hi all,
> >>>>>>>> I've prepared the new RC (4) for UIMA Addons release.
> >>>>>>>>
> >>>>>>>> The following is a list of issues addressed in this release:
> >>>>>>>>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310570&version=12316093
> >>>>>>>>
> >>>>>>>> The source zip and binary files are available here:
> >>>>>>>> http://people.apache.org/~tommaso/uima-addons-2.3.1-rc4
> >>>>>>>>
> >>>>>>>> SVN Tag Checkout:
> >>>>>>>> svn co
> >>>>>>>>
> http://svn.apache.org/repos/asf/uima/addons/tags/uima-addons-2.3.1-rc4/
> >>>>>>>>
> >>>>>>>> Please cast your vote for UIMA Addons 2.3.1 release:
> >>>>>>>>
> >>>>>>>> [ ] +1 Approve the release
> >>>>>>>> [ ] -1 Veto the release (please provide specific comments)
> >>>>>>>> [ ] 0   Don't care
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Tommaso
> >>>>>>>>
> >>> Richard Eckart de Castilho
> >>>
>

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

On 7/20/11 10:15 PM, Marshall Schor wrote:
> 3) longer-term: need to come up with a proposal for better OSGi support, as the
> current one only works with Eclipse-buddy kinds of things (which is not official
> OSGi).  This new support should be along the lines of enabling UIMA framework
> itself, and annotators, to be OSGi components, to be run within any one of a
> number of OSGi containers.
> Installation should be as easy as dropping a new annotator into a special spot
> in the file system:-)  .

There exists a paper about UIMA and OSGi from IBM, not sure if everyone
is aware of this, here is the link:
http://domino.research.ibm.com/library/cyberdig.nsf/0/c77885d8151bed8d8525731c0067d980?OpenDocument

And our wiki also has a page about this topic:
https://cwiki.apache.org/UIMA/uima-osgi-enablement.html

Jörn

Re: OSGi versions of Add-on Annotators

Posted by Marshall Schor <ms...@schor.com>.

Here's my summary of this thread, so far:

1) there's lots of interest in OSGi for UIMA components (annotators, shared type
systems, shared resources)

2) the current addon-osgi packaging should be altered:
  2a) to have individual dependent Jars which are not otherwise available as
OSGi bundles, included as Jars, not unpacked, so they don't overlay one another
  2b) to add OSGi *dependencies* for the base UIMA framework, instead of
including it in every bundle.
  2c) to remove unneeded dependencies from the OSGi packaging (for instance, the
many apparently extra Jars in the ConfigurableFeatureExtractor OSGi packaging).

3) longer-term: need to come up with a proposal for better OSGi support, as the
current one only works with Eclipse-buddy kinds of things (which is not official
OSGi).  This new support should be along the lines of enabling UIMA framework
itself, and annotators, to be OSGi components, to be run within any one of a
number of OSGi containers.
Installation should be as easy as dropping a new annotator into a special spot
in the file system :-) .

4) I didn't really hear a consensus on actually *releasing* the OSGi bundles of
the add-ons; is there any real use expect of this kind of packaging with the
current state of things?  Would it even work if we removed the UIMA framework
itself from being included in each OSGi bundle of an add-on annotator?  Is it
only usable with the Eclipse-buddy style of embedding?

-Marshall

On 7/19/2011 6:51 PM, Marshall Schor wrote:
> Moving this from the RC4 release discussion to a new thread ...
>
> I've now tried the following:
>
> Change the build instructions so
>
> a) the dependency goal doesn't unpack the jars
> b) the OSGi build instruction doesn't say to "inline" the jars.
>
> The result - it builds, no error messages, and has a result which includes lots
> of Jars at the top level, plus a META-INF directory, and nothing else.
>
> I tried this with the ConfigurableFeatureExtractor-osgi project.  The Jars that
> get included are all the ones shown by giving the command:  mvn dependency:tree 
> in the project directory.  There are many jars, including all of Ant, the
> Ant-launcher Jar, a bunch of eclipse jars including things like
> org.eclipse.core.jobs, the junit jar, and more.  Many of these are unnecessary,
> I think, and including them in the distribution causes us to work to verify the
> appropriate LICENSEs/NOTICEs are created.
>
> It seems very unlikely that the CFE needs all these to run, normally.   Running
> the non-OSGi CFE maven dependency:tree does not show the ant dependency - I'll
> have to track down why those are different.  Looking at the 2.3.0 release, the
> CFE non-OSGi had many fewer included Jars.
>
> I think we can fix the build instructions to excluded the unneeded Jars.
>
> I don't have any setup for testing the OSGi packaged artifacts - if anyone else
> does, let's figure out how to test these - either by collaborating or by helping
> me learn how to setup something locally to test this packaging result.
>
> -Marshall
>
>
>
>
>
>
>
> On 7/19/2011 3:44 PM, Marshall Schor wrote:
>> Thanks, Richard.
>>
>> I think you are right - some of the dependencies (for example, the
>> AlchemyApiAnnotator depends on Apache commons-digester, etc.) don't have OSGi
>> packagings.
>>
>> The build strategy for the OSGi modules currently gets all the dependencies and
>> unpacks them into .../target/classes directory, where a later step "jars" them up.
>>
>> This approach overlays files being unzipped, with later versions.  Some examples
>> where this might be an issue:
>> There is at the top level a license directory, containing one "LICENSE" file.
>> There is at the top level a "plugin.xml" file.
>> There is at the top level a META-INF dir, with LICENSE and NOTICE files among
>> other things.
>>
>> Perhaps it would be better to package the dependencies that are not OSGi in a
>> way that doesn't need to unpack, and then potentially overlay, files.
>>
>> It seems that OSGi and the bundle plugin support this, via the Embed-Dependency
>> instruction.  Is there a reason we're not using that, instead of the "unpacking"
>> approach?
>>
>> -Marshall
>>
>> On 7/19/2011 11:17 AM, Richard Eckart de Castilho wrote:
>>> I wanted to package the DKPro Core UIMA modules as OSGi bundle. These have lots of dependencies on various JARs that are not available as OSGi bundles and sometimes not even available in public Maven repositories - this is why we set up a public repository of our own for the moment. It may be less an issue for the UIMA sandbox, as the individual components may not depend on third-party libraries. 
>>>
>>> Looking the Add Ons repository, I would suspect that Tika, Solr, Rhino, BeanShell and maybe some of the Apache Commons JARs may not be OSGi bundles. 
>>>
>>> I guess you aim for a mixed setup where some dependencies (namely UIMA) are imported via package-imports and others (namely the above) are packaged inside the bundles?
>>>
>>> Cheers,
>>>
>>> Richard
>>>
>>> Am 19.07.2011 um 17:08 schrieb Marshall Schor:
>>>
>>>> I suspect that the Jars are now available as OSGi bundles; do you know of
>>>> specific ones that are not?
>>>>
>>>> Thanks. -Marshall
>>>>
>>>> On 7/19/2011 10:24 AM, Richard Eckart de Castilho wrote:
>>>>> Hi Marshall,
>>>>>
>>>>> I am very interested in this. Some time back I mostly gave up on packaging UIMA components as OSGi bundles because of this. If you do not bundle all jars (*jikes*) and use package imports instead, the questions is: where do the dependencies come from? Who prepares the bundles and who installs them? Many JARs are not available as OSGi bundles.
>>>>>
>>>>> -- Richard
>>>>>
>>>>> Am 19.07.2011 um 16:13 schrieb Marshall Schor:
>>>>>
>>>>>> I'll take a look at the OSGi build.
>>>>>>
>>>>>> -Marshall
>>>>>>
>>>>>> On 7/17/2011 12:16 PM, Marshall Schor wrote:
>>>>>>> Since this is (I think) the first time we're releasing the OSGi packaging of the
>>>>>>> annotators, I think some work on their license/notice files might be needed,
>>>>>>> because:
>>>>>>>
>>>>>>> - there are duplicate License files - one at the top level, and one in the
>>>>>>> META_INF directory
>>>>>>> - both of these are the plain vanilla license files.  For projects which are
>>>>>>> incorporating other libraries which are under other than the Apache v 2.0
>>>>>>> license, those licenses have to be included.
>>>>>>> - the NOTICE file is present in the META_INF directory, but is the plain one,
>>>>>>> rather than the project specific one.
>>>>>>>
>>>>>>> Finally, I wonder if the OSGi packaging strategy is correct - in that it
>>>>>>> "bundles" every dependency into the OSGi file.  This certainly makes the file
>>>>>>> easier to use, but if a user uses 2 OSGi components from UIMA, won't there be a
>>>>>>> lot of unnecessary duplication (or does OSGi notice this and avoid it somehow)? 
>>>>>>> I'm not sure of an alternative, but I do recall that OSGi allows for
>>>>>>> dependencies on other packages; perhaps that could be useful?
>>>>>>>
>>>>>>> -Marshall
>>>>>>>
>>>>>>> On 7/15/2011 8:29 AM, Tommaso Teofili wrote:
>>>>>>>> Hi all,
>>>>>>>> I've prepared the new RC (4) for UIMA Addons release.
>>>>>>>>
>>>>>>>> The following is a list of issues addressed in this release:
>>>>>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310570&version=12316093
>>>>>>>>
>>>>>>>> The source zip and binary files are available here:
>>>>>>>> http://people.apache.org/~tommaso/uima-addons-2.3.1-rc4
>>>>>>>>
>>>>>>>> SVN Tag Checkout:
>>>>>>>> svn co
>>>>>>>> http://svn.apache.org/repos/asf/uima/addons/tags/uima-addons-2.3.1-rc4/
>>>>>>>>
>>>>>>>> Please cast your vote for UIMA Addons 2.3.1 release:
>>>>>>>>
>>>>>>>> [ ] +1 Approve the release
>>>>>>>> [ ] -1 Veto the release (please provide specific comments)
>>>>>>>> [ ] 0   Don't care
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Tommaso
>>>>>>>>
>>> Richard Eckart de Castilho
>>>

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

On 7/20/11 11:22 AM, Richard Eckart de Castilho wrote:
> I always thought it would be nice if UIMA would define an extension point for such things.
>

Maybe something we should do for UIMA 3.0, because we might want to
break backward compatibility for it.

Jörn

Re: OSGi versions of Add-on Annotators

Posted by Eddie Epstein <ea...@gmail.com>.

On Wed, Jul 20, 2011 at 6:56 AM, Jörn Kottmann <ko...@gmail.com> wrote:
> On 7/20/11 12:15 PM, Richard Eckart de Castilho wrote:
>>
>> I don't think UIMA should mandate the use of OSGi.
>
> No, there shouldn't be a need for that, but we currently have certain
> limitations, which make it difficult to reuse UIMA components. And when
> things are translated to OSGi these parts don't fit in nicely.
>
> For example:
> A user wants to build an AAE out of three different AEs, made by three
> different
> vendors. Then he might run into issues with the type system, especially when
> one
> used JCas, he needs eventually to duplicate certain pieces of configuration,
> etc.
>
> I believe these things should be easier, and then it is also easier to
> properly support OSGi.
>
> Jörn
>

+1 to simplifying handling of type systems. Annotator specific JCas classes
make OSGi bundling particularly difficult.

Given that the type system is intended to communicate data between
components, it makes no sense to me that types can be defined in a specific
analysis engine descriptor. They should only be defined in shared objects.

If Type systems were first class objects, like AEs, then an OSGi bundle
for an AE would simply depend on one or more OSGi type system bundles.

Eddie

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

On 7/20/11 1:04 PM, Richard Eckart de Castilho wrote:
> I don't see how type system issues are related to class loading and component discovery issues.

It is, because all components must work with the specified type system. 
When you use JCas you
easily get class loading issues, because UIMA requires that JCas classes 
are only loaded once
per analysis pipeline, therefore you need to take care of this when 
doing OSGi (or any other component
framework).

You can now work out a solution to make this work in OSGi, or decide to 
redesign this type system
design in UIMA to work better with different components. If you do the 
later, you don't need to create
an OSGi integration anymore, or it is much simpler.

Jörn

Re: OSGi versions of Add-on Annotators

Posted by Richard Eckart de Castilho <ec...@tk.informatik.tu-darmstadt.de>.

> For example:
> A user wants to build an AAE out of three different AEs, made by three different
> vendors. Then he might run into issues with the type system, especially when one
> used JCas, he needs eventually to duplicate certain pieces of configuration, etc.
> 
> I believe these things should be easier, and then it is also easier to properly support OSGi.

I don't see how type system issues are related to class loading and component discovery issues. The first are related to the lack of standards on the level of annotations/metadata, the latter are related to the component framework architecture. 

Being spoiled by uimaFIT, I don't have much of a problem moving duplicate stuff into a factory function or variable and injecting it where I need it. But there is currently no way to plug in class loaders, change factories or discover component implementations. uimaFIT at least provides a way to discover type systems (very convenient!). Discovering components would be nice too though -- preferably without having to deal with UIMA XML descriptors. 

Cheers,

Richard

-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
eckartde@tk.informatik.tu-darmstadt.de 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
-------------------------------------------------------------------

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

On 7/20/11 12:15 PM, Richard Eckart de Castilho wrote:
> I don't think UIMA should mandate the use of OSGi.
No, there shouldn't be a need for that, but we currently have certain
limitations, which make it difficult to reuse UIMA components. And when
things are translated to OSGi these parts don't fit in nicely.

For example:
A user wants to build an AAE out of three different AEs, made by three 
different
vendors. Then he might run into issues with the type system, especially 
when one
used JCas, he needs eventually to duplicate certain pieces of 
configuration, etc.

I believe these things should be easier, and then it is also easier to 
properly support OSGi.

Jörn

Re: OSGi versions of Add-on Annotators

Posted by Richard Eckart de Castilho <ec...@tk.informatik.tu-darmstadt.de>.

>> This is quite "custom" and I don't like it very much to pass ClassLoaders
>> between bundles however it could be good to implement a
>> OSGiUIMAFrameworkImpl class which hides such details to the developer who
>> wants to run UIMA inside OSGi environments.
> 
> As far as my understanding goes, we can either add some "fixes" to
> make UIMA as it is now work in an OSGi environment or we can
> redesign a few things to migrate to OSGi.
> The later option might only be possible with breaking backward 
> compatibility.

I don't think UIMA should mandate the use of OSGi. It would be nice though, if the UIMAFramework class would be opened up for customization. While UIMA is implemented in a highly modular fashion, the current implementation of the UIMAFramework defies any attempt of actually customizing it, e.g. providing custom factories. For the uimafit-spring experiment, I had to do some ugly reflection hacks to replace the default component factories with instances that run UIMA component instances through the dependency injection. I could imagine that OSGi support could also come as a custom option that can be added to the core UIMA framework if desired.

-- Richard

-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
eckartde@tk.informatik.tu-darmstadt.de 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
-------------------------------------------------------------------

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

On 7/20/11 11:53 AM, Tommaso Teofili wrote:
> This is quite "custom" and I don't like it very much to pass ClassLoaders
> between bundles however it could be good to implement a
> OSGiUIMAFrameworkImpl class which hides such details to the developer who
> wants to run UIMA inside OSGi environments.

As far as my understanding goes, we can either add some "fixes" to
make UIMA as it is now work in an OSGi environment or we can
redesign a few things to migrate to OSGi.
The later option might only be possible with breaking backward 
compatibility.

Jörn

Re: OSGi versions of Add-on Annotators

Posted by Tommaso Teofili <to...@gmail.com>.

2011/7/20 Richard Eckart de Castilho <eckartde@tk.informatik.tu-darmstadt.de
>

> I always thought it would be nice if UIMA would define an extension point
> for such things.
>
> -- Richard
>

I think one of the problems with UIMA and OSGi derives from the different
usage of ClassLoaders in OSGi which make it hard to let an annotator whose
descriptor is defined in one bundle be used from another bundle (i.e. the
one containing the UIMAFrameworkImpl).
This is the reason why in Clerezza I created a factory which made it
possible to pass the ClassLoader as an additional parameter for
instantiating an AE [1].
This is quite "custom" and I don't like it very much to pass ClassLoaders
between bundles however it could be good to implement a
OSGiUIMAFrameworkImpl class which hides such details to the developer who
wants to run UIMA inside OSGi environments.

2011/7/20 Jörn Kottmann <ko...@gmail.com>

>
> Maybe something we should do for UIMA 3.0, because we might want to
> break backward compatibility for it.
>
>
>
+1
Tommaso


[1] :
http://svn.apache.org/repos/asf/incubator/clerezza/trunk/parent/uima/uima.utils/src/main/java/org/apache/clerezza/uima/utils/UIMAExecutorFactory.java

Re: OSGi versions of Add-on Annotators

Posted by Richard Eckart de Castilho <ec...@tk.informatik.tu-darmstadt.de>.

I always thought it would be nice if UIMA would define an extension point for such things.

-- Richard

Am 20.07.2011 um 11:16 schrieb Jörn Kottmann:

> On 7/20/11 11:04 AM, Tommaso Teofili wrote:
>> I'd be happy to help set up a testing environment (this makes me think it'd
>> be useful to setup some integration tests) .
> 
> Quite some time back I tested using UIMA in eclipse and noticed that
> the loading of AEs via descriptors doesn't really work,
> because the way UIMA loads AEs is not the OSGi way of doing it.
> 
> In the eclipse OSGi runtime is a workaround which adds other bundles
> to the classpath, that can be defined via Eclipse-RegisterBuddy.
> 
> Do we use this mechanism or is there now something
> similar in the OSGi standard?
> 
> Jörn

-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
eckartde@tk.informatik.tu-darmstadt.de 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
-------------------------------------------------------------------

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

On 7/20/11 11:04 AM, Tommaso Teofili wrote:
> I'd be happy to help set up a testing environment (this makes me think it'd
> be useful to setup some integration tests) .

Quite some time back I tested using UIMA in eclipse and noticed that
the loading of AEs via descriptors doesn't really work,
because the way UIMA loads AEs is not the OSGi way of doing it.

In the eclipse OSGi runtime is a workaround which adds other bundles
to the classpath, that can be defined via Eclipse-RegisterBuddy.

Do we use this mechanism or is there now something
similar in the OSGi standard?

Jörn

Re: OSGi versions of Add-on Annotators

Posted by Marshall Schor <ms...@schor.com>.

One more clarification, below

On 7/20/2011 7:55 AM, Marshall Schor wrote:
> On 7/20/2011 5:04 AM, Tommaso Teofili wrote:
>> Hello all,
>> I think a bit of history is helpful here to track what has been done so far
>> and what/where we can improve on OSGi components.
> Good idea!
>
>> the first thing I used to drive me through the setup of UIMA Addons OSGi
>> bundles was the dependency:tree on the non-OSGi version of each component
>> which I used to define the initial set of dependencies.
>> At first I used the maven-bundle-plugin to produce the OSGi jars (bundles)
>> and I deployed them inside an Apache Felix instance.
> I think I'm missing the big picture, here.  UIMA is usually used as a container
> in which Annotators are loaded.
> What does it mean to "deploy" inside of an Apache Felix instance?  For instance,
> can you put together a several annotators and a flow controller (a UIMA
> Aggregate) and "run" it inside Apache Felix?  I don't see how that is done, when
> every Annotator is its own OSGi bundle.  Maybe it's because it's early morning
> here, and I'm not quite awake, and I've missed the basics ...
>
> This may be all wrong-headed - but I wonder if the basic use case is to do
> something like the following: Take a bunch of annotators (and maybe flow
> controllers) together with a top-level aggregate XML specifying parameter
> overrides, etc., and "wrap" them so they become a single OSGi bundle, that can
> then be embedded in an OSGi container?  If so, then perhaps instead of having a
> "set" of individually OSGi-i-fied annotators, like we do now, maybe we should
> have instead a tool that does this for a set of annotators, etc.
>
>
>> The bundle plugin had one issue with the OSGi version as 2.3.1-SNAPSHOT is
>> not a valid OSGi version while it has to be converted to 2.3.1.SNAPSHOT.
> We have this problem elsewhere, and use the ${parsedVersion.osgiVersion} in the
> poms, to get the version with the period instead of the dash.
>
>> This problem was resolved setting the artifact packaging to jar thus
>> creating the bundle using also other plugins (i.e.:
>> maven-dependency-plugin).

The Felix bundle plugin can operate in 2 modes:
1) using the "bundle" artifact type
2) using a "jar' artifact type

When using the "bundle" artifact type, the plugin results in the Jar being built
automatically.
When using the "jar" artifact type, the bundle plugin only creates the manifest;
the user has to create the Jar using the Maven Jar plugin, after copying in
anything they need into the place the Jar zips up.

We use the 2nd approach (the Jar artifact type), because then we can take
advantage of UIMA's standard way of including in the Jars the right
License/Notice files.

-Marshall
>> On the other hand when deploying those artifacts on Apache Felix I couldn't
>> find "a lot" of dependencies not listed by the mvn depdendency:analyze , for
>> some of them I could avoid adding an additional dependency by setting an
>> "optional import" (i.e. see some sun.* import set to optional) to the bundle
>> configuration but this was not possible for every "unexpected" dependency.
>> The above issues drove addons-osgi-runtime configurations as they are now.
> The ConfigurableFeatureExtractor non-OSGi pom doesn't have an "ant" dependency,
> while the OSGi version of that same annotator does.  Any idea why this was added
> to the OSGi one?
>
> The full list of jars pulled into that annotator's OSGi packaging is quite large:
> ant-1.7.1.jar
> ant-launcher-1.7.1.jar
> common-2.3.0-v200706262000.jar
> common-3.3.0-v20070426.jar
> commons-beanutils-1.8.2.jar
> commons-collections-3.2.1.jar
> commons-jxpath-1.3.jar
> commons-logging-1.1.1.jar
> ConfigurableFeatureExtractor-2.3.2-SNAPSHOT.jar
> contenttype-3.2.100-v20070319.jar
> ecore-2.3.0-v200706262000.jar
> geronimo-stax-api_1.0_spec-1.0.1.jar
> jdom-1.0.jar
> jobs-3.3.0-v20070423.jar
> jVinci-2.3.1.jar
> org.eclipse.core.jobs-3.5.0.v20100515.jar
> org.eclipse.equinox.common-3.6.0.v20100503.jar
> org.eclipse.equinox.preferences-3.2.1.jar
> org.eclipse.equinox.registry-3.5.0.v20100503.jar
> org.eclipse.osgi-3.2.1.jar
> osgi-3.3.0-v20070530.jar
> preferences-3.2.100-v20070522.jar
> registry-3.3.0-v20070522.jar
> runtime-3.2.0-v20060603.jar
> uimaj-adapter-vinci-2.3.1.jar
> uimaj-core-2.3.1.jar
> uimaj-cpe-2.3.1.jar
> uimaj-document-annotation-2.3.1.jar
> uimaj-ep-runtime-2.3.1.jar
> uimaj-tools-2.3.1.jar
> xercesImpl-2.8.1.jar
> xmi-2.3.0-v200706262000.jar
> xml-apis-1.3.03.jar
> xml-resolver-1.2.jar
> xmlbeans-2.4.0.jar
>
> -Marshall
>
>> However I do agree with one thing, we should leave out all the dependencies
>> which are OSGi compliant or have an OSGi package version.
>> If we could revert back to use only the maven-bundle-plugin may help us keep
>> things cleaner.
>> I'd be happy to help set up a testing environment (this makes me think it'd
>> be useful to setup some integration tests) .
>> Tommaso
>>
>> 2011/7/20 Marshall Schor <ms...@schor.com>
>>
>>> Moving this from the RC4 release discussion to a new thread ...
>>>
>>> I've now tried the following:
>>>
>>> Change the build instructions so
>>>
>>> a) the dependency goal doesn't unpack the jars
>>> b) the OSGi build instruction doesn't say to "inline" the jars.
>>>
>>> The result - it builds, no error messages, and has a result which includes
>>> lots
>>> of Jars at the top level, plus a META-INF directory, and nothing else.
>>>
>>> I tried this with the ConfigurableFeatureExtractor-osgi project.  The Jars
>>> that
>>> get included are all the ones shown by giving the command:  mvn
>>> dependency:tree
>>> in the project directory.  There are many jars, including all of Ant, the
>>> Ant-launcher Jar, a bunch of eclipse jars including things like
>>> org.eclipse.core.jobs, the junit jar, and more.  Many of these are
>>> unnecessary,
>>> I think, and including them in the distribution causes us to work to verify
>>> the
>>> appropriate LICENSEs/NOTICEs are created.
>>>
>>> It seems very unlikely that the CFE needs all these to run, normally.
>>> Running
>>> the non-OSGi CFE maven dependency:tree does not show the ant dependency -
>>> I'll
>>> have to track down why those are different.  Looking at the 2.3.0 release,
>>> the
>>> CFE non-OSGi had many fewer included Jars.
>>>
>>> I think we can fix the build instructions to excluded the unneeded Jars.
>>>
>>> I don't have any setup for testing the OSGi packaged artifacts - if anyone
>>> else
>>> does, let's figure out how to test these - either by collaborating or by
>>> helping
>>> me learn how to setup something locally to test this packaging result.
>>>
>>> -Marshall
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 7/19/2011 3:44 PM, Marshall Schor wrote:
>>>> Thanks, Richard.
>>>>
>>>> I think you are right - some of the dependencies (for example, the
>>>> AlchemyApiAnnotator depends on Apache commons-digester, etc.) don't have
>>> OSGi
>>>> packagings.
>>>>
>>>> The build strategy for the OSGi modules currently gets all the
>>> dependencies and
>>>> unpacks them into .../target/classes directory, where a later step "jars"
>>> them up.
>>>> This approach overlays files being unzipped, with later versions.  Some
>>> examples
>>>> where this might be an issue:
>>>> There is at the top level a license directory, containing one "LICENSE"
>>> file.
>>>> There is at the top level a "plugin.xml" file.
>>>> There is at the top level a META-INF dir, with LICENSE and NOTICE files
>>> among
>>>> other things.
>>>>
>>>> Perhaps it would be better to package the dependencies that are not OSGi
>>> in a
>>>> way that doesn't need to unpack, and then potentially overlay, files.
>>>>
>>>> It seems that OSGi and the bundle plugin support this, via the
>>> Embed-Dependency
>>>> instruction.  Is there a reason we're not using that, instead of the
>>> "unpacking"
>>>> approach?
>>>>
>>>> -Marshall
>>>>
>>>> On 7/19/2011 11:17 AM, Richard Eckart de Castilho wrote:
>>>>> I wanted to package the DKPro Core UIMA modules as OSGi bundle. These
>>> have lots of dependencies on various JARs that are not available as OSGi
>>> bundles and sometimes not even available in public Maven repositories - this
>>> is why we set up a public repository of our own for the moment. It may be
>>> less an issue for the UIMA sandbox, as the individual components may not
>>> depend on third-party libraries.
>>>>> Looking the Add Ons repository, I would suspect that Tika, Solr, Rhino,
>>> BeanShell and maybe some of the Apache Commons JARs may not be OSGi bundles.
>>>>> I guess you aim for a mixed setup where some dependencies (namely UIMA)
>>> are imported via package-imports and others (namely the above) are packaged
>>> inside the bundles?
>>>>> Cheers,
>>>>>
>>>>> Richard
>>>>>
>>>>> Am 19.07.2011 um 17:08 schrieb Marshall Schor:
>>>>>
>>>>>> I suspect that the Jars are now available as OSGi bundles; do you know
>>> of
>>>>>> specific ones that are not?
>>>>>>
>>>>>> Thanks. -Marshall
>>>>>>
>>>>>> On 7/19/2011 10:24 AM, Richard Eckart de Castilho wrote:
>>>>>>> Hi Marshall,
>>>>>>>
>>>>>>> I am very interested in this. Some time back I mostly gave up on
>>> packaging UIMA components as OSGi bundles because of this. If you do not
>>> bundle all jars (*jikes*) and use package imports instead, the questions is:
>>> where do the dependencies come from? Who prepares the bundles and who
>>> installs them? Many JARs are not available as OSGi bundles.
>>>>>>> -- Richard
>>>>>>>
>>>>>>> Am 19.07.2011 um 16:13 schrieb Marshall Schor:
>>>>>>>
>>>>>>>> I'll take a look at the OSGi build.
>>>>>>>>
>>>>>>>> -Marshall
>>>>>>>>
>>>>>>>> On 7/17/2011 12:16 PM, Marshall Schor wrote:
>>>>>>>>> Since this is (I think) the first time we're releasing the OSGi
>>> packaging of the
>>>>>>>>> annotators, I think some work on their license/notice files might be
>>> needed,
>>>>>>>>> because:
>>>>>>>>>
>>>>>>>>> - there are duplicate License files - one at the top level, and one
>>> in the
>>>>>>>>> META_INF directory
>>>>>>>>> - both of these are the plain vanilla license files.  For projects
>>> which are
>>>>>>>>> incorporating other libraries which are under other than the Apache
>>> v 2.0
>>>>>>>>> license, those licenses have to be included.
>>>>>>>>> - the NOTICE file is present in the META_INF directory, but is the
>>> plain one,
>>>>>>>>> rather than the project specific one.
>>>>>>>>>
>>>>>>>>> Finally, I wonder if the OSGi packaging strategy is correct - in
>>> that it
>>>>>>>>> "bundles" every dependency into the OSGi file.  This certainly makes
>>> the file
>>>>>>>>> easier to use, but if a user uses 2 OSGi components from UIMA, won't
>>> there be a
>>>>>>>>> lot of unnecessary duplication (or does OSGi notice this and avoid
>>> it somehow)?
>>>>>>>>> I'm not sure of an alternative, but I do recall that OSGi allows for
>>>>>>>>> dependencies on other packages; perhaps that could be useful?
>>>>>>>>>
>>>>>>>>> -Marshall
>>>>>>>>>
>>>>>>>>> On 7/15/2011 8:29 AM, Tommaso Teofili wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>> I've prepared the new RC (4) for UIMA Addons release.
>>>>>>>>>>
>>>>>>>>>> The following is a list of issues addressed in this release:
>>>>>>>>>>
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310570&version=12316093
>>>>>>>>>> The source zip and binary files are available here:
>>>>>>>>>> http://people.apache.org/~tommaso/uima-addons-2.3.1-rc4
>>>>>>>>>>
>>>>>>>>>> SVN Tag Checkout:
>>>>>>>>>> svn co
>>>>>>>>>>
>>> http://svn.apache.org/repos/asf/uima/addons/tags/uima-addons-2.3.1-rc4/
>>>>>>>>>> Please cast your vote for UIMA Addons 2.3.1 release:
>>>>>>>>>>
>>>>>>>>>> [ ] +1 Approve the release
>>>>>>>>>> [ ] -1 Veto the release (please provide specific comments)
>>>>>>>>>> [ ] 0   Don't care
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Tommaso
>>>>>>>>>>
>>>>> Richard Eckart de Castilho
>>>>>

Re: OSGi versions of Add-on Annotators

Posted by Marshall Schor <ms...@schor.com>.


On 7/20/2011 12:23 PM, Richard Eckart de Castilho wrote:
> Am 20.07.2011 um 17:18 schrieb Marshall Schor:
>
>> This suggests having a tool to make this "easy"; but also suggests that having
>> individual addon annotators packaged up as a "complete UIMA pipeline" may not be
>> very interesting to anyone.
> I think it would be nice if UIMA provided OSGi bundles as a standard alternative to the proprietary PEAR format - and every annotator or family of related annotators in one bundle, yes. 

But only the annotator, right?  Not also the UIMA framework itself, right?
-Marshall

>
> I would not bundle up models with the annotators, but it would be nice if models could be resolvable/installable as separate bundles and/or maven artifacts. For example. in DKPro we wrapped the Stanford Parser and TreeTagger as an UIMA component without the models and made provisions for deploying the models to a Maven repository from where they can be added as dependencies. That's very convenient for our internal users. For licensing reasons, we currently do not publish these artifacts on a public Maven repository.
>
> And as mentioned before, as part of supporting OSGi, it would be nice if UIMA defined extensions points for publishing type-systems and components. Such a thing would eventually allow to install components as Eclipse plugins and have end users click pipelines together in a potential pipeline builder. 
>
> I would not be interested very much in monster-bundles that contain a whole pipeline that is to be used as a service.
>
> -- Richard
>

Re: OSGi versions of Add-on Annotators

Posted by Richard Eckart de Castilho <ec...@tk.informatik.tu-darmstadt.de>.

Am 20.07.2011 um 20:38 schrieb Marshall Schor:

> I can think of 2 scenarios (maybe because I have a limited imagination :-) ):
> 
> 1) UIMA is augmented to incorporate within itself whatever is needed to run OSGi
> - packaged annotators (and perhaps, separately OSGi-packaged type systems and
> shared resources).  In this case, UIMA itself becomes yet another OSGi "container".

I think UIMA should not embed OSGi, which again would mean that it mandates the use of OSGi. But UIMA should be usable in OSGi scenarios. EclipseLink does not use OSGi as a plugin-system, but it allows e.g. plugin in classloaders which makes it more convenient to use within an OSGi runtime environment.
If you say "OSGi container", I think of something like Equinox or Felix. I am certain we do not want to implement another OSGi container.

> 2) The UIMA Framework is made into one or more OSGi components, in such a manner
> as to enable some very simple "driver" application to take the UIMA framework
> and one or more OSGi-packaged parts, as above, and run them in some 3rd party
> OSGi container (like Felix).

That seems to me the reasonable scenario. However, it should be done in such a manner that the UIMA framework is still usable outside an OSGi container. That means the JARs can contain bundle metadata, but the framework itself should not have dependencies on any OSGi library. It would make sense to provide an OSGi support package for UIMA which then contains OSGI-specific code, such as an extension point registry, factories, etc. That should be a layer between the UIMA core framework and an OSGi-based application such as the CAS Editor, Eclipse etc.

> Here's a variation on the above:  allow somehow combining other packaging of
> annotators (e.g., "normal", and "PEAR") together with OSGi - packaged ones.

I don't like pears very much because they need to be installed and don't "just run" from the classpath. I can't just add a PEAR as a dependency to my Maven project and use the component. With OSGi packaged components, just dropping them in to the OSGi container should "just work" - no need to install them. Maybe it would be possible to add OSGi metadata to PEARs so they could operate in both environments - the OSGi metadata might just be able to provide the OSGi container with the right information to do that. Unfortunately, a PEAR would (correct me I am wrong) never be usable when just added to the classpath via a Maven dependency. 

> Are you thinking of all these variations, or does only one of them make sense?

I think only the second variation makes sense.

-- Richard 

-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
eckartde@tk.informatik.tu-darmstadt.de 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
-------------------------------------------------------------------

Re: OSGi versions of Add-on Annotators

Posted by Marshall Schor <ms...@schor.com>.


On 7/20/2011 12:23 PM, Richard Eckart de Castilho wrote:
> Am 20.07.2011 um 17:18 schrieb Marshall Schor:
>
>> This suggests having a tool to make this "easy"; but also suggests that having
>> individual addon annotators packaged up as a "complete UIMA pipeline" may not be
>> very interesting to anyone.
> I think it would be nice if UIMA provided OSGi bundles as a standard alternative to the proprietary PEAR format - and every annotator or family of related annotators in one bundle, yes. 

I can think of 2 scenarios (maybe because I have a limited imagination :-) ):

1) UIMA is augmented to incorporate within itself whatever is needed to run OSGi
- packaged annotators (and perhaps, separately OSGi-packaged type systems and
shared resources).  In this case, UIMA itself becomes yet another OSGi "container".

2) The UIMA Framework is made into one or more OSGi components, in such a manner
as to enable some very simple "driver" application to take the UIMA framework
and one or more OSGi-packaged parts, as above, and run them in some 3rd party
OSGi container (like Felix).

Here's a variation on the above:  allow somehow combining other packaging of
annotators (e.g., "normal", and "PEAR") together with OSGi - packaged ones.

Are you thinking of all these variations, or does only one of them make sense?

-Marshall
>
> I would not bundle up models with the annotators, but it would be nice if models could be resolvable/installable as separate bundles and/or maven artifacts. For example. in DKPro we wrapped the Stanford Parser and TreeTagger as an UIMA component without the models and made provisions for deploying the models to a Maven repository from where they can be added as dependencies. That's very convenient for our internal users. For licensing reasons, we currently do not publish these artifacts on a public Maven repository.
>
> And as mentioned before, as part of supporting OSGi, it would be nice if UIMA defined extensions points for publishing type-systems and components. Such a thing would eventually allow to install components as Eclipse plugins and have end users click pipelines together in a potential pipeline builder. 
>
> I would not be interested very much in monster-bundles that contain a whole pipeline that is to be used as a service.
>
> -- Richard
>

Re: OSGi versions of Add-on Annotators

Posted by Richard Eckart de Castilho <ec...@tk.informatik.tu-darmstadt.de>.

Am 20.07.2011 um 17:18 schrieb Marshall Schor:

> This suggests having a tool to make this "easy"; but also suggests that having
> individual addon annotators packaged up as a "complete UIMA pipeline" may not be
> very interesting to anyone.

I think it would be nice if UIMA provided OSGi bundles as a standard alternative to the proprietary PEAR format - and every annotator or family of related annotators in one bundle, yes. 

I would not bundle up models with the annotators, but it would be nice if models could be resolvable/installable as separate bundles and/or maven artifacts. For example. in DKPro we wrapped the Stanford Parser and TreeTagger as an UIMA component without the models and made provisions for deploying the models to a Maven repository from where they can be added as dependencies. That's very convenient for our internal users. For licensing reasons, we currently do not publish these artifacts on a public Maven repository.

And as mentioned before, as part of supporting OSGi, it would be nice if UIMA defined extensions points for publishing type-systems and components. Such a thing would eventually allow to install components as Eclipse plugins and have end users click pipelines together in a potential pipeline builder. 

I would not be interested very much in monster-bundles that contain a whole pipeline that is to be used as a service.

-- Richard

-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
eckartde@tk.informatik.tu-darmstadt.de 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
-------------------------------------------------------------------

Re: OSGi versions of Add-on Annotators

Posted by florent andré <fl...@4sengines.com>.

Hi,

I actually develop an Apache Stanbol engine based on UIMA.

With the help of Clerezza uima.utils I build this things :

- Bundle A :
* contains UIMA "core" bundles
* do the transformation from UIMA to Stanbol
* provide a service interface for UIMA aggregated AE

- Bundles B,C,...
* implements the service
This bundles only contains
- the aggregateAE xml definition, annotators xml definition, required 
ressources
- osgi dependencies to needed annotators.

this kind of structure is really cool IMO as we just have to define AE 
pipelines and bundle it.

As I'm new to UIMA and not aware of your specific term, I'm not sure to 
well understand you point Marshall.
I use the 2.3.1-SNAPSHOT version of osgi addons and it's work.

The "bad point" I see with this osgi version it that I have really big 
difficulties to correctly set up import/export packages... and leads to 
an ugly config...
But I'm really not sure that osgi modules are the cause.
The really first cause could be my lack of OSGI culture.

Note that I "embed" osgi version of uima and modules inside my bundle, 
because each time I try to free them into Felix I had problems... (the 
first cause could still be the same :) )

Just want to share this usecase with you.
OSGI UIMA will rocks ! :)

++

On 07/20/2011 05:18 PM, Marshall Schor wrote:
>
>
> On 7/20/2011 8:13 AM, Jörn Kottmann wrote:
>> On 7/20/11 1:55 PM, Marshall Schor wrote:
>>> What does it mean to "deploy" inside of an Apache Felix instance?
>>
>> I did that once, and simply embedded everything in one bundle, even UIMA
>> itself. This way I could use UIMA plus some AEs to do analysis as a service
>> for other OSGi bundles inside Felix.
>
> This suggests having a tool to make this "easy"; but also suggests that having
> individual addon annotators packaged up as a "complete UIMA pipeline" may not be
> very interesting to anyone.
>
> Is this right?  If so, perhaps we should not release this osgi versions in the
> addons at this time.  That also would reduce the size of the distribution
> considerably (about 100 MB of 150 MB is for the OSGi versions).  In computing
> this, I also noticed that the tagger osgi packaging was missing the 19.5 mb of
> statistical models...
>
> -Marshall
>>
>> Jörn
>>

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

On 7/20/11 6:17 PM, Tommaso Teofili wrote:
> I still think having individual OSGi versions of each annotator would be
> better.

How can UIMA then load the AE classes?

Jörn

Re: OSGi versions of Add-on Annotators

Posted by Marshall Schor <ms...@schor.com>.


On 7/25/2011 6:40 AM, Jörn Kottmann wrote:
> On 7/22/11 4:33 PM, Marshall Schor wrote:
>> A couple of thoughts:
>>
>> In real world instances that I've seen, there are complexities that make simple
>> renaming of types / features not sufficient to make many completely
>> independently developed annotators inter-operate.   This can be for many
>> reasons, including things like a confidence expressed in one system as a "float"
>> between 0 and 1, and in another as an "integer" between 100 and 0 (yes, also a
>> reversed scale), or in a third as a string set "low", "medium", "high", etc.
>>
>> So to combine independently developed annotators often takes writing some little
>> "glue" annotators inbetween, that do quite arbitrary things.  I've heard stories
>> about people using the BSFannotator to write little scripts to do this.
> Yes, you are right here, integration can be a bit more than just mapping
> something. Could writing glue code be easier when we offer special support for
> it?

Maybe, but I don't have a good grasp of the frequent use-cases here.   In my
limited experience, writing the glue code is very easy for the "easy" cases, and
for the other cases, I'm not sure what special support we could come up with to
make that easier.


>
> Currently the merged type system can get very complex and is somehow needed
> if the CASes are serialized and later maybe opened in the Cas Editor or used
> in some other way. In my opinion that makes it difficult to handle CASes.
>
> If we have one user type system and then one per annotator it would be easier
> to understand it,
> and only types from the user type system would occur in the CAS. In other words
> the annotator type system stays encapsulated and is not carried on to later uses
> of the CAS or cause compatibility issues at a later point in time. In my
> observation that
> is the way many try to use UIMA.

I'm not sure what you mean by a type system encapsulated with an annotator.  It
seems to me that if you have a primitive annotator, and have types defined for
it, which are *not* used by other annotators, then you are putting things into
the CAS which are not used by others - so why put them in the CAS?  Other Java
(or C++, etc.) techniques are probably better for local storage while an
annotator is running on a CAS.   But perhaps I misunderstand what you're getting at?

Perhaps a more concrete example would help me comprehend :-)

-Marshall
>
> Using JCas could then be also more attractive to AE implementers if it could
> be used by some type system mappings, instead of always requiring glue code,
> even for simple cases.
>
>> Another thought: if it turned out there was a substantial use case for very
>> simple renaming of types/ features, that could be very efficiently supported by
>> the framework if we added support for aliases type specifications- this would be
>> a special kind of type definition that "mapped" to another one.  However, as
>> I've suggested above, I don't think that this kind of thing would cover enough
>> of the real use cases to be worth the additional complexity.
>>
>
> I think for quite some AEs that could be useful. There are many simple AEs
> which could
> be integrated by this approach.
>
> I also do not see the advantage of the current type system merging approach.
>
> Jörn
>
>

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

On 7/22/11 4:33 PM, Marshall Schor wrote:
> A couple of thoughts:
>
> In real world instances that I've seen, there are complexities that make simple
> renaming of types / features not sufficient to make many completely
> independently developed annotators inter-operate.   This can be for many
> reasons, including things like a confidence expressed in one system as a "float"
> between 0 and 1, and in another as an "integer" between 100 and 0 (yes, also a
> reversed scale), or in a third as a string set "low", "medium", "high", etc.
>
> So to combine independently developed annotators often takes writing some little
> "glue" annotators inbetween, that do quite arbitrary things.  I've heard stories
> about people using the BSFannotator to write little scripts to do this.
Yes, you are right here, integration can be a bit more than just mapping
something. Could writing glue code be easier when we offer special 
support for it?

Currently the merged type system can get very complex and is somehow needed
if the CASes are serialized and later maybe opened in the Cas Editor or used
in some other way. In my opinion that makes it difficult to handle CASes.

If we have one user type system and then one per annotator it would be 
easier to understand it,
and only types from the user type system would occur in the CAS. In 
other words
the annotator type system stays encapsulated and is not carried on to 
later uses
of the CAS or cause compatibility issues at a later point in time. In my 
observation that
is the way many try to use UIMA.

Using JCas could then be also more attractive to AE implementers if it could
be used by some type system mappings, instead of always requiring glue code,
even for simple cases.

> Another thought: if it turned out there was a substantial use case for very
> simple renaming of types/ features, that could be very efficiently supported by
> the framework if we added support for aliases type specifications- this would be
> a special kind of type definition that "mapped" to another one.  However, as
> I've suggested above, I don't think that this kind of thing would cover enough
> of the real use cases to be worth the additional complexity.
>

I think for quite some AEs that could be useful. There are many simple 
AEs which could
be integrated by this approach.

I also do not see the advantage of the current type system merging approach.

Jörn

Re: OSGi versions of Add-on Annotators

Posted by Marshall Schor <ms...@schor.com>.

On 7/22/2011 4:46 AM, Jörn Kottmann wrote:
> Hi all,
>
> what do you think about these per AE type system mappings?
> Is it something which would improve the current situation ?
> Any concerns?
>

A couple of thoughts: 

In real world instances that I've seen, there are complexities that make simple
renaming of types / features not sufficient to make many completely
independently developed annotators inter-operate.   This can be for many
reasons, including things like a confidence expressed in one system as a "float"
between 0 and 1, and in another as an "integer" between 100 and 0 (yes, also a
reversed scale), or in a third as a string set "low", "medium", "high", etc.

So to combine independently developed annotators often takes writing some little
"glue" annotators inbetween, that do quite arbitrary things.  I've heard stories
about people using the BSFannotator to write little scripts to do this.

Another thought: if it turned out there was a substantial use case for very
simple renaming of types/ features, that could be very efficiently supported by
the framework if we added support for aliases type specifications- this would be
a special kind of type definition that "mapped" to another one.  However, as
I've suggested above, I don't think that this kind of thing would cover enough
of the real use cases to be worth the additional complexity.

-Marshall
> Jörn
>
> On 7/20/11 9:20 PM, Jörn Kottmann wrote:
>> My point is that a user defines his own type system, and a mapping which
>> translates parts
>> of this type system to the annotator type system.
>>
>> So in the sample above a user defines this type system:
>>
>> Type: com.foo.Token
>> Feature: double tokenConfidence
>> Feature: String posTag
>> Feature: double posConfidence
>>
>> The tokenizer also defined its type system:
>> Type: opennlp.Token
>> Feature: float confidence
>>
>> And one more type system for the pos tagger:
>> Type: opennlp.POSToken
>> Feature: float confidence
>> Feature: String tag
>>
>> The user defined AAE only knows the user type system and needs to
>> define "rules" which tell it how to transform opennlp.Token annotations
>> to com.foo.Token annotations, and then it needs a rule to transform
>> a com.foo.Token into an opennlp.POSToken, and back.
>>
>> Sure this is also already possible today, by writing these type mapping AEs,
>> as you would need to do for JCas. But I think having better framework support
>> for this would make it easier. 
>
>

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

Hi all,

what do you think about these per AE type system mappings?
Is it something which would improve the current situation ?
Any concerns?

Jörn

On 7/20/11 9:20 PM, Jörn Kottmann wrote:
> My point is that a user defines his own type system, and a mapping 
> which translates parts
> of this type system to the annotator type system.
>
> So in the sample above a user defines this type system:
>
> Type: com.foo.Token
> Feature: double tokenConfidence
> Feature: String posTag
> Feature: double posConfidence
>
> The tokenizer also defined its type system:
> Type: opennlp.Token
> Feature: float confidence
>
> And one more type system for the pos tagger:
> Type: opennlp.POSToken
> Feature: float confidence
> Feature: String tag
>
> The user defined AAE only knows the user type system and needs to
> define "rules" which tell it how to transform opennlp.Token annotations
> to com.foo.Token annotations, and then it needs a rule to transform
> a com.foo.Token into an opennlp.POSToken, and back.
>
> Sure this is also already possible today, by writing these type 
> mapping AEs,
> as you would need to do for JCas. But I think having better framework 
> support
> for this would make it easier.

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

On 7/20/11 9:44 PM, Richard Eckart de Castilho wrote:
> In DKPro we have a Token type and a POS from which several types inherit (V, NP, ADJ, etc.) The Token type has a feature of type POS on which we set an instance e.g. V or NP.
>
> Token t = new Token(jcas);
> t.setPos(new N(jcas));
>
> We find this quite convenient because it allows us to easily select particular type from the CAS, e.g.
>
> for (N noun : select(jcas, N.class)) {
>     ... do something with nouns ...
> }
>
> Similarly it's convenient to write rules over POS tags in TextMarker with our type system.
>
> With such type systems or with type systems using lists, arrays etc, a simple rule-based mapping won't work I think. JCas is a nice convenience API, but I don't think its more. I'm not sure if the effort of implementing a mapping rule framework is worth the outcome.

Well, in most cases you probably just need to map type names, and 
features. The only issue I see here is
that many annotators need to access the covered text, but that could be 
a new (and different named)
"virtual" feature of an annotation.

I only used types to define what kind of information I need to 
store/exchange, and did not abuse
type names itself to encode information. I also don't think that works 
well when you start generating a lot of types.
Sure, with growing type system complexity the issue of integrating 
different components gets worse.

I actually kind of like our solution for the flow controllers, there we 
define two standard cases, and if a user
needs something complex he can put it in his own implementation.

Jörn

Re: OSGi versions of Add-on Annotators

Posted by Richard Eckart de Castilho <ec...@tk.informatik.tu-darmstadt.de>.

Am 20.07.2011 um 21:20 schrieb Jörn Kottmann:

> So in the sample above a user defines this type system:
> 
> Type: com.foo.Token
> Feature: double tokenConfidence
> Feature: String posTag
> Feature: double posConfidence
> 
> The tokenizer also defined its type system:
> Type: opennlp.Token
> Feature: float confidence
> 
> And one more type system for the pos tagger:
> Type: opennlp.POSToken
> Feature: float confidence
> Feature: String tag
> 
> The user defined AAE only knows the user type system and needs to
> define "rules" which tell it how to transform opennlp.Token annotations
> to com.foo.Token annotations, and then it needs a rule to transform
> a com.foo.Token into an opennlp.POSToken, and back.
> 
> Sure this is also already possible today, by writing these type mapping AEs,
> as you would need to do for JCas. But I think having better framework 
> support for this would make it easier.

In DKPro we have a Token type and a POS from which several types inherit (V, NP, ADJ, etc.) The Token type has a feature of type POS on which we set an instance e.g. V or NP.

Token t = new Token(jcas);
t.setPos(new N(jcas));

We find this quite convenient because it allows us to easily select particular type from the CAS, e.g.

for (N noun : select(jcas, N.class)) {
   ... do something with nouns ...
}

Similarly it's convenient to write rules over POS tags in TextMarker with our type system.

With such type systems or with type systems using lists, arrays etc, a simple rule-based mapping won't work I think. JCas is a nice convenience API, but I don't think its more. I'm not sure if the effort of implementing a mapping rule framework is worth the outcome.

Cheers,

Richard

-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
eckartde@tk.informatik.tu-darmstadt.de 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
-------------------------------------------------------------------

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

On 7/20/11 8:57 PM, Marshall Schor wrote:
>> >
>> >
>> >  Now both AEs are made by different vendors, and both decide to declare their
>> >  own token type. Then this type system merging doesn't work.
>> >
>> >  As far as I know the only common used work around for this issue is, not to use
>> >  JCas and to define type system mappings, where the types the AE needs are mapped
>> >  based on some configuration.
> I don't understand why JCas cannot be used -- that seems to me to be independent
> of the need for having type system mappings.  I'm thinking that one annotator
> produces a.b.Token, and a down-stream annotator needs c.d.Token with some
> different kinds of meanings assigned to features - in this case you introduce a
> custom mapping annotator, that iterates over the a.b.Token(s), and makes the
> corresponding c.d.Token feature structures.  JCas can be used for both of these,
> as desired.

Ok, that is possible, but this way you start writing code, for something 
the framework
could do. And maintaining all kind of type system mapping AEs isn't 
really fun either.
>> >
>> >
>> >  I think a solution to this problem is, to stop doing this type system merging,
>> >  and always
>> >  map one common type system to every Annotators private type system.
> The hard part is getting a community to agree to "one common type system", I
> think.   But we have seen in large projects, that this often can be done, within
> one project.
>
> Other times, groups working collaboratively, have gotten together and defined a
> common type system for their work.

My point is that a user defines his own type system, and a mapping which 
translates parts
of this type system to the annotator type system.

So in the sample above a user defines this type system:

Type: com.foo.Token
Feature: double tokenConfidence
Feature: String posTag
Feature: double posConfidence

The tokenizer also defined its type system:
Type: opennlp.Token
Feature: float confidence

And one more type system for the pos tagger:
Type: opennlp.POSToken
Feature: float confidence
Feature: String tag

The user defined AAE only knows the user type system and needs to
define "rules" which tell it how to transform opennlp.Token annotations
to com.foo.Token annotations, and then it needs a rule to transform
a com.foo.Token into an opennlp.POSToken, and back.

Sure this is also already possible today, by writing these type mapping AEs,
as you would need to do for JCas. But I think having better framework 
support
for this would make it easier.

Jörn

Re: OSGi versions of Add-on Annotators

Posted by Marshall Schor <ms...@schor.com>.


On 7/20/2011 2:35 PM, Jörn Kottmann wrote:
> On 7/20/11 7:56 PM, Marshall Schor wrote:
>> The "normal" way of having annotators together is something that UIMA supports,
>> as a pipeline.  Part of this is setting up the pipeline at initialization time
>> by taking all the type systems declared by the annotators in the pipeline, and
>> merging them into one common type system.
>>
>> A CAS is generated using this one common type system, and then sent through the
>> pipeline.
>
> Yes, this of course works, but it is often problematic, because the merged
> type system
> needs to be suitable for all components.
>
> Lets say we have a tokenizer and a pos tagger, the pos tagger needs the output
> of the tokenizer as input. Therefore in UIMA you would declare a token type,
> and both AEs must use exactly the same token type.

Sort of true :-)

The Tokenizer could define a type: com.foo.Token subtype of
uima.tcas.Annotation, with feature "stem", "begin" and "end".

Then the POS tagger, could augment the com.foo.Token type with some additional
features, such as "POS", etc.

But of course, you are right, that there would need to be some cooperation,
because, the type name itself would have to match (including the package name)
and the features must not conflict - that is, have different range types (POS
declared an integer in one, and a String in another).  But it is OK to add
features to existing ones.

So, if the POS Tagger was written after the Tokenizer, and knew what the type
was in the Tokenizer, they could intentionally "augment" it.
>
>
> Now both AEs are made by different vendors, and both decide to declare their
> own token type. Then this type system merging doesn't work.
>
> As far as I know the only common used work around for this issue is, not to use
> JCas and to define type system mappings, where the types the AE needs are mapped
> based on some configuration.

I don't understand why JCas cannot be used -- that seems to me to be independent
of the need for having type system mappings.  I'm thinking that one annotator
produces a.b.Token, and a down-stream annotator needs c.d.Token with some
different kinds of meanings assigned to features - in this case you introduce a
custom mapping annotator, that iterates over the a.b.Token(s), and makes the
corresponding c.d.Token feature structures.  JCas can be used for both of these,
as desired.
>
>
> I think a solution to this problem is, to stop doing this type system merging,
> and always
> map one common type system to every Annotators private type system. 

The hard part is getting a community to agree to "one common type system", I
think.   But we have seen in large projects, that this often can be done, within
one project.

Other times, groups working collaboratively, have gotten together and defined a
common type system for their work.

-Marshall

> This mapping
> could give the AEs more flexibility and might even be able to perform simple
> type transformations.
> That would also make using JCas attractive again.
>
> This issue is even amplified by the fact that our users like to define their
> own type system,
> and then they only work properly if the AE implementers do type system mapping
> or program
> against this type system. The later case only work if the user and implementer
> is the same
> person/organization.
>
>> -----------
>>
>> In the case where each annotator is "bundled" as a OSGi bundle, that bundle
>> contains its own private copy of all the UIMA classes, including all of the UIMA
>> SDK, and any type system, etc.  Any JCAS generated classes are also private to
>> that bundle.
>>
>> This might make sense for running one Annotator by itself.
>
> Exactly.
>>   But for running
>> multiple annotators together, as separate OSGi components, I don't see how it
>> would "work" if each annotator were its own bundle.  How would the type systems
>> be combined at initialization time?  How would you share the JCAS generated
>> classes?  (I'll admit that this is not *required*, but is sometimes useful.)
>>
>> Does one of the Clerezza scenarios involve running multiple annotators, each
>> having its own bundle?  If so, how does that work?   (I'm guessing that there is
>> some "driver" code that uses UIMA Application APIs to separately initialize each
>> annotator,  and then maybe does something like getting a type system from all of
>> them, and merging them, and then creating a CAS from that, etc.  This is just
>> duplicating what the UIMA framework is doing - if it were "in charge" of the
>> pipeline and its management.)
>>
>> Thanks for the clarifications.
>>
>
> These are all points which don't really work out
> in the end (with our current release).
>
> Jörn
>
>

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

On 7/20/11 7:56 PM, Marshall Schor wrote:
> The "normal" way of having annotators together is something that UIMA supports,
> as a pipeline.  Part of this is setting up the pipeline at initialization time
> by taking all the type systems declared by the annotators in the pipeline, and
> merging them into one common type system.
>
> A CAS is generated using this one common type system, and then sent through the
> pipeline.

Yes, this of course works, but it is often problematic, because the 
merged type system
needs to be suitable for all components.

Lets say we have a tokenizer and a pos tagger, the pos tagger needs the 
output
of the tokenizer as input. Therefore in UIMA you would declare a token type,
and both AEs must use exactly the same token type.

Now both AEs are made by different vendors, and both decide to declare their
own token type. Then this type system merging doesn't work.

As far as I know the only common used work around for this issue is, not 
to use
JCas and to define type system mappings, where the types the AE needs 
are mapped
based on some configuration.

I think a solution to this problem is, to stop doing this type system 
merging, and always
map one common type system to every Annotators private type system. This 
mapping
could give the AEs more flexibility and might even be able to perform 
simple type transformations.
That would also make using JCas attractive again.

This issue is even amplified by the fact that our users like to define 
their own type system,
and then they only work properly if the AE implementers do type system 
mapping or program
against this type system. The later case only work if the user and 
implementer is the same
person/organization.

> -----------
>
> In the case where each annotator is "bundled" as a OSGi bundle, that bundle
> contains its own private copy of all the UIMA classes, including all of the UIMA
> SDK, and any type system, etc.  Any JCAS generated classes are also private to
> that bundle.
>
> This might make sense for running one Annotator by itself.

Exactly.
>   But for running
> multiple annotators together, as separate OSGi components, I don't see how it
> would "work" if each annotator were its own bundle.  How would the type systems
> be combined at initialization time?  How would you share the JCAS generated
> classes?  (I'll admit that this is not *required*, but is sometimes useful.)
>
> Does one of the Clerezza scenarios involve running multiple annotators, each
> having its own bundle?  If so, how does that work?   (I'm guessing that there is
> some "driver" code that uses UIMA Application APIs to separately initialize each
> annotator,  and then maybe does something like getting a type system from all of
> them, and merging them, and then creating a CAS from that, etc.  This is just
> duplicating what the UIMA framework is doing - if it were "in charge" of the
> pipeline and its management.)
>
> Thanks for the clarifications.
>

These are all points which don't really work out
in the end (with our current release).

Jörn

Re: OSGi versions of Add-on Annotators

Posted by Marshall Schor <ms...@schor.com>.

The "normal" way of having annotators together is something that UIMA supports,
as a pipeline.  Part of this is setting up the pipeline at initialization time
by taking all the type systems declared by the annotators in the pipeline, and
merging them into one common type system. 

A CAS is generated using this one common type system, and then sent through the
pipeline.

-----------

In the case where each annotator is "bundled" as a OSGi bundle, that bundle
contains its own private copy of all the UIMA classes, including all of the UIMA
SDK, and any type system, etc.  Any JCAS generated classes are also private to
that bundle.

This might make sense for running one Annotator by itself.  But for running
multiple annotators together, as separate OSGi components, I don't see how it
would "work" if each annotator were its own bundle.  How would the type systems
be combined at initialization time?  How would you share the JCAS generated
classes?  (I'll admit that this is not *required*, but is sometimes useful.)

Does one of the Clerezza scenarios involve running multiple annotators, each
having its own bundle?  If so, how does that work?   (I'm guessing that there is
some "driver" code that uses UIMA Application APIs to separately initialize each
annotator,  and then maybe does something like getting a type system from all of
them, and merging them, and then creating a CAS from that, etc.  This is just
duplicating what the UIMA framework is doing - if it were "in charge" of the
pipeline and its management.)

Thanks for the clarifications.

-Marshall 

On 7/20/2011 12:17 PM, Tommaso Teofili wrote:
> 2011/7/20 Marshall Schor <ms...@schor.com>
>
>> This may be all wrong-headed - but I wonder if the basic use case is to do
>> something like the following: Take a bunch of annotators (and maybe flow
>> controllers) together with a top-level aggregate XML specifying parameter
>> overrides, etc., and "wrap" them so they become a single OSGi bundle, that
>> can
>> then be embedded in an OSGi container?  If so, then perhaps instead of
>> having a
>> "set" of individually OSGi-i-fied annotators, like we do now, maybe we
>> should
>> have instead a tool that does this for a set of annotators, etc.
>>
> the use case in Clerezza is slightly different as it allows both the
> scenario where one executes an existing pipeline (using OpenCalaisAnnotator
> and AlchemyAPIAnnotator) and the scenario when one runs a custom pipeline,
> eventually using other existing UIMA components, defined in another bundle.
> I still think having individual OSGi versions of each annotator would be
> better.
>
>
> 2011/7/20 Marshall Schor <ms...@schor.com>
>
>>
>> On 7/20/2011 11:18 AM, Marshall Schor wrote:
>>> On 7/20/2011 8:13 AM, Jörn Kottmann wrote:
>>>> On 7/20/11 1:55 PM, Marshall Schor wrote:
>>>>> What does it mean to "deploy" inside of an Apache Felix instance?
>>>> I did that once, and simply embedded everything in one bundle, even UIMA
>>>> itself. This way I could use UIMA plus some AEs to do analysis as a
>> service
>>>> for other OSGi bundles inside Felix.
>>> This suggests having a tool to make this "easy"; but also suggests that
>> having
>>> individual addon annotators packaged up as a "complete UIMA pipeline" may
>> not be
>>> very interesting to anyone.
>>>
>>> Is this right?  If so, perhaps we should not release this osgi versions
>> in the
>>> addons at this time.
> Do you mean not in the binary package or not release them at all (i.e. not
> deploying them on Maven central too)?
>
> Tommaso
>
>  That also would reduce the size of the distribution
>>> considerably (about 100 MB of 150 MB is for the OSGi versions).
>> oops, I was wrong - delete the following...
>>>  In computing
>>> this, I also noticed that the tagger osgi packaging was missing the 19.5
>> mb of
>>> statistical models...
>>>
>>> -Marshall
>>>> Jörn
>>>>

Re: OSGi versions of Add-on Annotators

Posted by florent andré <fl...@4sengines.com>.


On 07/20/2011 08:05 PM, Marshall Schor wrote:
>
>
> On 7/20/2011 12:36 PM, florent andré wrote:
>>
>>
>> On 07/20/2011 06:17 PM, Tommaso Teofili wrote:
>>> 2011/7/20 Marshall Schor<ms...@schor.com>
>>>
>>>>
>>>> This may be all wrong-headed - but I wonder if the basic use case is to do
>>>> something like the following: Take a bunch of annotators (and maybe flow
>>>> controllers) together with a top-level aggregate XML specifying parameter
>>>> overrides, etc., and "wrap" them so they become a single OSGi bundle, that
>>>> can
>>>> then be embedded in an OSGi container?  If so, then perhaps instead of
>>>> having a
>>>> "set" of individually OSGi-i-fied annotators, like we do now, maybe we
>>>> should
>>>> have instead a tool that does this for a set of annotators, etc.
>>>>
>>>
>>> the use case in Clerezza is slightly different as it allows both the
>>> scenario where one executes an existing pipeline (using OpenCalaisAnnotator
>>> and AlchemyAPIAnnotator) and the scenario when one runs a custom pipeline,
>>> eventually using other existing UIMA components, defined in another bundle.
>>> I still think having individual OSGi versions of each annotator would be
>>> better.
>>
>> +1
>> Independent annotator allow to play with easily, and only load required ones.
>
> Maybe this could make more sense, if the bundle had only the annotator code, and
> didn't also contain a copy of the entire UIMA framework in every bundle?

Yep, I think so.

 From my candy eyes I imagine something like this :
- 1 uima framework bundle
- A,B,C,... uima annotators bundles
- 1..N user defined processing chain bundle(s)

This lasts processing chain bundles only contains minimum things to get 
stuff work (aggregateAE.xml + AE declaration)

++


>
> -Marshall
>
>>
>>>
>>>
>>> 2011/7/20 Marshall Schor<ms...@schor.com>
>>>
>>>>
>>>>
>>>> On 7/20/2011 11:18 AM, Marshall Schor wrote:
>>>>>
>>>>> On 7/20/2011 8:13 AM, Jörn Kottmann wrote:
>>>>>> On 7/20/11 1:55 PM, Marshall Schor wrote:
>>>>>>> What does it mean to "deploy" inside of an Apache Felix instance?
>>>>>> I did that once, and simply embedded everything in one bundle, even UIMA
>>>>>> itself. This way I could use UIMA plus some AEs to do analysis as a
>>>> service
>>>>>> for other OSGi bundles inside Felix.
>>>>> This suggests having a tool to make this "easy"; but also suggests that
>>>> having
>>>>> individual addon annotators packaged up as a "complete UIMA pipeline" may
>>>> not be
>>>>> very interesting to anyone.
>>>>>
>>>>> Is this right?  If so, perhaps we should not release this osgi versions
>>>> in the
>>>>> addons at this time.
>>>>
>>>
>>> Do you mean not in the binary package or not release them at all (i.e. not
>>> deploying them on Maven central too)?
>>>
>>> Tommaso
>>>
>>>    That also would reduce the size of the distribution
>>>>> considerably (about 100 MB of 150 MB is for the OSGi versions).
>>>> oops, I was wrong - delete the following...
>>>>>    In computing
>>>>> this, I also noticed that the tagger osgi packaging was missing the 19.5
>>>> mb of
>>>>> statistical models...
>>>>>
>>>>> -Marshall
>>>>>> Jörn
>>>>>>
>>>>
>>>
>>

Re: OSGi versions of Add-on Annotators

Posted by Marshall Schor <ms...@schor.com>.


On 7/20/2011 12:36 PM, florent andré wrote:
>
>
> On 07/20/2011 06:17 PM, Tommaso Teofili wrote:
>> 2011/7/20 Marshall Schor<ms...@schor.com>
>>
>>>
>>> This may be all wrong-headed - but I wonder if the basic use case is to do
>>> something like the following: Take a bunch of annotators (and maybe flow
>>> controllers) together with a top-level aggregate XML specifying parameter
>>> overrides, etc., and "wrap" them so they become a single OSGi bundle, that
>>> can
>>> then be embedded in an OSGi container?  If so, then perhaps instead of
>>> having a
>>> "set" of individually OSGi-i-fied annotators, like we do now, maybe we
>>> should
>>> have instead a tool that does this for a set of annotators, etc.
>>>
>>
>> the use case in Clerezza is slightly different as it allows both the
>> scenario where one executes an existing pipeline (using OpenCalaisAnnotator
>> and AlchemyAPIAnnotator) and the scenario when one runs a custom pipeline,
>> eventually using other existing UIMA components, defined in another bundle.
>> I still think having individual OSGi versions of each annotator would be
>> better.
>
> +1
> Independent annotator allow to play with easily, and only load required ones.

Maybe this could make more sense, if the bundle had only the annotator code, and
didn't also contain a copy of the entire UIMA framework in every bundle?

-Marshall

>
>>
>>
>> 2011/7/20 Marshall Schor<ms...@schor.com>
>>
>>>
>>>
>>> On 7/20/2011 11:18 AM, Marshall Schor wrote:
>>>>
>>>> On 7/20/2011 8:13 AM, Jörn Kottmann wrote:
>>>>> On 7/20/11 1:55 PM, Marshall Schor wrote:
>>>>>> What does it mean to "deploy" inside of an Apache Felix instance?
>>>>> I did that once, and simply embedded everything in one bundle, even UIMA
>>>>> itself. This way I could use UIMA plus some AEs to do analysis as a
>>> service
>>>>> for other OSGi bundles inside Felix.
>>>> This suggests having a tool to make this "easy"; but also suggests that
>>> having
>>>> individual addon annotators packaged up as a "complete UIMA pipeline" may
>>> not be
>>>> very interesting to anyone.
>>>>
>>>> Is this right?  If so, perhaps we should not release this osgi versions
>>> in the
>>>> addons at this time.
>>>
>>
>> Do you mean not in the binary package or not release them at all (i.e. not
>> deploying them on Maven central too)?
>>
>> Tommaso
>>
>>   That also would reduce the size of the distribution
>>>> considerably (about 100 MB of 150 MB is for the OSGi versions).
>>> oops, I was wrong - delete the following...
>>>>   In computing
>>>> this, I also noticed that the tagger osgi packaging was missing the 19.5
>>> mb of
>>>> statistical models...
>>>>
>>>> -Marshall
>>>>> Jörn
>>>>>
>>>
>>
>

Re: OSGi versions of Add-on Annotators

Posted by florent andré <fl...@4sengines.com>.


On 07/20/2011 06:17 PM, Tommaso Teofili wrote:
> 2011/7/20 Marshall Schor<ms...@schor.com>
>
>>
>> This may be all wrong-headed - but I wonder if the basic use case is to do
>> something like the following: Take a bunch of annotators (and maybe flow
>> controllers) together with a top-level aggregate XML specifying parameter
>> overrides, etc., and "wrap" them so they become a single OSGi bundle, that
>> can
>> then be embedded in an OSGi container?  If so, then perhaps instead of
>> having a
>> "set" of individually OSGi-i-fied annotators, like we do now, maybe we
>> should
>> have instead a tool that does this for a set of annotators, etc.
>>
>
> the use case in Clerezza is slightly different as it allows both the
> scenario where one executes an existing pipeline (using OpenCalaisAnnotator
> and AlchemyAPIAnnotator) and the scenario when one runs a custom pipeline,
> eventually using other existing UIMA components, defined in another bundle.
> I still think having individual OSGi versions of each annotator would be
> better.

+1
Independent annotator allow to play with easily, and only load required 
ones.

>
>
> 2011/7/20 Marshall Schor<ms...@schor.com>
>
>>
>>
>> On 7/20/2011 11:18 AM, Marshall Schor wrote:
>>>
>>> On 7/20/2011 8:13 AM, Jörn Kottmann wrote:
>>>> On 7/20/11 1:55 PM, Marshall Schor wrote:
>>>>> What does it mean to "deploy" inside of an Apache Felix instance?
>>>> I did that once, and simply embedded everything in one bundle, even UIMA
>>>> itself. This way I could use UIMA plus some AEs to do analysis as a
>> service
>>>> for other OSGi bundles inside Felix.
>>> This suggests having a tool to make this "easy"; but also suggests that
>> having
>>> individual addon annotators packaged up as a "complete UIMA pipeline" may
>> not be
>>> very interesting to anyone.
>>>
>>> Is this right?  If so, perhaps we should not release this osgi versions
>> in the
>>> addons at this time.
>>
>
> Do you mean not in the binary package or not release them at all (i.e. not
> deploying them on Maven central too)?
>
> Tommaso
>
>   That also would reduce the size of the distribution
>>> considerably (about 100 MB of 150 MB is for the OSGi versions).
>> oops, I was wrong - delete the following...
>>>   In computing
>>> this, I also noticed that the tagger osgi packaging was missing the 19.5
>> mb of
>>> statistical models...
>>>
>>> -Marshall
>>>> Jörn
>>>>
>>
>

Re: OSGi versions of Add-on Annotators

Posted by Tommaso Teofili <to...@gmail.com>.

2011/7/20 Marshall Schor <ms...@schor.com>

>
> This may be all wrong-headed - but I wonder if the basic use case is to do
> something like the following: Take a bunch of annotators (and maybe flow
> controllers) together with a top-level aggregate XML specifying parameter
> overrides, etc., and "wrap" them so they become a single OSGi bundle, that
> can
> then be embedded in an OSGi container?  If so, then perhaps instead of
> having a
> "set" of individually OSGi-i-fied annotators, like we do now, maybe we
> should
> have instead a tool that does this for a set of annotators, etc.
>

the use case in Clerezza is slightly different as it allows both the
scenario where one executes an existing pipeline (using OpenCalaisAnnotator
and AlchemyAPIAnnotator) and the scenario when one runs a custom pipeline,
eventually using other existing UIMA components, defined in another bundle.
I still think having individual OSGi versions of each annotator would be
better.


2011/7/20 Marshall Schor <ms...@schor.com>

>
>
> On 7/20/2011 11:18 AM, Marshall Schor wrote:
> >
> > On 7/20/2011 8:13 AM, Jörn Kottmann wrote:
> >> On 7/20/11 1:55 PM, Marshall Schor wrote:
> >>> What does it mean to "deploy" inside of an Apache Felix instance?
> >> I did that once, and simply embedded everything in one bundle, even UIMA
> >> itself. This way I could use UIMA plus some AEs to do analysis as a
> service
> >> for other OSGi bundles inside Felix.
> > This suggests having a tool to make this "easy"; but also suggests that
> having
> > individual addon annotators packaged up as a "complete UIMA pipeline" may
> not be
> > very interesting to anyone.
> >
> > Is this right?  If so, perhaps we should not release this osgi versions
> in the
> > addons at this time.
>

Do you mean not in the binary package or not release them at all (i.e. not
deploying them on Maven central too)?

Tommaso

 That also would reduce the size of the distribution
> > considerably (about 100 MB of 150 MB is for the OSGi versions).
> oops, I was wrong - delete the following...
> >  In computing
> > this, I also noticed that the tagger osgi packaging was missing the 19.5
> mb of
> > statistical models...
> >
> > -Marshall
> >> Jörn
> >>
>

Re: OSGi versions of Add-on Annotators

Posted by Marshall Schor <ms...@schor.com>.


On 7/20/2011 11:18 AM, Marshall Schor wrote:
>
> On 7/20/2011 8:13 AM, Jörn Kottmann wrote:
>> On 7/20/11 1:55 PM, Marshall Schor wrote:
>>> What does it mean to "deploy" inside of an Apache Felix instance?
>> I did that once, and simply embedded everything in one bundle, even UIMA
>> itself. This way I could use UIMA plus some AEs to do analysis as a service
>> for other OSGi bundles inside Felix.
> This suggests having a tool to make this "easy"; but also suggests that having
> individual addon annotators packaged up as a "complete UIMA pipeline" may not be
> very interesting to anyone.
>
> Is this right?  If so, perhaps we should not release this osgi versions in the
> addons at this time.  That also would reduce the size of the distribution
> considerably (about 100 MB of 150 MB is for the OSGi versions). 
oops, I was wrong - delete the following... 
>  In computing
> this, I also noticed that the tagger osgi packaging was missing the 19.5 mb of
> statistical models...
>
> -Marshall
>> Jörn
>>

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

On 7/20/11 5:18 PM, Marshall Schor wrote:
> This suggests having a tool to make this "easy"; but also suggests that having
> individual addon annotators packaged up as a "complete UIMA pipeline" may not be
> very interesting to anyone.

To make this easy you can use maven. To fix the problem UIMA would need 
to change
the way it loads classes. A proper OSGi integration should give a user 
all the features
UIMA currently offers.

And I think we should even make an additional step and redesign UIMA a 
bit, to
ease up the integration of AEs. In my eyes a simple use case where a 
user needs
to run AEs made by different vendors in an AAE should be really easy. In 
the end
the integration part is one of the reasons why someone at all decides to 
use UIMA.

> Is this right?  If so, perhaps we should not release this osgi versions in the
> addons at this time.  That also would reduce the size of the distribution
> considerably (about 100 MB of 150 MB is for the OSGi versions).  In computing
> this, I also noticed that the tagger osgi packaging was missing the 19.5 mb of
> statistical models...

In my opinion we should not release all these OSGi bundles, because they
can only be loaded with this eclipse register buddy hack, and that is not
even part of the OSGi standard.

Jörn

Re: OSGi versions of Add-on Annotators

Posted by Marshall Schor <ms...@schor.com>.

On 7/20/2011 8:13 AM, Jörn Kottmann wrote:
> On 7/20/11 1:55 PM, Marshall Schor wrote:
>> What does it mean to "deploy" inside of an Apache Felix instance?
>
> I did that once, and simply embedded everything in one bundle, even UIMA
> itself. This way I could use UIMA plus some AEs to do analysis as a service
> for other OSGi bundles inside Felix.

This suggests having a tool to make this "easy"; but also suggests that having
individual addon annotators packaged up as a "complete UIMA pipeline" may not be
very interesting to anyone.

Is this right?  If so, perhaps we should not release this osgi versions in the
addons at this time.  That also would reduce the size of the distribution
considerably (about 100 MB of 150 MB is for the OSGi versions).  In computing
this, I also noticed that the tagger osgi packaging was missing the 19.5 mb of
statistical models...

-Marshall
>
> Jörn
>

Re: OSGi versions of Add-on Annotators

Posted by Jörn Kottmann <ko...@gmail.com>.

On 7/20/11 1:55 PM, Marshall Schor wrote:
> What does it mean to "deploy" inside of an Apache Felix instance?

I did that once, and simply embedded everything in one bundle, even UIMA
itself. This way I could use UIMA plus some AEs to do analysis as a service
for other OSGi bundles inside Felix.

Jörn

Re: OSGi versions of Add-on Annotators

Posted by Marshall Schor <ms...@schor.com>.

On 7/20/2011 5:04 AM, Tommaso Teofili wrote:
> Hello all,
> I think a bit of history is helpful here to track what has been done so far
> and what/where we can improve on OSGi components.

Good idea!

> the first thing I used to drive me through the setup of UIMA Addons OSGi
> bundles was the dependency:tree on the non-OSGi version of each component
> which I used to define the initial set of dependencies.
> At first I used the maven-bundle-plugin to produce the OSGi jars (bundles)
> and I deployed them inside an Apache Felix instance.

I think I'm missing the big picture, here.  UIMA is usually used as a container
in which Annotators are loaded.
What does it mean to "deploy" inside of an Apache Felix instance?  For instance,
can you put together a several annotators and a flow controller (a UIMA
Aggregate) and "run" it inside Apache Felix?  I don't see how that is done, when
every Annotator is its own OSGi bundle.  Maybe it's because it's early morning
here, and I'm not quite awake, and I've missed the basics ...

This may be all wrong-headed - but I wonder if the basic use case is to do
something like the following: Take a bunch of annotators (and maybe flow
controllers) together with a top-level aggregate XML specifying parameter
overrides, etc., and "wrap" them so they become a single OSGi bundle, that can
then be embedded in an OSGi container?  If so, then perhaps instead of having a
"set" of individually OSGi-i-fied annotators, like we do now, maybe we should
have instead a tool that does this for a set of annotators, etc.


> The bundle plugin had one issue with the OSGi version as 2.3.1-SNAPSHOT is
> not a valid OSGi version while it has to be converted to 2.3.1.SNAPSHOT.

We have this problem elsewhere, and use the ${parsedVersion.osgiVersion} in the
poms, to get the version with the period instead of the dash.

> This problem was resolved setting the artifact packaging to jar thus
> creating the bundle using also other plugins (i.e.:
> maven-dependency-plugin).
> On the other hand when deploying those artifacts on Apache Felix I couldn't
> find "a lot" of dependencies not listed by the mvn depdendency:analyze , for
> some of them I could avoid adding an additional dependency by setting an
> "optional import" (i.e. see some sun.* import set to optional) to the bundle
> configuration but this was not possible for every "unexpected" dependency.
> The above issues drove addons-osgi-runtime configurations as they are now.

The ConfigurableFeatureExtractor non-OSGi pom doesn't have an "ant" dependency,
while the OSGi version of that same annotator does.  Any idea why this was added
to the OSGi one?

The full list of jars pulled into that annotator's OSGi packaging is quite large:
ant-1.7.1.jar
ant-launcher-1.7.1.jar
common-2.3.0-v200706262000.jar
common-3.3.0-v20070426.jar
commons-beanutils-1.8.2.jar
commons-collections-3.2.1.jar
commons-jxpath-1.3.jar
commons-logging-1.1.1.jar
ConfigurableFeatureExtractor-2.3.2-SNAPSHOT.jar
contenttype-3.2.100-v20070319.jar
ecore-2.3.0-v200706262000.jar
geronimo-stax-api_1.0_spec-1.0.1.jar
jdom-1.0.jar
jobs-3.3.0-v20070423.jar
jVinci-2.3.1.jar
org.eclipse.core.jobs-3.5.0.v20100515.jar
org.eclipse.equinox.common-3.6.0.v20100503.jar
org.eclipse.equinox.preferences-3.2.1.jar
org.eclipse.equinox.registry-3.5.0.v20100503.jar
org.eclipse.osgi-3.2.1.jar
osgi-3.3.0-v20070530.jar
preferences-3.2.100-v20070522.jar
registry-3.3.0-v20070522.jar
runtime-3.2.0-v20060603.jar
uimaj-adapter-vinci-2.3.1.jar
uimaj-core-2.3.1.jar
uimaj-cpe-2.3.1.jar
uimaj-document-annotation-2.3.1.jar
uimaj-ep-runtime-2.3.1.jar
uimaj-tools-2.3.1.jar
xercesImpl-2.8.1.jar
xmi-2.3.0-v200706262000.jar
xml-apis-1.3.03.jar
xml-resolver-1.2.jar
xmlbeans-2.4.0.jar

-Marshall

> However I do agree with one thing, we should leave out all the dependencies
> which are OSGi compliant or have an OSGi package version.
> If we could revert back to use only the maven-bundle-plugin may help us keep
> things cleaner.
> I'd be happy to help set up a testing environment (this makes me think it'd
> be useful to setup some integration tests) .
> Tommaso
>
> 2011/7/20 Marshall Schor <ms...@schor.com>
>
>> Moving this from the RC4 release discussion to a new thread ...
>>
>> I've now tried the following:
>>
>> Change the build instructions so
>>
>> a) the dependency goal doesn't unpack the jars
>> b) the OSGi build instruction doesn't say to "inline" the jars.
>>
>> The result - it builds, no error messages, and has a result which includes
>> lots
>> of Jars at the top level, plus a META-INF directory, and nothing else.
>>
>> I tried this with the ConfigurableFeatureExtractor-osgi project.  The Jars
>> that
>> get included are all the ones shown by giving the command:  mvn
>> dependency:tree
>> in the project directory.  There are many jars, including all of Ant, the
>> Ant-launcher Jar, a bunch of eclipse jars including things like
>> org.eclipse.core.jobs, the junit jar, and more.  Many of these are
>> unnecessary,
>> I think, and including them in the distribution causes us to work to verify
>> the
>> appropriate LICENSEs/NOTICEs are created.
>>
>> It seems very unlikely that the CFE needs all these to run, normally.
>> Running
>> the non-OSGi CFE maven dependency:tree does not show the ant dependency -
>> I'll
>> have to track down why those are different.  Looking at the 2.3.0 release,
>> the
>> CFE non-OSGi had many fewer included Jars.
>>
>> I think we can fix the build instructions to excluded the unneeded Jars.
>>
>> I don't have any setup for testing the OSGi packaged artifacts - if anyone
>> else
>> does, let's figure out how to test these - either by collaborating or by
>> helping
>> me learn how to setup something locally to test this packaging result.
>>
>> -Marshall
>>
>>
>>
>>
>>
>>
>>
>> On 7/19/2011 3:44 PM, Marshall Schor wrote:
>>> Thanks, Richard.
>>>
>>> I think you are right - some of the dependencies (for example, the
>>> AlchemyApiAnnotator depends on Apache commons-digester, etc.) don't have
>> OSGi
>>> packagings.
>>>
>>> The build strategy for the OSGi modules currently gets all the
>> dependencies and
>>> unpacks them into .../target/classes directory, where a later step "jars"
>> them up.
>>> This approach overlays files being unzipped, with later versions.  Some
>> examples
>>> where this might be an issue:
>>> There is at the top level a license directory, containing one "LICENSE"
>> file.
>>> There is at the top level a "plugin.xml" file.
>>> There is at the top level a META-INF dir, with LICENSE and NOTICE files
>> among
>>> other things.
>>>
>>> Perhaps it would be better to package the dependencies that are not OSGi
>> in a
>>> way that doesn't need to unpack, and then potentially overlay, files.
>>>
>>> It seems that OSGi and the bundle plugin support this, via the
>> Embed-Dependency
>>> instruction.  Is there a reason we're not using that, instead of the
>> "unpacking"
>>> approach?
>>>
>>> -Marshall
>>>
>>> On 7/19/2011 11:17 AM, Richard Eckart de Castilho wrote:
>>>> I wanted to package the DKPro Core UIMA modules as OSGi bundle. These
>> have lots of dependencies on various JARs that are not available as OSGi
>> bundles and sometimes not even available in public Maven repositories - this
>> is why we set up a public repository of our own for the moment. It may be
>> less an issue for the UIMA sandbox, as the individual components may not
>> depend on third-party libraries.
>>>> Looking the Add Ons repository, I would suspect that Tika, Solr, Rhino,
>> BeanShell and maybe some of the Apache Commons JARs may not be OSGi bundles.
>>>> I guess you aim for a mixed setup where some dependencies (namely UIMA)
>> are imported via package-imports and others (namely the above) are packaged
>> inside the bundles?
>>>> Cheers,
>>>>
>>>> Richard
>>>>
>>>> Am 19.07.2011 um 17:08 schrieb Marshall Schor:
>>>>
>>>>> I suspect that the Jars are now available as OSGi bundles; do you know
>> of
>>>>> specific ones that are not?
>>>>>
>>>>> Thanks. -Marshall
>>>>>
>>>>> On 7/19/2011 10:24 AM, Richard Eckart de Castilho wrote:
>>>>>> Hi Marshall,
>>>>>>
>>>>>> I am very interested in this. Some time back I mostly gave up on
>> packaging UIMA components as OSGi bundles because of this. If you do not
>> bundle all jars (*jikes*) and use package imports instead, the questions is:
>> where do the dependencies come from? Who prepares the bundles and who
>> installs them? Many JARs are not available as OSGi bundles.
>>>>>> -- Richard
>>>>>>
>>>>>> Am 19.07.2011 um 16:13 schrieb Marshall Schor:
>>>>>>
>>>>>>> I'll take a look at the OSGi build.
>>>>>>>
>>>>>>> -Marshall
>>>>>>>
>>>>>>> On 7/17/2011 12:16 PM, Marshall Schor wrote:
>>>>>>>> Since this is (I think) the first time we're releasing the OSGi
>> packaging of the
>>>>>>>> annotators, I think some work on their license/notice files might be
>> needed,
>>>>>>>> because:
>>>>>>>>
>>>>>>>> - there are duplicate License files - one at the top level, and one
>> in the
>>>>>>>> META_INF directory
>>>>>>>> - both of these are the plain vanilla license files.  For projects
>> which are
>>>>>>>> incorporating other libraries which are under other than the Apache
>> v 2.0
>>>>>>>> license, those licenses have to be included.
>>>>>>>> - the NOTICE file is present in the META_INF directory, but is the
>> plain one,
>>>>>>>> rather than the project specific one.
>>>>>>>>
>>>>>>>> Finally, I wonder if the OSGi packaging strategy is correct - in
>> that it
>>>>>>>> "bundles" every dependency into the OSGi file.  This certainly makes
>> the file
>>>>>>>> easier to use, but if a user uses 2 OSGi components from UIMA, won't
>> there be a
>>>>>>>> lot of unnecessary duplication (or does OSGi notice this and avoid
>> it somehow)?
>>>>>>>> I'm not sure of an alternative, but I do recall that OSGi allows for
>>>>>>>> dependencies on other packages; perhaps that could be useful?
>>>>>>>>
>>>>>>>> -Marshall
>>>>>>>>
>>>>>>>> On 7/15/2011 8:29 AM, Tommaso Teofili wrote:
>>>>>>>>> Hi all,
>>>>>>>>> I've prepared the new RC (4) for UIMA Addons release.
>>>>>>>>>
>>>>>>>>> The following is a list of issues addressed in this release:
>>>>>>>>>
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310570&version=12316093
>>>>>>>>> The source zip and binary files are available here:
>>>>>>>>> http://people.apache.org/~tommaso/uima-addons-2.3.1-rc4
>>>>>>>>>
>>>>>>>>> SVN Tag Checkout:
>>>>>>>>> svn co
>>>>>>>>>
>> http://svn.apache.org/repos/asf/uima/addons/tags/uima-addons-2.3.1-rc4/
>>>>>>>>> Please cast your vote for UIMA Addons 2.3.1 release:
>>>>>>>>>
>>>>>>>>> [ ] +1 Approve the release
>>>>>>>>> [ ] -1 Veto the release (please provide specific comments)
>>>>>>>>> [ ] 0   Don't care
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Tommaso
>>>>>>>>>
>>>> Richard Eckart de Castilho
>>>>

Re: OSGi versions of Add-on Annotators

Posted by Tommaso Teofili <to...@gmail.com>.

Hello all,
I think a bit of history is helpful here to track what has been done so far
and what/where we can improve on OSGi components.
the first thing I used to drive me through the setup of UIMA Addons OSGi
bundles was the dependency:tree on the non-OSGi version of each component
which I used to define the initial set of dependencies.
At first I used the maven-bundle-plugin to produce the OSGi jars (bundles)
and I deployed them inside an Apache Felix instance.
The bundle plugin had one issue with the OSGi version as 2.3.1-SNAPSHOT is
not a valid OSGi version while it has to be converted to 2.3.1.SNAPSHOT.
This problem was resolved setting the artifact packaging to jar thus
creating the bundle using also other plugins (i.e.:
maven-dependency-plugin).
On the other hand when deploying those artifacts on Apache Felix I couldn't
find "a lot" of dependencies not listed by the mvn depdendency:analyze , for
some of them I could avoid adding an additional dependency by setting an
"optional import" (i.e. see some sun.* import set to optional) to the bundle
configuration but this was not possible for every "unexpected" dependency.
The above issues drove addons-osgi-runtime configurations as they are now.
However I do agree with one thing, we should leave out all the dependencies
which are OSGi compliant or have an OSGi package version.
If we could revert back to use only the maven-bundle-plugin may help us keep
things cleaner.
I'd be happy to help set up a testing environment (this makes me think it'd
be useful to setup some integration tests) .
Tommaso

2011/7/20 Marshall Schor <ms...@schor.com>

> Moving this from the RC4 release discussion to a new thread ...
>
> I've now tried the following:
>
> Change the build instructions so
>
> a) the dependency goal doesn't unpack the jars
> b) the OSGi build instruction doesn't say to "inline" the jars.
>
> The result - it builds, no error messages, and has a result which includes
> lots
> of Jars at the top level, plus a META-INF directory, and nothing else.
>
> I tried this with the ConfigurableFeatureExtractor-osgi project.  The Jars
> that
> get included are all the ones shown by giving the command:  mvn
> dependency:tree
> in the project directory.  There are many jars, including all of Ant, the
> Ant-launcher Jar, a bunch of eclipse jars including things like
> org.eclipse.core.jobs, the junit jar, and more.  Many of these are
> unnecessary,
> I think, and including them in the distribution causes us to work to verify
> the
> appropriate LICENSEs/NOTICEs are created.
>
> It seems very unlikely that the CFE needs all these to run, normally.
> Running
> the non-OSGi CFE maven dependency:tree does not show the ant dependency -
> I'll
> have to track down why those are different.  Looking at the 2.3.0 release,
> the
> CFE non-OSGi had many fewer included Jars.
>
> I think we can fix the build instructions to excluded the unneeded Jars.
>
> I don't have any setup for testing the OSGi packaged artifacts - if anyone
> else
> does, let's figure out how to test these - either by collaborating or by
> helping
> me learn how to setup something locally to test this packaging result.
>
> -Marshall
>
>
>
>
>
>
>
> On 7/19/2011 3:44 PM, Marshall Schor wrote:
> > Thanks, Richard.
> >
> > I think you are right - some of the dependencies (for example, the
> > AlchemyApiAnnotator depends on Apache commons-digester, etc.) don't have
> OSGi
> > packagings.
> >
> > The build strategy for the OSGi modules currently gets all the
> dependencies and
> > unpacks them into .../target/classes directory, where a later step "jars"
> them up.
> >
> > This approach overlays files being unzipped, with later versions.  Some
> examples
> > where this might be an issue:
> > There is at the top level a license directory, containing one "LICENSE"
> file.
> > There is at the top level a "plugin.xml" file.
> > There is at the top level a META-INF dir, with LICENSE and NOTICE files
> among
> > other things.
> >
> > Perhaps it would be better to package the dependencies that are not OSGi
> in a
> > way that doesn't need to unpack, and then potentially overlay, files.
> >
> > It seems that OSGi and the bundle plugin support this, via the
> Embed-Dependency
> > instruction.  Is there a reason we're not using that, instead of the
> "unpacking"
> > approach?
> >
> > -Marshall
> >
> > On 7/19/2011 11:17 AM, Richard Eckart de Castilho wrote:
> >> I wanted to package the DKPro Core UIMA modules as OSGi bundle. These
> have lots of dependencies on various JARs that are not available as OSGi
> bundles and sometimes not even available in public Maven repositories - this
> is why we set up a public repository of our own for the moment. It may be
> less an issue for the UIMA sandbox, as the individual components may not
> depend on third-party libraries.
> >>
> >> Looking the Add Ons repository, I would suspect that Tika, Solr, Rhino,
> BeanShell and maybe some of the Apache Commons JARs may not be OSGi bundles.
> >>
> >> I guess you aim for a mixed setup where some dependencies (namely UIMA)
> are imported via package-imports and others (namely the above) are packaged
> inside the bundles?
> >>
> >> Cheers,
> >>
> >> Richard
> >>
> >> Am 19.07.2011 um 17:08 schrieb Marshall Schor:
> >>
> >>> I suspect that the Jars are now available as OSGi bundles; do you know
> of
> >>> specific ones that are not?
> >>>
> >>> Thanks. -Marshall
> >>>
> >>> On 7/19/2011 10:24 AM, Richard Eckart de Castilho wrote:
> >>>> Hi Marshall,
> >>>>
> >>>> I am very interested in this. Some time back I mostly gave up on
> packaging UIMA components as OSGi bundles because of this. If you do not
> bundle all jars (*jikes*) and use package imports instead, the questions is:
> where do the dependencies come from? Who prepares the bundles and who
> installs them? Many JARs are not available as OSGi bundles.
> >>>>
> >>>> -- Richard
> >>>>
> >>>> Am 19.07.2011 um 16:13 schrieb Marshall Schor:
> >>>>
> >>>>> I'll take a look at the OSGi build.
> >>>>>
> >>>>> -Marshall
> >>>>>
> >>>>> On 7/17/2011 12:16 PM, Marshall Schor wrote:
> >>>>>> Since this is (I think) the first time we're releasing the OSGi
> packaging of the
> >>>>>> annotators, I think some work on their license/notice files might be
> needed,
> >>>>>> because:
> >>>>>>
> >>>>>> - there are duplicate License files - one at the top level, and one
> in the
> >>>>>> META_INF directory
> >>>>>> - both of these are the plain vanilla license files.  For projects
> which are
> >>>>>> incorporating other libraries which are under other than the Apache
> v 2.0
> >>>>>> license, those licenses have to be included.
> >>>>>> - the NOTICE file is present in the META_INF directory, but is the
> plain one,
> >>>>>> rather than the project specific one.
> >>>>>>
> >>>>>> Finally, I wonder if the OSGi packaging strategy is correct - in
> that it
> >>>>>> "bundles" every dependency into the OSGi file.  This certainly makes
> the file
> >>>>>> easier to use, but if a user uses 2 OSGi components from UIMA, won't
> there be a
> >>>>>> lot of unnecessary duplication (or does OSGi notice this and avoid
> it somehow)?
> >>>>>> I'm not sure of an alternative, but I do recall that OSGi allows for
> >>>>>> dependencies on other packages; perhaps that could be useful?
> >>>>>>
> >>>>>> -Marshall
> >>>>>>
> >>>>>> On 7/15/2011 8:29 AM, Tommaso Teofili wrote:
> >>>>>>> Hi all,
> >>>>>>> I've prepared the new RC (4) for UIMA Addons release.
> >>>>>>>
> >>>>>>> The following is a list of issues addressed in this release:
> >>>>>>>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310570&version=12316093
> >>>>>>>
> >>>>>>> The source zip and binary files are available here:
> >>>>>>> http://people.apache.org/~tommaso/uima-addons-2.3.1-rc4
> >>>>>>>
> >>>>>>> SVN Tag Checkout:
> >>>>>>> svn co
> >>>>>>>
> http://svn.apache.org/repos/asf/uima/addons/tags/uima-addons-2.3.1-rc4/
> >>>>>>>
> >>>>>>> Please cast your vote for UIMA Addons 2.3.1 release:
> >>>>>>>
> >>>>>>> [ ] +1 Approve the release
> >>>>>>> [ ] -1 Veto the release (please provide specific comments)
> >>>>>>> [ ] 0   Don't care
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Tommaso
> >>>>>>>
> >> Richard Eckart de Castilho
> >>
>