You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Michael Baessler <mb...@michael-baessler.de> on 2007/10/15 16:25:13 UTC

UIMA Sandbox releases

Hi,

I think now it is time to discuss how we should proceed with the Sandbox 
components and how we will release them.
The Sandbox is a place to develop components or tooling and to play with 
new technologies around UIMA. If some of the components are stable and 
tested they
should be included to a Sandbox release. A Sandbox release has the 
advantage the users know what are the reliable components and what are the
components that are currently under development. I would like to release 
the selected Sandbox components as bundle with each Apache UIMA release 
under - "Apache UIMA Add-ons".
So the version number is the same for both releases and the effort is 
minimized.

Advantages:
- release effort is minimized since only one release for all components 
must be done
- you can download only one package that contains all the components and 
tools

Disadvantages:
- the release number for a Sandbox component changes for each Apache 
UIMA release independent of component changes
- the Sandbox components can not be downloaded separately which is maybe 
an issue (??) for some companies

To do the release building I'm going to create a new Sandbox project 
called "SandboxDistr" which contains all the information to build and 
package the components.
For analysis engine components I have created a PEAR packager maven 
plugin (plugin will also be added to the Sandbox) that will be used to 
create a PEAR package automatically
during the build. My plan was to ship the PEAR package additionally in 
the binary distribution for a component.

What do others think about this release approach?
Suggestions or comments?

-- Michael

Re: UIMA Sandbox releases

Posted by Adam Lally <al...@alum.rpi.edu>.
On 10/15/07, Michael Baessler <mb...@michael-baessler.de> wrote:
> Hi,
>
> I think now it is time to discuss how we should proceed with the Sandbox
> components and how we will release them.
> The Sandbox is a place to develop components or tooling and to play with
> new technologies around UIMA. If some of the components are stable and
> tested they
> should be included to a Sandbox release. A Sandbox release has the
> advantage the users know what are the reliable components and what are the
> components that are currently under development. I would like to release
> the selected Sandbox components as bundle with each Apache UIMA release
> under - "Apache UIMA Add-ons".
> So the version number is the same for both releases and the effort is
> minimized.
>
> Advantages:
> - release effort is minimized since only one release for all components
> must be done
> - you can download only one package that contains all the components and
> tools
>
> Disadvantages:
> - the release number for a Sandbox component changes for each Apache
> UIMA release independent of component changes
> - the Sandbox components can not be downloaded separately which is maybe
> an issue (??) for some companies
>

Maybe it would be better to offer two kinds of Apache UIMA downloads -
one that bundles the Sandbox components and one that doesn't.  That
way people who don't want the less mature add-ons don't have to get
them.  Of course the downside of that is that the downloads page
becomes more confusing since there are more options.

> To do the release building I'm going to create a new Sandbox project
> called "SandboxDistr" which contains all the information to build and
> package the components.
> For analysis engine components I have created a PEAR packager maven
> plugin (plugin will also be added to the Sandbox) that will be used to
> create a PEAR package automatically
> during the build. My plan was to ship the PEAR package additionally in
> the binary distribution for a component.
>

Neat!  As I recall there was a request for exactly this on the uima user list.

  -Adam

Re: UIMA Sandbox releases

Posted by Michael Baessler <mb...@michael-baessler.de>.
Sounds good to me... especially the suggestion with the component 
versions that does not change if there is no change!

The other thing I would like to mention again is that from my 
perspective the Sandbox is a place for playing around with new ideas and 
technologies. But if one of the Sandbox components is ready to use in 
"production" I think the component should be moved to the core 
framework. So users get an impression what kind of components are still 
in a development/test phase and what kind of components are "ready to 
use in production". Also this approach helps us to no longer have to 
provide an "empty framework" without any analysis components.

So I'm fine with releasing the CasEditor for now as a single Sandbox 
component.
We could also consider to release some of the annotators in a Sandbox 
annotator bundle.

-- Michael

Marshall Schor wrote:
> Re: releasing the Cas Editor - with or without some pre-packaged 
> annotators.
>
> I suspect that Joern would be willing to be the "release manager" for 
> this :-).  He may even be willing to bundle some of the more "stable" 
> sandbox components with it, but certainly not uima-as (uima-ee), which 
> is not ready.
>
> The pragmatic, least - work approach would be to pick those sandbox 
> projects that would be ready now, and do one "release" packaging that 
> included the Cas Editor.  However, I don't think that's the "clearest" 
> approach for our users.  I think they might like to see bundles 
> arranged by topics - and so might like a bundle of annotators, and 
> might separately like the Cas Editor.
>
> So - my preference for now would be to keep the Cas Editor as a 
> separately packaged thing coming from the project.  If we get 
> additional tools, over time, which we consider "add-ons" and not 
> fundamentally needed as part of the core, then perhaps we can have a 
> tools-bundle.
>
> To do this effectively using the Maven "way" - we might want to have 
> each tool (in one "project") produce one jar (maven way:  each project 
> <=> one jar), at a particular version level.  These would be available 
> in the maven jar repository, and maven tooling could be used to fetch 
> them.   Maven "assemblies" could then be used to package multiples of 
> these into bigger packages of things.  A basic idea here would be that 
> the version of the assembly would be on a different "schedule" than 
> the components.  So someone downloading an assembled "bundle" would 
> get parts, each of which had their own version number.  This is 
> similar to what you get with other big projects that include jars from 
> other sources.  The parts which are stable and not changing would not 
> have their version numbers incremented in the assembled bundle.
>
> -Marshall
>
>
> Michael Baessler wrote:
>> Marshall Schor wrote:
>>> Thilo Goetz wrote:
>>>  
>>>> Hi Marshall,
>>>>
>>>> as usual, my view is pretty much the exact opposite ;-)
>>>>
>>>> First of all, I don't see the sense in creating yet another
>>>> category.  To my mind, there's nothing wrong with having
>>>> mature components in the sandbox.  The only thing I would
>>>> consider is to move some sandbox components that are really
>>>> important to people into the core.
>>>>       
>>> I think people might feel that the sandbox isn't a place to get
>>> production-quality things, and I was hoping that some of these
>>> components were production-quality :-)   
>> I think Thilo raised a good point here. We still have an "empty 
>> framework" that does not provide any linguistic functionality out of 
>> the box. So maybe we should think about moving sandbox components 
>> that are ready to use and are important for most of the UIMA users to 
>> the core. We could than also provide some more out of the box 
>> analytics by combining the components.
>>
>> For all the other Sandbox components that are ready to use but are 
>> not relevant for most of the UIMA users we can consider to do a 
>> separate release for each component. I guess the release cycles are 
>> larger for those components so that we do not have so much Sandbox 
>> component releases.
>>
>> Opinions?
>>
>> -- Michael
>>
>>
>>
>


Re: UIMA Sandbox releases

Posted by Marshall Schor <ms...@schor.com>.
Re: releasing the Cas Editor - with or without some pre-packaged annotators.

I suspect that Joern would be willing to be the "release manager" for 
this :-).  He may even be willing to bundle some of the more "stable" 
sandbox components with it, but certainly not uima-as (uima-ee), which 
is not ready.

The pragmatic, least - work approach would be to pick those sandbox 
projects that would be ready now, and do one "release" packaging that 
included the Cas Editor.  However, I don't think that's the "clearest" 
approach for our users.  I think they might like to see bundles arranged 
by topics - and so might like a bundle of annotators, and might 
separately like the Cas Editor.

So - my preference for now would be to keep the Cas Editor as a 
separately packaged thing coming from the project.  If we get additional 
tools, over time, which we consider "add-ons" and not fundamentally 
needed as part of the core, then perhaps we can have a tools-bundle.

To do this effectively using the Maven "way" - we might want to have 
each tool (in one "project") produce one jar (maven way:  each project 
<=> one jar), at a particular version level.  These would be available 
in the maven jar repository, and maven tooling could be used to fetch 
them.   Maven "assemblies" could then be used to package multiples of 
these into bigger packages of things.  A basic idea here would be that 
the version of the assembly would be on a different "schedule" than the 
components.  So someone downloading an assembled "bundle" would get 
parts, each of which had their own version number.  This is similar to 
what you get with other big projects that include jars from other 
sources.  The parts which are stable and not changing would not have 
their version numbers incremented in the assembled bundle.

-Marshall


Michael Baessler wrote:
> Marshall Schor wrote:
>> Thilo Goetz wrote:
>>  
>>> Hi Marshall,
>>>
>>> as usual, my view is pretty much the exact opposite ;-)
>>>
>>> First of all, I don't see the sense in creating yet another
>>> category.  To my mind, there's nothing wrong with having
>>> mature components in the sandbox.  The only thing I would
>>> consider is to move some sandbox components that are really
>>> important to people into the core.
>>>       
>> I think people might feel that the sandbox isn't a place to get
>> production-quality things, and I was hoping that some of these
>> components were production-quality :-)   
> I think Thilo raised a good point here. We still have an "empty 
> framework" that does not provide any linguistic functionality out of 
> the box. So maybe we should think about moving sandbox components that 
> are ready to use and are important for most of the UIMA users to the 
> core. We could than also provide some more out of the box analytics by 
> combining the components.
>
> For all the other Sandbox components that are ready to use but are not 
> relevant for most of the UIMA users we can consider to do a separate 
> release for each component. I guess the release cycles are larger for 
> those components so that we do not have so much Sandbox component 
> releases.
>
> Opinions?
>
> -- Michael
>
>
>


Re: UIMA Sandbox releases

Posted by Michael Baessler <mb...@michael-baessler.de>.
Marshall Schor wrote:
> Thilo Goetz wrote:
>   
>> Hi Marshall,
>>
>> as usual, my view is pretty much the exact opposite ;-)
>>
>> First of all, I don't see the sense in creating yet another
>> category.  To my mind, there's nothing wrong with having
>> mature components in the sandbox.  The only thing I would
>> consider is to move some sandbox components that are really
>> important to people into the core.
>>   
>>     
> I think people might feel that the sandbox isn't a place to get
> production-quality things, and I was hoping that some of these
> components were production-quality :-) 
>   
I think Thilo raised a good point here. We still have an "empty 
framework" that does not provide any linguistic functionality out of the 
box. So maybe we should think about moving sandbox components that are 
ready to use and are important for most of the UIMA users to the core. 
We could than also provide some more out of the box analytics by 
combining the components.

For all the other Sandbox components that are ready to use but are not 
relevant for most of the UIMA users we can consider to do a separate 
release for each component. I guess the release cycles are larger for 
those components so that we do not have so much Sandbox component releases.

Opinions?

-- Michael


Re: UIMA Sandbox releases

Posted by Marshall Schor <ms...@schor.com>.
Thilo Goetz wrote:
> Hi Marshall,
>
> as usual, my view is pretty much the exact opposite ;-)
>
> First of all, I don't see the sense in creating yet another
> category.  To my mind, there's nothing wrong with having
> mature components in the sandbox.  The only thing I would
> consider is to move some sandbox components that are really
> important to people into the core.
>   
I think people might feel that the sandbox isn't a place to get
production-quality things, and I was hoping that some of these
components were production-quality :-) 
> Also, at least while we're still in the incubator, I really
> don't think individual releases for sandbox components are
> manageable (unless I hear people volunteer to be release
> managers for these things).
>   
You have a point - I think this could also be an issue even outside the
incubator.
> Also from a user perspective, I prefer a single download
> with everything in it.  When I look at a new software and
> go through the tutorial, I hate it when I have to get
> components A, B, and C to do the first chapter, and then
> for the next one, I need to get D and E, and so on.  A
> couple of megabytes of download is insignificant these
> days, and then you have everything on disc and can pick
> and choose what you need.
>   
I agree if it's small, it's not worth making a fuss.  If the components
start to have more of a life of their own (consider, for instance, the
CLI component of the Commons project - it's used in many things) -
people will want to clearly see for each component what the license /
notification issues are, so they can re-package them.  To me, this tilts
a bit toward wanting individual downloads. 

But I'm not against bundles for those that want them. 

And I agree - it's extra work to do all this fancy packaging - so I was
thinking idealistically, hoping that we could use computers to do all
the heavy lifting of supporting these packagings ;-)

-Marshall
> Just my 2 cents.  If people are willing to do the work and
> think that our users are better served by individual
> downloads, that's fine with me, too.  Personally, I'll
> defer to whatever the release manager for the sandbox,
> whoever that may be, has to say on this subject :-)
>
>   

> --Thilo
>
> Marshall Schor wrote:
>   
>> How do other projects do this kind of thing?
>>
>> I'm thinking of the Commons project (was Jakarta Commons) - has lots of
>> downloadable subparts; each is separate.
>>
>> I don't have a strong preference, but I would lean toward the following
>> kinds of things:
>>
>> 1) having a new category of things:  Apache UIMA Components & Tools
>>
>> 2) moving things we think are ready for release from the sandbox to this
>> new category.
>>
>> If things are really not ready for off-the-shelf kinds of users to use,
>> I guess I would favor keeping those things in the Sandbox
>>
>> On the question of "bundling" components versus making them available
>> separately - I think in our case, the components that we have - it feels
>> more like they would be better consumed as separately available things. 
>>
>> At some point in the future, it may make sense to offer the whole set as
>> a "package" - but I guess that seems less useful than having them
>> separately available.
>>
>> It also seems to me that these components would be on independent
>> release cycles. 
>>
>> -Marshall
>>
>> Michael Baessler wrote:
>>     
>>> Hi,
>>>
>>> I think now it is time to discuss how we should proceed with the
>>> Sandbox components and how we will release them.
>>> The Sandbox is a place to develop components or tooling and to play
>>> with new technologies around UIMA. If some of the components are
>>> stable and tested they
>>> should be included to a Sandbox release. A Sandbox release has the
>>> advantage the users know what are the reliable components and what are
>>> the
>>> components that are currently under development. I would like to
>>> release the selected Sandbox components as bundle with each Apache
>>> UIMA release under - "Apache UIMA Add-ons".
>>> So the version number is the same for both releases and the effort is
>>> minimized.
>>>
>>> Advantages:
>>> - release effort is minimized since only one release for all
>>> components must be done
>>> - you can download only one package that contains all the components
>>> and tools
>>>
>>> Disadvantages:
>>> - the release number for a Sandbox component changes for each Apache
>>> UIMA release independent of component changes
>>> - the Sandbox components can not be downloaded separately which is
>>> maybe an issue (??) for some companies
>>>
>>> To do the release building I'm going to create a new Sandbox project
>>> called "SandboxDistr" which contains all the information to build and
>>> package the components.
>>> For analysis engine components I have created a PEAR packager maven
>>> plugin (plugin will also be added to the Sandbox) that will be used to
>>> create a PEAR package automatically
>>> during the build. My plan was to ship the PEAR package additionally in
>>> the binary distribution for a component.
>>>
>>> What do others think about this release approach?
>>> Suggestions or comments?
>>>
>>> -- Michael
>>>
>>>
>>>       
>
>
>
>   


Re: UIMA Sandbox releases

Posted by Thilo Goetz <tw...@gmx.de>.
Hi Marshall,

as usual, my view is pretty much the exact opposite ;-)

First of all, I don't see the sense in creating yet another
category.  To my mind, there's nothing wrong with having
mature components in the sandbox.  The only thing I would
consider is to move some sandbox components that are really
important to people into the core.

Also, at least while we're still in the incubator, I really
don't think individual releases for sandbox components are
manageable (unless I hear people volunteer to be release
managers for these things).

Also from a user perspective, I prefer a single download
with everything in it.  When I look at a new software and
go through the tutorial, I hate it when I have to get
components A, B, and C to do the first chapter, and then
for the next one, I need to get D and E, and so on.  A
couple of megabytes of download is insignificant these
days, and then you have everything on disc and can pick
and choose what you need.

Just my 2 cents.  If people are willing to do the work and
think that our users are better served by individual
downloads, that's fine with me, too.  Personally, I'll
defer to whatever the release manager for the sandbox,
whoever that may be, has to say on this subject :-)

--Thilo

Marshall Schor wrote:
> How do other projects do this kind of thing?
> 
> I'm thinking of the Commons project (was Jakarta Commons) - has lots of
> downloadable subparts; each is separate.
> 
> I don't have a strong preference, but I would lean toward the following
> kinds of things:
> 
> 1) having a new category of things:  Apache UIMA Components & Tools
> 
> 2) moving things we think are ready for release from the sandbox to this
> new category.
> 
> If things are really not ready for off-the-shelf kinds of users to use,
> I guess I would favor keeping those things in the Sandbox
> 
> On the question of "bundling" components versus making them available
> separately - I think in our case, the components that we have - it feels
> more like they would be better consumed as separately available things. 
> 
> At some point in the future, it may make sense to offer the whole set as
> a "package" - but I guess that seems less useful than having them
> separately available.
> 
> It also seems to me that these components would be on independent
> release cycles. 
> 
> -Marshall
> 
> Michael Baessler wrote:
>> Hi,
>>
>> I think now it is time to discuss how we should proceed with the
>> Sandbox components and how we will release them.
>> The Sandbox is a place to develop components or tooling and to play
>> with new technologies around UIMA. If some of the components are
>> stable and tested they
>> should be included to a Sandbox release. A Sandbox release has the
>> advantage the users know what are the reliable components and what are
>> the
>> components that are currently under development. I would like to
>> release the selected Sandbox components as bundle with each Apache
>> UIMA release under - "Apache UIMA Add-ons".
>> So the version number is the same for both releases and the effort is
>> minimized.
>>
>> Advantages:
>> - release effort is minimized since only one release for all
>> components must be done
>> - you can download only one package that contains all the components
>> and tools
>>
>> Disadvantages:
>> - the release number for a Sandbox component changes for each Apache
>> UIMA release independent of component changes
>> - the Sandbox components can not be downloaded separately which is
>> maybe an issue (??) for some companies
>>
>> To do the release building I'm going to create a new Sandbox project
>> called "SandboxDistr" which contains all the information to build and
>> package the components.
>> For analysis engine components I have created a PEAR packager maven
>> plugin (plugin will also be added to the Sandbox) that will be used to
>> create a PEAR package automatically
>> during the build. My plan was to ship the PEAR package additionally in
>> the binary distribution for a component.
>>
>> What do others think about this release approach?
>> Suggestions or comments?
>>
>> -- Michael
>>
>>


Re: UIMA Sandbox releases

Posted by Marshall Schor <ms...@schor.com>.
How do other projects do this kind of thing?

I'm thinking of the Commons project (was Jakarta Commons) - has lots of
downloadable subparts; each is separate.

I don't have a strong preference, but I would lean toward the following
kinds of things:

1) having a new category of things:  Apache UIMA Components & Tools

2) moving things we think are ready for release from the sandbox to this
new category.

If things are really not ready for off-the-shelf kinds of users to use,
I guess I would favor keeping those things in the Sandbox

On the question of "bundling" components versus making them available
separately - I think in our case, the components that we have - it feels
more like they would be better consumed as separately available things. 

At some point in the future, it may make sense to offer the whole set as
a "package" - but I guess that seems less useful than having them
separately available.

It also seems to me that these components would be on independent
release cycles. 

-Marshall

Michael Baessler wrote:
> Hi,
>
> I think now it is time to discuss how we should proceed with the
> Sandbox components and how we will release them.
> The Sandbox is a place to develop components or tooling and to play
> with new technologies around UIMA. If some of the components are
> stable and tested they
> should be included to a Sandbox release. A Sandbox release has the
> advantage the users know what are the reliable components and what are
> the
> components that are currently under development. I would like to
> release the selected Sandbox components as bundle with each Apache
> UIMA release under - "Apache UIMA Add-ons".
> So the version number is the same for both releases and the effort is
> minimized.
>
> Advantages:
> - release effort is minimized since only one release for all
> components must be done
> - you can download only one package that contains all the components
> and tools
>
> Disadvantages:
> - the release number for a Sandbox component changes for each Apache
> UIMA release independent of component changes
> - the Sandbox components can not be downloaded separately which is
> maybe an issue (??) for some companies
>
> To do the release building I'm going to create a new Sandbox project
> called "SandboxDistr" which contains all the information to build and
> package the components.
> For analysis engine components I have created a PEAR packager maven
> plugin (plugin will also be added to the Sandbox) that will be used to
> create a PEAR package automatically
> during the build. My plan was to ship the PEAR package additionally in
> the binary distribution for a component.
>
> What do others think about this release approach?
> Suggestions or comments?
>
> -- Michael
>
>