You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@taverna.apache.org by Andy Seaborne <an...@apache.org> on 2014/11/10 11:58:04 UTC

Codebase ingestion

One of the next things to do is to ingest the existing codebase.
That means git repos - I recall you wanted several.

I think I saw a discussion that the maven release plugin recently had 
improvements added to handle releases covering multiple repos and also 
partial repo releases (one release, not everything in the repo).  That 
might make things easier and different. [I can't find where I saw that :-(]

Does someone want to come up with the plan for the layout?

(svn for the website - the machinery for publishing websites works with svn)

Normally, for a new podling, the details of how to release don't arise 
until the first release approaches but if it is going to impact the 
repository layout, then we need to have a general picture first.

An Apache release is source code - the fact there are maven jars, 
combined zips/tgz  etc is subsidiary.

Example: for Jena that means the files in:

http://www.apache.org/dist/jena/source/

Jena has two current releases at the moment - normally, we have one and 
and there would only be files like "jena-VERSION-source-release".

The one true item for a release, what the vote is formally about, is 
that source-release file and it is something any can download and build 
Jena. It's all made by the maven release plugin, having the Apache POM 
on the parent chain, and using the supplied profile -Papache-release.

Full details for Jena:

https://cwiki.apache.org/confluence/display/JENA/Release+Process

Of course, most people use one of our binaries (maven or the 
all-libraries-in-one zip old style in binaries/) but the authentic 
release is the source-release.

Every project is different and the only required part is to have 
source-release and the VOTE.  In fact, it's not too hard to find other 
projects that do it in ways that various people would argue with, 
typically, because they are long standing projects and have their way of 
doing things before general conventions arose.

It's educational to look at other what projects do and blend it all 
together with your existing process.  (mentors - where are good examples 
that you know of?).

	Andy

Re: Codebase ingestion

Posted by Andy Seaborne <an...@apache.org>.
The project probably ought to remain on monthly incubator reports until 
the paperwork is delivered.  It's not a big deal.

	Season greetings
	Andy

On 24/12/14 13:52, Marlon Pierce wrote:
> Great, I have been worried that the delays were causing your transition
> to Apache to stagnate.  Hopefully it will be cleared up in January.
>
> Marlon
>
> On 12/24/14, 12:55 AM, Stian Soiland-Reyes wrote:
>> Thanks for following this up.
>>
>>
>> We had to hand over the paper copies of the existing CLAs for external
>> contributors, so Shoaib has chased up who had the original papers and
>> asked that person to forward it to the legal department dealing with
>> signing the IP document.
>>
>> In principle I believe everything is OK to sign (it's been approved by
>> the head of school), they just waited for those papers before they
>> could sign off.
>>
>> Perhaps Shoaib could send an update to this list?
>>
>>
>> The University has now closed for Christmas, so further progress won't
>> be until second week of January 2015.
>>
>>
>>
>> On 23 December 2014 at 16:51, Marlon Pierce <ma...@iu.edu> wrote:
>>> Hi folks--
>>>
>>> How is the code donation process going?  Are you still having problems
>>> clearing legal hurdles at your university?
>>>
>>> Marlon


Re: Codebase ingestion

Posted by Marlon Pierce <ma...@iu.edu>.
Great, I have been worried that the delays were causing your transition 
to Apache to stagnate.  Hopefully it will be cleared up in January.

Marlon

On 12/24/14, 12:55 AM, Stian Soiland-Reyes wrote:
> Thanks for following this up.
>
>
> We had to hand over the paper copies of the existing CLAs for external
> contributors, so Shoaib has chased up who had the original papers and
> asked that person to forward it to the legal department dealing with
> signing the IP document.
>
> In principle I believe everything is OK to sign (it's been approved by
> the head of school), they just waited for those papers before they
> could sign off.
>
> Perhaps Shoaib could send an update to this list?
>
>
> The University has now closed for Christmas, so further progress won't
> be until second week of January 2015.
>
>
>
> On 23 December 2014 at 16:51, Marlon Pierce <ma...@iu.edu> wrote:
>> Hi folks--
>>
>> How is the code donation process going?  Are you still having problems
>> clearing legal hurdles at your university?
>>
>> Marlon
>>
>>
>> On 11/10/14, 6:37 AM, Stian Soiland-Reyes wrote:
>>> Thanks for keeping the ball rolling :)
>>>
>>>
>>> I've started sketching out a potential repository tree layout in
>>>
>>> https://github.com/taverna-incubator
>>>
>>>
>>> NOTE: This is not yet ready for ingestion - it does not currently
>>> contain the git history and is just a draft  on how to structure it
>>> using raw force. Once it's settled I suggest we recreate the structure
>>> proper using git merge, git mv and copying across the modified POMs -
>>> done with a shell script so we know it's not inconsistent.
>>>
>>>
>>>
>>> I have modified
>>>
>>>
>>> https://github.com/taverna-incubator/taverna-maven-parent/blob/master/pom.xml
>>>
>>> to inherit the general Apache parent.
>>>
>>>
>>> And then modified all the POM files of
>>>
>>>    https://github.com/taverna-incubator/taverna-language
>>>
>>>
>>> to e.g.:
>>>
>>> <parent>
>>>       <groupId>org.apache.taverna.language</groupId>
>>>       <artifactId>taverna-language</artifactId>
>>>       <version>0.15.0-incubating-SNAPSHOT</version>
>>> </parent>
>>> <artifactId>taverna-scufl2-api</artifactId>
>>>
>>>
>>> You might notice that I have merged several repositories to one. E.g.
>>>
>>> https://github.com/taverna-incubator/taverna-engine-impl
>>>
>>> contains code that currently live across:
>>>
>>> https://github.com/taverna/taverna-engine-core
>>> https://github.com/taverna/taverna-engine-credential-manager
>>> https://github.com/taverna/taverna3-platform
>>> https://github.com/taverna/taverna-dataflow-activity
>>>
>>>
>>>
>>> The intention is for these repositories to build totally independent
>>> of any net.sf.taverna.* dependencies and myGrid repositories.  It does
>>> have some issues that it depends on older libraries not in Central -
>>> many which we should get rid off (Sesame 2.3.2) - but that should
>>> mainly be sorted post code migration (but pre release).
>>>
>>>
>>> I have not yet done such pom modifications in any of the other
>>> repositories - but generally have renamed the submodules to the naming
>>> "taverna-*" or "taverna-*".   I've not had a look at further merging
>>> those packages and submodules - as that can be done post migration.
>>>
>>>
>>>
>>> So that is how I intend to test out the suggested layout -- we will ensure
>>> that:
>>>
>>> a) we are not forgetting to migrate something that is needed deep
>>> inside (it would depend on something with groupId net.sf.taverna or
>>> fetched from mygrid.org.uk/repository )
>>> b) there are no unintended spaghetti dependencies that makes it
>>> difficult to do a staged series of normal mvn release:prepare type of
>>> releases.
>>>
>>>
>>> Then we can try to do a fake release to a separate/temporary Maven
>>> repository - this will ensure the correct release order and show if
>>> further merges should be done.
>>>
>>>
>>> I am very worried about "partial release" mechanism as, with multiple
>>> people having the capability to do a release (which I would hope we
>>> manage), can easily end up in a "Second release of version 1.1, now
>>> with some unintended changes" - specially as I believe Apache's Maven
>>> server is not a very strict one that would reject such an attempt.
>>>
>>>
>>> In this suggested layout each repository will basically have their own
>>> versioning/tags - and as we're starting with fresh groupIds we can
>>> make them uniform within each repository - e.g. taverna-databundle
>>> will now get the same version as taverna-scufl2-api as they both live
>>> in the taverna-language repository. There's nothing saying the
>>> versions can't diverge later individually (e.g. databundle goes to
>>> 0.12.3 while scufl2-api to 0.14.0) - but we've seen in the past
>>> developers get confused by this, resorting to a massive <properties>
>>> listing in taverna-maven-parent in order to keep track of all of those
>>> (which then leads to constant releases of taverna-maven-parent, which
>>> again leads everything to appear to need to change..).
>>>
>>> Keeping track of just these suggested repositories should be enough -
>>> and probably even that a challenge - but luckily in OSGi it only needs
>>> to be 'actually latest' at the product bundling stage.
>>>
>>>
>>> That's why I have the split taverna-engine-api and taverna-engine-impl
>>> for instance.  One could argue that the API and IMPL should be
>>> versioned together, as a new API necessitates a new IMPL - but given
>>> the size/complexity of the engine-impl this means that we would have
>>> to only version the impl in patch numbers as there is nothing new in
>>> the API (but what if the impl depends on something new?)  -- or we
>>> would release a 'fake new API version' even though nothing changed
>>> (Basically leading to the Taverna 2 situation where you get an
>>> impression of need to update usage of the API).
>>>
>>>
>>> I would assume a proposed Apache release would still depend on
>>> multiple of these git repositories. Even a release of the "Taverna
>>> Language API" or the "Taverna Platform Product" will rely on multiple
>>> of the other repositories. But does that mean we have to release a
>>> single src-zip pr product (with massive overlaps)  or can we have say
>>> taverna-language-src.zip and taverna-engine-api-src.zip and just vote
>>> on a couple of them at the same time?
>>>
>>>
>>>
>>> On 10 November 2014 10:58, Andy Seaborne <an...@apache.org> wrote:
>>>> One of the next things to do is to ingest the existing codebase.
>>>> That means git repos - I recall you wanted several.
>>>>
>>>> I think I saw a discussion that the maven release plugin recently had
>>>> improvements added to handle releases covering multiple repos and also
>>>> partial repo releases (one release, not everything in the repo).  That
>>>> might
>>>> make things easier and different. [I can't find where I saw that :-(]
>>>>
>>>> Does someone want to come up with the plan for the layout?
>>>>
>>>> (svn for the website - the machinery for publishing websites works with
>>>> svn)
>>>>
>>>> Normally, for a new podling, the details of how to release don't arise
>>>> until
>>>> the first release approaches but if it is going to impact the repository
>>>> layout, then we need to have a general picture first.
>>>>
>>>> An Apache release is source code - the fact there are maven jars,
>>>> combined
>>>> zips/tgz  etc is subsidiary.
>>>>
>>>> Example: for Jena that means the files in:
>>>>
>>>> http://www.apache.org/dist/jena/source/
>>>>
>>>> Jena has two current releases at the moment - normally, we have one and
>>>> and
>>>> there would only be files like "jena-VERSION-source-release".
>>>>
>>>> The one true item for a release, what the vote is formally about, is that
>>>> source-release file and it is something any can download and build Jena.
>>>> It's all made by the maven release plugin, having the Apache POM on the
>>>> parent chain, and using the supplied profile -Papache-release.
>>>>
>>>> Full details for Jena:
>>>>
>>>> https://cwiki.apache.org/confluence/display/JENA/Release+Process
>>>>
>>>> Of course, most people use one of our binaries (maven or the
>>>> all-libraries-in-one zip old style in binaries/) but the authentic
>>>> release
>>>> is the source-release.
>>>>
>>>> Every project is different and the only required part is to have
>>>> source-release and the VOTE.  In fact, it's not too hard to find other
>>>> projects that do it in ways that various people would argue with,
>>>> typically,
>>>> because they are long standing projects and have their way of doing
>>>> things
>>>> before general conventions arose.
>>>>
>>>> It's educational to look at other what projects do and blend it all
>>>> together
>>>> with your existing process.  (mentors - where are good examples that you
>>>> know of?).
>>>>
>>>>           Andy
>>>
>>>
>
>


Re: Codebase ingestion

Posted by Stian Soiland-Reyes <so...@cs.manchester.ac.uk>.
Thanks for following this up.


We had to hand over the paper copies of the existing CLAs for external
contributors, so Shoaib has chased up who had the original papers and
asked that person to forward it to the legal department dealing with
signing the IP document.

In principle I believe everything is OK to sign (it's been approved by
the head of school), they just waited for those papers before they
could sign off.

Perhaps Shoaib could send an update to this list?


The University has now closed for Christmas, so further progress won't
be until second week of January 2015.



On 23 December 2014 at 16:51, Marlon Pierce <ma...@iu.edu> wrote:
> Hi folks--
>
> How is the code donation process going?  Are you still having problems
> clearing legal hurdles at your university?
>
> Marlon
>
>
> On 11/10/14, 6:37 AM, Stian Soiland-Reyes wrote:
>>
>> Thanks for keeping the ball rolling :)
>>
>>
>> I've started sketching out a potential repository tree layout in
>>
>> https://github.com/taverna-incubator
>>
>>
>> NOTE: This is not yet ready for ingestion - it does not currently
>> contain the git history and is just a draft  on how to structure it
>> using raw force. Once it's settled I suggest we recreate the structure
>> proper using git merge, git mv and copying across the modified POMs -
>> done with a shell script so we know it's not inconsistent.
>>
>>
>>
>> I have modified
>>
>>
>> https://github.com/taverna-incubator/taverna-maven-parent/blob/master/pom.xml
>>
>> to inherit the general Apache parent.
>>
>>
>> And then modified all the POM files of
>>
>>   https://github.com/taverna-incubator/taverna-language
>>
>>
>> to e.g.:
>>
>> <parent>
>>      <groupId>org.apache.taverna.language</groupId>
>>      <artifactId>taverna-language</artifactId>
>>      <version>0.15.0-incubating-SNAPSHOT</version>
>> </parent>
>> <artifactId>taverna-scufl2-api</artifactId>
>>
>>
>> You might notice that I have merged several repositories to one. E.g.
>>
>> https://github.com/taverna-incubator/taverna-engine-impl
>>
>> contains code that currently live across:
>>
>> https://github.com/taverna/taverna-engine-core
>> https://github.com/taverna/taverna-engine-credential-manager
>> https://github.com/taverna/taverna3-platform
>> https://github.com/taverna/taverna-dataflow-activity
>>
>>
>>
>> The intention is for these repositories to build totally independent
>> of any net.sf.taverna.* dependencies and myGrid repositories.  It does
>> have some issues that it depends on older libraries not in Central -
>> many which we should get rid off (Sesame 2.3.2) - but that should
>> mainly be sorted post code migration (but pre release).
>>
>>
>> I have not yet done such pom modifications in any of the other
>> repositories - but generally have renamed the submodules to the naming
>> "taverna-*" or "taverna-*".   I've not had a look at further merging
>> those packages and submodules - as that can be done post migration.
>>
>>
>>
>> So that is how I intend to test out the suggested layout -- we will ensure
>> that:
>>
>> a) we are not forgetting to migrate something that is needed deep
>> inside (it would depend on something with groupId net.sf.taverna or
>> fetched from mygrid.org.uk/repository )
>> b) there are no unintended spaghetti dependencies that makes it
>> difficult to do a staged series of normal mvn release:prepare type of
>> releases.
>>
>>
>> Then we can try to do a fake release to a separate/temporary Maven
>> repository - this will ensure the correct release order and show if
>> further merges should be done.
>>
>>
>> I am very worried about "partial release" mechanism as, with multiple
>> people having the capability to do a release (which I would hope we
>> manage), can easily end up in a "Second release of version 1.1, now
>> with some unintended changes" - specially as I believe Apache's Maven
>> server is not a very strict one that would reject such an attempt.
>>
>>
>> In this suggested layout each repository will basically have their own
>> versioning/tags - and as we're starting with fresh groupIds we can
>> make them uniform within each repository - e.g. taverna-databundle
>> will now get the same version as taverna-scufl2-api as they both live
>> in the taverna-language repository. There's nothing saying the
>> versions can't diverge later individually (e.g. databundle goes to
>> 0.12.3 while scufl2-api to 0.14.0) - but we've seen in the past
>> developers get confused by this, resorting to a massive <properties>
>> listing in taverna-maven-parent in order to keep track of all of those
>> (which then leads to constant releases of taverna-maven-parent, which
>> again leads everything to appear to need to change..).
>>
>> Keeping track of just these suggested repositories should be enough -
>> and probably even that a challenge - but luckily in OSGi it only needs
>> to be 'actually latest' at the product bundling stage.
>>
>>
>> That's why I have the split taverna-engine-api and taverna-engine-impl
>> for instance.  One could argue that the API and IMPL should be
>> versioned together, as a new API necessitates a new IMPL - but given
>> the size/complexity of the engine-impl this means that we would have
>> to only version the impl in patch numbers as there is nothing new in
>> the API (but what if the impl depends on something new?)  -- or we
>> would release a 'fake new API version' even though nothing changed
>> (Basically leading to the Taverna 2 situation where you get an
>> impression of need to update usage of the API).
>>
>>
>> I would assume a proposed Apache release would still depend on
>> multiple of these git repositories. Even a release of the "Taverna
>> Language API" or the "Taverna Platform Product" will rely on multiple
>> of the other repositories. But does that mean we have to release a
>> single src-zip pr product (with massive overlaps)  or can we have say
>> taverna-language-src.zip and taverna-engine-api-src.zip and just vote
>> on a couple of them at the same time?
>>
>>
>>
>> On 10 November 2014 10:58, Andy Seaborne <an...@apache.org> wrote:
>>>
>>> One of the next things to do is to ingest the existing codebase.
>>> That means git repos - I recall you wanted several.
>>>
>>> I think I saw a discussion that the maven release plugin recently had
>>> improvements added to handle releases covering multiple repos and also
>>> partial repo releases (one release, not everything in the repo).  That
>>> might
>>> make things easier and different. [I can't find where I saw that :-(]
>>>
>>> Does someone want to come up with the plan for the layout?
>>>
>>> (svn for the website - the machinery for publishing websites works with
>>> svn)
>>>
>>> Normally, for a new podling, the details of how to release don't arise
>>> until
>>> the first release approaches but if it is going to impact the repository
>>> layout, then we need to have a general picture first.
>>>
>>> An Apache release is source code - the fact there are maven jars,
>>> combined
>>> zips/tgz  etc is subsidiary.
>>>
>>> Example: for Jena that means the files in:
>>>
>>> http://www.apache.org/dist/jena/source/
>>>
>>> Jena has two current releases at the moment - normally, we have one and
>>> and
>>> there would only be files like "jena-VERSION-source-release".
>>>
>>> The one true item for a release, what the vote is formally about, is that
>>> source-release file and it is something any can download and build Jena.
>>> It's all made by the maven release plugin, having the Apache POM on the
>>> parent chain, and using the supplied profile -Papache-release.
>>>
>>> Full details for Jena:
>>>
>>> https://cwiki.apache.org/confluence/display/JENA/Release+Process
>>>
>>> Of course, most people use one of our binaries (maven or the
>>> all-libraries-in-one zip old style in binaries/) but the authentic
>>> release
>>> is the source-release.
>>>
>>> Every project is different and the only required part is to have
>>> source-release and the VOTE.  In fact, it's not too hard to find other
>>> projects that do it in ways that various people would argue with,
>>> typically,
>>> because they are long standing projects and have their way of doing
>>> things
>>> before general conventions arose.
>>>
>>> It's educational to look at other what projects do and blend it all
>>> together
>>> with your existing process.  (mentors - where are good examples that you
>>> know of?).
>>>
>>>          Andy
>>
>>
>>
>



-- 
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester
http://soiland-reyes.com/stian/work/ http://orcid.org/0000-0001-9842-9718

Re: Codebase ingestion

Posted by Marlon Pierce <ma...@iu.edu>.
Hi folks--

How is the code donation process going?  Are you still having problems 
clearing legal hurdles at your university?

Marlon

On 11/10/14, 6:37 AM, Stian Soiland-Reyes wrote:
> Thanks for keeping the ball rolling :)
>
>
> I've started sketching out a potential repository tree layout in
>
> https://github.com/taverna-incubator
>
>
> NOTE: This is not yet ready for ingestion - it does not currently
> contain the git history and is just a draft  on how to structure it
> using raw force. Once it's settled I suggest we recreate the structure
> proper using git merge, git mv and copying across the modified POMs -
> done with a shell script so we know it's not inconsistent.
>
>
>
> I have modified
>
> https://github.com/taverna-incubator/taverna-maven-parent/blob/master/pom.xml
>
> to inherit the general Apache parent.
>
>
> And then modified all the POM files of
>
>   https://github.com/taverna-incubator/taverna-language
>
>
> to e.g.:
>
> <parent>
>      <groupId>org.apache.taverna.language</groupId>
>      <artifactId>taverna-language</artifactId>
>      <version>0.15.0-incubating-SNAPSHOT</version>
> </parent>
> <artifactId>taverna-scufl2-api</artifactId>
>
>
> You might notice that I have merged several repositories to one. E.g.
>
> https://github.com/taverna-incubator/taverna-engine-impl
>
> contains code that currently live across:
>
> https://github.com/taverna/taverna-engine-core
> https://github.com/taverna/taverna-engine-credential-manager
> https://github.com/taverna/taverna3-platform
> https://github.com/taverna/taverna-dataflow-activity
>
>
>
> The intention is for these repositories to build totally independent
> of any net.sf.taverna.* dependencies and myGrid repositories.  It does
> have some issues that it depends on older libraries not in Central -
> many which we should get rid off (Sesame 2.3.2) - but that should
> mainly be sorted post code migration (but pre release).
>
>
> I have not yet done such pom modifications in any of the other
> repositories - but generally have renamed the submodules to the naming
> "taverna-*" or "taverna-*".   I've not had a look at further merging
> those packages and submodules - as that can be done post migration.
>
>
>
> So that is how I intend to test out the suggested layout -- we will ensure that:
>
> a) we are not forgetting to migrate something that is needed deep
> inside (it would depend on something with groupId net.sf.taverna or
> fetched from mygrid.org.uk/repository )
> b) there are no unintended spaghetti dependencies that makes it
> difficult to do a staged series of normal mvn release:prepare type of
> releases.
>
>
> Then we can try to do a fake release to a separate/temporary Maven
> repository - this will ensure the correct release order and show if
> further merges should be done.
>
>
> I am very worried about "partial release" mechanism as, with multiple
> people having the capability to do a release (which I would hope we
> manage), can easily end up in a "Second release of version 1.1, now
> with some unintended changes" - specially as I believe Apache's Maven
> server is not a very strict one that would reject such an attempt.
>
>
> In this suggested layout each repository will basically have their own
> versioning/tags - and as we're starting with fresh groupIds we can
> make them uniform within each repository - e.g. taverna-databundle
> will now get the same version as taverna-scufl2-api as they both live
> in the taverna-language repository. There's nothing saying the
> versions can't diverge later individually (e.g. databundle goes to
> 0.12.3 while scufl2-api to 0.14.0) - but we've seen in the past
> developers get confused by this, resorting to a massive <properties>
> listing in taverna-maven-parent in order to keep track of all of those
> (which then leads to constant releases of taverna-maven-parent, which
> again leads everything to appear to need to change..).
>
> Keeping track of just these suggested repositories should be enough -
> and probably even that a challenge - but luckily in OSGi it only needs
> to be 'actually latest' at the product bundling stage.
>
>
> That's why I have the split taverna-engine-api and taverna-engine-impl
> for instance.  One could argue that the API and IMPL should be
> versioned together, as a new API necessitates a new IMPL - but given
> the size/complexity of the engine-impl this means that we would have
> to only version the impl in patch numbers as there is nothing new in
> the API (but what if the impl depends on something new?)  -- or we
> would release a 'fake new API version' even though nothing changed
> (Basically leading to the Taverna 2 situation where you get an
> impression of need to update usage of the API).
>
>
> I would assume a proposed Apache release would still depend on
> multiple of these git repositories. Even a release of the "Taverna
> Language API" or the "Taverna Platform Product" will rely on multiple
> of the other repositories. But does that mean we have to release a
> single src-zip pr product (with massive overlaps)  or can we have say
> taverna-language-src.zip and taverna-engine-api-src.zip and just vote
> on a couple of them at the same time?
>
>
>
> On 10 November 2014 10:58, Andy Seaborne <an...@apache.org> wrote:
>> One of the next things to do is to ingest the existing codebase.
>> That means git repos - I recall you wanted several.
>>
>> I think I saw a discussion that the maven release plugin recently had
>> improvements added to handle releases covering multiple repos and also
>> partial repo releases (one release, not everything in the repo).  That might
>> make things easier and different. [I can't find where I saw that :-(]
>>
>> Does someone want to come up with the plan for the layout?
>>
>> (svn for the website - the machinery for publishing websites works with svn)
>>
>> Normally, for a new podling, the details of how to release don't arise until
>> the first release approaches but if it is going to impact the repository
>> layout, then we need to have a general picture first.
>>
>> An Apache release is source code - the fact there are maven jars, combined
>> zips/tgz  etc is subsidiary.
>>
>> Example: for Jena that means the files in:
>>
>> http://www.apache.org/dist/jena/source/
>>
>> Jena has two current releases at the moment - normally, we have one and and
>> there would only be files like "jena-VERSION-source-release".
>>
>> The one true item for a release, what the vote is formally about, is that
>> source-release file and it is something any can download and build Jena.
>> It's all made by the maven release plugin, having the Apache POM on the
>> parent chain, and using the supplied profile -Papache-release.
>>
>> Full details for Jena:
>>
>> https://cwiki.apache.org/confluence/display/JENA/Release+Process
>>
>> Of course, most people use one of our binaries (maven or the
>> all-libraries-in-one zip old style in binaries/) but the authentic release
>> is the source-release.
>>
>> Every project is different and the only required part is to have
>> source-release and the VOTE.  In fact, it's not too hard to find other
>> projects that do it in ways that various people would argue with, typically,
>> because they are long standing projects and have their way of doing things
>> before general conventions arose.
>>
>> It's educational to look at other what projects do and blend it all together
>> with your existing process.  (mentors - where are good examples that you
>> know of?).
>>
>>          Andy
>
>


Re: Codebase ingestion

Posted by Stian Soiland-Reyes <so...@cs.manchester.ac.uk>.
Thanks for keeping the ball rolling :)


I've started sketching out a potential repository tree layout in

https://github.com/taverna-incubator


NOTE: This is not yet ready for ingestion - it does not currently
contain the git history and is just a draft  on how to structure it
using raw force. Once it's settled I suggest we recreate the structure
proper using git merge, git mv and copying across the modified POMs -
done with a shell script so we know it's not inconsistent.



I have modified

https://github.com/taverna-incubator/taverna-maven-parent/blob/master/pom.xml

to inherit the general Apache parent.


And then modified all the POM files of

 https://github.com/taverna-incubator/taverna-language


to e.g.:

<parent>
    <groupId>org.apache.taverna.language</groupId>
    <artifactId>taverna-language</artifactId>
    <version>0.15.0-incubating-SNAPSHOT</version>
</parent>
<artifactId>taverna-scufl2-api</artifactId>


You might notice that I have merged several repositories to one. E.g.

https://github.com/taverna-incubator/taverna-engine-impl

contains code that currently live across:

https://github.com/taverna/taverna-engine-core
https://github.com/taverna/taverna-engine-credential-manager
https://github.com/taverna/taverna3-platform
https://github.com/taverna/taverna-dataflow-activity



The intention is for these repositories to build totally independent
of any net.sf.taverna.* dependencies and myGrid repositories.  It does
have some issues that it depends on older libraries not in Central -
many which we should get rid off (Sesame 2.3.2) - but that should
mainly be sorted post code migration (but pre release).


I have not yet done such pom modifications in any of the other
repositories - but generally have renamed the submodules to the naming
"taverna-*" or "taverna-*".   I've not had a look at further merging
those packages and submodules - as that can be done post migration.



So that is how I intend to test out the suggested layout -- we will ensure that:

a) we are not forgetting to migrate something that is needed deep
inside (it would depend on something with groupId net.sf.taverna or
fetched from mygrid.org.uk/repository )
b) there are no unintended spaghetti dependencies that makes it
difficult to do a staged series of normal mvn release:prepare type of
releases.


Then we can try to do a fake release to a separate/temporary Maven
repository - this will ensure the correct release order and show if
further merges should be done.


I am very worried about "partial release" mechanism as, with multiple
people having the capability to do a release (which I would hope we
manage), can easily end up in a "Second release of version 1.1, now
with some unintended changes" - specially as I believe Apache's Maven
server is not a very strict one that would reject such an attempt.


In this suggested layout each repository will basically have their own
versioning/tags - and as we're starting with fresh groupIds we can
make them uniform within each repository - e.g. taverna-databundle
will now get the same version as taverna-scufl2-api as they both live
in the taverna-language repository. There's nothing saying the
versions can't diverge later individually (e.g. databundle goes to
0.12.3 while scufl2-api to 0.14.0) - but we've seen in the past
developers get confused by this, resorting to a massive <properties>
listing in taverna-maven-parent in order to keep track of all of those
(which then leads to constant releases of taverna-maven-parent, which
again leads everything to appear to need to change..).

Keeping track of just these suggested repositories should be enough -
and probably even that a challenge - but luckily in OSGi it only needs
to be 'actually latest' at the product bundling stage.


That's why I have the split taverna-engine-api and taverna-engine-impl
for instance.  One could argue that the API and IMPL should be
versioned together, as a new API necessitates a new IMPL - but given
the size/complexity of the engine-impl this means that we would have
to only version the impl in patch numbers as there is nothing new in
the API (but what if the impl depends on something new?)  -- or we
would release a 'fake new API version' even though nothing changed
(Basically leading to the Taverna 2 situation where you get an
impression of need to update usage of the API).


I would assume a proposed Apache release would still depend on
multiple of these git repositories. Even a release of the "Taverna
Language API" or the "Taverna Platform Product" will rely on multiple
of the other repositories. But does that mean we have to release a
single src-zip pr product (with massive overlaps)  or can we have say
taverna-language-src.zip and taverna-engine-api-src.zip and just vote
on a couple of them at the same time?



On 10 November 2014 10:58, Andy Seaborne <an...@apache.org> wrote:
> One of the next things to do is to ingest the existing codebase.
> That means git repos - I recall you wanted several.
>
> I think I saw a discussion that the maven release plugin recently had
> improvements added to handle releases covering multiple repos and also
> partial repo releases (one release, not everything in the repo).  That might
> make things easier and different. [I can't find where I saw that :-(]
>
> Does someone want to come up with the plan for the layout?
>
> (svn for the website - the machinery for publishing websites works with svn)
>
> Normally, for a new podling, the details of how to release don't arise until
> the first release approaches but if it is going to impact the repository
> layout, then we need to have a general picture first.
>
> An Apache release is source code - the fact there are maven jars, combined
> zips/tgz  etc is subsidiary.
>
> Example: for Jena that means the files in:
>
> http://www.apache.org/dist/jena/source/
>
> Jena has two current releases at the moment - normally, we have one and and
> there would only be files like "jena-VERSION-source-release".
>
> The one true item for a release, what the vote is formally about, is that
> source-release file and it is something any can download and build Jena.
> It's all made by the maven release plugin, having the Apache POM on the
> parent chain, and using the supplied profile -Papache-release.
>
> Full details for Jena:
>
> https://cwiki.apache.org/confluence/display/JENA/Release+Process
>
> Of course, most people use one of our binaries (maven or the
> all-libraries-in-one zip old style in binaries/) but the authentic release
> is the source-release.
>
> Every project is different and the only required part is to have
> source-release and the VOTE.  In fact, it's not too hard to find other
> projects that do it in ways that various people would argue with, typically,
> because they are long standing projects and have their way of doing things
> before general conventions arose.
>
> It's educational to look at other what projects do and blend it all together
> with your existing process.  (mentors - where are good examples that you
> know of?).
>
>         Andy



-- 
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester
http://soiland-reyes.com/stian/work/ http://orcid.org/0000-0001-9842-9718