You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Rob Vesse <rv...@dotnetrdf.org> on 2014/07/23 12:02:37 UTC

Module Structuring and Release Cycles

Hey All

Following up on the recent discussion points Andy raised around module
structuring as the project expands its scope and modules do we want to think
about reorganising our repository and how we do releases?

For discussion lets consider splitting the current modules up into several
sub-projects like so:

* commons - jena-iri and Claude's proposed jena-commons module
* core - jena-core, jena-arq, jena-tdb, jena-security
* search - jena-text, jena-spatial
* fuseki - jena-fuseki
* jdbc - All JDBC modules
* sdb - jena-sdb
* client - Stephen's experimental jena-client module
* hadoop - The jena-hadoop-rdf modules
(This is just an illustration, what goes in which sub-project can be decided
later, this is intended to stimulate discussion rather than to be a concrete
proposal).

So rather than having a flat trunk structure with each module in the root we
would have a structure where each sub-project has a directory in trunk with
its modules located under it.  In this structure jena-parent then lives in
the top level directory.

With such a structure we could then consider having different release
cadences for different sub-projects.  So when we do a new core release we
would not necessarily have to release everything else at the same time
(though in most cases I suspect we would want to), however if a sub-project
wants to make an interim release in the meantime there would be nothing to
stop this happening e.g. making a critical bug fix, tracking a sub-project
specific dependency release

Obviously doing this kind of restructuring would be really painful with SVN
so might I suggest that any such change should happen in conjunction with a
move to Git?

The other alternative to such a trunk structure is to have each sub-project
live in its own Git repository which does appear to be something that Infra
supports - the list of repos at https://git-wip-us.apache.org/repos/asf
shows multiple repos for several projects (Accumulo, ActiveMQ, Ant to name
just three near the top of the list) - so we could certainly go down that
route if we wished?  In that scenario then jena-parent would need to live in
its own repository as well (probably the main jena repo would contain just
jena-parent and pointers to the other repos).  However this approach would
complicate releases somewhat since you likely need to have multiple release
votes and release artifacts since cutting a release might mean releasing
from multiple releases and each would need reproducible sources.

Thoughts?

Rob



Re: Module Structuring and Release Cycles

Posted by Andy Seaborne <an...@apache.org>.
On 30/07/14 14:33, Sergio Fernández wrote:
> Hi,
>
> my two cents from out of the project development:
>
> On 23/07/14 12:02, Rob Vesse wrote:
>> Obviously doing this kind of restructuring would be really painful
>> with SVN
>> so might I suggest that any such change should happen in conjunction
>> with a
>> move to Git?
>
> +1 for switching to git. For those who have no experience could be a
> extra work, but it pays off the effort, believe me. It'll bring so many
> new useful workflows that would benefit the project development, both
> internal ans external.

I think we're heading for git (except for the website).

CVS -> SVN -> git -> ??

!!

>
>> The other alternative to such a trunk structure is to have each
>> sub-project
>> live in its own Git repository which does appear to be something that
>> Infra
>> supports - the list of repos athttps://git-wip-us.apache.org/repos/asf
>> shows multiple repos for several projects (Accumulo, ActiveMQ, Ant to
>> name
>> just three near the top of the list) - so we could certainly go down that
>> route if we wished?  In that scenario then jena-parent would need to
>> live in
>> its own repository as well (probably the main jena repo would contain
>> just
>> jena-parent and pointers to the other repos).  However this approach
>> would
>> complicate releases somewhat since you likely need to have multiple
>> release
>> votes and release artifacts since cutting a release might mean releasing
>> from multiple releases and each would need reproducible sources.
>
> Even if infra would allow you to have different repo, I'd not recommend
> you to use submodules in that way. Because it'd make a pain the release
> process.

It does seem to be at best hard-code assumptions about what and how to 
release whereas one repo is neutral.  As the repo is long term choice, 
it's nice to have it independent of release requirements.

	Andy

>
> Cheers,
>


Re: Module Structuring and Release Cycles

Posted by Sergio Fernández <se...@salzburgresearch.at>.
Hi,

my two cents from out of the project development:

On 23/07/14 12:02, Rob Vesse wrote:
> Obviously doing this kind of restructuring would be really painful with SVN
> so might I suggest that any such change should happen in conjunction with a
> move to Git?

+1 for switching to git. For those who have no experience could be a 
extra work, but it pays off the effort, believe me. It'll bring so many 
new useful workflows that would benefit the project development, both 
internal ans external.

> The other alternative to such a trunk structure is to have each sub-project
> live in its own Git repository which does appear to be something that Infra
> supports - the list of repos athttps://git-wip-us.apache.org/repos/asf
> shows multiple repos for several projects (Accumulo, ActiveMQ, Ant to name
> just three near the top of the list) - so we could certainly go down that
> route if we wished?  In that scenario then jena-parent would need to live in
> its own repository as well (probably the main jena repo would contain just
> jena-parent and pointers to the other repos).  However this approach would
> complicate releases somewhat since you likely need to have multiple release
> votes and release artifacts since cutting a release might mean releasing
> from multiple releases and each would need reproducible sources.

Even if infra would allow you to have different repo, I'd not recommend 
you to use submodules in that way. Because it'd make a pain the release 
process.

Cheers,

-- 
Sergio Fernández
Senior Researcher
Knowledge and Media Technologies
Salzburg Research Forschungsgesellschaft mbH
Jakob-Haringer-Straße 5/3 | 5020 Salzburg, Austria
T: +43 662 2288 318 | M: +43 660 2747 925
sergio.fernandez@salzburgresearch.at
http://www.salzburgresearch.at

Re: Module Structuring and Release Cycles

Posted by Claude Warren <cl...@xenei.com>.
I am a fan of JRE as it ensure that everything works together at that point
in time.

That being said, perhaps we should define what is "Jena Core".  As opposed
to jena-core, "Jena Core" are the components that provide the basic
functionality that we want to promote for Jena.  At this time it is
probably everthing.  However, if we proceed with something like
jena-commons  as a collection of code that provides functionality that
might be useful for developers using the Jena Core but not required -- then
it could be outside the JRE envelope.

If we provide different configurations for fuseki (just thinking outside
the box here) then those might be outside the JRE envelope.

If we end up with a bunch of graph or model implementations (e.g. JDBC
based, Hadoop based, etc) then they might be outside of the JRE envelope.

I think we determine what goes inside the envelope by the following
checklist:

1) does it implement a W3 RDF based recommendation (e.g. SPARQL)
2) is it required by any component that provides #1
3) is it required to provide a reasonable out of the box working
implementation (e.g. TDB)
4) (optional) is it directly tied to and significantly affected by versions
of the above (e.g. jena-security)

If it ticks a box in the above then it is in.  If not then it is out.  If
security end up outside the box, I would like a way to match security
version with release versions and get the download pages updated, etc.

Just my thoughts,
Claude


If we provide


On Wed, Jul 30, 2014 at 10:46 AM, Andy Seaborne <an...@apache.org> wrote:

> In trying to reply to this, I keep coming back to teh release structure
> and whether we want to JRE. (JRE = Just Release Everything)
>
> This affects the repo structure (not so much the modules).  It would be
> nice for people to check out just a part of the project.  It's possible in
> git but it's convoluted [1][2].
>
> If there are multiple git repos, I don't see how JRE would work except as
> doing each repo in turn producing multiple source artifacts.  There isn't a
> single source artifact produced this way. (We could write our build process
> that glues things togther).
>
> I think we want to have JRE without excess plumbing and maintenance of a
> build process.  The partial fetch of Jena is an nice-to-have feature but
> not at the cost of making other things harder.
>
> So it seems to be that we have one (git) repo.
>
> Within the one repo, we split up the modules into sub-projects.  We
> document/link to ways to pull part of the repo.
>
> The only module that might benefit from being separate is jena-iri which
> is very stable.
>
> Does that make sense?
>
>         Andy
>
> [1] http://jasonkarns.com/blog/subdirectory-checkouts-with-
> git-sparse-checkout/
> [2] http://briancoyner.github.io/blog/2013/06/05/git-sparse-checkout/
>
>
>
> On 23/07/14 13:48, Andy Seaborne wrote:
>
>> Rob,
>>
>> Excellent
>>
>> I hadn't realised that TLP could have several git repos and for some
>> reason I was thinking it discouraged.  (No idea where that belief came
>> from as it's clearly wrong.)
>>
>> It solves the big problem I was noodling on - how to be able to work on
>> some of Jena without having to have everything cloned.  It would (might)
>> be tolerable for us committers but opaque for people coming to jena
>> initially.
>>
>>      Andy
>>
>> On 23/07/14 11:02, Rob Vesse wrote:
>>
>>> Hey All
>>>
>>> Following up on the recent discussion points Andy raised around module
>>> structuring as the project expands its scope and modules do we want to
>>> think
>>> about reorganising our repository and how we do releases?
>>>
>>> For discussion lets consider splitting the current modules up into
>>> several
>>> sub-projects like so:
>>>
>>> * commons - jena-iri and Claude's proposed jena-commons module
>>> * core - jena-core, jena-arq, jena-tdb, jena-security
>>> * search - jena-text, jena-spatial
>>> * fuseki - jena-fuseki
>>> * jdbc - All JDBC modules
>>> * sdb - jena-sdb
>>> * client - Stephen's experimental jena-client module
>>> * hadoop - The jena-hadoop-rdf modules
>>> (This is just an illustration, what goes in which sub-project can be
>>> decided
>>> later, this is intended to stimulate discussion rather than to be a
>>> concrete
>>> proposal).
>>>
>>> So rather than having a flat trunk structure with each module in the
>>> root we
>>> would have a structure where each sub-project has a directory in trunk
>>> with
>>> its modules located under it.  In this structure jena-parent then
>>> lives in
>>> the top level directory.
>>>
>>> With such a structure we could then consider having different release
>>> cadences for different sub-projects.  So when we do a new core release we
>>> would not necessarily have to release everything else at the same time
>>> (though in most cases I suspect we would want to), however if a
>>> sub-project
>>> wants to make an interim release in the meantime there would be
>>> nothing to
>>> stop this happening e.g. making a critical bug fix, tracking a
>>> sub-project
>>> specific dependency release
>>>
>>> Obviously doing this kind of restructuring would be really painful
>>> with SVN
>>> so might I suggest that any such change should happen in conjunction
>>> with a
>>> move to Git?
>>>
>>> The other alternative to such a trunk structure is to have each
>>> sub-project
>>> live in its own Git repository which does appear to be something that
>>> Infra
>>> supports - the list of repos at https://git-wip-us.apache.org/repos/asf
>>> shows multiple repos for several projects (Accumulo, ActiveMQ, Ant to
>>> name
>>> just three near the top of the list) - so we could certainly go down that
>>> route if we wished?  In that scenario then jena-parent would need to
>>> live in
>>> its own repository as well (probably the main jena repo would contain
>>> just
>>> jena-parent and pointers to the other repos).  However this approach
>>> would
>>> complicate releases somewhat since you likely need to have multiple
>>> release
>>> votes and release artifacts since cutting a release might mean releasing
>>> from multiple releases and each would need reproducible sources.
>>>
>>> Thoughts?
>>>
>>> Rob
>>>
>>>
>>>
>>>
>>
>


-- 
I like: Like Like - The likeliest place on the web
<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: Module Structuring and Release Cycles

Posted by Andy Seaborne <an...@apache.org>.
In trying to reply to this, I keep coming back to teh release structure 
and whether we want to JRE. (JRE = Just Release Everything)

This affects the repo structure (not so much the modules).  It would be 
nice for people to check out just a part of the project.  It's possible 
in git but it's convoluted [1][2].

If there are multiple git repos, I don't see how JRE would work except 
as doing each repo in turn producing multiple source artifacts.  There 
isn't a single source artifact produced this way. (We could write our 
build process that glues things togther).

I think we want to have JRE without excess plumbing and maintenance of a 
build process.  The partial fetch of Jena is an nice-to-have feature but 
not at the cost of making other things harder.

So it seems to be that we have one (git) repo.

Within the one repo, we split up the modules into sub-projects.  We 
document/link to ways to pull part of the repo.

The only module that might benefit from being separate is jena-iri which 
is very stable.

Does that make sense?

	Andy

[1] 
http://jasonkarns.com/blog/subdirectory-checkouts-with-git-sparse-checkout/
[2] http://briancoyner.github.io/blog/2013/06/05/git-sparse-checkout/


On 23/07/14 13:48, Andy Seaborne wrote:
> Rob,
>
> Excellent
>
> I hadn't realised that TLP could have several git repos and for some
> reason I was thinking it discouraged.  (No idea where that belief came
> from as it's clearly wrong.)
>
> It solves the big problem I was noodling on - how to be able to work on
> some of Jena without having to have everything cloned.  It would (might)
> be tolerable for us committers but opaque for people coming to jena
> initially.
>
>      Andy
>
> On 23/07/14 11:02, Rob Vesse wrote:
>> Hey All
>>
>> Following up on the recent discussion points Andy raised around module
>> structuring as the project expands its scope and modules do we want to
>> think
>> about reorganising our repository and how we do releases?
>>
>> For discussion lets consider splitting the current modules up into
>> several
>> sub-projects like so:
>>
>> * commons - jena-iri and Claude's proposed jena-commons module
>> * core - jena-core, jena-arq, jena-tdb, jena-security
>> * search - jena-text, jena-spatial
>> * fuseki - jena-fuseki
>> * jdbc - All JDBC modules
>> * sdb - jena-sdb
>> * client - Stephen's experimental jena-client module
>> * hadoop - The jena-hadoop-rdf modules
>> (This is just an illustration, what goes in which sub-project can be
>> decided
>> later, this is intended to stimulate discussion rather than to be a
>> concrete
>> proposal).
>>
>> So rather than having a flat trunk structure with each module in the
>> root we
>> would have a structure where each sub-project has a directory in trunk
>> with
>> its modules located under it.  In this structure jena-parent then
>> lives in
>> the top level directory.
>>
>> With such a structure we could then consider having different release
>> cadences for different sub-projects.  So when we do a new core release we
>> would not necessarily have to release everything else at the same time
>> (though in most cases I suspect we would want to), however if a
>> sub-project
>> wants to make an interim release in the meantime there would be
>> nothing to
>> stop this happening e.g. making a critical bug fix, tracking a
>> sub-project
>> specific dependency release
>>
>> Obviously doing this kind of restructuring would be really painful
>> with SVN
>> so might I suggest that any such change should happen in conjunction
>> with a
>> move to Git?
>>
>> The other alternative to such a trunk structure is to have each
>> sub-project
>> live in its own Git repository which does appear to be something that
>> Infra
>> supports - the list of repos at https://git-wip-us.apache.org/repos/asf
>> shows multiple repos for several projects (Accumulo, ActiveMQ, Ant to
>> name
>> just three near the top of the list) - so we could certainly go down that
>> route if we wished?  In that scenario then jena-parent would need to
>> live in
>> its own repository as well (probably the main jena repo would contain
>> just
>> jena-parent and pointers to the other repos).  However this approach
>> would
>> complicate releases somewhat since you likely need to have multiple
>> release
>> votes and release artifacts since cutting a release might mean releasing
>> from multiple releases and each would need reproducible sources.
>>
>> Thoughts?
>>
>> Rob
>>
>>
>>
>


Re: Module Structuring and Release Cycles

Posted by Andy Seaborne <an...@apache.org>.
Rob,

Excellent

I hadn't realised that TLP could have several git repos and for some 
reason I was thinking it discouraged.  (No idea where that belief came 
from as it's clearly wrong.)

It solves the big problem I was noodling on - how to be able to work on 
some of Jena without having to have everything cloned.  It would (might) 
be tolerable for us committers but opaque for people coming to jena 
initially.

	Andy

On 23/07/14 11:02, Rob Vesse wrote:
> Hey All
>
> Following up on the recent discussion points Andy raised around module
> structuring as the project expands its scope and modules do we want to think
> about reorganising our repository and how we do releases?
>
> For discussion lets consider splitting the current modules up into several
> sub-projects like so:
>
> * commons - jena-iri and Claude's proposed jena-commons module
> * core - jena-core, jena-arq, jena-tdb, jena-security
> * search - jena-text, jena-spatial
> * fuseki - jena-fuseki
> * jdbc - All JDBC modules
> * sdb - jena-sdb
> * client - Stephen's experimental jena-client module
> * hadoop - The jena-hadoop-rdf modules
> (This is just an illustration, what goes in which sub-project can be decided
> later, this is intended to stimulate discussion rather than to be a concrete
> proposal).
>
> So rather than having a flat trunk structure with each module in the root we
> would have a structure where each sub-project has a directory in trunk with
> its modules located under it.  In this structure jena-parent then lives in
> the top level directory.
>
> With such a structure we could then consider having different release
> cadences for different sub-projects.  So when we do a new core release we
> would not necessarily have to release everything else at the same time
> (though in most cases I suspect we would want to), however if a sub-project
> wants to make an interim release in the meantime there would be nothing to
> stop this happening e.g. making a critical bug fix, tracking a sub-project
> specific dependency release
>
> Obviously doing this kind of restructuring would be really painful with SVN
> so might I suggest that any such change should happen in conjunction with a
> move to Git?
>
> The other alternative to such a trunk structure is to have each sub-project
> live in its own Git repository which does appear to be something that Infra
> supports - the list of repos at https://git-wip-us.apache.org/repos/asf
> shows multiple repos for several projects (Accumulo, ActiveMQ, Ant to name
> just three near the top of the list) - so we could certainly go down that
> route if we wished?  In that scenario then jena-parent would need to live in
> its own repository as well (probably the main jena repo would contain just
> jena-parent and pointers to the other repos).  However this approach would
> complicate releases somewhat since you likely need to have multiple release
> votes and release artifacts since cutting a release might mean releasing
> from multiple releases and each would need reproducible sources.
>
> Thoughts?
>
> Rob
>
>
>