You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@clerezza.apache.org by Reto Gmür <re...@apache.org> on 2014/09/03 18:29:32 UTC

Many small git-repos

Dear Infra-Team

On the Clerezza dev-list we have been discussing our release process and we
would now like to discus in how far it is possible to adapt the
infrastructure we use to the envisaged process.

Clerezza consist of currently around 200 individual maven artifacts
(modules). They are all versioned individually to avoid releasing modules
that had effective no change since the last release.

Now it turns out that releasing a subset of the modules or even just a
single module is quite troublesome and the tools (i.e. the maven
release-plugin) do not work as expected.

The conclusion of the discussion on our mailing list was, that the best for
our project would be to have a different repository for every module. We
think that this would also be good for the growth of the community, as
people can focus on the mdoule they know best abaout.

So the question is, if it would be possible to have an infrastructure like
a gitlab/github/bitbucket instance that allows us to create many small git
repos for our projects?


Cheers,
Reto

Re: Many small git-repos

Posted by Bertrand Delacretaz <bd...@apache.org>.
Salut Reto,

On Wed, Sep 24, 2014 at 8:55 PM, Reto Gmür <re...@apache.org> wrote:
> ...Still a self-service multi-git solution would be nice to have, but for me it
> definitively lost some of its urgency...

I've been using http://gitslave.sourceforge.net/ to do git operations
on multiple related repositories and it's useful - and non-intrusive
so you don't force people to use it.

-Bertrand

Re: Many small git-repos

Posted by Reto Gmür <re...@apache.org>.
With the maven-release-plugin version that has just been released releasing
individual modules in a large git repository works nicely, thanks Benson
for pointing me to this!

Still a self-service multi-git solution would be nice to have, but for me
it definitively lost some of its urgency.

Cheers,
Reto

On Mon, Sep 22, 2014 at 8:36 AM, Paul Davis <pa...@gmail.com>
wrote:

> Forgot to reply to this earlier. From the CouchDB side of things, even
> though we have many repositories, the combination of all those
> repositories will constitute a single source release. Our repository
> breakdown is mostly driven by how Erlang-the-language mandates a flat
> namespace coupled with Erlang-the-ecosystem which has developed
> tooling that expects individual repositories.
>
> Paul
>
> On Sun, Sep 21, 2014 at 10:33 PM, David Nalley <da...@gnsa.us> wrote:
> > On Sun, Sep 21, 2014 at 12:26 PM, Reto Gmür <re...@apache.org> wrote:
> >> Hi Brian, David, all,
> >>
> >> Thanks for your feedback - and sorry for my late reply.
> >>
> >> I did some more playing around with the mvn:release plugins and
> >> unfortunately found no way to use it for individually versioned projects
> >> sharing a singe git repo. The maven git flow plugin also assumes one
> version
> >> per repository.
> >>
> >> Regarding the management of the repos: We do not plan to mandate the
> use of
> >> any tool. One of the advanatges of 1-repo per module is that people only
> >> need to check out and get into the subprojects they are actually
> interested
> >> in.  Reactor projects will use git submodules to integrate the modules,
> so
> >> one can get all the modules by recursively updating the submodules of
> the
> >> root-reactor.
> >>
> >> In the case we want to draft a change that spans over multiple modules
> we
> >> would have to individually branch those projects and branch the reactor
> or
> >> provide a temporary reactor that contains only the branched projects.
> The
> >> temporary reactor approach is possible as the module depend on the
> released
> >> version of other modules by default, so the modules would depend on a
> >> SNAPSHOT version only where this is also branched.
> >>
> >> As for the release process: A goal of the new approach is to make
> releasing
> >> much easier ans thus much more frequent. So often we will be releasing
> just
> >> a single module. When releasing multiple modules we will generate
> several
> >> artifacts and call for a common vote. A single vote is necessary if the
> >> modules are interdependent, as otherwise the votes would have to be held
> >> sequentially following the dependency chain.
> >>
> >> Even if Infra is *very* responsive for 200 repos having a self-service
> >> system seems to be more convenient. Allura looks good, gitlab might be
> a bit
> >> easier as it is focused on git.
> >>
> >
> > I don't disagree that having a self-service system would be
> > convenient; it's come up in conversation several times in the past few
> > months. However, this isn't currently a priority for infra. For the
> > moment, projects have to keep their source code in an infra-maintained
> > repository.
> >
> > --David
>

Re: Many small git-repos

Posted by Paul Davis <pa...@gmail.com>.
Forgot to reply to this earlier. From the CouchDB side of things, even
though we have many repositories, the combination of all those
repositories will constitute a single source release. Our repository
breakdown is mostly driven by how Erlang-the-language mandates a flat
namespace coupled with Erlang-the-ecosystem which has developed
tooling that expects individual repositories.

Paul

On Sun, Sep 21, 2014 at 10:33 PM, David Nalley <da...@gnsa.us> wrote:
> On Sun, Sep 21, 2014 at 12:26 PM, Reto Gmür <re...@apache.org> wrote:
>> Hi Brian, David, all,
>>
>> Thanks for your feedback - and sorry for my late reply.
>>
>> I did some more playing around with the mvn:release plugins and
>> unfortunately found no way to use it for individually versioned projects
>> sharing a singe git repo. The maven git flow plugin also assumes one version
>> per repository.
>>
>> Regarding the management of the repos: We do not plan to mandate the use of
>> any tool. One of the advanatges of 1-repo per module is that people only
>> need to check out and get into the subprojects they are actually interested
>> in.  Reactor projects will use git submodules to integrate the modules, so
>> one can get all the modules by recursively updating the submodules of the
>> root-reactor.
>>
>> In the case we want to draft a change that spans over multiple modules we
>> would have to individually branch those projects and branch the reactor or
>> provide a temporary reactor that contains only the branched projects. The
>> temporary reactor approach is possible as the module depend on the released
>> version of other modules by default, so the modules would depend on a
>> SNAPSHOT version only where this is also branched.
>>
>> As for the release process: A goal of the new approach is to make releasing
>> much easier ans thus much more frequent. So often we will be releasing just
>> a single module. When releasing multiple modules we will generate several
>> artifacts and call for a common vote. A single vote is necessary if the
>> modules are interdependent, as otherwise the votes would have to be held
>> sequentially following the dependency chain.
>>
>> Even if Infra is *very* responsive for 200 repos having a self-service
>> system seems to be more convenient. Allura looks good, gitlab might be a bit
>> easier as it is focused on git.
>>
>
> I don't disagree that having a self-service system would be
> convenient; it's come up in conversation several times in the past few
> months. However, this isn't currently a priority for infra. For the
> moment, projects have to keep their source code in an infra-maintained
> repository.
>
> --David

Re: Many small git-repos

Posted by David Nalley <da...@gnsa.us>.
On Sun, Sep 21, 2014 at 12:26 PM, Reto Gmür <re...@apache.org> wrote:
> Hi Brian, David, all,
>
> Thanks for your feedback - and sorry for my late reply.
>
> I did some more playing around with the mvn:release plugins and
> unfortunately found no way to use it for individually versioned projects
> sharing a singe git repo. The maven git flow plugin also assumes one version
> per repository.
>
> Regarding the management of the repos: We do not plan to mandate the use of
> any tool. One of the advanatges of 1-repo per module is that people only
> need to check out and get into the subprojects they are actually interested
> in.  Reactor projects will use git submodules to integrate the modules, so
> one can get all the modules by recursively updating the submodules of the
> root-reactor.
>
> In the case we want to draft a change that spans over multiple modules we
> would have to individually branch those projects and branch the reactor or
> provide a temporary reactor that contains only the branched projects. The
> temporary reactor approach is possible as the module depend on the released
> version of other modules by default, so the modules would depend on a
> SNAPSHOT version only where this is also branched.
>
> As for the release process: A goal of the new approach is to make releasing
> much easier ans thus much more frequent. So often we will be releasing just
> a single module. When releasing multiple modules we will generate several
> artifacts and call for a common vote. A single vote is necessary if the
> modules are interdependent, as otherwise the votes would have to be held
> sequentially following the dependency chain.
>
> Even if Infra is *very* responsive for 200 repos having a self-service
> system seems to be more convenient. Allura looks good, gitlab might be a bit
> easier as it is focused on git.
>

I don't disagree that having a self-service system would be
convenient; it's come up in conversation several times in the past few
months. However, this isn't currently a priority for infra. For the
moment, projects have to keep their source code in an infra-maintained
repository.

--David

Re: Many small git-repos

Posted by Reto Gmür <re...@apache.org>.
Hi Brian, David, all,

Thanks for your feedback - and sorry for my late reply.

I did some more playing around with the mvn:release plugins and
unfortunately found no way to use it for individually versioned projects
sharing a singe git repo. The maven git flow plugin also assumes one
version per repository.

Regarding the management of the repos: We do not plan to mandate the use of
any tool. One of the advanatges of 1-repo per module is that people only
need to check out and get into the subprojects they are actually interested
in.  Reactor projects will use git submodules to integrate the modules, so
one can get all the modules by recursively updating the submodules of the
root-reactor.

In the case we want to draft a change that spans over multiple modules we
would have to individually branch those projects and branch the reactor or
provide a temporary reactor that contains only the branched projects. The
temporary reactor approach is possible as the module depend on the released
version of other modules by default, so the modules would depend on a
SNAPSHOT version only where this is also branched.

As for the release process: A goal of the new approach is to make releasing
much easier ans thus much more frequent. So often we will be releasing just
a single module. When releasing multiple modules we will generate several
artifacts and call for a common vote. A single vote is necessary if the
modules are interdependent, as otherwise the votes would have to be held
sequentially following the dependency chain.

Even if Infra is *very* responsive for 200 repos having a self-service
system seems to be more convenient. Allura looks good, gitlab might be a
bit easier as it is focused on git.

Cheers,
Reto


On Thu, Sep 4, 2014 at 6:50 PM, Brian LeRoux <b...@brian.io> wrote:

> Cordova tends to prefer the many small repos for a host of reasons. Bugs
> are easier to track. Code becomes more modular and decouples naturally. It
> is easier to reason about (and thus contribute to) isolated pieces (esp
> when those pieces may be made of different languages).
>
> Drawbacks include the Apache vote/release thing adds time and complexity. I
> guess final composition of the repos requires integration testing but I do
> not really view testing as a drawback! Ideally we could upgrade the release
> process at Apache to facilitate workflows like this. Har har.
>
> To quickly comment on earlier thoughts:
>
> - Branching isn't effected (its just faster and easier with Git).
> - Tagging needs some convention or coordination but its not really a
> problem per se.
> - We've found infra to be *very* responsive so I wouldn't worry about the
> creation of said repos.
>
> Anyhow, if you have more queries about the Cordova approach happy to help.
>
>
> On Thu, Sep 4, 2014 at 6:22 AM, David Nalley <da...@gnsa.us> wrote:
>
> > On Wed, Sep 3, 2014 at 12:29 PM, Reto Gmür <re...@apache.org> wrote:
> > > Dear Infra-Team
> > >
> > > On the Clerezza dev-list we have been discussing our release process
> and
> > we
> > > would now like to discus in how far it is possible to adapt the
> > > infrastructure we use to the envisaged process.
> > >
> > > Clerezza consist of currently around 200 individual maven artifacts
> > > (modules). They are all versioned individually to avoid releasing
> modules
> > > that had effective no change since the last release.
> > >
> >
> > I'd really encourage you to think through this for a while.
> > 200 separate artifacts - which might mean 200 separate release
> > [VOTE]s, but certainly will increase your release overhead.
> >
> >
> > > Now it turns out that releasing a subset of the modules or even just a
> > > single module is quite troublesome and the tools (i.e. the maven
> > > release-plugin) do not work as expected.
> > >
> > > The conclusion of the discussion on our mailing list was, that the best
> > for
> > > our project would be to have a different repository for every module.
> We
> > > think that this would also be good for the growth of the community, as
> > > people can focus on the mdoule they know best abaout.
> > >
> >
> > Today we have two projects that have scores of git repos. (CouchDB and
> > Cordova), but they are still an order of magnitude smaller than the
> > 200 you are talking about. Do you have a solution ready to manage all
> > of those repos? E.g. whats your plan for branching or tagging? Doing
> > that individually or using a tool like git-repo or mr?
> > I'd heavily encourage you to talk to them about their challenges and
> > strategies.
> >
> > > So the question is, if it would be possible to have an infrastructure
> > like a
> > > gitlab/github/bitbucket instance that allows us to create many small
> git
> > > repos for our projects?
> > >
> >
> > In short, today we have nothing that will support the project creating
> > it's own repositories.
> > While there has been talk of github writable repos, even if we were to
> > take that bold step, your project wouldn't be able to create repos
> > on-demand.
> > That isn't to say that we wouldn't create a gitreq self-service module
> > for whimsy, or something similar.
> >
> > Dave mentions Allura - and that is a platform that would allow such
> > things to happen; but Allura's self-hosted instance is not backed up
> > or maintained by Infra, and even the Allura codebase lives on
> > git-wip-us.a.o.
> >
> > --David
> >
>

Re: Many small git-repos

Posted by Brian LeRoux <b...@brian.io>.
Cordova tends to prefer the many small repos for a host of reasons. Bugs
are easier to track. Code becomes more modular and decouples naturally. It
is easier to reason about (and thus contribute to) isolated pieces (esp
when those pieces may be made of different languages).

Drawbacks include the Apache vote/release thing adds time and complexity. I
guess final composition of the repos requires integration testing but I do
not really view testing as a drawback! Ideally we could upgrade the release
process at Apache to facilitate workflows like this. Har har.

To quickly comment on earlier thoughts:

- Branching isn't effected (its just faster and easier with Git).
- Tagging needs some convention or coordination but its not really a
problem per se.
- We've found infra to be *very* responsive so I wouldn't worry about the
creation of said repos.

Anyhow, if you have more queries about the Cordova approach happy to help.


On Thu, Sep 4, 2014 at 6:22 AM, David Nalley <da...@gnsa.us> wrote:

> On Wed, Sep 3, 2014 at 12:29 PM, Reto Gmür <re...@apache.org> wrote:
> > Dear Infra-Team
> >
> > On the Clerezza dev-list we have been discussing our release process and
> we
> > would now like to discus in how far it is possible to adapt the
> > infrastructure we use to the envisaged process.
> >
> > Clerezza consist of currently around 200 individual maven artifacts
> > (modules). They are all versioned individually to avoid releasing modules
> > that had effective no change since the last release.
> >
>
> I'd really encourage you to think through this for a while.
> 200 separate artifacts - which might mean 200 separate release
> [VOTE]s, but certainly will increase your release overhead.
>
>
> > Now it turns out that releasing a subset of the modules or even just a
> > single module is quite troublesome and the tools (i.e. the maven
> > release-plugin) do not work as expected.
> >
> > The conclusion of the discussion on our mailing list was, that the best
> for
> > our project would be to have a different repository for every module. We
> > think that this would also be good for the growth of the community, as
> > people can focus on the mdoule they know best abaout.
> >
>
> Today we have two projects that have scores of git repos. (CouchDB and
> Cordova), but they are still an order of magnitude smaller than the
> 200 you are talking about. Do you have a solution ready to manage all
> of those repos? E.g. whats your plan for branching or tagging? Doing
> that individually or using a tool like git-repo or mr?
> I'd heavily encourage you to talk to them about their challenges and
> strategies.
>
> > So the question is, if it would be possible to have an infrastructure
> like a
> > gitlab/github/bitbucket instance that allows us to create many small git
> > repos for our projects?
> >
>
> In short, today we have nothing that will support the project creating
> it's own repositories.
> While there has been talk of github writable repos, even if we were to
> take that bold step, your project wouldn't be able to create repos
> on-demand.
> That isn't to say that we wouldn't create a gitreq self-service module
> for whimsy, or something similar.
>
> Dave mentions Allura - and that is a platform that would allow such
> things to happen; but Allura's self-hosted instance is not backed up
> or maintained by Infra, and even the Allura codebase lives on
> git-wip-us.a.o.
>
> --David
>

Re: Many small git-repos

Posted by David Nalley <da...@gnsa.us>.
On Wed, Sep 3, 2014 at 12:29 PM, Reto Gmür <re...@apache.org> wrote:
> Dear Infra-Team
>
> On the Clerezza dev-list we have been discussing our release process and we
> would now like to discus in how far it is possible to adapt the
> infrastructure we use to the envisaged process.
>
> Clerezza consist of currently around 200 individual maven artifacts
> (modules). They are all versioned individually to avoid releasing modules
> that had effective no change since the last release.
>

I'd really encourage you to think through this for a while.
200 separate artifacts - which might mean 200 separate release
[VOTE]s, but certainly will increase your release overhead.


> Now it turns out that releasing a subset of the modules or even just a
> single module is quite troublesome and the tools (i.e. the maven
> release-plugin) do not work as expected.
>
> The conclusion of the discussion on our mailing list was, that the best for
> our project would be to have a different repository for every module. We
> think that this would also be good for the growth of the community, as
> people can focus on the mdoule they know best abaout.
>

Today we have two projects that have scores of git repos. (CouchDB and
Cordova), but they are still an order of magnitude smaller than the
200 you are talking about. Do you have a solution ready to manage all
of those repos? E.g. whats your plan for branching or tagging? Doing
that individually or using a tool like git-repo or mr?
I'd heavily encourage you to talk to them about their challenges and strategies.

> So the question is, if it would be possible to have an infrastructure like a
> gitlab/github/bitbucket instance that allows us to create many small git
> repos for our projects?
>

In short, today we have nothing that will support the project creating
it's own repositories.
While there has been talk of github writable repos, even if we were to
take that bold step, your project wouldn't be able to create repos
on-demand.
That isn't to say that we wouldn't create a gitreq self-service module
for whimsy, or something similar.

Dave mentions Allura - and that is a platform that would allow such
things to happen; but Allura's self-hosted instance is not backed up
or maintained by Infra, and even the Allura codebase lives on
git-wip-us.a.o.

--David