You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@maven.apache.org by Viktor Sadovnikov <vi...@jv-ration.com> on 2015/02/13 14:09:36 UTC

Cleaning source code repositories

Good day,

I wonder if this community can provide some hints on handling the following.

At a few last projects I was asked to set (or clean) automated builds up,
so they can get (at least) deployable software package(s) after code
changes in a minimum time. Starting from the final desirable results, I was
able to trace down every module, which is needed to build the "master CD".
It was especially easy for Maven-based projects. However discovering these
modules in source repositories were always highlighting:

   - lack of knowledge if a certain module in repository is needed or even
   used in multiple products;
   - duplications of modules with similar purposes - sometimes a conscious
   decision to copy in order to avoid breaking backward compatibility with
   unknown dependents;
   - existence of build jobs for obsolete modules;
   - absence of builds for stable modules, which are not changed during the
   last couple of years
   - and things like these

Assuming all projects in the repositories are maven-ized, how would you
approach determining those, which are required for final deliveries, and
those, which might break if another module changes (sort of reverse
dependency management)?

Do you actually consider this situation as a problem or is it just a
perfectionist talking to me? ;-)

Thank you for your attention,
Viktor

Re: Cleaning source code repositories

Posted by Viktor Sadovnikov <vi...@jv-ration.com>.
Hi Curtis,

Yes, I believe, we are concerned about the same challenges, however with
slightly different approach.

I'm trying to come with a recipe or recipes for:

   - determining impact of introduction of a backward incompatible change;
   - determining delivered (released) versions of software, which are
   affected by a discovered defect

Your point, as it seems to me, is that even a MINOR or PATCH change can
break dependency. The theoretical response would be: insure your modules
(projects) are not tidily coupled, program against interfaces, which
composed the public API of the dependency. Yes, these are valid
recommendations, however they are extremely difficult to follow completely.

Instead of "melting pot" I use upgrade builds. All projects (not modules of
multi-module projects) depend on each other released versions and every
project has a separate upgrade build, which is scheduled to run at least
once a day. This build

   - uses maven dependency plugin to upgrade dependencies and parent of the
   project (with filters to exclude external dependencies);
   - runs regular "clean install" with modified POM;
   - if the previous step succeeds, commits changed POM to the repository

This approach gives freedom to break backward compatibility in SNAPSHOT
versions and, after release, just flags the problem (by failing the upgrade
build), allowing the team to continue working on none-upgraded version,
while one of team members resolves upgrade problem.

Hope this helps,
Viktor

Viktor Sadovnikov @ JV-ration
evolution of Joint Vision
Tuinluststraat 14, 2275XZ Voorburg, The Netherlands
viktor@jv-ration.com <vi...@jv-ration.com> | http://jv-ration.com | +31 6
2466 0736

On Fri, Feb 13, 2015 at 5:33 PM, Curtis Rueden <ct...@wisc.edu> wrote:

> Hi Viktor,
>
> > Do you actually consider this situation as a problem or is it just a
> > perfectionist talking to me? ;-)
>
> I would say it is a very real challenge of managing projects with many
> components.
>
> > how would you approach determining those, which are required for final
> > deliveries, and those, which might break if another module changes
> > (sort of reverse dependency management)?
>
> That "reverse dependency management" need in particular is the most
> important of those on your list, I think. (Detecting obsolete modules and
> build jobs is nice and helps declutter, but often has little consequence
> beyond that.)
>
> My project [1] is still looking for better ways to be notified in advance
> when code changes somehow affect downstream projects. (Of course, you
> cannot be responsible for the entire world -- but you can define a known
> set of downstream code that you want to help maintain.)
>
> We use release couplings for reproducible builds [2], SemVer for
> versioning [3], and Jenkins for CI [4]. Because of the release dependency
> couplings, Jenkins cannot tell you when upstream changes to master (or even
> new release versions) break downstream projects, until such projects
> attempt to update to the new release. What we are working towards creating
> is a "melting pot" Jenkins job that switches everything to snapshot
> couplings using a profile [5] in a giant synthetic multi-module build. Then
> the java compiler would tell you directly if you broke backwards
> compatibility -- at least compile-time compatibility, which is more than
> half the battle.
>
> If anyone knows of a better established best practice for this sort of
> thing, that would be awesome.
>
> Regards,
> Curtis
>
> [1] http://imagej.net/Architecture
> [2] http://imagej.net/Reproducible_builds
> [3] http://imagej.net/Versioning
> [4] http://imagej.net/Jenkins
> [5]
> https://github.com/scijava/pom-scijava/blob/pom-scijava-5.7.0/pom.xml#L1048-L1051
>
> On Fri, Feb 13, 2015 at 7:09 AM, Viktor Sadovnikov <vi...@jv-ration.com>
> wrote:
>
>> Good day,
>>
>> I wonder if this community can provide some hints on handling the
>> following.
>>
>> At a few last projects I was asked to set (or clean) automated builds up,
>> so they can get (at least) deployable software package(s) after code
>> changes in a minimum time. Starting from the final desirable results, I
>> was
>> able to trace down every module, which is needed to build the "master CD".
>> It was especially easy for Maven-based projects. However discovering these
>> modules in source repositories were always highlighting:
>>
>>    - lack of knowledge if a certain module in repository is needed or even
>>    used in multiple products;
>>    - duplications of modules with similar purposes - sometimes a conscious
>>    decision to copy in order to avoid breaking backward compatibility with
>>    unknown dependents;
>>    - existence of build jobs for obsolete modules;
>>    - absence of builds for stable modules, which are not changed during
>> the
>>    last couple of years
>>    - and things like these
>>
>> Assuming all projects in the repositories are maven-ized, how would you
>> approach determining those, which are required for final deliveries, and
>> those, which might break if another module changes (sort of reverse
>> dependency management)?
>>
>> Do you actually consider this situation as a problem or is it just a
>> perfectionist talking to me? ;-)
>>
>> Thank you for your attention,
>> Viktor
>>
>
>

Re: Cleaning source code repositories

Posted by Viktor Sadovnikov <vi...@jv-ration.com>.
Hi Curtis,

Thank you very much for your thoughtful reply. I'll review your links... very curious

Thanks,
Viktor
--- Sent on go

> On 13/2/2015, at 17:33, Curtis Rueden <ct...@wisc.edu> wrote:
> 
> Hi Viktor,
> 
> > Do you actually consider this situation as a problem or is it just a
> > perfectionist talking to me? ;-)
> 
> I would say it is a very real challenge of managing projects with many components.
> 
> > how would you approach determining those, which are required for final
> > deliveries, and those, which might break if another module changes
> > (sort of reverse dependency management)?
> 
> That "reverse dependency management" need in particular is the most important of those on your list, I think. (Detecting obsolete modules and build jobs is nice and helps declutter, but often has little consequence beyond that.)
> 
> My project [1] is still looking for better ways to be notified in advance when code changes somehow affect downstream projects. (Of course, you cannot be responsible for the entire world -- but you can define a known set of downstream code that you want to help maintain.)
> 
> We use release couplings for reproducible builds [2], SemVer for versioning [3], and Jenkins for CI [4]. Because of the release dependency couplings, Jenkins cannot tell you when upstream changes to master (or even new release versions) break downstream projects, until such projects attempt to update to the new release. What we are working towards creating is a "melting pot" Jenkins job that switches everything to snapshot couplings using a profile [5] in a giant synthetic multi-module build. Then the java compiler would tell you directly if you broke backwards compatibility -- at least compile-time compatibility, which is more than half the battle.
> 
> If anyone knows of a better established best practice for this sort of thing, that would be awesome.
> 
> Regards,
> Curtis
> 
> [1] http://imagej.net/Architecture
> [2] http://imagej.net/Reproducible_builds
> [3] http://imagej.net/Versioning
> [4] http://imagej.net/Jenkins
> [5] https://github.com/scijava/pom-scijava/blob/pom-scijava-5.7.0/pom.xml#L1048-L1051
> 
>> On Fri, Feb 13, 2015 at 7:09 AM, Viktor Sadovnikov <vi...@jv-ration.com> wrote:
>> Good day,
>> 
>> I wonder if this community can provide some hints on handling the following.
>> 
>> At a few last projects I was asked to set (or clean) automated builds up,
>> so they can get (at least) deployable software package(s) after code
>> changes in a minimum time. Starting from the final desirable results, I was
>> able to trace down every module, which is needed to build the "master CD".
>> It was especially easy for Maven-based projects. However discovering these
>> modules in source repositories were always highlighting:
>> 
>>    - lack of knowledge if a certain module in repository is needed or even
>>    used in multiple products;
>>    - duplications of modules with similar purposes - sometimes a conscious
>>    decision to copy in order to avoid breaking backward compatibility with
>>    unknown dependents;
>>    - existence of build jobs for obsolete modules;
>>    - absence of builds for stable modules, which are not changed during the
>>    last couple of years
>>    - and things like these
>> 
>> Assuming all projects in the repositories are maven-ized, how would you
>> approach determining those, which are required for final deliveries, and
>> those, which might break if another module changes (sort of reverse
>> dependency management)?
>> 
>> Do you actually consider this situation as a problem or is it just a
>> perfectionist talking to me? ;-)
>> 
>> Thank you for your attention,
>> Viktor
> 

Re: Cleaning source code repositories

Posted by Curtis Rueden <ct...@wisc.edu>.
Hi Viktor,

> Do you actually consider this situation as a problem or is it just a
> perfectionist talking to me? ;-)

I would say it is a very real challenge of managing projects with many
components.

> how would you approach determining those, which are required for final
> deliveries, and those, which might break if another module changes
> (sort of reverse dependency management)?

That "reverse dependency management" need in particular is the most
important of those on your list, I think. (Detecting obsolete modules and
build jobs is nice and helps declutter, but often has little consequence
beyond that.)

My project [1] is still looking for better ways to be notified in advance
when code changes somehow affect downstream projects. (Of course, you
cannot be responsible for the entire world -- but you can define a known
set of downstream code that you want to help maintain.)

We use release couplings for reproducible builds [2], SemVer for versioning
[3], and Jenkins for CI [4]. Because of the release dependency couplings,
Jenkins cannot tell you when upstream changes to master (or even new
release versions) break downstream projects, until such projects attempt to
update to the new release. What we are working towards creating is a
"melting pot" Jenkins job that switches everything to snapshot couplings
using a profile [5] in a giant synthetic multi-module build. Then the java
compiler would tell you directly if you broke backwards compatibility -- at
least compile-time compatibility, which is more than half the battle.

If anyone knows of a better established best practice for this sort of
thing, that would be awesome.

Regards,
Curtis

[1] http://imagej.net/Architecture
[2] http://imagej.net/Reproducible_builds
[3] http://imagej.net/Versioning
[4] http://imagej.net/Jenkins
[5]
https://github.com/scijava/pom-scijava/blob/pom-scijava-5.7.0/pom.xml#L1048-L1051

On Fri, Feb 13, 2015 at 7:09 AM, Viktor Sadovnikov <vi...@jv-ration.com>
wrote:

> Good day,
>
> I wonder if this community can provide some hints on handling the
> following.
>
> At a few last projects I was asked to set (or clean) automated builds up,
> so they can get (at least) deployable software package(s) after code
> changes in a minimum time. Starting from the final desirable results, I was
> able to trace down every module, which is needed to build the "master CD".
> It was especially easy for Maven-based projects. However discovering these
> modules in source repositories were always highlighting:
>
>    - lack of knowledge if a certain module in repository is needed or even
>    used in multiple products;
>    - duplications of modules with similar purposes - sometimes a conscious
>    decision to copy in order to avoid breaking backward compatibility with
>    unknown dependents;
>    - existence of build jobs for obsolete modules;
>    - absence of builds for stable modules, which are not changed during the
>    last couple of years
>    - and things like these
>
> Assuming all projects in the repositories are maven-ized, how would you
> approach determining those, which are required for final deliveries, and
> those, which might break if another module changes (sort of reverse
> dependency management)?
>
> Do you actually consider this situation as a problem or is it just a
> perfectionist talking to me? ;-)
>
> Thank you for your attention,
> Viktor
>