You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@maven.apache.org by Maximilian Novikov <ma...@db.com> on 2019/09/13 13:54:46 UTC

[VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Hi All,

We want to create upstream change to Maven to support true incremental build for big-sized projects.
To raise a pull request we have to pass long chain of Deutsche Bank's internal procedures. So, before starting the process we would like to get your feedback regarding this feature.

Motivation:

Our project is hosted in mono-repo and contains ~600 modules. All modules has the same SNAPSHOT version.
There are lot of test automation around this, everything is tested before merge into release branch.

Current setup helps us to simplify build/release/dependency management for 10+ teams those contribute into codebase. We can release everything in 1-click.
The major drawback of such approach is build time: full local build took 45-60 min (-T8), CI build ~25min(-T16).

To speed-up our build we needed 2 features: incremental build and shared cache.
Initially we started to think about migration to Gradle or Bazel. As migration costs for the mentioned tools were too high, we decided to add similar functionality into Maven.

Current results we get: 1-2 mins for local build(-T8) if build was cached by CI, CI build ~5 mins (-T16).

Feature description:

The idea is to calculate checksum for inputs and save outputs in cache.
[image2019-8-27_20-0-14.png]
Each node checksum calculated with:


*         Effective POM hash

*         Sources hash

*         Dependencies hash (dependencies within multi-module project)

Project sources inputs are searched inside project + all paths from plugins configuration:
[image2019-8-30_10-28-56.png]
How does it work in practice:



1.       CI: runs builds and stores outputs in shared cache

2.       CI: reuse outputs for same inputs, so time is decreasing

3.       Locally: when I checkout branch and run 'install' for whole project, I get all actual snapshots from remote cache for this branch

4.       Locally: if I change multiple modules in tree, only changed subtree is rebuilt

Impact on current Maven codebase is very localized (MojoExecutor, where we injected cache controller).
Caching can be activated/deactivated by property, so current maven flow will work as is.

And the big plus is that you don't need to re-work your current project. Caching should work out of box, just need to add config in .mvn folder.

Please let us know what do you think. We are ready to invest in this feature and address any further feedback.

Kind regards,
Max



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Ashitkin <as...@gmail.com>.
Hi 
Let's please stop discussing this subject as it goes off-top and pollutes thread. Our case exists, it has own reasons and we discuss it is not in the scope of this thread. Lets please return to the feature itself

Thank you

On 2019/09/14 16:41:43, Romain Manni-Bucau <rm...@gmail.com> wrote: 
> Hope I didnt miss it but how monorepo=single build?
> 
> It is working well to not have a common parent too and is unlinked to
> monorepo which uses local relative paths in general (at least in the
> references you quoted which are also not about java ;)).
> 
> Unrelated to making maven better at incremental builds but both tracks can
> help you to get a very fast build feedback.
> 
> Le sam. 14 sept. 2019 à 17:35, Robert Scholte <rf...@apache.org> a
> écrit :
> 
> > https://issues.apache.org/jira/browse/MPLUGIN-350 is the issue to start
> > with.
> >
> > Please read all the comments, because my original thought won't work.
> >
> > thanks,
> > Robert
> >
> > On Sat, 14 Sep 2019 17:10:13 +0200, Alexander Ashitkin
> > <as...@gmail.com> wrote:
> >
> > > We checked and price of 550$ per user makes us think twice of what's
> > the
> > > best way to proceed here :-)
> > > Regarding plugin api - yes, changes are desirable to make maven model
> > > cache-friendly. Both in plugin invocation model and Mojo#execute
> > > input/output apis. But it is possible to work with current model with
> > > declarative approach.
> > >
> > > Thanks in advance
> > >
> > > On 2019/09/14 10:45:24, Tibor Digana <ti...@apache.org> wrote:
> > >> But I do not understand why the Maven should be responsible for the
> > >> project
> > >> cahe control/management of "/target" directories.
> > >> It is a responsibility of the build manager which is the Jenkins.
> > >> The Jenkins has the ability to archive files and such property already
> > >> exists in the Jenkins.
> > >>
> > >> So the Jenkins has a full knowledge about:
> > >>
> > >> 1. how long the workspace content retains intact
> > >> 2. what commit hash is for the last build/job/branch
> > >> 3. and what commit was successful
> > >>
> > >> If the target directories retain intact (or renewed from archive) in the
> > >> workspace for very long time and the workspace was reused by the next
> > >> build
> > >> then I would say that the improvement should work as it is on CI level.
> > >>
> > >> Maybe what is necessary is only that improvement in Maven where we would
> > >> obtain the list of modules or directories of changes in the current
> > >> commit.
> > >> Then the Maven can highly optimize its own build steps and build only
> > >> those
> > >> modules which have been changed and their dependent modules.
> > >> So the interface between CI and Maven is needed in a kind of extension
> > >> or
> > >> the class MavenCli can be extended with some new entrypoint.
> > >>
> > >> But I do not hink that Maven has to take care of responsibilities of CI
> > >> (project cache mgmt), that's not our task I would say and we as Maven
> > >> would
> > >> never know all about the miscellaneous CI specifics and therefore we
> > >> would
> > >> not cope with CI related troubles.
> > >>
> > >> Cheers
> > >> Tibor17
> > >>
> > >>
> > >>
> > >> On Sat, Sep 14, 2019 at 11:08 AM Robert Scholte <rf...@apache.org>
> > >> wrote:
> > >>
> > >> > On Fri, 13 Sep 2019 23:37:15 +0200, Romain Manni-Bucau
> > >> > <rm...@gmail.com> wrote:
> > >> >
> > >> > > There are multiple possible incremental support:
> > >> > >
> > >> > > 1. Scm related: do a status and rebuild downstream reactor
> > >> > > 2. Full and module build graph: seems it is the one you target, ie
> > >> bypass
> > >> > > modules without change. Note that it only works if upstream graph is
> > >> > > taken
> > >> > > into account.
> > >> > > 3. Full build: each mojo has incremental support so the full build
> > >> gets
> > >> > > it.
> > >> > > Issue is that it requires each mojo to know if it needs to be
> > >> executed or
> > >> > > give enough info to the mojo executor to do so (gradle requires all
> > >> > > inputs/outputs to assume this state - which is still just an
> > >> heuristic
> > >> > > and
> > >> > > not 100% reliable).
> > >> > >
> > >> > > In current state, 2. sounds like a good option since 3 can require
> >
> > >> a
> > >> > > loot
> > >> > > of work for external plugins (today's builds have a lot more of
> > not
> > >> maven
> > >> > > provide plugins than core plugins).
> > >> > > Now, we should be able to activate it or not so having a
> > >> cacheLocation
> > >> > > config in settings.xml can be good.
> > >> > >
> > >> > > Side notes:
> > >> > >
> > >> > > 1. having it on by default will break builds - reactor is
> > >> deterministic
> > >> > > and
> > >> > > bypassing a module can break a build since it can init maven
> > >> properties -
> > >> > > for ex - for next modules
> > >> > > 2. You cant find all in/out paths from the pom in general so your
> > >> algo is
> > >> > > not generic, a meta config can be needed in .mvn
> > >> > > 3. We should let a mojo be able to disable that to replace default
> > >> logic
> > >> > > (surefire is a good example where it must be refined and it can save
> > >> > > hours
> > >> > > there ;))
> > >> > > 4. Let's try to impl it as a mvn extension first then if it works
> > >> well on
> > >> > > multiple big project get it to core?
> > >> >
> > >> > Did anyone Google for "maven extension build cache"? There are already
> > >> > commercial solutions for it.
> > >> > Even though I would like to see improvements in this area, the old
> > >> > architecture of Maven makes it quite hard to move to that situation.
> > >> > First
> > >> > of all it requires changes to the Plugin API (without breaking
> > >> backwards
> > >> > compatibility) to have support out of the box.
> > >> >
> > >> > Robert
> > >> >
> > >> > >
> > >> > > Romain
> > >> > >
> > >> > >
> > >> > >
> > >> > > Le ven. 13 sept. 2019 à 23:18, Tibor Digana
> > >> <ti...@apache.org> a
> > >> > > écrit :
> > >> > >
> > >> > >> In theory, the incremental compiler would make it faster.
> > >> > >> But this can be told only if you present a demo project with has
> > >> trivial
> > >> > >> tests taking much less time to complete than the compiler.
> > >> > >>
> > >> > >> In reality the tests in huge projects take significantly longer
> > >> time
> > >> > >> than
> > >> > >> the compiler.
> > >> > >> Some developers say "switch off all the tests" in the release
> > >> phase but
> > >> > >> that's wrong because then the quality goes down and methodologies
> > >> are
> > >> > >> broken.
> > >> > >>
> > >> > >> I can see a big problem that we do not have an interface between
> > >> > >> Surefire
> > >> > >> and Compiler plugin negotiating which tests have been modified
> > >> including
> > >> > >> modules and classes in the entire structure.
> > >> > >>
> > >> > >> Having incremental compiler is easy, just use compiler:3.8.1 or
> > >> use the
> > >> > >> Takari compiler.
> > >> > >> But IMO the biggest benefit in performance would be after having
> > >> the
> > >> > >> truly
> > >> > >> incremental test executor.
> > >> > >>
> > >> > >> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> > >> > >> maximilian.novikov@db.com> wrote:
> > >> > >>
> > >> > >> > Hi All,
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > *We want to create upstream change to Maven* to support true
> > >> > >> incremental
> > >> > >> > build for big-sized projects.
> > >> > >> >
> > >> > >> > To raise a pull request we have to pass long chain of Deutsche
> > >> Bank’s
> > >> > >> > internal procedures. So, *before starting the process we would
> > >> like to
> > >> > >> > get your feedback regarding this feature*.
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > *Motivation:*
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > Our project is hosted in mono-repo and contains ~600 modules. All
> > >> > >> modules
> > >> > >> > has the same SNAPSHOT version.
> > >> > >> >
> > >> > >> > There are lot of test automation around this, everything is
> > >> tested
> > >> > >> before
> > >> > >> > merge into release branch.
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > Current setup helps us to simplify build/release/dependency
> > >> management
> > >> > >> for
> > >> > >> > 10+ teams those contribute into codebase. We can release
> > >> everything in
> > >> > >> > 1-click.
> > >> > >> >
> > >> > >> > The major drawback of such approach is build time: *full local
> > >> build
> > >> > >> took
> > >> > >> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > To speed-up our build we needed 2 features: incremental build and
> > >> > >> shared
> > >> > >> > cache.
> > >> > >> >
> > >> > >> > Initially we started to think about migration to Gradle or
> > >> Bazel. As
> > >> > >> > migration costs for the mentioned tools were too high, we
> > >> decided to
> > >> > >> add
> > >> > >> > similar functionality into Maven.
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > Current results we get: *1-2 mins for local build(*-T8*)* if
> > >> build was
> > >> > >> > cached by CI*, CI build ~5 mins (*-T16*).*
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > *Feature description:*
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > The idea is to calculate checksum for inputs and save outputs in
> > >> > >> cache.
> > >> > >> >
> > >> > >> > [image: image2019-8-27_20-0-14.png]
> > >> > >> >
> > >> > >> > Each node checksum calculated with:
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > ·         Effective POM hash
> > >> > >> >
> > >> > >> > ·         Sources hash
> > >> > >> >
> > >> > >> > ·         Dependencies hash (dependencies within multi-module
> > >> project)
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > Project sources inputs are searched inside project + all paths
> > >> from
> > >> > >> > plugins configuration:
> > >> > >> >
> > >> > >> > [image: image2019-8-30_10-28-56.png]
> > >> > >> >
> > >> > >> > How does it work in practice:
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > 1.       CI: runs builds and stores outputs in shared cache
> > >> > >> >
> > >> > >> > 2.       CI: reuse outputs for same inputs, so time is decreasing
> > >> > >> >
> > >> > >> > 3.       Locally: when I checkout branch and run ‘install’ for
> > >> whole
> > >> > >> > project, I get all actual snapshots from remote cache for this
> > >> branch
> > >> > >> >
> > >> > >> > 4.       Locally: if I change multiple modules in tree, only
> > >> changed
> > >> > >> > subtree is rebuilt
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > Impact on current Maven codebase is very localized (MojoExecutor,
> > >> > >> where
> > >> > >> we
> > >> > >> > injected cache controller).
> > >> > >> >
> > >> > >> > Caching can be activated/deactivated by property, so current
> > >> maven
> > >> > >> flow
> > >> > >> > will work as is.
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > And the big plus is that you don’t need to re-work your current
> > >> > >> project.
> > >> > >> > Caching should work out of box, just need to add config in .mvn
> > >> > >> folder.
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > Please let us know what do you think. We are ready to invest in
> > >> this
> > >> > >> > feature and address any further feedback.
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > Kind regards,
> > >> > >> >
> > >> > >> > Max
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > ---
> > >> > >> > This e-mail may contain confidential and/or privileged
> > >> information. If
> > >> > >> you
> > >> > >> > are not the intended recipient (or have received this e-mail in
> > >> error)
> > >> > >> > please notify the sender immediately and delete this e-mail. Any
> > >> > >> > unauthorized copying, disclosure or distribution of the
> > material
> > >> in
> > >> > >> this
> > >> > >> > e-mail is strictly forbidden.
> > >> > >> >
> > >> > >> > Please refer to https://www.db.com/disclosures for additional EU
> > >> > >> > corporate and regulatory disclosures and to
> > >> > >> > http://www.db.com/unitedkingdom/content/privacy.htm for
> > >> information
> > >> > >> about
> > >> > >> > privacy.
> > >> > >> >
> > >> >
> > >> > ---------------------------------------------------------------------
> > >> > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > >> > For additional commands, e-mail: dev-help@maven.apache.org
> > >> >
> > >> >
> > >>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > > For additional commands, e-mail: dev-help@maven.apache.org
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > For additional commands, e-mail: dev-help@maven.apache.org
> >
> >
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Romain Manni-Bucau <rm...@gmail.com>.
Hope I didnt miss it but how monorepo=single build?

It is working well to not have a common parent too and is unlinked to
monorepo which uses local relative paths in general (at least in the
references you quoted which are also not about java ;)).

Unrelated to making maven better at incremental builds but both tracks can
help you to get a very fast build feedback.

Le sam. 14 sept. 2019 à 17:35, Robert Scholte <rf...@apache.org> a
écrit :

> https://issues.apache.org/jira/browse/MPLUGIN-350 is the issue to start
> with.
>
> Please read all the comments, because my original thought won't work.
>
> thanks,
> Robert
>
> On Sat, 14 Sep 2019 17:10:13 +0200, Alexander Ashitkin
> <as...@gmail.com> wrote:
>
> > We checked and price of 550$ per user makes us think twice of what's
> the
> > best way to proceed here :-)
> > Regarding plugin api - yes, changes are desirable to make maven model
> > cache-friendly. Both in plugin invocation model and Mojo#execute
> > input/output apis. But it is possible to work with current model with
> > declarative approach.
> >
> > Thanks in advance
> >
> > On 2019/09/14 10:45:24, Tibor Digana <ti...@apache.org> wrote:
> >> But I do not understand why the Maven should be responsible for the
> >> project
> >> cahe control/management of "/target" directories.
> >> It is a responsibility of the build manager which is the Jenkins.
> >> The Jenkins has the ability to archive files and such property already
> >> exists in the Jenkins.
> >>
> >> So the Jenkins has a full knowledge about:
> >>
> >> 1. how long the workspace content retains intact
> >> 2. what commit hash is for the last build/job/branch
> >> 3. and what commit was successful
> >>
> >> If the target directories retain intact (or renewed from archive) in the
> >> workspace for very long time and the workspace was reused by the next
> >> build
> >> then I would say that the improvement should work as it is on CI level.
> >>
> >> Maybe what is necessary is only that improvement in Maven where we would
> >> obtain the list of modules or directories of changes in the current
> >> commit.
> >> Then the Maven can highly optimize its own build steps and build only
> >> those
> >> modules which have been changed and their dependent modules.
> >> So the interface between CI and Maven is needed in a kind of extension
> >> or
> >> the class MavenCli can be extended with some new entrypoint.
> >>
> >> But I do not hink that Maven has to take care of responsibilities of CI
> >> (project cache mgmt), that's not our task I would say and we as Maven
> >> would
> >> never know all about the miscellaneous CI specifics and therefore we
> >> would
> >> not cope with CI related troubles.
> >>
> >> Cheers
> >> Tibor17
> >>
> >>
> >>
> >> On Sat, Sep 14, 2019 at 11:08 AM Robert Scholte <rf...@apache.org>
> >> wrote:
> >>
> >> > On Fri, 13 Sep 2019 23:37:15 +0200, Romain Manni-Bucau
> >> > <rm...@gmail.com> wrote:
> >> >
> >> > > There are multiple possible incremental support:
> >> > >
> >> > > 1. Scm related: do a status and rebuild downstream reactor
> >> > > 2. Full and module build graph: seems it is the one you target, ie
> >> bypass
> >> > > modules without change. Note that it only works if upstream graph is
> >> > > taken
> >> > > into account.
> >> > > 3. Full build: each mojo has incremental support so the full build
> >> gets
> >> > > it.
> >> > > Issue is that it requires each mojo to know if it needs to be
> >> executed or
> >> > > give enough info to the mojo executor to do so (gradle requires all
> >> > > inputs/outputs to assume this state - which is still just an
> >> heuristic
> >> > > and
> >> > > not 100% reliable).
> >> > >
> >> > > In current state, 2. sounds like a good option since 3 can require
>
> >> a
> >> > > loot
> >> > > of work for external plugins (today's builds have a lot more of
> not
> >> maven
> >> > > provide plugins than core plugins).
> >> > > Now, we should be able to activate it or not so having a
> >> cacheLocation
> >> > > config in settings.xml can be good.
> >> > >
> >> > > Side notes:
> >> > >
> >> > > 1. having it on by default will break builds - reactor is
> >> deterministic
> >> > > and
> >> > > bypassing a module can break a build since it can init maven
> >> properties -
> >> > > for ex - for next modules
> >> > > 2. You cant find all in/out paths from the pom in general so your
> >> algo is
> >> > > not generic, a meta config can be needed in .mvn
> >> > > 3. We should let a mojo be able to disable that to replace default
> >> logic
> >> > > (surefire is a good example where it must be refined and it can save
> >> > > hours
> >> > > there ;))
> >> > > 4. Let's try to impl it as a mvn extension first then if it works
> >> well on
> >> > > multiple big project get it to core?
> >> >
> >> > Did anyone Google for "maven extension build cache"? There are already
> >> > commercial solutions for it.
> >> > Even though I would like to see improvements in this area, the old
> >> > architecture of Maven makes it quite hard to move to that situation.
> >> > First
> >> > of all it requires changes to the Plugin API (without breaking
> >> backwards
> >> > compatibility) to have support out of the box.
> >> >
> >> > Robert
> >> >
> >> > >
> >> > > Romain
> >> > >
> >> > >
> >> > >
> >> > > Le ven. 13 sept. 2019 à 23:18, Tibor Digana
> >> <ti...@apache.org> a
> >> > > écrit :
> >> > >
> >> > >> In theory, the incremental compiler would make it faster.
> >> > >> But this can be told only if you present a demo project with has
> >> trivial
> >> > >> tests taking much less time to complete than the compiler.
> >> > >>
> >> > >> In reality the tests in huge projects take significantly longer
> >> time
> >> > >> than
> >> > >> the compiler.
> >> > >> Some developers say "switch off all the tests" in the release
> >> phase but
> >> > >> that's wrong because then the quality goes down and methodologies
> >> are
> >> > >> broken.
> >> > >>
> >> > >> I can see a big problem that we do not have an interface between
> >> > >> Surefire
> >> > >> and Compiler plugin negotiating which tests have been modified
> >> including
> >> > >> modules and classes in the entire structure.
> >> > >>
> >> > >> Having incremental compiler is easy, just use compiler:3.8.1 or
> >> use the
> >> > >> Takari compiler.
> >> > >> But IMO the biggest benefit in performance would be after having
> >> the
> >> > >> truly
> >> > >> incremental test executor.
> >> > >>
> >> > >> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> >> > >> maximilian.novikov@db.com> wrote:
> >> > >>
> >> > >> > Hi All,
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > *We want to create upstream change to Maven* to support true
> >> > >> incremental
> >> > >> > build for big-sized projects.
> >> > >> >
> >> > >> > To raise a pull request we have to pass long chain of Deutsche
> >> Bank’s
> >> > >> > internal procedures. So, *before starting the process we would
> >> like to
> >> > >> > get your feedback regarding this feature*.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > *Motivation:*
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > Our project is hosted in mono-repo and contains ~600 modules. All
> >> > >> modules
> >> > >> > has the same SNAPSHOT version.
> >> > >> >
> >> > >> > There are lot of test automation around this, everything is
> >> tested
> >> > >> before
> >> > >> > merge into release branch.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > Current setup helps us to simplify build/release/dependency
> >> management
> >> > >> for
> >> > >> > 10+ teams those contribute into codebase. We can release
> >> everything in
> >> > >> > 1-click.
> >> > >> >
> >> > >> > The major drawback of such approach is build time: *full local
> >> build
> >> > >> took
> >> > >> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > To speed-up our build we needed 2 features: incremental build and
> >> > >> shared
> >> > >> > cache.
> >> > >> >
> >> > >> > Initially we started to think about migration to Gradle or
> >> Bazel. As
> >> > >> > migration costs for the mentioned tools were too high, we
> >> decided to
> >> > >> add
> >> > >> > similar functionality into Maven.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > Current results we get: *1-2 mins for local build(*-T8*)* if
> >> build was
> >> > >> > cached by CI*, CI build ~5 mins (*-T16*).*
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > *Feature description:*
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > The idea is to calculate checksum for inputs and save outputs in
> >> > >> cache.
> >> > >> >
> >> > >> > [image: image2019-8-27_20-0-14.png]
> >> > >> >
> >> > >> > Each node checksum calculated with:
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > ·         Effective POM hash
> >> > >> >
> >> > >> > ·         Sources hash
> >> > >> >
> >> > >> > ·         Dependencies hash (dependencies within multi-module
> >> project)
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > Project sources inputs are searched inside project + all paths
> >> from
> >> > >> > plugins configuration:
> >> > >> >
> >> > >> > [image: image2019-8-30_10-28-56.png]
> >> > >> >
> >> > >> > How does it work in practice:
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > 1.       CI: runs builds and stores outputs in shared cache
> >> > >> >
> >> > >> > 2.       CI: reuse outputs for same inputs, so time is decreasing
> >> > >> >
> >> > >> > 3.       Locally: when I checkout branch and run ‘install’ for
> >> whole
> >> > >> > project, I get all actual snapshots from remote cache for this
> >> branch
> >> > >> >
> >> > >> > 4.       Locally: if I change multiple modules in tree, only
> >> changed
> >> > >> > subtree is rebuilt
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > Impact on current Maven codebase is very localized (MojoExecutor,
> >> > >> where
> >> > >> we
> >> > >> > injected cache controller).
> >> > >> >
> >> > >> > Caching can be activated/deactivated by property, so current
> >> maven
> >> > >> flow
> >> > >> > will work as is.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > And the big plus is that you don’t need to re-work your current
> >> > >> project.
> >> > >> > Caching should work out of box, just need to add config in .mvn
> >> > >> folder.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > Please let us know what do you think. We are ready to invest in
> >> this
> >> > >> > feature and address any further feedback.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > Kind regards,
> >> > >> >
> >> > >> > Max
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > ---
> >> > >> > This e-mail may contain confidential and/or privileged
> >> information. If
> >> > >> you
> >> > >> > are not the intended recipient (or have received this e-mail in
> >> error)
> >> > >> > please notify the sender immediately and delete this e-mail. Any
> >> > >> > unauthorized copying, disclosure or distribution of the
> material
> >> in
> >> > >> this
> >> > >> > e-mail is strictly forbidden.
> >> > >> >
> >> > >> > Please refer to https://www.db.com/disclosures for additional EU
> >> > >> > corporate and regulatory disclosures and to
> >> > >> > http://www.db.com/unitedkingdom/content/privacy.htm for
> >> information
> >> > >> about
> >> > >> > privacy.
> >> > >> >
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> >> > For additional commands, e-mail: dev-help@maven.apache.org
> >> >
> >> >
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > For additional commands, e-mail: dev-help@maven.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Ashitkin <as...@gmail.com>.
Hi Robert 
sounds like right direction, but just categorizing parameters is not enough. 
1) You need to connect plugins into a chain, so you need to match outputs of multiple plugins as input for downstream plugins. 
2) Input output is not enough. You need to know which parameters affect plugin behavoir and sometimes how (like threading doesnt affect tests result but exclude does). So parameter should have more metadata i guess

Thank you 

On 2019/09/14 15:35:26, "Robert Scholte" <rf...@apache.org> wrote: 
> https://issues.apache.org/jira/browse/MPLUGIN-350 is the issue to start  
> with.
> 
> Please read all the comments, because my original thought won't work.
> 
> thanks,
> Robert
> 
> On Sat, 14 Sep 2019 17:10:13 +0200, Alexander Ashitkin  
> <as...@gmail.com> wrote:
> 
> > We checked and price of 550$ per user makes us think twice of what's the  
> > best way to proceed here :-)
> > Regarding plugin api - yes, changes are desirable to make maven model  
> > cache-friendly. Both in plugin invocation model and Mojo#execute  
> > input/output apis. But it is possible to work with current model with  
> > declarative approach.
> >
> > Thanks in advance
> >
> > On 2019/09/14 10:45:24, Tibor Digana <ti...@apache.org> wrote:
> >> But I do not understand why the Maven should be responsible for the  
> >> project
> >> cahe control/management of "/target" directories.
> >> It is a responsibility of the build manager which is the Jenkins.
> >> The Jenkins has the ability to archive files and such property already
> >> exists in the Jenkins.
> >>
> >> So the Jenkins has a full knowledge about:
> >>
> >> 1. how long the workspace content retains intact
> >> 2. what commit hash is for the last build/job/branch
> >> 3. and what commit was successful
> >>
> >> If the target directories retain intact (or renewed from archive) in the
> >> workspace for very long time and the workspace was reused by the next  
> >> build
> >> then I would say that the improvement should work as it is on CI level.
> >>
> >> Maybe what is necessary is only that improvement in Maven where we would
> >> obtain the list of modules or directories of changes in the current  
> >> commit.
> >> Then the Maven can highly optimize its own build steps and build only  
> >> those
> >> modules which have been changed and their dependent modules.
> >> So the interface between CI and Maven is needed in a kind of extension  
> >> or
> >> the class MavenCli can be extended with some new entrypoint.
> >>
> >> But I do not hink that Maven has to take care of responsibilities of CI
> >> (project cache mgmt), that's not our task I would say and we as Maven  
> >> would
> >> never know all about the miscellaneous CI specifics and therefore we  
> >> would
> >> not cope with CI related troubles.
> >>
> >> Cheers
> >> Tibor17
> >>
> >>
> >>
> >> On Sat, Sep 14, 2019 at 11:08 AM Robert Scholte <rf...@apache.org>
> >> wrote:
> >>
> >> > On Fri, 13 Sep 2019 23:37:15 +0200, Romain Manni-Bucau
> >> > <rm...@gmail.com> wrote:
> >> >
> >> > > There are multiple possible incremental support:
> >> > >
> >> > > 1. Scm related: do a status and rebuild downstream reactor
> >> > > 2. Full and module build graph: seems it is the one you target, ie  
> >> bypass
> >> > > modules without change. Note that it only works if upstream graph is
> >> > > taken
> >> > > into account.
> >> > > 3. Full build: each mojo has incremental support so the full build  
> >> gets
> >> > > it.
> >> > > Issue is that it requires each mojo to know if it needs to be  
> >> executed or
> >> > > give enough info to the mojo executor to do so (gradle requires all
> >> > > inputs/outputs to assume this state - which is still just an  
> >> heuristic
> >> > > and
> >> > > not 100% reliable).
> >> > >
> >> > > In current state, 2. sounds like a good option since 3 can require   
> >> a
> >> > > loot
> >> > > of work for external plugins (today's builds have a lot more of not  
> >> maven
> >> > > provide plugins than core plugins).
> >> > > Now, we should be able to activate it or not so having a  
> >> cacheLocation
> >> > > config in settings.xml can be good.
> >> > >
> >> > > Side notes:
> >> > >
> >> > > 1. having it on by default will break builds - reactor is  
> >> deterministic
> >> > > and
> >> > > bypassing a module can break a build since it can init maven  
> >> properties -
> >> > > for ex - for next modules
> >> > > 2. You cant find all in/out paths from the pom in general so your  
> >> algo is
> >> > > not generic, a meta config can be needed in .mvn
> >> > > 3. We should let a mojo be able to disable that to replace default  
> >> logic
> >> > > (surefire is a good example where it must be refined and it can save
> >> > > hours
> >> > > there ;))
> >> > > 4. Let's try to impl it as a mvn extension first then if it works  
> >> well on
> >> > > multiple big project get it to core?
> >> >
> >> > Did anyone Google for "maven extension build cache"? There are already
> >> > commercial solutions for it.
> >> > Even though I would like to see improvements in this area, the old
> >> > architecture of Maven makes it quite hard to move to that situation.
> >> > First
> >> > of all it requires changes to the Plugin API (without breaking  
> >> backwards
> >> > compatibility) to have support out of the box.
> >> >
> >> > Robert
> >> >
> >> > >
> >> > > Romain
> >> > >
> >> > >
> >> > >
> >> > > Le ven. 13 sept. 2019 à 23:18, Tibor Digana  
> >> <ti...@apache.org> a
> >> > > écrit :
> >> > >
> >> > >> In theory, the incremental compiler would make it faster.
> >> > >> But this can be told only if you present a demo project with has  
> >> trivial
> >> > >> tests taking much less time to complete than the compiler.
> >> > >>
> >> > >> In reality the tests in huge projects take significantly longer  
> >> time
> >> > >> than
> >> > >> the compiler.
> >> > >> Some developers say "switch off all the tests" in the release  
> >> phase but
> >> > >> that's wrong because then the quality goes down and methodologies  
> >> are
> >> > >> broken.
> >> > >>
> >> > >> I can see a big problem that we do not have an interface between
> >> > >> Surefire
> >> > >> and Compiler plugin negotiating which tests have been modified  
> >> including
> >> > >> modules and classes in the entire structure.
> >> > >>
> >> > >> Having incremental compiler is easy, just use compiler:3.8.1 or  
> >> use the
> >> > >> Takari compiler.
> >> > >> But IMO the biggest benefit in performance would be after having  
> >> the
> >> > >> truly
> >> > >> incremental test executor.
> >> > >>
> >> > >> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> >> > >> maximilian.novikov@db.com> wrote:
> >> > >>
> >> > >> > Hi All,
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > *We want to create upstream change to Maven* to support true
> >> > >> incremental
> >> > >> > build for big-sized projects.
> >> > >> >
> >> > >> > To raise a pull request we have to pass long chain of Deutsche  
> >> Bank’s
> >> > >> > internal procedures. So, *before starting the process we would  
> >> like to
> >> > >> > get your feedback regarding this feature*.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > *Motivation:*
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > Our project is hosted in mono-repo and contains ~600 modules. All
> >> > >> modules
> >> > >> > has the same SNAPSHOT version.
> >> > >> >
> >> > >> > There are lot of test automation around this, everything is  
> >> tested
> >> > >> before
> >> > >> > merge into release branch.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > Current setup helps us to simplify build/release/dependency  
> >> management
> >> > >> for
> >> > >> > 10+ teams those contribute into codebase. We can release  
> >> everything in
> >> > >> > 1-click.
> >> > >> >
> >> > >> > The major drawback of such approach is build time: *full local  
> >> build
> >> > >> took
> >> > >> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > To speed-up our build we needed 2 features: incremental build and
> >> > >> shared
> >> > >> > cache.
> >> > >> >
> >> > >> > Initially we started to think about migration to Gradle or  
> >> Bazel. As
> >> > >> > migration costs for the mentioned tools were too high, we  
> >> decided to
> >> > >> add
> >> > >> > similar functionality into Maven.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > Current results we get: *1-2 mins for local build(*-T8*)* if  
> >> build was
> >> > >> > cached by CI*, CI build ~5 mins (*-T16*).*
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > *Feature description:*
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > The idea is to calculate checksum for inputs and save outputs in
> >> > >> cache.
> >> > >> >
> >> > >> > [image: image2019-8-27_20-0-14.png]
> >> > >> >
> >> > >> > Each node checksum calculated with:
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > ·         Effective POM hash
> >> > >> >
> >> > >> > ·         Sources hash
> >> > >> >
> >> > >> > ·         Dependencies hash (dependencies within multi-module  
> >> project)
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > Project sources inputs are searched inside project + all paths  
> >> from
> >> > >> > plugins configuration:
> >> > >> >
> >> > >> > [image: image2019-8-30_10-28-56.png]
> >> > >> >
> >> > >> > How does it work in practice:
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > 1.       CI: runs builds and stores outputs in shared cache
> >> > >> >
> >> > >> > 2.       CI: reuse outputs for same inputs, so time is decreasing
> >> > >> >
> >> > >> > 3.       Locally: when I checkout branch and run ‘install’ for  
> >> whole
> >> > >> > project, I get all actual snapshots from remote cache for this  
> >> branch
> >> > >> >
> >> > >> > 4.       Locally: if I change multiple modules in tree, only  
> >> changed
> >> > >> > subtree is rebuilt
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > Impact on current Maven codebase is very localized (MojoExecutor,
> >> > >> where
> >> > >> we
> >> > >> > injected cache controller).
> >> > >> >
> >> > >> > Caching can be activated/deactivated by property, so current  
> >> maven
> >> > >> flow
> >> > >> > will work as is.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > And the big plus is that you don’t need to re-work your current
> >> > >> project.
> >> > >> > Caching should work out of box, just need to add config in .mvn
> >> > >> folder.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > Please let us know what do you think. We are ready to invest in  
> >> this
> >> > >> > feature and address any further feedback.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > Kind regards,
> >> > >> >
> >> > >> > Max
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > ---
> >> > >> > This e-mail may contain confidential and/or privileged  
> >> information. If
> >> > >> you
> >> > >> > are not the intended recipient (or have received this e-mail in  
> >> error)
> >> > >> > please notify the sender immediately and delete this e-mail. Any
> >> > >> > unauthorized copying, disclosure or distribution of the material  
> >> in
> >> > >> this
> >> > >> > e-mail is strictly forbidden.
> >> > >> >
> >> > >> > Please refer to https://www.db.com/disclosures for additional EU
> >> > >> > corporate and regulatory disclosures and to
> >> > >> > http://www.db.com/unitedkingdom/content/privacy.htm for  
> >> information
> >> > >> about
> >> > >> > privacy.
> >> > >> >
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> >> > For additional commands, e-mail: dev-help@maven.apache.org
> >> >
> >> >
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > For additional commands, e-mail: dev-help@maven.apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Robert Scholte <rf...@apache.org>.
https://issues.apache.org/jira/browse/MPLUGIN-350 is the issue to start  
with.

Please read all the comments, because my original thought won't work.

thanks,
Robert

On Sat, 14 Sep 2019 17:10:13 +0200, Alexander Ashitkin  
<as...@gmail.com> wrote:

> We checked and price of 550$ per user makes us think twice of what's the  
> best way to proceed here :-)
> Regarding plugin api - yes, changes are desirable to make maven model  
> cache-friendly. Both in plugin invocation model and Mojo#execute  
> input/output apis. But it is possible to work with current model with  
> declarative approach.
>
> Thanks in advance
>
> On 2019/09/14 10:45:24, Tibor Digana <ti...@apache.org> wrote:
>> But I do not understand why the Maven should be responsible for the  
>> project
>> cahe control/management of "/target" directories.
>> It is a responsibility of the build manager which is the Jenkins.
>> The Jenkins has the ability to archive files and such property already
>> exists in the Jenkins.
>>
>> So the Jenkins has a full knowledge about:
>>
>> 1. how long the workspace content retains intact
>> 2. what commit hash is for the last build/job/branch
>> 3. and what commit was successful
>>
>> If the target directories retain intact (or renewed from archive) in the
>> workspace for very long time and the workspace was reused by the next  
>> build
>> then I would say that the improvement should work as it is on CI level.
>>
>> Maybe what is necessary is only that improvement in Maven where we would
>> obtain the list of modules or directories of changes in the current  
>> commit.
>> Then the Maven can highly optimize its own build steps and build only  
>> those
>> modules which have been changed and their dependent modules.
>> So the interface between CI and Maven is needed in a kind of extension  
>> or
>> the class MavenCli can be extended with some new entrypoint.
>>
>> But I do not hink that Maven has to take care of responsibilities of CI
>> (project cache mgmt), that's not our task I would say and we as Maven  
>> would
>> never know all about the miscellaneous CI specifics and therefore we  
>> would
>> not cope with CI related troubles.
>>
>> Cheers
>> Tibor17
>>
>>
>>
>> On Sat, Sep 14, 2019 at 11:08 AM Robert Scholte <rf...@apache.org>
>> wrote:
>>
>> > On Fri, 13 Sep 2019 23:37:15 +0200, Romain Manni-Bucau
>> > <rm...@gmail.com> wrote:
>> >
>> > > There are multiple possible incremental support:
>> > >
>> > > 1. Scm related: do a status and rebuild downstream reactor
>> > > 2. Full and module build graph: seems it is the one you target, ie  
>> bypass
>> > > modules without change. Note that it only works if upstream graph is
>> > > taken
>> > > into account.
>> > > 3. Full build: each mojo has incremental support so the full build  
>> gets
>> > > it.
>> > > Issue is that it requires each mojo to know if it needs to be  
>> executed or
>> > > give enough info to the mojo executor to do so (gradle requires all
>> > > inputs/outputs to assume this state - which is still just an  
>> heuristic
>> > > and
>> > > not 100% reliable).
>> > >
>> > > In current state, 2. sounds like a good option since 3 can require   
>> a
>> > > loot
>> > > of work for external plugins (today's builds have a lot more of not  
>> maven
>> > > provide plugins than core plugins).
>> > > Now, we should be able to activate it or not so having a  
>> cacheLocation
>> > > config in settings.xml can be good.
>> > >
>> > > Side notes:
>> > >
>> > > 1. having it on by default will break builds - reactor is  
>> deterministic
>> > > and
>> > > bypassing a module can break a build since it can init maven  
>> properties -
>> > > for ex - for next modules
>> > > 2. You cant find all in/out paths from the pom in general so your  
>> algo is
>> > > not generic, a meta config can be needed in .mvn
>> > > 3. We should let a mojo be able to disable that to replace default  
>> logic
>> > > (surefire is a good example where it must be refined and it can save
>> > > hours
>> > > there ;))
>> > > 4. Let's try to impl it as a mvn extension first then if it works  
>> well on
>> > > multiple big project get it to core?
>> >
>> > Did anyone Google for "maven extension build cache"? There are already
>> > commercial solutions for it.
>> > Even though I would like to see improvements in this area, the old
>> > architecture of Maven makes it quite hard to move to that situation.
>> > First
>> > of all it requires changes to the Plugin API (without breaking  
>> backwards
>> > compatibility) to have support out of the box.
>> >
>> > Robert
>> >
>> > >
>> > > Romain
>> > >
>> > >
>> > >
>> > > Le ven. 13 sept. 2019 à 23:18, Tibor Digana  
>> <ti...@apache.org> a
>> > > écrit :
>> > >
>> > >> In theory, the incremental compiler would make it faster.
>> > >> But this can be told only if you present a demo project with has  
>> trivial
>> > >> tests taking much less time to complete than the compiler.
>> > >>
>> > >> In reality the tests in huge projects take significantly longer  
>> time
>> > >> than
>> > >> the compiler.
>> > >> Some developers say "switch off all the tests" in the release  
>> phase but
>> > >> that's wrong because then the quality goes down and methodologies  
>> are
>> > >> broken.
>> > >>
>> > >> I can see a big problem that we do not have an interface between
>> > >> Surefire
>> > >> and Compiler plugin negotiating which tests have been modified  
>> including
>> > >> modules and classes in the entire structure.
>> > >>
>> > >> Having incremental compiler is easy, just use compiler:3.8.1 or  
>> use the
>> > >> Takari compiler.
>> > >> But IMO the biggest benefit in performance would be after having  
>> the
>> > >> truly
>> > >> incremental test executor.
>> > >>
>> > >> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
>> > >> maximilian.novikov@db.com> wrote:
>> > >>
>> > >> > Hi All,
>> > >> >
>> > >> >
>> > >> >
>> > >> > *We want to create upstream change to Maven* to support true
>> > >> incremental
>> > >> > build for big-sized projects.
>> > >> >
>> > >> > To raise a pull request we have to pass long chain of Deutsche  
>> Bank’s
>> > >> > internal procedures. So, *before starting the process we would  
>> like to
>> > >> > get your feedback regarding this feature*.
>> > >> >
>> > >> >
>> > >> >
>> > >> > *Motivation:*
>> > >> >
>> > >> >
>> > >> >
>> > >> > Our project is hosted in mono-repo and contains ~600 modules. All
>> > >> modules
>> > >> > has the same SNAPSHOT version.
>> > >> >
>> > >> > There are lot of test automation around this, everything is  
>> tested
>> > >> before
>> > >> > merge into release branch.
>> > >> >
>> > >> >
>> > >> >
>> > >> > Current setup helps us to simplify build/release/dependency  
>> management
>> > >> for
>> > >> > 10+ teams those contribute into codebase. We can release  
>> everything in
>> > >> > 1-click.
>> > >> >
>> > >> > The major drawback of such approach is build time: *full local  
>> build
>> > >> took
>> > >> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
>> > >> >
>> > >> >
>> > >> >
>> > >> > To speed-up our build we needed 2 features: incremental build and
>> > >> shared
>> > >> > cache.
>> > >> >
>> > >> > Initially we started to think about migration to Gradle or  
>> Bazel. As
>> > >> > migration costs for the mentioned tools were too high, we  
>> decided to
>> > >> add
>> > >> > similar functionality into Maven.
>> > >> >
>> > >> >
>> > >> >
>> > >> > Current results we get: *1-2 mins for local build(*-T8*)* if  
>> build was
>> > >> > cached by CI*, CI build ~5 mins (*-T16*).*
>> > >> >
>> > >> >
>> > >> >
>> > >> > *Feature description:*
>> > >> >
>> > >> >
>> > >> >
>> > >> > The idea is to calculate checksum for inputs and save outputs in
>> > >> cache.
>> > >> >
>> > >> > [image: image2019-8-27_20-0-14.png]
>> > >> >
>> > >> > Each node checksum calculated with:
>> > >> >
>> > >> >
>> > >> >
>> > >> > ·         Effective POM hash
>> > >> >
>> > >> > ·         Sources hash
>> > >> >
>> > >> > ·         Dependencies hash (dependencies within multi-module  
>> project)
>> > >> >
>> > >> >
>> > >> >
>> > >> > Project sources inputs are searched inside project + all paths  
>> from
>> > >> > plugins configuration:
>> > >> >
>> > >> > [image: image2019-8-30_10-28-56.png]
>> > >> >
>> > >> > How does it work in practice:
>> > >> >
>> > >> >
>> > >> >
>> > >> > 1.       CI: runs builds and stores outputs in shared cache
>> > >> >
>> > >> > 2.       CI: reuse outputs for same inputs, so time is decreasing
>> > >> >
>> > >> > 3.       Locally: when I checkout branch and run ‘install’ for  
>> whole
>> > >> > project, I get all actual snapshots from remote cache for this  
>> branch
>> > >> >
>> > >> > 4.       Locally: if I change multiple modules in tree, only  
>> changed
>> > >> > subtree is rebuilt
>> > >> >
>> > >> >
>> > >> >
>> > >> > Impact on current Maven codebase is very localized (MojoExecutor,
>> > >> where
>> > >> we
>> > >> > injected cache controller).
>> > >> >
>> > >> > Caching can be activated/deactivated by property, so current  
>> maven
>> > >> flow
>> > >> > will work as is.
>> > >> >
>> > >> >
>> > >> >
>> > >> > And the big plus is that you don’t need to re-work your current
>> > >> project.
>> > >> > Caching should work out of box, just need to add config in .mvn
>> > >> folder.
>> > >> >
>> > >> >
>> > >> >
>> > >> > Please let us know what do you think. We are ready to invest in  
>> this
>> > >> > feature and address any further feedback.
>> > >> >
>> > >> >
>> > >> >
>> > >> > Kind regards,
>> > >> >
>> > >> > Max
>> > >> >
>> > >> >
>> > >> >
>> > >> >
>> > >> > ---
>> > >> > This e-mail may contain confidential and/or privileged  
>> information. If
>> > >> you
>> > >> > are not the intended recipient (or have received this e-mail in  
>> error)
>> > >> > please notify the sender immediately and delete this e-mail. Any
>> > >> > unauthorized copying, disclosure or distribution of the material  
>> in
>> > >> this
>> > >> > e-mail is strictly forbidden.
>> > >> >
>> > >> > Please refer to https://www.db.com/disclosures for additional EU
>> > >> > corporate and regulatory disclosures and to
>> > >> > http://www.db.com/unitedkingdom/content/privacy.htm for  
>> information
>> > >> about
>> > >> > privacy.
>> > >> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
>> > For additional commands, e-mail: dev-help@maven.apache.org
>> >
>> >
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Ashitkin <as...@gmail.com>.
We checked and price of 550$ per user makes us think twice of what's the best way to proceed here :-) 
Regarding plugin api - yes, changes are desirable to make maven model cache-friendly. Both in plugin invocation model and Mojo#execute input/output apis. But it is possible to work with current model with declarative approach.

Thanks in advance

On 2019/09/14 10:45:24, Tibor Digana <ti...@apache.org> wrote: 
> But I do not understand why the Maven should be responsible for the project
> cahe control/management of "/target" directories.
> It is a responsibility of the build manager which is the Jenkins.
> The Jenkins has the ability to archive files and such property already
> exists in the Jenkins.
> 
> So the Jenkins has a full knowledge about:
> 
> 1. how long the workspace content retains intact
> 2. what commit hash is for the last build/job/branch
> 3. and what commit was successful
> 
> If the target directories retain intact (or renewed from archive) in the
> workspace for very long time and the workspace was reused by the next build
> then I would say that the improvement should work as it is on CI level.
> 
> Maybe what is necessary is only that improvement in Maven where we would
> obtain the list of modules or directories of changes in the current commit.
> Then the Maven can highly optimize its own build steps and build only those
> modules which have been changed and their dependent modules.
> So the interface between CI and Maven is needed in a kind of extension or
> the class MavenCli can be extended with some new entrypoint.
> 
> But I do not hink that Maven has to take care of responsibilities of CI
> (project cache mgmt), that's not our task I would say and we as Maven would
> never know all about the miscellaneous CI specifics and therefore we would
> not cope with CI related troubles.
> 
> Cheers
> Tibor17
> 
> 
> 
> On Sat, Sep 14, 2019 at 11:08 AM Robert Scholte <rf...@apache.org>
> wrote:
> 
> > On Fri, 13 Sep 2019 23:37:15 +0200, Romain Manni-Bucau
> > <rm...@gmail.com> wrote:
> >
> > > There are multiple possible incremental support:
> > >
> > > 1. Scm related: do a status and rebuild downstream reactor
> > > 2. Full and module build graph: seems it is the one you target, ie bypass
> > > modules without change. Note that it only works if upstream graph is
> > > taken
> > > into account.
> > > 3. Full build: each mojo has incremental support so the full build gets
> > > it.
> > > Issue is that it requires each mojo to know if it needs to be executed or
> > > give enough info to the mojo executor to do so (gradle requires all
> > > inputs/outputs to assume this state - which is still just an heuristic
> > > and
> > > not 100% reliable).
> > >
> > > In current state, 2. sounds like a good option since 3 can require  a
> > > loot
> > > of work for external plugins (today's builds have a lot more of not maven
> > > provide plugins than core plugins).
> > > Now, we should be able to activate it or not so having a cacheLocation
> > > config in settings.xml can be good.
> > >
> > > Side notes:
> > >
> > > 1. having it on by default will break builds - reactor is deterministic
> > > and
> > > bypassing a module can break a build since it can init maven properties -
> > > for ex - for next modules
> > > 2. You cant find all in/out paths from the pom in general so your algo is
> > > not generic, a meta config can be needed in .mvn
> > > 3. We should let a mojo be able to disable that to replace default logic
> > > (surefire is a good example where it must be refined and it can save
> > > hours
> > > there ;))
> > > 4. Let's try to impl it as a mvn extension first then if it works well on
> > > multiple big project get it to core?
> >
> > Did anyone Google for "maven extension build cache"? There are already
> > commercial solutions for it.
> > Even though I would like to see improvements in this area, the old
> > architecture of Maven makes it quite hard to move to that situation.
> > First
> > of all it requires changes to the Plugin API (without breaking backwards
> > compatibility) to have support out of the box.
> >
> > Robert
> >
> > >
> > > Romain
> > >
> > >
> > >
> > > Le ven. 13 sept. 2019 à 23:18, Tibor Digana <ti...@apache.org> a
> > > écrit :
> > >
> > >> In theory, the incremental compiler would make it faster.
> > >> But this can be told only if you present a demo project with has trivial
> > >> tests taking much less time to complete than the compiler.
> > >>
> > >> In reality the tests in huge projects take significantly longer time
> > >> than
> > >> the compiler.
> > >> Some developers say "switch off all the tests" in the release phase but
> > >> that's wrong because then the quality goes down and methodologies are
> > >> broken.
> > >>
> > >> I can see a big problem that we do not have an interface between
> > >> Surefire
> > >> and Compiler plugin negotiating which tests have been modified including
> > >> modules and classes in the entire structure.
> > >>
> > >> Having incremental compiler is easy, just use compiler:3.8.1 or use the
> > >> Takari compiler.
> > >> But IMO the biggest benefit in performance would be after having the
> > >> truly
> > >> incremental test executor.
> > >>
> > >> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> > >> maximilian.novikov@db.com> wrote:
> > >>
> > >> > Hi All,
> > >> >
> > >> >
> > >> >
> > >> > *We want to create upstream change to Maven* to support true
> > >> incremental
> > >> > build for big-sized projects.
> > >> >
> > >> > To raise a pull request we have to pass long chain of Deutsche Bank’s
> > >> > internal procedures. So, *before starting the process we would like to
> > >> > get your feedback regarding this feature*.
> > >> >
> > >> >
> > >> >
> > >> > *Motivation:*
> > >> >
> > >> >
> > >> >
> > >> > Our project is hosted in mono-repo and contains ~600 modules. All
> > >> modules
> > >> > has the same SNAPSHOT version.
> > >> >
> > >> > There are lot of test automation around this, everything is tested
> > >> before
> > >> > merge into release branch.
> > >> >
> > >> >
> > >> >
> > >> > Current setup helps us to simplify build/release/dependency management
> > >> for
> > >> > 10+ teams those contribute into codebase. We can release everything in
> > >> > 1-click.
> > >> >
> > >> > The major drawback of such approach is build time: *full local build
> > >> took
> > >> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> > >> >
> > >> >
> > >> >
> > >> > To speed-up our build we needed 2 features: incremental build and
> > >> shared
> > >> > cache.
> > >> >
> > >> > Initially we started to think about migration to Gradle or Bazel. As
> > >> > migration costs for the mentioned tools were too high, we decided to
> > >> add
> > >> > similar functionality into Maven.
> > >> >
> > >> >
> > >> >
> > >> > Current results we get: *1-2 mins for local build(*-T8*)* if build was
> > >> > cached by CI*, CI build ~5 mins (*-T16*).*
> > >> >
> > >> >
> > >> >
> > >> > *Feature description:*
> > >> >
> > >> >
> > >> >
> > >> > The idea is to calculate checksum for inputs and save outputs in
> > >> cache.
> > >> >
> > >> > [image: image2019-8-27_20-0-14.png]
> > >> >
> > >> > Each node checksum calculated with:
> > >> >
> > >> >
> > >> >
> > >> > ·         Effective POM hash
> > >> >
> > >> > ·         Sources hash
> > >> >
> > >> > ·         Dependencies hash (dependencies within multi-module project)
> > >> >
> > >> >
> > >> >
> > >> > Project sources inputs are searched inside project + all paths from
> > >> > plugins configuration:
> > >> >
> > >> > [image: image2019-8-30_10-28-56.png]
> > >> >
> > >> > How does it work in practice:
> > >> >
> > >> >
> > >> >
> > >> > 1.       CI: runs builds and stores outputs in shared cache
> > >> >
> > >> > 2.       CI: reuse outputs for same inputs, so time is decreasing
> > >> >
> > >> > 3.       Locally: when I checkout branch and run ‘install’ for whole
> > >> > project, I get all actual snapshots from remote cache for this branch
> > >> >
> > >> > 4.       Locally: if I change multiple modules in tree, only changed
> > >> > subtree is rebuilt
> > >> >
> > >> >
> > >> >
> > >> > Impact on current Maven codebase is very localized (MojoExecutor,
> > >> where
> > >> we
> > >> > injected cache controller).
> > >> >
> > >> > Caching can be activated/deactivated by property, so current maven
> > >> flow
> > >> > will work as is.
> > >> >
> > >> >
> > >> >
> > >> > And the big plus is that you don’t need to re-work your current
> > >> project.
> > >> > Caching should work out of box, just need to add config in .mvn
> > >> folder.
> > >> >
> > >> >
> > >> >
> > >> > Please let us know what do you think. We are ready to invest in this
> > >> > feature and address any further feedback.
> > >> >
> > >> >
> > >> >
> > >> > Kind regards,
> > >> >
> > >> > Max
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > ---
> > >> > This e-mail may contain confidential and/or privileged information. If
> > >> you
> > >> > are not the intended recipient (or have received this e-mail in error)
> > >> > please notify the sender immediately and delete this e-mail. Any
> > >> > unauthorized copying, disclosure or distribution of the material in
> > >> this
> > >> > e-mail is strictly forbidden.
> > >> >
> > >> > Please refer to https://www.db.com/disclosures for additional EU
> > >> > corporate and regulatory disclosures and to
> > >> > http://www.db.com/unitedkingdom/content/privacy.htm for information
> > >> about
> > >> > privacy.
> > >> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > For additional commands, e-mail: dev-help@maven.apache.org
> >
> >
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Ashitkin <as...@gmail.com>.
Sorry, i think Jenkins is irrelevant here at all. we need solution which works everythere - on workstations, from commandline, on other build servers, in intellij, etc. Any solution which is worth to discuss must be Jenkins agnostic honestly. Also, FYI we don't rely on target directory state at all - it is not saved.

Thank you

On 2019/09/14 11:37:40, Tibor Digana <ti...@apache.org> wrote: 
> oh yeah, exactly opposite.
> Jenkins has several ways to create Maven build configuration and it knows
> where the repo and workspace is, it knows where to store the archive, it
> knows when the build failed.
> We cannot take the responsibility because the build may fail for whatever
> reason and we do not know whether to keep the folders or delete all
> "/target" folders or just to delete only the failed one. The user knows it.
> We cannot archive the folders because we may significantly cause very high
> disk usage which would be without the control of CI. And we cannot take the
> responsibility of lifetime of these archives. It is all the property of
> Jenkins and Jenkins has the feature and management plugins where the
> workspace may retain for certain period of time, archives are limited in
> some way. The archives can be stored in another folder and we should not
> adopt these responsibilities because then we suddenly end up with all the
> knowledge of the distributed system and then we as maven project would end
> as unmaintainable project with many more issues in Jira and requirements we
> would be able to find the spare time to develop.
> 
> On Sat, Sep 14, 2019 at 1:25 PM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
> 
> > Tibor, maven is the only one with the logic to give any cache the data it
> > needs. Jenkins alone can't since it does not own the reactor nor plugin I/O
> > values.
> >
> > Le sam. 14 sept. 2019 à 12:45, Tibor Digana <ti...@apache.org> a
> > écrit :
> >
> > > But I do not understand why the Maven should be responsible for the
> > project
> > > cahe control/management of "/target" directories.
> > > It is a responsibility of the build manager which is the Jenkins.
> > > The Jenkins has the ability to archive files and such property already
> > > exists in the Jenkins.
> > >
> > > So the Jenkins has a full knowledge about:
> > >
> > > 1. how long the workspace content retains intact
> > > 2. what commit hash is for the last build/job/branch
> > > 3. and what commit was successful
> > >
> > > If the target directories retain intact (or renewed from archive) in the
> > > workspace for very long time and the workspace was reused by the next
> > build
> > > then I would say that the improvement should work as it is on CI level.
> > >
> > > Maybe what is necessary is only that improvement in Maven where we would
> > > obtain the list of modules or directories of changes in the current
> > commit.
> > > Then the Maven can highly optimize its own build steps and build only
> > those
> > > modules which have been changed and their dependent modules.
> > > So the interface between CI and Maven is needed in a kind of extension or
> > > the class MavenCli can be extended with some new entrypoint.
> > >
> > > But I do not hink that Maven has to take care of responsibilities of CI
> > > (project cache mgmt), that's not our task I would say and we as Maven
> > would
> > > never know all about the miscellaneous CI specifics and therefore we
> > would
> > > not cope with CI related troubles.
> > >
> > > Cheers
> > > Tibor17
> > >
> > >
> > >
> > > On Sat, Sep 14, 2019 at 11:08 AM Robert Scholte <rf...@apache.org>
> > > wrote:
> > >
> > > > On Fri, 13 Sep 2019 23:37:15 +0200, Romain Manni-Bucau
> > > > <rm...@gmail.com> wrote:
> > > >
> > > > > There are multiple possible incremental support:
> > > > >
> > > > > 1. Scm related: do a status and rebuild downstream reactor
> > > > > 2. Full and module build graph: seems it is the one you target, ie
> > > bypass
> > > > > modules without change. Note that it only works if upstream graph is
> > > > > taken
> > > > > into account.
> > > > > 3. Full build: each mojo has incremental support so the full build
> > gets
> > > > > it.
> > > > > Issue is that it requires each mojo to know if it needs to be
> > executed
> > > or
> > > > > give enough info to the mojo executor to do so (gradle requires all
> > > > > inputs/outputs to assume this state - which is still just an
> > heuristic
> > > > > and
> > > > > not 100% reliable).
> > > > >
> > > > > In current state, 2. sounds like a good option since 3 can require  a
> > > > > loot
> > > > > of work for external plugins (today's builds have a lot more of not
> > > maven
> > > > > provide plugins than core plugins).
> > > > > Now, we should be able to activate it or not so having a
> > cacheLocation
> > > > > config in settings.xml can be good.
> > > > >
> > > > > Side notes:
> > > > >
> > > > > 1. having it on by default will break builds - reactor is
> > deterministic
> > > > > and
> > > > > bypassing a module can break a build since it can init maven
> > > properties -
> > > > > for ex - for next modules
> > > > > 2. You cant find all in/out paths from the pom in general so your
> > algo
> > > is
> > > > > not generic, a meta config can be needed in .mvn
> > > > > 3. We should let a mojo be able to disable that to replace default
> > > logic
> > > > > (surefire is a good example where it must be refined and it can save
> > > > > hours
> > > > > there ;))
> > > > > 4. Let's try to impl it as a mvn extension first then if it works
> > well
> > > on
> > > > > multiple big project get it to core?
> > > >
> > > > Did anyone Google for "maven extension build cache"? There are already
> > > > commercial solutions for it.
> > > > Even though I would like to see improvements in this area, the old
> > > > architecture of Maven makes it quite hard to move to that situation.
> > > > First
> > > > of all it requires changes to the Plugin API (without breaking
> > backwards
> > > > compatibility) to have support out of the box.
> > > >
> > > > Robert
> > > >
> > > > >
> > > > > Romain
> > > > >
> > > > >
> > > > >
> > > > > Le ven. 13 sept. 2019 à 23:18, Tibor Digana <ti...@apache.org>
> > a
> > > > > écrit :
> > > > >
> > > > >> In theory, the incremental compiler would make it faster.
> > > > >> But this can be told only if you present a demo project with has
> > > trivial
> > > > >> tests taking much less time to complete than the compiler.
> > > > >>
> > > > >> In reality the tests in huge projects take significantly longer time
> > > > >> than
> > > > >> the compiler.
> > > > >> Some developers say "switch off all the tests" in the release phase
> > > but
> > > > >> that's wrong because then the quality goes down and methodologies
> > are
> > > > >> broken.
> > > > >>
> > > > >> I can see a big problem that we do not have an interface between
> > > > >> Surefire
> > > > >> and Compiler plugin negotiating which tests have been modified
> > > including
> > > > >> modules and classes in the entire structure.
> > > > >>
> > > > >> Having incremental compiler is easy, just use compiler:3.8.1 or use
> > > the
> > > > >> Takari compiler.
> > > > >> But IMO the biggest benefit in performance would be after having the
> > > > >> truly
> > > > >> incremental test executor.
> > > > >>
> > > > >> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> > > > >> maximilian.novikov@db.com> wrote:
> > > > >>
> > > > >> > Hi All,
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > *We want to create upstream change to Maven* to support true
> > > > >> incremental
> > > > >> > build for big-sized projects.
> > > > >> >
> > > > >> > To raise a pull request we have to pass long chain of Deutsche
> > > Bank’s
> > > > >> > internal procedures. So, *before starting the process we would
> > like
> > > to
> > > > >> > get your feedback regarding this feature*.
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > *Motivation:*
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > Our project is hosted in mono-repo and contains ~600 modules. All
> > > > >> modules
> > > > >> > has the same SNAPSHOT version.
> > > > >> >
> > > > >> > There are lot of test automation around this, everything is tested
> > > > >> before
> > > > >> > merge into release branch.
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > Current setup helps us to simplify build/release/dependency
> > > management
> > > > >> for
> > > > >> > 10+ teams those contribute into codebase. We can release
> > everything
> > > in
> > > > >> > 1-click.
> > > > >> >
> > > > >> > The major drawback of such approach is build time: *full local
> > build
> > > > >> took
> > > > >> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > To speed-up our build we needed 2 features: incremental build and
> > > > >> shared
> > > > >> > cache.
> > > > >> >
> > > > >> > Initially we started to think about migration to Gradle or Bazel.
> > As
> > > > >> > migration costs for the mentioned tools were too high, we decided
> > to
> > > > >> add
> > > > >> > similar functionality into Maven.
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > Current results we get: *1-2 mins for local build(*-T8*)* if build
> > > was
> > > > >> > cached by CI*, CI build ~5 mins (*-T16*).*
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > *Feature description:*
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > The idea is to calculate checksum for inputs and save outputs in
> > > > >> cache.
> > > > >> >
> > > > >> > [image: image2019-8-27_20-0-14.png]
> > > > >> >
> > > > >> > Each node checksum calculated with:
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > ·         Effective POM hash
> > > > >> >
> > > > >> > ·         Sources hash
> > > > >> >
> > > > >> > ·         Dependencies hash (dependencies within multi-module
> > > project)
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > Project sources inputs are searched inside project + all paths
> > from
> > > > >> > plugins configuration:
> > > > >> >
> > > > >> > [image: image2019-8-30_10-28-56.png]
> > > > >> >
> > > > >> > How does it work in practice:
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > 1.       CI: runs builds and stores outputs in shared cache
> > > > >> >
> > > > >> > 2.       CI: reuse outputs for same inputs, so time is decreasing
> > > > >> >
> > > > >> > 3.       Locally: when I checkout branch and run ‘install’ for
> > whole
> > > > >> > project, I get all actual snapshots from remote cache for this
> > > branch
> > > > >> >
> > > > >> > 4.       Locally: if I change multiple modules in tree, only
> > changed
> > > > >> > subtree is rebuilt
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > Impact on current Maven codebase is very localized (MojoExecutor,
> > > > >> where
> > > > >> we
> > > > >> > injected cache controller).
> > > > >> >
> > > > >> > Caching can be activated/deactivated by property, so current maven
> > > > >> flow
> > > > >> > will work as is.
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > And the big plus is that you don’t need to re-work your current
> > > > >> project.
> > > > >> > Caching should work out of box, just need to add config in .mvn
> > > > >> folder.
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > Please let us know what do you think. We are ready to invest in
> > this
> > > > >> > feature and address any further feedback.
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > Kind regards,
> > > > >> >
> > > > >> > Max
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > ---
> > > > >> > This e-mail may contain confidential and/or privileged
> > information.
> > > If
> > > > >> you
> > > > >> > are not the intended recipient (or have received this e-mail in
> > > error)
> > > > >> > please notify the sender immediately and delete this e-mail. Any
> > > > >> > unauthorized copying, disclosure or distribution of the material
> > in
> > > > >> this
> > > > >> > e-mail is strictly forbidden.
> > > > >> >
> > > > >> > Please refer to https://www.db.com/disclosures for additional EU
> > > > >> > corporate and regulatory disclosures and to
> > > > >> > http://www.db.com/unitedkingdom/content/privacy.htm for
> > information
> > > > >> about
> > > > >> > privacy.
> > > > >> >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > > > For additional commands, e-mail: dev-help@maven.apache.org
> > > >
> > > >
> > >
> >
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Tibor Digana <ti...@apache.org>.
Robert, I understand the discussion. There is the requirement (1) to cache
the targets and second discussion is to (2) switch on/off unchanged modules
in multi-module project.


I had used the (1) in the company Software AG and it was really
unpredictable build with hunders POMs in project. I would never ever
recommend doing it with caching targets and local repos. We as developer
were unable to run the local build, you had to rely on matching SCM
revision and hash in the cache.

Solution for (2) is simple. There are already plugins and extensions. We
only have to "tell the Maven" to switch off building unmodified modules -
skipp the lifecycle in a module.
The (2) at least does not look like a workaround because no change would
happen in unchanged module. Of course this feature should be disabled by
default, and enabled explicitly by the user.

I agree with Enrico, saying that x100 modules in multi-module project
represents a bad project design and the structure should split into
separate SCM projects.

I do not agree with some user saying that a separate dependencies (in
segregated SCM project) should declare SNAPSHOT versions. Again, I have
commercial experiences in Scheidt & Bachmann where this approach broke
consistency of the main product and caused some slowness of s/w
development. Solution was to teach the developer to execute the command
"mvn release prepare release:perform" which I did in the company.
I think the Maven is about the best practices and we should keep declaring
them all the time.

On Sat, Sep 14, 2019 at 3:44 PM Robert Scholte <rf...@apache.org> wrote:

> Tibor, it seems like you're missing the bigger picture.
> The question is similar to what we've discussed in the past: can we
> define
> if surefire should be executed or not?
>
> We should define incremental builds as "should a goal be executed or
> not?", e.g. based on the results of the previous build.
> First of all: calling 'clean' makes it impossible to do incremental builds.
> Next, it is the *plugin-developer* that knows best if the goal should be
> executed or not. Now it is still logic inside the plugin, but if the
> plugin API understands input and output, we can leave it up to Maven to
> decide if a goal should be executed.
> The buildplan now gives us a graph of Maven Projects, but theoretically
> with such changes we could make a graph of goals. And it could detect
> useless calls of goals, because the output is never being used.
> Some might recognize a Gradle concept here, and that's correct. At this
> point they were able to design something that works better compared to
> Maven. For their build cache extension they had to analyze the plugin
> descriptors, marking all parameters as either input or output. And that
> boosts the builds with their extension.
>
> thanks,
> Robert
>
>
> On Sat, 14 Sep 2019 13:37:40 +0200, Tibor Digana <ti...@apache.org>
>
> wrote:
>
> > oh yeah, exactly opposite.
> > Jenkins has several ways to create Maven build configuration and it knows
> > where the repo and workspace is, it knows where to store the archive, it
> > knows when the build failed.
> > We cannot take the responsibility because the build may fail for whatever
> > reason and we do not know whether to keep the folders or delete all
> > "/target" folders or just to delete only the failed one. The user knows
> > it.
> > We cannot archive the folders because we may significantly cause very
> > high
> > disk usage which would be without the control of CI. And we cannot take
> > the
> > responsibility of lifetime of these archives. It is all the property of
> > Jenkins and Jenkins has the feature and management plugins where the
> > workspace may retain for certain period of time, archives are limited in
> > some way. The archives can be stored in another folder and we should not
> > adopt these responsibilities because then we suddenly end up with all the
> > knowledge of the distributed system and then we as maven project would
> > end
> > as unmaintainable project with many more issues in Jira and
> requirements
> > we
> > would be able to find the spare time to develop.
> >
> > On Sat, Sep 14, 2019 at 1:25 PM Romain Manni-Bucau
> > <rm...@gmail.com>
> > wrote:
> >
> >> Tibor, maven is the only one with the logic to give any cache the data
> >> it
> >> needs. Jenkins alone can't since it does not own the reactor nor
> plugin
> >> I/O
> >> values.
> >>
> >> Le sam. 14 sept. 2019 à 12:45, Tibor Digana <ti...@apache.org> a
> >> écrit :
> >>
> >> > But I do not understand why the Maven should be responsible for the
> >> project
> >> > cahe control/management of "/target" directories.
> >> > It is a responsibility of the build manager which is the Jenkins.
> >> > The Jenkins has the ability to archive files and such property already
> >> > exists in the Jenkins.
> >> >
> >> > So the Jenkins has a full knowledge about:
> >> >
> >> > 1. how long the workspace content retains intact
> >> > 2. what commit hash is for the last build/job/branch
> >> > 3. and what commit was successful
> >> >
> >> > If the target directories retain intact (or renewed from archive) in
> >> the
> >> > workspace for very long time and the workspace was reused by the next
> >> build
> >> > then I would say that the improvement should work as it is on CI
> >> level.
> >> >
> >> > Maybe what is necessary is only that improvement in Maven where we
> >> would
> >> > obtain the list of modules or directories of changes in the current
> >> commit.
> >> > Then the Maven can highly optimize its own build steps and build only
> >> those
> >> > modules which have been changed and their dependent modules.
> >> > So the interface between CI and Maven is needed in a kind of
> >> extension or
> >> > the class MavenCli can be extended with some new entrypoint.
> >> >
> >> > But I do not hink that Maven has to take care of responsibilities of
> >> CI
> >> > (project cache mgmt), that's not our task I would say and we as Maven
> >> would
> >> > never know all about the miscellaneous CI specifics and therefore we
> >> would
> >> > not cope with CI related troubles.
> >> >
> >> > Cheers
> >> > Tibor17
> >> >
> >> >
> >> >
> >> > On Sat, Sep 14, 2019 at 11:08 AM Robert Scholte <rfscholte@apache.org
> >
> >> > wrote:
> >> >
> >> > > On Fri, 13 Sep 2019 23:37:15 +0200, Romain Manni-Bucau
> >> > > <rm...@gmail.com> wrote:
> >> > >
> >> > > > There are multiple possible incremental support:
> >> > > >
> >> > > > 1. Scm related: do a status and rebuild downstream reactor
> >> > > > 2. Full and module build graph: seems it is the one you target, ie
> >> > bypass
> >> > > > modules without change. Note that it only works if upstream
> graph
> >> is
> >> > > > taken
> >> > > > into account.
> >> > > > 3. Full build: each mojo has incremental support so the full build
> >> gets
> >> > > > it.
> >> > > > Issue is that it requires each mojo to know if it needs to be
> >> executed
> >> > or
> >> > > > give enough info to the mojo executor to do so (gradle requires
> >> all
> >> > > > inputs/outputs to assume this state - which is still just an
> >> heuristic
> >> > > > and
> >> > > > not 100% reliable).
> >> > > >
> >> > > > In current state, 2. sounds like a good option since 3 can
> >> require  a
> >> > > > loot
> >> > > > of work for external plugins (today's builds have a lot more of
> >> not
> >> > maven
> >> > > > provide plugins than core plugins).
> >> > > > Now, we should be able to activate it or not so having a
> >> cacheLocation
> >> > > > config in settings.xml can be good.
> >> > > >
> >> > > > Side notes:
> >> > > >
> >> > > > 1. having it on by default will break builds - reactor is
> >> deterministic
> >> > > > and
> >> > > > bypassing a module can break a build since it can init maven
> >> > properties -
> >> > > > for ex - for next modules
> >> > > > 2. You cant find all in/out paths from the pom in general so your
> >> algo
> >> > is
> >> > > > not generic, a meta config can be needed in .mvn
> >> > > > 3. We should let a mojo be able to disable that to replace default
> >> > logic
> >> > > > (surefire is a good example where it must be refined and it can
> >> save
> >> > > > hours
> >> > > > there ;))
> >> > > > 4. Let's try to impl it as a mvn extension first then if it works
> >> well
> >> > on
> >> > > > multiple big project get it to core?
> >> > >
> >> > > Did anyone Google for "maven extension build cache"? There are
> >> already
> >> > > commercial solutions for it.
> >> > > Even though I would like to see improvements in this area, the old
> >> > > architecture of Maven makes it quite hard to move to that situation.
> >> > > First
> >> > > of all it requires changes to the Plugin API (without breaking
> >> backwards
> >> > > compatibility) to have support out of the box.
> >> > >
> >> > > Robert
> >> > >
> >> > > >
> >> > > > Romain
> >> > > >
> >> > > >
> >> > > >
> >> > > > Le ven. 13 sept. 2019 à 23:18, Tibor Digana
> >> <ti...@apache.org>
> >> a
> >> > > > écrit :
> >> > > >
> >> > > >> In theory, the incremental compiler would make it faster.
> >> > > >> But this can be told only if you present a demo project with has
> >> > trivial
> >> > > >> tests taking much less time to complete than the compiler.
> >> > > >>
> >> > > >> In reality the tests in huge projects take significantly longer
> >> time
> >> > > >> than
> >> > > >> the compiler.
> >> > > >> Some developers say "switch off all the tests" in the release
> >> phase
> >> > but
> >> > > >> that's wrong because then the quality goes down and methodologies
> >> are
> >> > > >> broken.
> >> > > >>
> >> > > >> I can see a big problem that we do not have an interface between
> >> > > >> Surefire
> >> > > >> and Compiler plugin negotiating which tests have been modified
> >> > including
> >> > > >> modules and classes in the entire structure.
> >> > > >>
> >> > > >> Having incremental compiler is easy, just use compiler:3.8.1 or
> >> use
> >> > the
> >> > > >> Takari compiler.
> >> > > >> But IMO the biggest benefit in performance would be after
> having
> >> the
> >> > > >> truly
> >> > > >> incremental test executor.
> >> > > >>
> >> > > >> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> >> > > >> maximilian.novikov@db.com> wrote:
> >> > > >>
> >> > > >> > Hi All,
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > *We want to create upstream change to Maven* to support true
> >> > > >> incremental
> >> > > >> > build for big-sized projects.
> >> > > >> >
> >> > > >> > To raise a pull request we have to pass long chain of Deutsche
> >> > Bank’s
> >> > > >> > internal procedures. So, *before starting the process we would
> >> like
> >> > to
> >> > > >> > get your feedback regarding this feature*.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > *Motivation:*
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Our project is hosted in mono-repo and contains ~600 modules.
> >> All
> >> > > >> modules
> >> > > >> > has the same SNAPSHOT version.
> >> > > >> >
> >> > > >> > There are lot of test automation around this, everything is
> >> tested
> >> > > >> before
> >> > > >> > merge into release branch.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Current setup helps us to simplify build/release/dependency
> >> > management
> >> > > >> for
> >> > > >> > 10+ teams those contribute into codebase. We can release
> >> everything
> >> > in
> >> > > >> > 1-click.
> >> > > >> >
> >> > > >> > The major drawback of such approach is build time: *full local
> >> build
> >> > > >> took
> >> > > >> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > To speed-up our build we needed 2 features: incremental build
> >> and
> >> > > >> shared
> >> > > >> > cache.
> >> > > >> >
> >> > > >> > Initially we started to think about migration to Gradle or
> >> Bazel.
> >> As
> >> > > >> > migration costs for the mentioned tools were too high, we
> >> decided
> >> to
> >> > > >> add
> >> > > >> > similar functionality into Maven.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Current results we get: *1-2 mins for local build(*-T8*)* if
> >> build
> >> > was
> >> > > >> > cached by CI*, CI build ~5 mins (*-T16*).*
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > *Feature description:*
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > The idea is to calculate checksum for inputs and save outputs
> >> in
> >> > > >> cache.
> >> > > >> >
> >> > > >> > [image: image2019-8-27_20-0-14.png]
> >> > > >> >
> >> > > >> > Each node checksum calculated with:
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > ·         Effective POM hash
> >> > > >> >
> >> > > >> > ·         Sources hash
> >> > > >> >
> >> > > >> > ·         Dependencies hash (dependencies within multi-module
> >> > project)
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Project sources inputs are searched inside project + all paths
> >> from
> >> > > >> > plugins configuration:
> >> > > >> >
> >> > > >> > [image: image2019-8-30_10-28-56.png]
> >> > > >> >
> >> > > >> > How does it work in practice:
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > 1.       CI: runs builds and stores outputs in shared cache
> >> > > >> >
> >> > > >> > 2.       CI: reuse outputs for same inputs, so time is
> >> decreasing
> >> > > >> >
> >> > > >> > 3.       Locally: when I checkout branch and run ‘install’ for
> >> whole
> >> > > >> > project, I get all actual snapshots from remote cache for this
> >> > branch
> >> > > >> >
> >> > > >> > 4.       Locally: if I change multiple modules in tree, only
> >> changed
> >> > > >> > subtree is rebuilt
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Impact on current Maven codebase is very localized
> >> (MojoExecutor,
> >> > > >> where
> >> > > >> we
> >> > > >> > injected cache controller).
> >> > > >> >
> >> > > >> > Caching can be activated/deactivated by property, so current
> >> maven
> >> > > >> flow
> >> > > >> > will work as is.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > And the big plus is that you don’t need to re-work your current
> >> > > >> project.
> >> > > >> > Caching should work out of box, just need to add config in .mvn
> >> > > >> folder.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Please let us know what do you think. We are ready to invest in
> >> this
> >> > > >> > feature and address any further feedback.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Kind regards,
> >> > > >> >
> >> > > >> > Max
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > ---
> >> > > >> > This e-mail may contain confidential and/or privileged
> >> information.
> >> > If
> >> > > >> you
> >> > > >> > are not the intended recipient (or have received this e-mail in
> >> > error)
> >> > > >> > please notify the sender immediately and delete this e-mail.
> >> Any
> >> > > >> > unauthorized copying, disclosure or distribution of the
> >> material
> >> in
> >> > > >> this
> >> > > >> > e-mail is strictly forbidden.
> >> > > >> >
> >> > > >> > Please refer to https://www.db.com/disclosures for
> additional
> >> EU
> >> > > >> > corporate and regulatory disclosures and to
> >> > > >> > http://www.db.com/unitedkingdom/content/privacy.htm for
> >> information
> >> > > >> about
> >> > > >> > privacy.
> >> > > >> >
> >> > >
> >> > >
> >> ---------------------------------------------------------------------
> >> > > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> >> > > For additional commands, e-mail: dev-help@maven.apache.org
> >> > >
> >> > >
> >> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Ashitkin <as...@gmail.com>.
HI Robert
seems to your thinking matches to our own one. Indeed, the question is - should be executed this goal or not? But in current plugin-api limitations not a plugin-developer but only project developer can understand should this goal run or not because of side effects each plugin might have. And yes, if plugin api defines input output and have more granular execution lifecycle, it will be much more friendly for caching/incremental approach. That was one of the options for us, but we decided to avoid huge reworks in sake of easier portability between versions.

Thank you


On 2019/09/14 13:44:36, "Robert Scholte" <rf...@apache.org> wrote: 
> Tibor, it seems like you're missing the bigger picture.
> The question is similar to what we've discussed in the past: can we define  
> if surefire should be executed or not?
> 
> We should define incremental builds as "should a goal be executed or  
> not?", e.g. based on the results of the previous build.
> First of all: calling 'clean' makes it impossible to do incremental builds.
> Next, it is the *plugin-developer* that knows best if the goal should be  
> executed or not. Now it is still logic inside the plugin, but if the  
> plugin API understands input and output, we can leave it up to Maven to  
> decide if a goal should be executed.
> The buildplan now gives us a graph of Maven Projects, but theoretically  
> with such changes we could make a graph of goals. And it could detect  
> useless calls of goals, because the output is never being used.
> Some might recognize a Gradle concept here, and that's correct. At this  
> point they were able to design something that works better compared to  
> Maven. For their build cache extension they had to analyze the plugin  
> descriptors, marking all parameters as either input or output. And that  
> boosts the builds with their extension.
> 
> thanks,
> Robert
> 
> 
> On Sat, 14 Sep 2019 13:37:40 +0200, Tibor Digana <ti...@apache.org>  
> wrote:
> 
> > oh yeah, exactly opposite.
> > Jenkins has several ways to create Maven build configuration and it knows
> > where the repo and workspace is, it knows where to store the archive, it
> > knows when the build failed.
> > We cannot take the responsibility because the build may fail for whatever
> > reason and we do not know whether to keep the folders or delete all
> > "/target" folders or just to delete only the failed one. The user knows  
> > it.
> > We cannot archive the folders because we may significantly cause very  
> > high
> > disk usage which would be without the control of CI. And we cannot take  
> > the
> > responsibility of lifetime of these archives. It is all the property of
> > Jenkins and Jenkins has the feature and management plugins where the
> > workspace may retain for certain period of time, archives are limited in
> > some way. The archives can be stored in another folder and we should not
> > adopt these responsibilities because then we suddenly end up with all the
> > knowledge of the distributed system and then we as maven project would  
> > end
> > as unmaintainable project with many more issues in Jira and requirements  
> > we
> > would be able to find the spare time to develop.
> >
> > On Sat, Sep 14, 2019 at 1:25 PM Romain Manni-Bucau  
> > <rm...@gmail.com>
> > wrote:
> >
> >> Tibor, maven is the only one with the logic to give any cache the data  
> >> it
> >> needs. Jenkins alone can't since it does not own the reactor nor plugin  
> >> I/O
> >> values.
> >>
> >> Le sam. 14 sept. 2019 à 12:45, Tibor Digana <ti...@apache.org> a
> >> écrit :
> >>
> >> > But I do not understand why the Maven should be responsible for the
> >> project
> >> > cahe control/management of "/target" directories.
> >> > It is a responsibility of the build manager which is the Jenkins.
> >> > The Jenkins has the ability to archive files and such property already
> >> > exists in the Jenkins.
> >> >
> >> > So the Jenkins has a full knowledge about:
> >> >
> >> > 1. how long the workspace content retains intact
> >> > 2. what commit hash is for the last build/job/branch
> >> > 3. and what commit was successful
> >> >
> >> > If the target directories retain intact (or renewed from archive) in  
> >> the
> >> > workspace for very long time and the workspace was reused by the next
> >> build
> >> > then I would say that the improvement should work as it is on CI  
> >> level.
> >> >
> >> > Maybe what is necessary is only that improvement in Maven where we  
> >> would
> >> > obtain the list of modules or directories of changes in the current
> >> commit.
> >> > Then the Maven can highly optimize its own build steps and build only
> >> those
> >> > modules which have been changed and their dependent modules.
> >> > So the interface between CI and Maven is needed in a kind of  
> >> extension or
> >> > the class MavenCli can be extended with some new entrypoint.
> >> >
> >> > But I do not hink that Maven has to take care of responsibilities of  
> >> CI
> >> > (project cache mgmt), that's not our task I would say and we as Maven
> >> would
> >> > never know all about the miscellaneous CI specifics and therefore we
> >> would
> >> > not cope with CI related troubles.
> >> >
> >> > Cheers
> >> > Tibor17
> >> >
> >> >
> >> >
> >> > On Sat, Sep 14, 2019 at 11:08 AM Robert Scholte <rf...@apache.org>
> >> > wrote:
> >> >
> >> > > On Fri, 13 Sep 2019 23:37:15 +0200, Romain Manni-Bucau
> >> > > <rm...@gmail.com> wrote:
> >> > >
> >> > > > There are multiple possible incremental support:
> >> > > >
> >> > > > 1. Scm related: do a status and rebuild downstream reactor
> >> > > > 2. Full and module build graph: seems it is the one you target, ie
> >> > bypass
> >> > > > modules without change. Note that it only works if upstream graph  
> >> is
> >> > > > taken
> >> > > > into account.
> >> > > > 3. Full build: each mojo has incremental support so the full build
> >> gets
> >> > > > it.
> >> > > > Issue is that it requires each mojo to know if it needs to be
> >> executed
> >> > or
> >> > > > give enough info to the mojo executor to do so (gradle requires  
> >> all
> >> > > > inputs/outputs to assume this state - which is still just an
> >> heuristic
> >> > > > and
> >> > > > not 100% reliable).
> >> > > >
> >> > > > In current state, 2. sounds like a good option since 3 can  
> >> require  a
> >> > > > loot
> >> > > > of work for external plugins (today's builds have a lot more of  
> >> not
> >> > maven
> >> > > > provide plugins than core plugins).
> >> > > > Now, we should be able to activate it or not so having a
> >> cacheLocation
> >> > > > config in settings.xml can be good.
> >> > > >
> >> > > > Side notes:
> >> > > >
> >> > > > 1. having it on by default will break builds - reactor is
> >> deterministic
> >> > > > and
> >> > > > bypassing a module can break a build since it can init maven
> >> > properties -
> >> > > > for ex - for next modules
> >> > > > 2. You cant find all in/out paths from the pom in general so your
> >> algo
> >> > is
> >> > > > not generic, a meta config can be needed in .mvn
> >> > > > 3. We should let a mojo be able to disable that to replace default
> >> > logic
> >> > > > (surefire is a good example where it must be refined and it can  
> >> save
> >> > > > hours
> >> > > > there ;))
> >> > > > 4. Let's try to impl it as a mvn extension first then if it works
> >> well
> >> > on
> >> > > > multiple big project get it to core?
> >> > >
> >> > > Did anyone Google for "maven extension build cache"? There are  
> >> already
> >> > > commercial solutions for it.
> >> > > Even though I would like to see improvements in this area, the old
> >> > > architecture of Maven makes it quite hard to move to that situation.
> >> > > First
> >> > > of all it requires changes to the Plugin API (without breaking
> >> backwards
> >> > > compatibility) to have support out of the box.
> >> > >
> >> > > Robert
> >> > >
> >> > > >
> >> > > > Romain
> >> > > >
> >> > > >
> >> > > >
> >> > > > Le ven. 13 sept. 2019 à 23:18, Tibor Digana  
> >> <ti...@apache.org>
> >> a
> >> > > > écrit :
> >> > > >
> >> > > >> In theory, the incremental compiler would make it faster.
> >> > > >> But this can be told only if you present a demo project with has
> >> > trivial
> >> > > >> tests taking much less time to complete than the compiler.
> >> > > >>
> >> > > >> In reality the tests in huge projects take significantly longer  
> >> time
> >> > > >> than
> >> > > >> the compiler.
> >> > > >> Some developers say "switch off all the tests" in the release  
> >> phase
> >> > but
> >> > > >> that's wrong because then the quality goes down and methodologies
> >> are
> >> > > >> broken.
> >> > > >>
> >> > > >> I can see a big problem that we do not have an interface between
> >> > > >> Surefire
> >> > > >> and Compiler plugin negotiating which tests have been modified
> >> > including
> >> > > >> modules and classes in the entire structure.
> >> > > >>
> >> > > >> Having incremental compiler is easy, just use compiler:3.8.1 or  
> >> use
> >> > the
> >> > > >> Takari compiler.
> >> > > >> But IMO the biggest benefit in performance would be after having  
> >> the
> >> > > >> truly
> >> > > >> incremental test executor.
> >> > > >>
> >> > > >> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> >> > > >> maximilian.novikov@db.com> wrote:
> >> > > >>
> >> > > >> > Hi All,
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > *We want to create upstream change to Maven* to support true
> >> > > >> incremental
> >> > > >> > build for big-sized projects.
> >> > > >> >
> >> > > >> > To raise a pull request we have to pass long chain of Deutsche
> >> > Bank’s
> >> > > >> > internal procedures. So, *before starting the process we would
> >> like
> >> > to
> >> > > >> > get your feedback regarding this feature*.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > *Motivation:*
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Our project is hosted in mono-repo and contains ~600 modules.  
> >> All
> >> > > >> modules
> >> > > >> > has the same SNAPSHOT version.
> >> > > >> >
> >> > > >> > There are lot of test automation around this, everything is  
> >> tested
> >> > > >> before
> >> > > >> > merge into release branch.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Current setup helps us to simplify build/release/dependency
> >> > management
> >> > > >> for
> >> > > >> > 10+ teams those contribute into codebase. We can release
> >> everything
> >> > in
> >> > > >> > 1-click.
> >> > > >> >
> >> > > >> > The major drawback of such approach is build time: *full local
> >> build
> >> > > >> took
> >> > > >> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > To speed-up our build we needed 2 features: incremental build  
> >> and
> >> > > >> shared
> >> > > >> > cache.
> >> > > >> >
> >> > > >> > Initially we started to think about migration to Gradle or  
> >> Bazel.
> >> As
> >> > > >> > migration costs for the mentioned tools were too high, we  
> >> decided
> >> to
> >> > > >> add
> >> > > >> > similar functionality into Maven.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Current results we get: *1-2 mins for local build(*-T8*)* if  
> >> build
> >> > was
> >> > > >> > cached by CI*, CI build ~5 mins (*-T16*).*
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > *Feature description:*
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > The idea is to calculate checksum for inputs and save outputs  
> >> in
> >> > > >> cache.
> >> > > >> >
> >> > > >> > [image: image2019-8-27_20-0-14.png]
> >> > > >> >
> >> > > >> > Each node checksum calculated with:
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > ·         Effective POM hash
> >> > > >> >
> >> > > >> > ·         Sources hash
> >> > > >> >
> >> > > >> > ·         Dependencies hash (dependencies within multi-module
> >> > project)
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Project sources inputs are searched inside project + all paths
> >> from
> >> > > >> > plugins configuration:
> >> > > >> >
> >> > > >> > [image: image2019-8-30_10-28-56.png]
> >> > > >> >
> >> > > >> > How does it work in practice:
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > 1.       CI: runs builds and stores outputs in shared cache
> >> > > >> >
> >> > > >> > 2.       CI: reuse outputs for same inputs, so time is  
> >> decreasing
> >> > > >> >
> >> > > >> > 3.       Locally: when I checkout branch and run ‘install’ for
> >> whole
> >> > > >> > project, I get all actual snapshots from remote cache for this
> >> > branch
> >> > > >> >
> >> > > >> > 4.       Locally: if I change multiple modules in tree, only
> >> changed
> >> > > >> > subtree is rebuilt
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Impact on current Maven codebase is very localized  
> >> (MojoExecutor,
> >> > > >> where
> >> > > >> we
> >> > > >> > injected cache controller).
> >> > > >> >
> >> > > >> > Caching can be activated/deactivated by property, so current  
> >> maven
> >> > > >> flow
> >> > > >> > will work as is.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > And the big plus is that you don’t need to re-work your current
> >> > > >> project.
> >> > > >> > Caching should work out of box, just need to add config in .mvn
> >> > > >> folder.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Please let us know what do you think. We are ready to invest in
> >> this
> >> > > >> > feature and address any further feedback.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Kind regards,
> >> > > >> >
> >> > > >> > Max
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > ---
> >> > > >> > This e-mail may contain confidential and/or privileged
> >> information.
> >> > If
> >> > > >> you
> >> > > >> > are not the intended recipient (or have received this e-mail in
> >> > error)
> >> > > >> > please notify the sender immediately and delete this e-mail.  
> >> Any
> >> > > >> > unauthorized copying, disclosure or distribution of the  
> >> material
> >> in
> >> > > >> this
> >> > > >> > e-mail is strictly forbidden.
> >> > > >> >
> >> > > >> > Please refer to https://www.db.com/disclosures for additional  
> >> EU
> >> > > >> > corporate and regulatory disclosures and to
> >> > > >> > http://www.db.com/unitedkingdom/content/privacy.htm for
> >> information
> >> > > >> about
> >> > > >> > privacy.
> >> > > >> >
> >> > >
> >> > >  
> >> ---------------------------------------------------------------------
> >> > > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> >> > > For additional commands, e-mail: dev-help@maven.apache.org
> >> > >
> >> > >
> >> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Robert Scholte <rf...@apache.org>.
Tibor, it seems like you're missing the bigger picture.
The question is similar to what we've discussed in the past: can we define  
if surefire should be executed or not?

We should define incremental builds as "should a goal be executed or  
not?", e.g. based on the results of the previous build.
First of all: calling 'clean' makes it impossible to do incremental builds.
Next, it is the *plugin-developer* that knows best if the goal should be  
executed or not. Now it is still logic inside the plugin, but if the  
plugin API understands input and output, we can leave it up to Maven to  
decide if a goal should be executed.
The buildplan now gives us a graph of Maven Projects, but theoretically  
with such changes we could make a graph of goals. And it could detect  
useless calls of goals, because the output is never being used.
Some might recognize a Gradle concept here, and that's correct. At this  
point they were able to design something that works better compared to  
Maven. For their build cache extension they had to analyze the plugin  
descriptors, marking all parameters as either input or output. And that  
boosts the builds with their extension.

thanks,
Robert


On Sat, 14 Sep 2019 13:37:40 +0200, Tibor Digana <ti...@apache.org>  
wrote:

> oh yeah, exactly opposite.
> Jenkins has several ways to create Maven build configuration and it knows
> where the repo and workspace is, it knows where to store the archive, it
> knows when the build failed.
> We cannot take the responsibility because the build may fail for whatever
> reason and we do not know whether to keep the folders or delete all
> "/target" folders or just to delete only the failed one. The user knows  
> it.
> We cannot archive the folders because we may significantly cause very  
> high
> disk usage which would be without the control of CI. And we cannot take  
> the
> responsibility of lifetime of these archives. It is all the property of
> Jenkins and Jenkins has the feature and management plugins where the
> workspace may retain for certain period of time, archives are limited in
> some way. The archives can be stored in another folder and we should not
> adopt these responsibilities because then we suddenly end up with all the
> knowledge of the distributed system and then we as maven project would  
> end
> as unmaintainable project with many more issues in Jira and requirements  
> we
> would be able to find the spare time to develop.
>
> On Sat, Sep 14, 2019 at 1:25 PM Romain Manni-Bucau  
> <rm...@gmail.com>
> wrote:
>
>> Tibor, maven is the only one with the logic to give any cache the data  
>> it
>> needs. Jenkins alone can't since it does not own the reactor nor plugin  
>> I/O
>> values.
>>
>> Le sam. 14 sept. 2019 à 12:45, Tibor Digana <ti...@apache.org> a
>> écrit :
>>
>> > But I do not understand why the Maven should be responsible for the
>> project
>> > cahe control/management of "/target" directories.
>> > It is a responsibility of the build manager which is the Jenkins.
>> > The Jenkins has the ability to archive files and such property already
>> > exists in the Jenkins.
>> >
>> > So the Jenkins has a full knowledge about:
>> >
>> > 1. how long the workspace content retains intact
>> > 2. what commit hash is for the last build/job/branch
>> > 3. and what commit was successful
>> >
>> > If the target directories retain intact (or renewed from archive) in  
>> the
>> > workspace for very long time and the workspace was reused by the next
>> build
>> > then I would say that the improvement should work as it is on CI  
>> level.
>> >
>> > Maybe what is necessary is only that improvement in Maven where we  
>> would
>> > obtain the list of modules or directories of changes in the current
>> commit.
>> > Then the Maven can highly optimize its own build steps and build only
>> those
>> > modules which have been changed and their dependent modules.
>> > So the interface between CI and Maven is needed in a kind of  
>> extension or
>> > the class MavenCli can be extended with some new entrypoint.
>> >
>> > But I do not hink that Maven has to take care of responsibilities of  
>> CI
>> > (project cache mgmt), that's not our task I would say and we as Maven
>> would
>> > never know all about the miscellaneous CI specifics and therefore we
>> would
>> > not cope with CI related troubles.
>> >
>> > Cheers
>> > Tibor17
>> >
>> >
>> >
>> > On Sat, Sep 14, 2019 at 11:08 AM Robert Scholte <rf...@apache.org>
>> > wrote:
>> >
>> > > On Fri, 13 Sep 2019 23:37:15 +0200, Romain Manni-Bucau
>> > > <rm...@gmail.com> wrote:
>> > >
>> > > > There are multiple possible incremental support:
>> > > >
>> > > > 1. Scm related: do a status and rebuild downstream reactor
>> > > > 2. Full and module build graph: seems it is the one you target, ie
>> > bypass
>> > > > modules without change. Note that it only works if upstream graph  
>> is
>> > > > taken
>> > > > into account.
>> > > > 3. Full build: each mojo has incremental support so the full build
>> gets
>> > > > it.
>> > > > Issue is that it requires each mojo to know if it needs to be
>> executed
>> > or
>> > > > give enough info to the mojo executor to do so (gradle requires  
>> all
>> > > > inputs/outputs to assume this state - which is still just an
>> heuristic
>> > > > and
>> > > > not 100% reliable).
>> > > >
>> > > > In current state, 2. sounds like a good option since 3 can  
>> require  a
>> > > > loot
>> > > > of work for external plugins (today's builds have a lot more of  
>> not
>> > maven
>> > > > provide plugins than core plugins).
>> > > > Now, we should be able to activate it or not so having a
>> cacheLocation
>> > > > config in settings.xml can be good.
>> > > >
>> > > > Side notes:
>> > > >
>> > > > 1. having it on by default will break builds - reactor is
>> deterministic
>> > > > and
>> > > > bypassing a module can break a build since it can init maven
>> > properties -
>> > > > for ex - for next modules
>> > > > 2. You cant find all in/out paths from the pom in general so your
>> algo
>> > is
>> > > > not generic, a meta config can be needed in .mvn
>> > > > 3. We should let a mojo be able to disable that to replace default
>> > logic
>> > > > (surefire is a good example where it must be refined and it can  
>> save
>> > > > hours
>> > > > there ;))
>> > > > 4. Let's try to impl it as a mvn extension first then if it works
>> well
>> > on
>> > > > multiple big project get it to core?
>> > >
>> > > Did anyone Google for "maven extension build cache"? There are  
>> already
>> > > commercial solutions for it.
>> > > Even though I would like to see improvements in this area, the old
>> > > architecture of Maven makes it quite hard to move to that situation.
>> > > First
>> > > of all it requires changes to the Plugin API (without breaking
>> backwards
>> > > compatibility) to have support out of the box.
>> > >
>> > > Robert
>> > >
>> > > >
>> > > > Romain
>> > > >
>> > > >
>> > > >
>> > > > Le ven. 13 sept. 2019 à 23:18, Tibor Digana  
>> <ti...@apache.org>
>> a
>> > > > écrit :
>> > > >
>> > > >> In theory, the incremental compiler would make it faster.
>> > > >> But this can be told only if you present a demo project with has
>> > trivial
>> > > >> tests taking much less time to complete than the compiler.
>> > > >>
>> > > >> In reality the tests in huge projects take significantly longer  
>> time
>> > > >> than
>> > > >> the compiler.
>> > > >> Some developers say "switch off all the tests" in the release  
>> phase
>> > but
>> > > >> that's wrong because then the quality goes down and methodologies
>> are
>> > > >> broken.
>> > > >>
>> > > >> I can see a big problem that we do not have an interface between
>> > > >> Surefire
>> > > >> and Compiler plugin negotiating which tests have been modified
>> > including
>> > > >> modules and classes in the entire structure.
>> > > >>
>> > > >> Having incremental compiler is easy, just use compiler:3.8.1 or  
>> use
>> > the
>> > > >> Takari compiler.
>> > > >> But IMO the biggest benefit in performance would be after having  
>> the
>> > > >> truly
>> > > >> incremental test executor.
>> > > >>
>> > > >> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
>> > > >> maximilian.novikov@db.com> wrote:
>> > > >>
>> > > >> > Hi All,
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > *We want to create upstream change to Maven* to support true
>> > > >> incremental
>> > > >> > build for big-sized projects.
>> > > >> >
>> > > >> > To raise a pull request we have to pass long chain of Deutsche
>> > Bank’s
>> > > >> > internal procedures. So, *before starting the process we would
>> like
>> > to
>> > > >> > get your feedback regarding this feature*.
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > *Motivation:*
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > Our project is hosted in mono-repo and contains ~600 modules.  
>> All
>> > > >> modules
>> > > >> > has the same SNAPSHOT version.
>> > > >> >
>> > > >> > There are lot of test automation around this, everything is  
>> tested
>> > > >> before
>> > > >> > merge into release branch.
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > Current setup helps us to simplify build/release/dependency
>> > management
>> > > >> for
>> > > >> > 10+ teams those contribute into codebase. We can release
>> everything
>> > in
>> > > >> > 1-click.
>> > > >> >
>> > > >> > The major drawback of such approach is build time: *full local
>> build
>> > > >> took
>> > > >> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > To speed-up our build we needed 2 features: incremental build  
>> and
>> > > >> shared
>> > > >> > cache.
>> > > >> >
>> > > >> > Initially we started to think about migration to Gradle or  
>> Bazel.
>> As
>> > > >> > migration costs for the mentioned tools were too high, we  
>> decided
>> to
>> > > >> add
>> > > >> > similar functionality into Maven.
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > Current results we get: *1-2 mins for local build(*-T8*)* if  
>> build
>> > was
>> > > >> > cached by CI*, CI build ~5 mins (*-T16*).*
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > *Feature description:*
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > The idea is to calculate checksum for inputs and save outputs  
>> in
>> > > >> cache.
>> > > >> >
>> > > >> > [image: image2019-8-27_20-0-14.png]
>> > > >> >
>> > > >> > Each node checksum calculated with:
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > ·         Effective POM hash
>> > > >> >
>> > > >> > ·         Sources hash
>> > > >> >
>> > > >> > ·         Dependencies hash (dependencies within multi-module
>> > project)
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > Project sources inputs are searched inside project + all paths
>> from
>> > > >> > plugins configuration:
>> > > >> >
>> > > >> > [image: image2019-8-30_10-28-56.png]
>> > > >> >
>> > > >> > How does it work in practice:
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > 1.       CI: runs builds and stores outputs in shared cache
>> > > >> >
>> > > >> > 2.       CI: reuse outputs for same inputs, so time is  
>> decreasing
>> > > >> >
>> > > >> > 3.       Locally: when I checkout branch and run ‘install’ for
>> whole
>> > > >> > project, I get all actual snapshots from remote cache for this
>> > branch
>> > > >> >
>> > > >> > 4.       Locally: if I change multiple modules in tree, only
>> changed
>> > > >> > subtree is rebuilt
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > Impact on current Maven codebase is very localized  
>> (MojoExecutor,
>> > > >> where
>> > > >> we
>> > > >> > injected cache controller).
>> > > >> >
>> > > >> > Caching can be activated/deactivated by property, so current  
>> maven
>> > > >> flow
>> > > >> > will work as is.
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > And the big plus is that you don’t need to re-work your current
>> > > >> project.
>> > > >> > Caching should work out of box, just need to add config in .mvn
>> > > >> folder.
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > Please let us know what do you think. We are ready to invest in
>> this
>> > > >> > feature and address any further feedback.
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > Kind regards,
>> > > >> >
>> > > >> > Max
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > ---
>> > > >> > This e-mail may contain confidential and/or privileged
>> information.
>> > If
>> > > >> you
>> > > >> > are not the intended recipient (or have received this e-mail in
>> > error)
>> > > >> > please notify the sender immediately and delete this e-mail.  
>> Any
>> > > >> > unauthorized copying, disclosure or distribution of the  
>> material
>> in
>> > > >> this
>> > > >> > e-mail is strictly forbidden.
>> > > >> >
>> > > >> > Please refer to https://www.db.com/disclosures for additional  
>> EU
>> > > >> > corporate and regulatory disclosures and to
>> > > >> > http://www.db.com/unitedkingdom/content/privacy.htm for
>> information
>> > > >> about
>> > > >> > privacy.
>> > > >> >
>> > >
>> > >  
>> ---------------------------------------------------------------------
>> > > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
>> > > For additional commands, e-mail: dev-help@maven.apache.org
>> > >
>> > >
>> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Tibor Digana <ti...@apache.org>.
oh yeah, exactly opposite.
Jenkins has several ways to create Maven build configuration and it knows
where the repo and workspace is, it knows where to store the archive, it
knows when the build failed.
We cannot take the responsibility because the build may fail for whatever
reason and we do not know whether to keep the folders or delete all
"/target" folders or just to delete only the failed one. The user knows it.
We cannot archive the folders because we may significantly cause very high
disk usage which would be without the control of CI. And we cannot take the
responsibility of lifetime of these archives. It is all the property of
Jenkins and Jenkins has the feature and management plugins where the
workspace may retain for certain period of time, archives are limited in
some way. The archives can be stored in another folder and we should not
adopt these responsibilities because then we suddenly end up with all the
knowledge of the distributed system and then we as maven project would end
as unmaintainable project with many more issues in Jira and requirements we
would be able to find the spare time to develop.

On Sat, Sep 14, 2019 at 1:25 PM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> Tibor, maven is the only one with the logic to give any cache the data it
> needs. Jenkins alone can't since it does not own the reactor nor plugin I/O
> values.
>
> Le sam. 14 sept. 2019 à 12:45, Tibor Digana <ti...@apache.org> a
> écrit :
>
> > But I do not understand why the Maven should be responsible for the
> project
> > cahe control/management of "/target" directories.
> > It is a responsibility of the build manager which is the Jenkins.
> > The Jenkins has the ability to archive files and such property already
> > exists in the Jenkins.
> >
> > So the Jenkins has a full knowledge about:
> >
> > 1. how long the workspace content retains intact
> > 2. what commit hash is for the last build/job/branch
> > 3. and what commit was successful
> >
> > If the target directories retain intact (or renewed from archive) in the
> > workspace for very long time and the workspace was reused by the next
> build
> > then I would say that the improvement should work as it is on CI level.
> >
> > Maybe what is necessary is only that improvement in Maven where we would
> > obtain the list of modules or directories of changes in the current
> commit.
> > Then the Maven can highly optimize its own build steps and build only
> those
> > modules which have been changed and their dependent modules.
> > So the interface between CI and Maven is needed in a kind of extension or
> > the class MavenCli can be extended with some new entrypoint.
> >
> > But I do not hink that Maven has to take care of responsibilities of CI
> > (project cache mgmt), that's not our task I would say and we as Maven
> would
> > never know all about the miscellaneous CI specifics and therefore we
> would
> > not cope with CI related troubles.
> >
> > Cheers
> > Tibor17
> >
> >
> >
> > On Sat, Sep 14, 2019 at 11:08 AM Robert Scholte <rf...@apache.org>
> > wrote:
> >
> > > On Fri, 13 Sep 2019 23:37:15 +0200, Romain Manni-Bucau
> > > <rm...@gmail.com> wrote:
> > >
> > > > There are multiple possible incremental support:
> > > >
> > > > 1. Scm related: do a status and rebuild downstream reactor
> > > > 2. Full and module build graph: seems it is the one you target, ie
> > bypass
> > > > modules without change. Note that it only works if upstream graph is
> > > > taken
> > > > into account.
> > > > 3. Full build: each mojo has incremental support so the full build
> gets
> > > > it.
> > > > Issue is that it requires each mojo to know if it needs to be
> executed
> > or
> > > > give enough info to the mojo executor to do so (gradle requires all
> > > > inputs/outputs to assume this state - which is still just an
> heuristic
> > > > and
> > > > not 100% reliable).
> > > >
> > > > In current state, 2. sounds like a good option since 3 can require  a
> > > > loot
> > > > of work for external plugins (today's builds have a lot more of not
> > maven
> > > > provide plugins than core plugins).
> > > > Now, we should be able to activate it or not so having a
> cacheLocation
> > > > config in settings.xml can be good.
> > > >
> > > > Side notes:
> > > >
> > > > 1. having it on by default will break builds - reactor is
> deterministic
> > > > and
> > > > bypassing a module can break a build since it can init maven
> > properties -
> > > > for ex - for next modules
> > > > 2. You cant find all in/out paths from the pom in general so your
> algo
> > is
> > > > not generic, a meta config can be needed in .mvn
> > > > 3. We should let a mojo be able to disable that to replace default
> > logic
> > > > (surefire is a good example where it must be refined and it can save
> > > > hours
> > > > there ;))
> > > > 4. Let's try to impl it as a mvn extension first then if it works
> well
> > on
> > > > multiple big project get it to core?
> > >
> > > Did anyone Google for "maven extension build cache"? There are already
> > > commercial solutions for it.
> > > Even though I would like to see improvements in this area, the old
> > > architecture of Maven makes it quite hard to move to that situation.
> > > First
> > > of all it requires changes to the Plugin API (without breaking
> backwards
> > > compatibility) to have support out of the box.
> > >
> > > Robert
> > >
> > > >
> > > > Romain
> > > >
> > > >
> > > >
> > > > Le ven. 13 sept. 2019 à 23:18, Tibor Digana <ti...@apache.org>
> a
> > > > écrit :
> > > >
> > > >> In theory, the incremental compiler would make it faster.
> > > >> But this can be told only if you present a demo project with has
> > trivial
> > > >> tests taking much less time to complete than the compiler.
> > > >>
> > > >> In reality the tests in huge projects take significantly longer time
> > > >> than
> > > >> the compiler.
> > > >> Some developers say "switch off all the tests" in the release phase
> > but
> > > >> that's wrong because then the quality goes down and methodologies
> are
> > > >> broken.
> > > >>
> > > >> I can see a big problem that we do not have an interface between
> > > >> Surefire
> > > >> and Compiler plugin negotiating which tests have been modified
> > including
> > > >> modules and classes in the entire structure.
> > > >>
> > > >> Having incremental compiler is easy, just use compiler:3.8.1 or use
> > the
> > > >> Takari compiler.
> > > >> But IMO the biggest benefit in performance would be after having the
> > > >> truly
> > > >> incremental test executor.
> > > >>
> > > >> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> > > >> maximilian.novikov@db.com> wrote:
> > > >>
> > > >> > Hi All,
> > > >> >
> > > >> >
> > > >> >
> > > >> > *We want to create upstream change to Maven* to support true
> > > >> incremental
> > > >> > build for big-sized projects.
> > > >> >
> > > >> > To raise a pull request we have to pass long chain of Deutsche
> > Bank’s
> > > >> > internal procedures. So, *before starting the process we would
> like
> > to
> > > >> > get your feedback regarding this feature*.
> > > >> >
> > > >> >
> > > >> >
> > > >> > *Motivation:*
> > > >> >
> > > >> >
> > > >> >
> > > >> > Our project is hosted in mono-repo and contains ~600 modules. All
> > > >> modules
> > > >> > has the same SNAPSHOT version.
> > > >> >
> > > >> > There are lot of test automation around this, everything is tested
> > > >> before
> > > >> > merge into release branch.
> > > >> >
> > > >> >
> > > >> >
> > > >> > Current setup helps us to simplify build/release/dependency
> > management
> > > >> for
> > > >> > 10+ teams those contribute into codebase. We can release
> everything
> > in
> > > >> > 1-click.
> > > >> >
> > > >> > The major drawback of such approach is build time: *full local
> build
> > > >> took
> > > >> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> > > >> >
> > > >> >
> > > >> >
> > > >> > To speed-up our build we needed 2 features: incremental build and
> > > >> shared
> > > >> > cache.
> > > >> >
> > > >> > Initially we started to think about migration to Gradle or Bazel.
> As
> > > >> > migration costs for the mentioned tools were too high, we decided
> to
> > > >> add
> > > >> > similar functionality into Maven.
> > > >> >
> > > >> >
> > > >> >
> > > >> > Current results we get: *1-2 mins for local build(*-T8*)* if build
> > was
> > > >> > cached by CI*, CI build ~5 mins (*-T16*).*
> > > >> >
> > > >> >
> > > >> >
> > > >> > *Feature description:*
> > > >> >
> > > >> >
> > > >> >
> > > >> > The idea is to calculate checksum for inputs and save outputs in
> > > >> cache.
> > > >> >
> > > >> > [image: image2019-8-27_20-0-14.png]
> > > >> >
> > > >> > Each node checksum calculated with:
> > > >> >
> > > >> >
> > > >> >
> > > >> > ·         Effective POM hash
> > > >> >
> > > >> > ·         Sources hash
> > > >> >
> > > >> > ·         Dependencies hash (dependencies within multi-module
> > project)
> > > >> >
> > > >> >
> > > >> >
> > > >> > Project sources inputs are searched inside project + all paths
> from
> > > >> > plugins configuration:
> > > >> >
> > > >> > [image: image2019-8-30_10-28-56.png]
> > > >> >
> > > >> > How does it work in practice:
> > > >> >
> > > >> >
> > > >> >
> > > >> > 1.       CI: runs builds and stores outputs in shared cache
> > > >> >
> > > >> > 2.       CI: reuse outputs for same inputs, so time is decreasing
> > > >> >
> > > >> > 3.       Locally: when I checkout branch and run ‘install’ for
> whole
> > > >> > project, I get all actual snapshots from remote cache for this
> > branch
> > > >> >
> > > >> > 4.       Locally: if I change multiple modules in tree, only
> changed
> > > >> > subtree is rebuilt
> > > >> >
> > > >> >
> > > >> >
> > > >> > Impact on current Maven codebase is very localized (MojoExecutor,
> > > >> where
> > > >> we
> > > >> > injected cache controller).
> > > >> >
> > > >> > Caching can be activated/deactivated by property, so current maven
> > > >> flow
> > > >> > will work as is.
> > > >> >
> > > >> >
> > > >> >
> > > >> > And the big plus is that you don’t need to re-work your current
> > > >> project.
> > > >> > Caching should work out of box, just need to add config in .mvn
> > > >> folder.
> > > >> >
> > > >> >
> > > >> >
> > > >> > Please let us know what do you think. We are ready to invest in
> this
> > > >> > feature and address any further feedback.
> > > >> >
> > > >> >
> > > >> >
> > > >> > Kind regards,
> > > >> >
> > > >> > Max
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> > ---
> > > >> > This e-mail may contain confidential and/or privileged
> information.
> > If
> > > >> you
> > > >> > are not the intended recipient (or have received this e-mail in
> > error)
> > > >> > please notify the sender immediately and delete this e-mail. Any
> > > >> > unauthorized copying, disclosure or distribution of the material
> in
> > > >> this
> > > >> > e-mail is strictly forbidden.
> > > >> >
> > > >> > Please refer to https://www.db.com/disclosures for additional EU
> > > >> > corporate and regulatory disclosures and to
> > > >> > http://www.db.com/unitedkingdom/content/privacy.htm for
> information
> > > >> about
> > > >> > privacy.
> > > >> >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > > For additional commands, e-mail: dev-help@maven.apache.org
> > >
> > >
> >
>

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Romain Manni-Bucau <rm...@gmail.com>.
Tibor, maven is the only one with the logic to give any cache the data it
needs. Jenkins alone can't since it does not own the reactor nor plugin I/O
values.

Le sam. 14 sept. 2019 à 12:45, Tibor Digana <ti...@apache.org> a
écrit :

> But I do not understand why the Maven should be responsible for the project
> cahe control/management of "/target" directories.
> It is a responsibility of the build manager which is the Jenkins.
> The Jenkins has the ability to archive files and such property already
> exists in the Jenkins.
>
> So the Jenkins has a full knowledge about:
>
> 1. how long the workspace content retains intact
> 2. what commit hash is for the last build/job/branch
> 3. and what commit was successful
>
> If the target directories retain intact (or renewed from archive) in the
> workspace for very long time and the workspace was reused by the next build
> then I would say that the improvement should work as it is on CI level.
>
> Maybe what is necessary is only that improvement in Maven where we would
> obtain the list of modules or directories of changes in the current commit.
> Then the Maven can highly optimize its own build steps and build only those
> modules which have been changed and their dependent modules.
> So the interface between CI and Maven is needed in a kind of extension or
> the class MavenCli can be extended with some new entrypoint.
>
> But I do not hink that Maven has to take care of responsibilities of CI
> (project cache mgmt), that's not our task I would say and we as Maven would
> never know all about the miscellaneous CI specifics and therefore we would
> not cope with CI related troubles.
>
> Cheers
> Tibor17
>
>
>
> On Sat, Sep 14, 2019 at 11:08 AM Robert Scholte <rf...@apache.org>
> wrote:
>
> > On Fri, 13 Sep 2019 23:37:15 +0200, Romain Manni-Bucau
> > <rm...@gmail.com> wrote:
> >
> > > There are multiple possible incremental support:
> > >
> > > 1. Scm related: do a status and rebuild downstream reactor
> > > 2. Full and module build graph: seems it is the one you target, ie
> bypass
> > > modules without change. Note that it only works if upstream graph is
> > > taken
> > > into account.
> > > 3. Full build: each mojo has incremental support so the full build gets
> > > it.
> > > Issue is that it requires each mojo to know if it needs to be executed
> or
> > > give enough info to the mojo executor to do so (gradle requires all
> > > inputs/outputs to assume this state - which is still just an heuristic
> > > and
> > > not 100% reliable).
> > >
> > > In current state, 2. sounds like a good option since 3 can require  a
> > > loot
> > > of work for external plugins (today's builds have a lot more of not
> maven
> > > provide plugins than core plugins).
> > > Now, we should be able to activate it or not so having a cacheLocation
> > > config in settings.xml can be good.
> > >
> > > Side notes:
> > >
> > > 1. having it on by default will break builds - reactor is deterministic
> > > and
> > > bypassing a module can break a build since it can init maven
> properties -
> > > for ex - for next modules
> > > 2. You cant find all in/out paths from the pom in general so your algo
> is
> > > not generic, a meta config can be needed in .mvn
> > > 3. We should let a mojo be able to disable that to replace default
> logic
> > > (surefire is a good example where it must be refined and it can save
> > > hours
> > > there ;))
> > > 4. Let's try to impl it as a mvn extension first then if it works well
> on
> > > multiple big project get it to core?
> >
> > Did anyone Google for "maven extension build cache"? There are already
> > commercial solutions for it.
> > Even though I would like to see improvements in this area, the old
> > architecture of Maven makes it quite hard to move to that situation.
> > First
> > of all it requires changes to the Plugin API (without breaking backwards
> > compatibility) to have support out of the box.
> >
> > Robert
> >
> > >
> > > Romain
> > >
> > >
> > >
> > > Le ven. 13 sept. 2019 à 23:18, Tibor Digana <ti...@apache.org> a
> > > écrit :
> > >
> > >> In theory, the incremental compiler would make it faster.
> > >> But this can be told only if you present a demo project with has
> trivial
> > >> tests taking much less time to complete than the compiler.
> > >>
> > >> In reality the tests in huge projects take significantly longer time
> > >> than
> > >> the compiler.
> > >> Some developers say "switch off all the tests" in the release phase
> but
> > >> that's wrong because then the quality goes down and methodologies are
> > >> broken.
> > >>
> > >> I can see a big problem that we do not have an interface between
> > >> Surefire
> > >> and Compiler plugin negotiating which tests have been modified
> including
> > >> modules and classes in the entire structure.
> > >>
> > >> Having incremental compiler is easy, just use compiler:3.8.1 or use
> the
> > >> Takari compiler.
> > >> But IMO the biggest benefit in performance would be after having the
> > >> truly
> > >> incremental test executor.
> > >>
> > >> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> > >> maximilian.novikov@db.com> wrote:
> > >>
> > >> > Hi All,
> > >> >
> > >> >
> > >> >
> > >> > *We want to create upstream change to Maven* to support true
> > >> incremental
> > >> > build for big-sized projects.
> > >> >
> > >> > To raise a pull request we have to pass long chain of Deutsche
> Bank’s
> > >> > internal procedures. So, *before starting the process we would like
> to
> > >> > get your feedback regarding this feature*.
> > >> >
> > >> >
> > >> >
> > >> > *Motivation:*
> > >> >
> > >> >
> > >> >
> > >> > Our project is hosted in mono-repo and contains ~600 modules. All
> > >> modules
> > >> > has the same SNAPSHOT version.
> > >> >
> > >> > There are lot of test automation around this, everything is tested
> > >> before
> > >> > merge into release branch.
> > >> >
> > >> >
> > >> >
> > >> > Current setup helps us to simplify build/release/dependency
> management
> > >> for
> > >> > 10+ teams those contribute into codebase. We can release everything
> in
> > >> > 1-click.
> > >> >
> > >> > The major drawback of such approach is build time: *full local build
> > >> took
> > >> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> > >> >
> > >> >
> > >> >
> > >> > To speed-up our build we needed 2 features: incremental build and
> > >> shared
> > >> > cache.
> > >> >
> > >> > Initially we started to think about migration to Gradle or Bazel. As
> > >> > migration costs for the mentioned tools were too high, we decided to
> > >> add
> > >> > similar functionality into Maven.
> > >> >
> > >> >
> > >> >
> > >> > Current results we get: *1-2 mins for local build(*-T8*)* if build
> was
> > >> > cached by CI*, CI build ~5 mins (*-T16*).*
> > >> >
> > >> >
> > >> >
> > >> > *Feature description:*
> > >> >
> > >> >
> > >> >
> > >> > The idea is to calculate checksum for inputs and save outputs in
> > >> cache.
> > >> >
> > >> > [image: image2019-8-27_20-0-14.png]
> > >> >
> > >> > Each node checksum calculated with:
> > >> >
> > >> >
> > >> >
> > >> > ·         Effective POM hash
> > >> >
> > >> > ·         Sources hash
> > >> >
> > >> > ·         Dependencies hash (dependencies within multi-module
> project)
> > >> >
> > >> >
> > >> >
> > >> > Project sources inputs are searched inside project + all paths from
> > >> > plugins configuration:
> > >> >
> > >> > [image: image2019-8-30_10-28-56.png]
> > >> >
> > >> > How does it work in practice:
> > >> >
> > >> >
> > >> >
> > >> > 1.       CI: runs builds and stores outputs in shared cache
> > >> >
> > >> > 2.       CI: reuse outputs for same inputs, so time is decreasing
> > >> >
> > >> > 3.       Locally: when I checkout branch and run ‘install’ for whole
> > >> > project, I get all actual snapshots from remote cache for this
> branch
> > >> >
> > >> > 4.       Locally: if I change multiple modules in tree, only changed
> > >> > subtree is rebuilt
> > >> >
> > >> >
> > >> >
> > >> > Impact on current Maven codebase is very localized (MojoExecutor,
> > >> where
> > >> we
> > >> > injected cache controller).
> > >> >
> > >> > Caching can be activated/deactivated by property, so current maven
> > >> flow
> > >> > will work as is.
> > >> >
> > >> >
> > >> >
> > >> > And the big plus is that you don’t need to re-work your current
> > >> project.
> > >> > Caching should work out of box, just need to add config in .mvn
> > >> folder.
> > >> >
> > >> >
> > >> >
> > >> > Please let us know what do you think. We are ready to invest in this
> > >> > feature and address any further feedback.
> > >> >
> > >> >
> > >> >
> > >> > Kind regards,
> > >> >
> > >> > Max
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > ---
> > >> > This e-mail may contain confidential and/or privileged information.
> If
> > >> you
> > >> > are not the intended recipient (or have received this e-mail in
> error)
> > >> > please notify the sender immediately and delete this e-mail. Any
> > >> > unauthorized copying, disclosure or distribution of the material in
> > >> this
> > >> > e-mail is strictly forbidden.
> > >> >
> > >> > Please refer to https://www.db.com/disclosures for additional EU
> > >> > corporate and regulatory disclosures and to
> > >> > http://www.db.com/unitedkingdom/content/privacy.htm for information
> > >> about
> > >> > privacy.
> > >> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > For additional commands, e-mail: dev-help@maven.apache.org
> >
> >
>

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Tibor Digana <ti...@apache.org>.
But I do not understand why the Maven should be responsible for the project
cahe control/management of "/target" directories.
It is a responsibility of the build manager which is the Jenkins.
The Jenkins has the ability to archive files and such property already
exists in the Jenkins.

So the Jenkins has a full knowledge about:

1. how long the workspace content retains intact
2. what commit hash is for the last build/job/branch
3. and what commit was successful

If the target directories retain intact (or renewed from archive) in the
workspace for very long time and the workspace was reused by the next build
then I would say that the improvement should work as it is on CI level.

Maybe what is necessary is only that improvement in Maven where we would
obtain the list of modules or directories of changes in the current commit.
Then the Maven can highly optimize its own build steps and build only those
modules which have been changed and their dependent modules.
So the interface between CI and Maven is needed in a kind of extension or
the class MavenCli can be extended with some new entrypoint.

But I do not hink that Maven has to take care of responsibilities of CI
(project cache mgmt), that's not our task I would say and we as Maven would
never know all about the miscellaneous CI specifics and therefore we would
not cope with CI related troubles.

Cheers
Tibor17



On Sat, Sep 14, 2019 at 11:08 AM Robert Scholte <rf...@apache.org>
wrote:

> On Fri, 13 Sep 2019 23:37:15 +0200, Romain Manni-Bucau
> <rm...@gmail.com> wrote:
>
> > There are multiple possible incremental support:
> >
> > 1. Scm related: do a status and rebuild downstream reactor
> > 2. Full and module build graph: seems it is the one you target, ie bypass
> > modules without change. Note that it only works if upstream graph is
> > taken
> > into account.
> > 3. Full build: each mojo has incremental support so the full build gets
> > it.
> > Issue is that it requires each mojo to know if it needs to be executed or
> > give enough info to the mojo executor to do so (gradle requires all
> > inputs/outputs to assume this state - which is still just an heuristic
> > and
> > not 100% reliable).
> >
> > In current state, 2. sounds like a good option since 3 can require  a
> > loot
> > of work for external plugins (today's builds have a lot more of not maven
> > provide plugins than core plugins).
> > Now, we should be able to activate it or not so having a cacheLocation
> > config in settings.xml can be good.
> >
> > Side notes:
> >
> > 1. having it on by default will break builds - reactor is deterministic
> > and
> > bypassing a module can break a build since it can init maven properties -
> > for ex - for next modules
> > 2. You cant find all in/out paths from the pom in general so your algo is
> > not generic, a meta config can be needed in .mvn
> > 3. We should let a mojo be able to disable that to replace default logic
> > (surefire is a good example where it must be refined and it can save
> > hours
> > there ;))
> > 4. Let's try to impl it as a mvn extension first then if it works well on
> > multiple big project get it to core?
>
> Did anyone Google for "maven extension build cache"? There are already
> commercial solutions for it.
> Even though I would like to see improvements in this area, the old
> architecture of Maven makes it quite hard to move to that situation.
> First
> of all it requires changes to the Plugin API (without breaking backwards
> compatibility) to have support out of the box.
>
> Robert
>
> >
> > Romain
> >
> >
> >
> > Le ven. 13 sept. 2019 à 23:18, Tibor Digana <ti...@apache.org> a
> > écrit :
> >
> >> In theory, the incremental compiler would make it faster.
> >> But this can be told only if you present a demo project with has trivial
> >> tests taking much less time to complete than the compiler.
> >>
> >> In reality the tests in huge projects take significantly longer time
> >> than
> >> the compiler.
> >> Some developers say "switch off all the tests" in the release phase but
> >> that's wrong because then the quality goes down and methodologies are
> >> broken.
> >>
> >> I can see a big problem that we do not have an interface between
> >> Surefire
> >> and Compiler plugin negotiating which tests have been modified including
> >> modules and classes in the entire structure.
> >>
> >> Having incremental compiler is easy, just use compiler:3.8.1 or use the
> >> Takari compiler.
> >> But IMO the biggest benefit in performance would be after having the
> >> truly
> >> incremental test executor.
> >>
> >> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> >> maximilian.novikov@db.com> wrote:
> >>
> >> > Hi All,
> >> >
> >> >
> >> >
> >> > *We want to create upstream change to Maven* to support true
> >> incremental
> >> > build for big-sized projects.
> >> >
> >> > To raise a pull request we have to pass long chain of Deutsche Bank’s
> >> > internal procedures. So, *before starting the process we would like to
> >> > get your feedback regarding this feature*.
> >> >
> >> >
> >> >
> >> > *Motivation:*
> >> >
> >> >
> >> >
> >> > Our project is hosted in mono-repo and contains ~600 modules. All
> >> modules
> >> > has the same SNAPSHOT version.
> >> >
> >> > There are lot of test automation around this, everything is tested
> >> before
> >> > merge into release branch.
> >> >
> >> >
> >> >
> >> > Current setup helps us to simplify build/release/dependency management
> >> for
> >> > 10+ teams those contribute into codebase. We can release everything in
> >> > 1-click.
> >> >
> >> > The major drawback of such approach is build time: *full local build
> >> took
> >> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> >> >
> >> >
> >> >
> >> > To speed-up our build we needed 2 features: incremental build and
> >> shared
> >> > cache.
> >> >
> >> > Initially we started to think about migration to Gradle or Bazel. As
> >> > migration costs for the mentioned tools were too high, we decided to
> >> add
> >> > similar functionality into Maven.
> >> >
> >> >
> >> >
> >> > Current results we get: *1-2 mins for local build(*-T8*)* if build was
> >> > cached by CI*, CI build ~5 mins (*-T16*).*
> >> >
> >> >
> >> >
> >> > *Feature description:*
> >> >
> >> >
> >> >
> >> > The idea is to calculate checksum for inputs and save outputs in
> >> cache.
> >> >
> >> > [image: image2019-8-27_20-0-14.png]
> >> >
> >> > Each node checksum calculated with:
> >> >
> >> >
> >> >
> >> > ·         Effective POM hash
> >> >
> >> > ·         Sources hash
> >> >
> >> > ·         Dependencies hash (dependencies within multi-module project)
> >> >
> >> >
> >> >
> >> > Project sources inputs are searched inside project + all paths from
> >> > plugins configuration:
> >> >
> >> > [image: image2019-8-30_10-28-56.png]
> >> >
> >> > How does it work in practice:
> >> >
> >> >
> >> >
> >> > 1.       CI: runs builds and stores outputs in shared cache
> >> >
> >> > 2.       CI: reuse outputs for same inputs, so time is decreasing
> >> >
> >> > 3.       Locally: when I checkout branch and run ‘install’ for whole
> >> > project, I get all actual snapshots from remote cache for this branch
> >> >
> >> > 4.       Locally: if I change multiple modules in tree, only changed
> >> > subtree is rebuilt
> >> >
> >> >
> >> >
> >> > Impact on current Maven codebase is very localized (MojoExecutor,
> >> where
> >> we
> >> > injected cache controller).
> >> >
> >> > Caching can be activated/deactivated by property, so current maven
> >> flow
> >> > will work as is.
> >> >
> >> >
> >> >
> >> > And the big plus is that you don’t need to re-work your current
> >> project.
> >> > Caching should work out of box, just need to add config in .mvn
> >> folder.
> >> >
> >> >
> >> >
> >> > Please let us know what do you think. We are ready to invest in this
> >> > feature and address any further feedback.
> >> >
> >> >
> >> >
> >> > Kind regards,
> >> >
> >> > Max
> >> >
> >> >
> >> >
> >> >
> >> > ---
> >> > This e-mail may contain confidential and/or privileged information. If
> >> you
> >> > are not the intended recipient (or have received this e-mail in error)
> >> > please notify the sender immediately and delete this e-mail. Any
> >> > unauthorized copying, disclosure or distribution of the material in
> >> this
> >> > e-mail is strictly forbidden.
> >> >
> >> > Please refer to https://www.db.com/disclosures for additional EU
> >> > corporate and regulatory disclosures and to
> >> > http://www.db.com/unitedkingdom/content/privacy.htm for information
> >> about
> >> > privacy.
> >> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Robert Scholte <rf...@apache.org>.
On Fri, 13 Sep 2019 23:37:15 +0200, Romain Manni-Bucau  
<rm...@gmail.com> wrote:

> There are multiple possible incremental support:
>
> 1. Scm related: do a status and rebuild downstream reactor
> 2. Full and module build graph: seems it is the one you target, ie bypass
> modules without change. Note that it only works if upstream graph is  
> taken
> into account.
> 3. Full build: each mojo has incremental support so the full build gets  
> it.
> Issue is that it requires each mojo to know if it needs to be executed or
> give enough info to the mojo executor to do so (gradle requires all
> inputs/outputs to assume this state - which is still just an heuristic  
> and
> not 100% reliable).
>
> In current state, 2. sounds like a good option since 3 can require  a  
> loot
> of work for external plugins (today's builds have a lot more of not maven
> provide plugins than core plugins).
> Now, we should be able to activate it or not so having a cacheLocation
> config in settings.xml can be good.
>
> Side notes:
>
> 1. having it on by default will break builds - reactor is deterministic  
> and
> bypassing a module can break a build since it can init maven properties -
> for ex - for next modules
> 2. You cant find all in/out paths from the pom in general so your algo is
> not generic, a meta config can be needed in .mvn
> 3. We should let a mojo be able to disable that to replace default logic
> (surefire is a good example where it must be refined and it can save  
> hours
> there ;))
> 4. Let's try to impl it as a mvn extension first then if it works well on
> multiple big project get it to core?

Did anyone Google for "maven extension build cache"? There are already  
commercial solutions for it.
Even though I would like to see improvements in this area, the old  
architecture of Maven makes it quite hard to move to that situation. First  
of all it requires changes to the Plugin API (without breaking backwards  
compatibility) to have support out of the box.

Robert

>
> Romain
>
>
>
> Le ven. 13 sept. 2019 à 23:18, Tibor Digana <ti...@apache.org> a
> écrit :
>
>> In theory, the incremental compiler would make it faster.
>> But this can be told only if you present a demo project with has trivial
>> tests taking much less time to complete than the compiler.
>>
>> In reality the tests in huge projects take significantly longer time  
>> than
>> the compiler.
>> Some developers say "switch off all the tests" in the release phase but
>> that's wrong because then the quality goes down and methodologies are
>> broken.
>>
>> I can see a big problem that we do not have an interface between  
>> Surefire
>> and Compiler plugin negotiating which tests have been modified including
>> modules and classes in the entire structure.
>>
>> Having incremental compiler is easy, just use compiler:3.8.1 or use the
>> Takari compiler.
>> But IMO the biggest benefit in performance would be after having the  
>> truly
>> incremental test executor.
>>
>> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
>> maximilian.novikov@db.com> wrote:
>>
>> > Hi All,
>> >
>> >
>> >
>> > *We want to create upstream change to Maven* to support true  
>> incremental
>> > build for big-sized projects.
>> >
>> > To raise a pull request we have to pass long chain of Deutsche Bank’s
>> > internal procedures. So, *before starting the process we would like to
>> > get your feedback regarding this feature*.
>> >
>> >
>> >
>> > *Motivation:*
>> >
>> >
>> >
>> > Our project is hosted in mono-repo and contains ~600 modules. All  
>> modules
>> > has the same SNAPSHOT version.
>> >
>> > There are lot of test automation around this, everything is tested  
>> before
>> > merge into release branch.
>> >
>> >
>> >
>> > Current setup helps us to simplify build/release/dependency management
>> for
>> > 10+ teams those contribute into codebase. We can release everything in
>> > 1-click.
>> >
>> > The major drawback of such approach is build time: *full local build  
>> took
>> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
>> >
>> >
>> >
>> > To speed-up our build we needed 2 features: incremental build and  
>> shared
>> > cache.
>> >
>> > Initially we started to think about migration to Gradle or Bazel. As
>> > migration costs for the mentioned tools were too high, we decided to  
>> add
>> > similar functionality into Maven.
>> >
>> >
>> >
>> > Current results we get: *1-2 mins for local build(*-T8*)* if build was
>> > cached by CI*, CI build ~5 mins (*-T16*).*
>> >
>> >
>> >
>> > *Feature description:*
>> >
>> >
>> >
>> > The idea is to calculate checksum for inputs and save outputs in  
>> cache.
>> >
>> > [image: image2019-8-27_20-0-14.png]
>> >
>> > Each node checksum calculated with:
>> >
>> >
>> >
>> > ·         Effective POM hash
>> >
>> > ·         Sources hash
>> >
>> > ·         Dependencies hash (dependencies within multi-module project)
>> >
>> >
>> >
>> > Project sources inputs are searched inside project + all paths from
>> > plugins configuration:
>> >
>> > [image: image2019-8-30_10-28-56.png]
>> >
>> > How does it work in practice:
>> >
>> >
>> >
>> > 1.       CI: runs builds and stores outputs in shared cache
>> >
>> > 2.       CI: reuse outputs for same inputs, so time is decreasing
>> >
>> > 3.       Locally: when I checkout branch and run ‘install’ for whole
>> > project, I get all actual snapshots from remote cache for this branch
>> >
>> > 4.       Locally: if I change multiple modules in tree, only changed
>> > subtree is rebuilt
>> >
>> >
>> >
>> > Impact on current Maven codebase is very localized (MojoExecutor,  
>> where
>> we
>> > injected cache controller).
>> >
>> > Caching can be activated/deactivated by property, so current maven  
>> flow
>> > will work as is.
>> >
>> >
>> >
>> > And the big plus is that you don’t need to re-work your current  
>> project.
>> > Caching should work out of box, just need to add config in .mvn  
>> folder.
>> >
>> >
>> >
>> > Please let us know what do you think. We are ready to invest in this
>> > feature and address any further feedback.
>> >
>> >
>> >
>> > Kind regards,
>> >
>> > Max
>> >
>> >
>> >
>> >
>> > ---
>> > This e-mail may contain confidential and/or privileged information. If
>> you
>> > are not the intended recipient (or have received this e-mail in error)
>> > please notify the sender immediately and delete this e-mail. Any
>> > unauthorized copying, disclosure or distribution of the material in  
>> this
>> > e-mail is strictly forbidden.
>> >
>> > Please refer to https://www.db.com/disclosures for additional EU
>> > corporate and regulatory disclosures and to
>> > http://www.db.com/unitedkingdom/content/privacy.htm for information
>> about
>> > privacy.
>> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Ashitkin <as...@gmail.com>.
The feature doesn't do per Test/Classes analysis. Granularity of incremental build is per project - if project is invalidated by changes in dependencies/source code/plugins it will be rebuilt fully. So, single module of multi-module build doesn't receive any increment, but for multi module projects with hundreds of modules just skipping not affected part of graph is a huge win.

Sincerely yours, Aleks

On 2019/09/13 21:48:37, Tibor Digana <ti...@apache.org> wrote: 
> Disabling a Surefire/Failsafe in a particular module is easy but it won't
> gain the performance so much if you do not analyse the relations between
> classes and the test.
> 
> If you analyse the relations then you can easily fetch the list of the
> tests in -Dtests or in the included/excludedTests. So everything is in
> Surefire but I guess that the analysis of the code would be so time
> demanding that it would be all contraprodactive effort on our side.
> 
> Regarding the cache, the repo is the cache. And if the CI build deletes the
> repo, we would not know it today.
> So these performance improvements must be optional feature only enabled by
> the user and not by default.
> 
> On Fri, Sep 13, 2019 at 11:37 PM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
> 
> > There are multiple possible incremental support:
> >
> > 1. Scm related: do a status and rebuild downstream reactor
> > 2. Full and module build graph: seems it is the one you target, ie bypass
> > modules without change. Note that it only works if upstream graph is taken
> > into account.
> > 3. Full build: each mojo has incremental support so the full build gets it.
> > Issue is that it requires each mojo to know if it needs to be executed or
> > give enough info to the mojo executor to do so (gradle requires all
> > inputs/outputs to assume this state - which is still just an heuristic and
> > not 100% reliable).
> >
> > In current state, 2. sounds like a good option since 3 can require  a loot
> > of work for external plugins (today's builds have a lot more of not maven
> > provide plugins than core plugins).
> > Now, we should be able to activate it or not so having a cacheLocation
> > config in settings.xml can be good.
> >
> > Side notes:
> >
> > 1. having it on by default will break builds - reactor is deterministic and
> > bypassing a module can break a build since it can init maven properties -
> > for ex - for next modules
> > 2. You cant find all in/out paths from the pom in general so your algo is
> > not generic, a meta config can be needed in .mvn
> > 3. We should let a mojo be able to disable that to replace default logic
> > (surefire is a good example where it must be refined and it can save hours
> > there ;))
> > 4. Let's try to impl it as a mvn extension first then if it works well on
> > multiple big project get it to core?
> >
> > Romain
> >
> >
> >
> > Le ven. 13 sept. 2019 à 23:18, Tibor Digana <ti...@apache.org> a
> > écrit :
> >
> > > In theory, the incremental compiler would make it faster.
> > > But this can be told only if you present a demo project with has trivial
> > > tests taking much less time to complete than the compiler.
> > >
> > > In reality the tests in huge projects take significantly longer time than
> > > the compiler.
> > > Some developers say "switch off all the tests" in the release phase but
> > > that's wrong because then the quality goes down and methodologies are
> > > broken.
> > >
> > > I can see a big problem that we do not have an interface between Surefire
> > > and Compiler plugin negotiating which tests have been modified including
> > > modules and classes in the entire structure.
> > >
> > > Having incremental compiler is easy, just use compiler:3.8.1 or use the
> > > Takari compiler.
> > > But IMO the biggest benefit in performance would be after having the
> > truly
> > > incremental test executor.
> > >
> > > On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> > > maximilian.novikov@db.com> wrote:
> > >
> > > > Hi All,
> > > >
> > > >
> > > >
> > > > *We want to create upstream change to Maven* to support true
> > incremental
> > > > build for big-sized projects.
> > > >
> > > > To raise a pull request we have to pass long chain of Deutsche Bank’s
> > > > internal procedures. So, *before starting the process we would like to
> > > > get your feedback regarding this feature*.
> > > >
> > > >
> > > >
> > > > *Motivation:*
> > > >
> > > >
> > > >
> > > > Our project is hosted in mono-repo and contains ~600 modules. All
> > modules
> > > > has the same SNAPSHOT version.
> > > >
> > > > There are lot of test automation around this, everything is tested
> > before
> > > > merge into release branch.
> > > >
> > > >
> > > >
> > > > Current setup helps us to simplify build/release/dependency management
> > > for
> > > > 10+ teams those contribute into codebase. We can release everything in
> > > > 1-click.
> > > >
> > > > The major drawback of such approach is build time: *full local build
> > took
> > > > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> > > >
> > > >
> > > >
> > > > To speed-up our build we needed 2 features: incremental build and
> > shared
> > > > cache.
> > > >
> > > > Initially we started to think about migration to Gradle or Bazel. As
> > > > migration costs for the mentioned tools were too high, we decided to
> > add
> > > > similar functionality into Maven.
> > > >
> > > >
> > > >
> > > > Current results we get: *1-2 mins for local build(*-T8*)* if build was
> > > > cached by CI*, CI build ~5 mins (*-T16*).*
> > > >
> > > >
> > > >
> > > > *Feature description:*
> > > >
> > > >
> > > >
> > > > The idea is to calculate checksum for inputs and save outputs in cache.
> > > >
> > > > [image: image2019-8-27_20-0-14.png]
> > > >
> > > > Each node checksum calculated with:
> > > >
> > > >
> > > >
> > > > ·         Effective POM hash
> > > >
> > > > ·         Sources hash
> > > >
> > > > ·         Dependencies hash (dependencies within multi-module project)
> > > >
> > > >
> > > >
> > > > Project sources inputs are searched inside project + all paths from
> > > > plugins configuration:
> > > >
> > > > [image: image2019-8-30_10-28-56.png]
> > > >
> > > > How does it work in practice:
> > > >
> > > >
> > > >
> > > > 1.       CI: runs builds and stores outputs in shared cache
> > > >
> > > > 2.       CI: reuse outputs for same inputs, so time is decreasing
> > > >
> > > > 3.       Locally: when I checkout branch and run ‘install’ for whole
> > > > project, I get all actual snapshots from remote cache for this branch
> > > >
> > > > 4.       Locally: if I change multiple modules in tree, only changed
> > > > subtree is rebuilt
> > > >
> > > >
> > > >
> > > > Impact on current Maven codebase is very localized (MojoExecutor, where
> > > we
> > > > injected cache controller).
> > > >
> > > > Caching can be activated/deactivated by property, so current maven flow
> > > > will work as is.
> > > >
> > > >
> > > >
> > > > And the big plus is that you don’t need to re-work your current
> > project.
> > > > Caching should work out of box, just need to add config in .mvn folder.
> > > >
> > > >
> > > >
> > > > Please let us know what do you think. We are ready to invest in this
> > > > feature and address any further feedback.
> > > >
> > > >
> > > >
> > > > Kind regards,
> > > >
> > > > Max
> > > >
> > > >
> > > >
> > > >
> > > > ---
> > > > This e-mail may contain confidential and/or privileged information. If
> > > you
> > > > are not the intended recipient (or have received this e-mail in error)
> > > > please notify the sender immediately and delete this e-mail. Any
> > > > unauthorized copying, disclosure or distribution of the material in
> > this
> > > > e-mail is strictly forbidden.
> > > >
> > > > Please refer to https://www.db.com/disclosures for additional EU
> > > > corporate and regulatory disclosures and to
> > > > http://www.db.com/unitedkingdom/content/privacy.htm for information
> > > about
> > > > privacy.
> > > >
> > >
> >
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Tibor Digana <ti...@apache.org>.
Disabling a Surefire/Failsafe in a particular module is easy but it won't
gain the performance so much if you do not analyse the relations between
classes and the test.

If you analyse the relations then you can easily fetch the list of the
tests in -Dtests or in the included/excludedTests. So everything is in
Surefire but I guess that the analysis of the code would be so time
demanding that it would be all contraprodactive effort on our side.

Regarding the cache, the repo is the cache. And if the CI build deletes the
repo, we would not know it today.
So these performance improvements must be optional feature only enabled by
the user and not by default.

On Fri, Sep 13, 2019 at 11:37 PM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> There are multiple possible incremental support:
>
> 1. Scm related: do a status and rebuild downstream reactor
> 2. Full and module build graph: seems it is the one you target, ie bypass
> modules without change. Note that it only works if upstream graph is taken
> into account.
> 3. Full build: each mojo has incremental support so the full build gets it.
> Issue is that it requires each mojo to know if it needs to be executed or
> give enough info to the mojo executor to do so (gradle requires all
> inputs/outputs to assume this state - which is still just an heuristic and
> not 100% reliable).
>
> In current state, 2. sounds like a good option since 3 can require  a loot
> of work for external plugins (today's builds have a lot more of not maven
> provide plugins than core plugins).
> Now, we should be able to activate it or not so having a cacheLocation
> config in settings.xml can be good.
>
> Side notes:
>
> 1. having it on by default will break builds - reactor is deterministic and
> bypassing a module can break a build since it can init maven properties -
> for ex - for next modules
> 2. You cant find all in/out paths from the pom in general so your algo is
> not generic, a meta config can be needed in .mvn
> 3. We should let a mojo be able to disable that to replace default logic
> (surefire is a good example where it must be refined and it can save hours
> there ;))
> 4. Let's try to impl it as a mvn extension first then if it works well on
> multiple big project get it to core?
>
> Romain
>
>
>
> Le ven. 13 sept. 2019 à 23:18, Tibor Digana <ti...@apache.org> a
> écrit :
>
> > In theory, the incremental compiler would make it faster.
> > But this can be told only if you present a demo project with has trivial
> > tests taking much less time to complete than the compiler.
> >
> > In reality the tests in huge projects take significantly longer time than
> > the compiler.
> > Some developers say "switch off all the tests" in the release phase but
> > that's wrong because then the quality goes down and methodologies are
> > broken.
> >
> > I can see a big problem that we do not have an interface between Surefire
> > and Compiler plugin negotiating which tests have been modified including
> > modules and classes in the entire structure.
> >
> > Having incremental compiler is easy, just use compiler:3.8.1 or use the
> > Takari compiler.
> > But IMO the biggest benefit in performance would be after having the
> truly
> > incremental test executor.
> >
> > On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> > maximilian.novikov@db.com> wrote:
> >
> > > Hi All,
> > >
> > >
> > >
> > > *We want to create upstream change to Maven* to support true
> incremental
> > > build for big-sized projects.
> > >
> > > To raise a pull request we have to pass long chain of Deutsche Bank’s
> > > internal procedures. So, *before starting the process we would like to
> > > get your feedback regarding this feature*.
> > >
> > >
> > >
> > > *Motivation:*
> > >
> > >
> > >
> > > Our project is hosted in mono-repo and contains ~600 modules. All
> modules
> > > has the same SNAPSHOT version.
> > >
> > > There are lot of test automation around this, everything is tested
> before
> > > merge into release branch.
> > >
> > >
> > >
> > > Current setup helps us to simplify build/release/dependency management
> > for
> > > 10+ teams those contribute into codebase. We can release everything in
> > > 1-click.
> > >
> > > The major drawback of such approach is build time: *full local build
> took
> > > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> > >
> > >
> > >
> > > To speed-up our build we needed 2 features: incremental build and
> shared
> > > cache.
> > >
> > > Initially we started to think about migration to Gradle or Bazel. As
> > > migration costs for the mentioned tools were too high, we decided to
> add
> > > similar functionality into Maven.
> > >
> > >
> > >
> > > Current results we get: *1-2 mins for local build(*-T8*)* if build was
> > > cached by CI*, CI build ~5 mins (*-T16*).*
> > >
> > >
> > >
> > > *Feature description:*
> > >
> > >
> > >
> > > The idea is to calculate checksum for inputs and save outputs in cache.
> > >
> > > [image: image2019-8-27_20-0-14.png]
> > >
> > > Each node checksum calculated with:
> > >
> > >
> > >
> > > ·         Effective POM hash
> > >
> > > ·         Sources hash
> > >
> > > ·         Dependencies hash (dependencies within multi-module project)
> > >
> > >
> > >
> > > Project sources inputs are searched inside project + all paths from
> > > plugins configuration:
> > >
> > > [image: image2019-8-30_10-28-56.png]
> > >
> > > How does it work in practice:
> > >
> > >
> > >
> > > 1.       CI: runs builds and stores outputs in shared cache
> > >
> > > 2.       CI: reuse outputs for same inputs, so time is decreasing
> > >
> > > 3.       Locally: when I checkout branch and run ‘install’ for whole
> > > project, I get all actual snapshots from remote cache for this branch
> > >
> > > 4.       Locally: if I change multiple modules in tree, only changed
> > > subtree is rebuilt
> > >
> > >
> > >
> > > Impact on current Maven codebase is very localized (MojoExecutor, where
> > we
> > > injected cache controller).
> > >
> > > Caching can be activated/deactivated by property, so current maven flow
> > > will work as is.
> > >
> > >
> > >
> > > And the big plus is that you don’t need to re-work your current
> project.
> > > Caching should work out of box, just need to add config in .mvn folder.
> > >
> > >
> > >
> > > Please let us know what do you think. We are ready to invest in this
> > > feature and address any further feedback.
> > >
> > >
> > >
> > > Kind regards,
> > >
> > > Max
> > >
> > >
> > >
> > >
> > > ---
> > > This e-mail may contain confidential and/or privileged information. If
> > you
> > > are not the intended recipient (or have received this e-mail in error)
> > > please notify the sender immediately and delete this e-mail. Any
> > > unauthorized copying, disclosure or distribution of the material in
> this
> > > e-mail is strictly forbidden.
> > >
> > > Please refer to https://www.db.com/disclosures for additional EU
> > > corporate and regulatory disclosures and to
> > > http://www.db.com/unitedkingdom/content/privacy.htm for information
> > about
> > > privacy.
> > >
> >
>

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Romain Manni-Bucau <rm...@gmail.com>.
Le sam. 14 sept. 2019 à 22:17, Alexander Ashitkin <as...@gmail.com>
a écrit :

> Let us evaluate this approach. But if we go extension way, it will be not
> so big motivation to make it part of maven. and i'm not sure what are long
> term strategy for maven, but without incremental builld it becomes less and
> less attractive in our multi-branched world
>

Let see it this way: extension enables to test, enhance and validates the
approach.

Side note: for a medium size project like apache beam, migration from maven
to gradle saved 10mn on 1h20 of build and made the build not deterministic
anymore so even if Im the first motivated by incremental build, I am also
convinced your conclusion is mainly driven by disappointment to have steps
in the process and not a prediction ;).

Dont hesitate to ask help to write the extension though, happy to find some
time to enable you on that topic.



> Thank you
>
> On 2019/09/14 08:48:00, Romain Manni-Bucau <rm...@gmail.com> wrote:
> > Le sam. 14 sept. 2019 à 08:00, Alexander Ashitkin <
> ashitkin.alex@gmail.com>
> > a écrit :
> >
> > > Indeed we have a kind of the option 2 with variations. Current
> > > implementation is opt-in feature driven by configuration with some
> metadata
> > > of required cache behavior and hints.
> > >
> > > Maven extensions is the option, but we would love to have it in maven
> > > itself which is in interest of maven community i believe. Extension is
> a
> > > way we are trying to avoid and even not sure it could be implemented as
> > > extension as it requires changes in maven core.
> > >
> >
> > No real change required in maven core here since guice enables to
> override
> > any bean or even just to rewrite the pom to remove modules to just
> rebuild
> > the minimum set (keeping downstream project).
> >
> > The only challenge is an exhaustive test suite since your current impl
> can
> > easily fake a passing build (as gradle does today if you dont disable the
> > daemon and state cache on the CI).
> >
> > Side note: test relationship discovery is close to AOT in terms of impl
> and
> > very very slow so can be worse than doing the full suite in simple
> projects
> > and it still asks the IT question.
> >
> > So due to the numerous "?" of a core solution, extension is the way to
> go.
> > Now if a guice bean in core can help to write your extension, it can
> surely
> > be reviewed more easily IMHO.
> >
> > Hope it helps.
> >
> >
> > > Thanks in advance, Aleks
> > >
> > > On 2019/09/13 21:37:15, Romain Manni-Bucau <rm...@gmail.com>
> wrote:
> > > > There are multiple possible incremental support:
> > > >
> > > > 1. Scm related: do a status and rebuild downstream reactor
> > > > 2. Full and module build graph: seems it is the one you target, ie
> bypass
> > > > modules without change. Note that it only works if upstream graph is
> > > taken
> > > > into account.
> > > > 3. Full build: each mojo has incremental support so the full build
> gets
> > > it.
> > > > Issue is that it requires each mojo to know if it needs to be
> executed or
> > > > give enough info to the mojo executor to do so (gradle requires all
> > > > inputs/outputs to assume this state - which is still just an
> heuristic
> > > and
> > > > not 100% reliable).
> > > >
> > > > In current state, 2. sounds like a good option since 3 can require  a
> > > loot
> > > > of work for external plugins (today's builds have a lot more of not
> maven
> > > > provide plugins than core plugins).
> > > > Now, we should be able to activate it or not so having a
> cacheLocation
> > > > config in settings.xml can be good.
> > > >
> > > > Side notes:
> > > >
> > > > 1. having it on by default will break builds - reactor is
> deterministic
> > > and
> > > > bypassing a module can break a build since it can init maven
> properties -
> > > > for ex - for next modules
> > > > 2. You cant find all in/out paths from the pom in general so your
> algo is
> > > > not generic, a meta config can be needed in .mvn
> > > > 3. We should let a mojo be able to disable that to replace default
> logic
> > > > (surefire is a good example where it must be refined and it can save
> > > hours
> > > > there ;))
> > > > 4. Let's try to impl it as a mvn extension first then if it works
> well on
> > > > multiple big project get it to core?
> > > >
> > > > Romain
> > > >
> > > >
> > > >
> > > > Le ven. 13 sept. 2019 à 23:18, Tibor Digana <ti...@apache.org>
> a
> > > > écrit :
> > > >
> > > > > In theory, the incremental compiler would make it faster.
> > > > > But this can be told only if you present a demo project with has
> > > trivial
> > > > > tests taking much less time to complete than the compiler.
> > > > >
> > > > > In reality the tests in huge projects take significantly longer
> time
> > > than
> > > > > the compiler.
> > > > > Some developers say "switch off all the tests" in the release
> phase but
> > > > > that's wrong because then the quality goes down and methodologies
> are
> > > > > broken.
> > > > >
> > > > > I can see a big problem that we do not have an interface between
> > > Surefire
> > > > > and Compiler plugin negotiating which tests have been modified
> > > including
> > > > > modules and classes in the entire structure.
> > > > >
> > > > > Having incremental compiler is easy, just use compiler:3.8.1 or
> use the
> > > > > Takari compiler.
> > > > > But IMO the biggest benefit in performance would be after having
> the
> > > truly
> > > > > incremental test executor.
> > > > >
> > > > > On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> > > > > maximilian.novikov@db.com> wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > >
> > > > > >
> > > > > > *We want to create upstream change to Maven* to support true
> > > incremental
> > > > > > build for big-sized projects.
> > > > > >
> > > > > > To raise a pull request we have to pass long chain of Deutsche
> Bank’s
> > > > > > internal procedures. So, *before starting the process we would
> like
> > > to
> > > > > > get your feedback regarding this feature*.
> > > > > >
> > > > > >
> > > > > >
> > > > > > *Motivation:*
> > > > > >
> > > > > >
> > > > > >
> > > > > > Our project is hosted in mono-repo and contains ~600 modules. All
> > > modules
> > > > > > has the same SNAPSHOT version.
> > > > > >
> > > > > > There are lot of test automation around this, everything is
> tested
> > > before
> > > > > > merge into release branch.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Current setup helps us to simplify build/release/dependency
> > > management
> > > > > for
> > > > > > 10+ teams those contribute into codebase. We can release
> everything
> > > in
> > > > > > 1-click.
> > > > > >
> > > > > > The major drawback of such approach is build time: *full local
> build
> > > took
> > > > > > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> > > > > >
> > > > > >
> > > > > >
> > > > > > To speed-up our build we needed 2 features: incremental build and
> > > shared
> > > > > > cache.
> > > > > >
> > > > > > Initially we started to think about migration to Gradle or
> Bazel. As
> > > > > > migration costs for the mentioned tools were too high, we
> decided to
> > > add
> > > > > > similar functionality into Maven.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Current results we get: *1-2 mins for local build(*-T8*)* if
> build
> > > was
> > > > > > cached by CI*, CI build ~5 mins (*-T16*).*
> > > > > >
> > > > > >
> > > > > >
> > > > > > *Feature description:*
> > > > > >
> > > > > >
> > > > > >
> > > > > > The idea is to calculate checksum for inputs and save outputs in
> > > cache.
> > > > > >
> > > > > > [image: image2019-8-27_20-0-14.png]
> > > > > >
> > > > > > Each node checksum calculated with:
> > > > > >
> > > > > >
> > > > > >
> > > > > > ·         Effective POM hash
> > > > > >
> > > > > > ·         Sources hash
> > > > > >
> > > > > > ·         Dependencies hash (dependencies within multi-module
> > > project)
> > > > > >
> > > > > >
> > > > > >
> > > > > > Project sources inputs are searched inside project + all paths
> from
> > > > > > plugins configuration:
> > > > > >
> > > > > > [image: image2019-8-30_10-28-56.png]
> > > > > >
> > > > > > How does it work in practice:
> > > > > >
> > > > > >
> > > > > >
> > > > > > 1.       CI: runs builds and stores outputs in shared cache
> > > > > >
> > > > > > 2.       CI: reuse outputs for same inputs, so time is decreasing
> > > > > >
> > > > > > 3.       Locally: when I checkout branch and run ‘install’ for
> whole
> > > > > > project, I get all actual snapshots from remote cache for this
> branch
> > > > > >
> > > > > > 4.       Locally: if I change multiple modules in tree, only
> changed
> > > > > > subtree is rebuilt
> > > > > >
> > > > > >
> > > > > >
> > > > > > Impact on current Maven codebase is very localized (MojoExecutor,
> > > where
> > > > > we
> > > > > > injected cache controller).
> > > > > >
> > > > > > Caching can be activated/deactivated by property, so current
> maven
> > > flow
> > > > > > will work as is.
> > > > > >
> > > > > >
> > > > > >
> > > > > > And the big plus is that you don’t need to re-work your current
> > > project.
> > > > > > Caching should work out of box, just need to add config in .mvn
> > > folder.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Please let us know what do you think. We are ready to invest in
> this
> > > > > > feature and address any further feedback.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Kind regards,
> > > > > >
> > > > > > Max
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > ---
> > > > > > This e-mail may contain confidential and/or privileged
> information.
> > > If
> > > > > you
> > > > > > are not the intended recipient (or have received this e-mail in
> > > error)
> > > > > > please notify the sender immediately and delete this e-mail. Any
> > > > > > unauthorized copying, disclosure or distribution of the material
> in
> > > this
> > > > > > e-mail is strictly forbidden.
> > > > > >
> > > > > > Please refer to https://www.db.com/disclosures for additional EU
> > > > > > corporate and regulatory disclosures and to
> > > > > > http://www.db.com/unitedkingdom/content/privacy.htm for
> information
> > > > > about
> > > > > > privacy.
> > > > > >
> > > > >
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > > For additional commands, e-mail: dev-help@maven.apache.org
> > >
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Ashitkin <as...@gmail.com>.
Let us evaluate this approach. But if we go extension way, it will be not so big motivation to make it part of maven. and i'm not sure what are long term strategy for maven, but without incremental builld it becomes less and less attractive in our multi-branched world

Thank you

On 2019/09/14 08:48:00, Romain Manni-Bucau <rm...@gmail.com> wrote: 
> Le sam. 14 sept. 2019 à 08:00, Alexander Ashitkin <as...@gmail.com>
> a écrit :
> 
> > Indeed we have a kind of the option 2 with variations. Current
> > implementation is opt-in feature driven by configuration with some metadata
> > of required cache behavior and hints.
> >
> > Maven extensions is the option, but we would love to have it in maven
> > itself which is in interest of maven community i believe. Extension is a
> > way we are trying to avoid and even not sure it could be implemented as
> > extension as it requires changes in maven core.
> >
> 
> No real change required in maven core here since guice enables to override
> any bean or even just to rewrite the pom to remove modules to just rebuild
> the minimum set (keeping downstream project).
> 
> The only challenge is an exhaustive test suite since your current impl can
> easily fake a passing build (as gradle does today if you dont disable the
> daemon and state cache on the CI).
> 
> Side note: test relationship discovery is close to AOT in terms of impl and
> very very slow so can be worse than doing the full suite in simple projects
> and it still asks the IT question.
> 
> So due to the numerous "?" of a core solution, extension is the way to go.
> Now if a guice bean in core can help to write your extension, it can surely
> be reviewed more easily IMHO.
> 
> Hope it helps.
> 
> 
> > Thanks in advance, Aleks
> >
> > On 2019/09/13 21:37:15, Romain Manni-Bucau <rm...@gmail.com> wrote:
> > > There are multiple possible incremental support:
> > >
> > > 1. Scm related: do a status and rebuild downstream reactor
> > > 2. Full and module build graph: seems it is the one you target, ie bypass
> > > modules without change. Note that it only works if upstream graph is
> > taken
> > > into account.
> > > 3. Full build: each mojo has incremental support so the full build gets
> > it.
> > > Issue is that it requires each mojo to know if it needs to be executed or
> > > give enough info to the mojo executor to do so (gradle requires all
> > > inputs/outputs to assume this state - which is still just an heuristic
> > and
> > > not 100% reliable).
> > >
> > > In current state, 2. sounds like a good option since 3 can require  a
> > loot
> > > of work for external plugins (today's builds have a lot more of not maven
> > > provide plugins than core plugins).
> > > Now, we should be able to activate it or not so having a cacheLocation
> > > config in settings.xml can be good.
> > >
> > > Side notes:
> > >
> > > 1. having it on by default will break builds - reactor is deterministic
> > and
> > > bypassing a module can break a build since it can init maven properties -
> > > for ex - for next modules
> > > 2. You cant find all in/out paths from the pom in general so your algo is
> > > not generic, a meta config can be needed in .mvn
> > > 3. We should let a mojo be able to disable that to replace default logic
> > > (surefire is a good example where it must be refined and it can save
> > hours
> > > there ;))
> > > 4. Let's try to impl it as a mvn extension first then if it works well on
> > > multiple big project get it to core?
> > >
> > > Romain
> > >
> > >
> > >
> > > Le ven. 13 sept. 2019 à 23:18, Tibor Digana <ti...@apache.org> a
> > > écrit :
> > >
> > > > In theory, the incremental compiler would make it faster.
> > > > But this can be told only if you present a demo project with has
> > trivial
> > > > tests taking much less time to complete than the compiler.
> > > >
> > > > In reality the tests in huge projects take significantly longer time
> > than
> > > > the compiler.
> > > > Some developers say "switch off all the tests" in the release phase but
> > > > that's wrong because then the quality goes down and methodologies are
> > > > broken.
> > > >
> > > > I can see a big problem that we do not have an interface between
> > Surefire
> > > > and Compiler plugin negotiating which tests have been modified
> > including
> > > > modules and classes in the entire structure.
> > > >
> > > > Having incremental compiler is easy, just use compiler:3.8.1 or use the
> > > > Takari compiler.
> > > > But IMO the biggest benefit in performance would be after having the
> > truly
> > > > incremental test executor.
> > > >
> > > > On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> > > > maximilian.novikov@db.com> wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > >
> > > > >
> > > > > *We want to create upstream change to Maven* to support true
> > incremental
> > > > > build for big-sized projects.
> > > > >
> > > > > To raise a pull request we have to pass long chain of Deutsche Bank’s
> > > > > internal procedures. So, *before starting the process we would like
> > to
> > > > > get your feedback regarding this feature*.
> > > > >
> > > > >
> > > > >
> > > > > *Motivation:*
> > > > >
> > > > >
> > > > >
> > > > > Our project is hosted in mono-repo and contains ~600 modules. All
> > modules
> > > > > has the same SNAPSHOT version.
> > > > >
> > > > > There are lot of test automation around this, everything is tested
> > before
> > > > > merge into release branch.
> > > > >
> > > > >
> > > > >
> > > > > Current setup helps us to simplify build/release/dependency
> > management
> > > > for
> > > > > 10+ teams those contribute into codebase. We can release everything
> > in
> > > > > 1-click.
> > > > >
> > > > > The major drawback of such approach is build time: *full local build
> > took
> > > > > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> > > > >
> > > > >
> > > > >
> > > > > To speed-up our build we needed 2 features: incremental build and
> > shared
> > > > > cache.
> > > > >
> > > > > Initially we started to think about migration to Gradle or Bazel. As
> > > > > migration costs for the mentioned tools were too high, we decided to
> > add
> > > > > similar functionality into Maven.
> > > > >
> > > > >
> > > > >
> > > > > Current results we get: *1-2 mins for local build(*-T8*)* if build
> > was
> > > > > cached by CI*, CI build ~5 mins (*-T16*).*
> > > > >
> > > > >
> > > > >
> > > > > *Feature description:*
> > > > >
> > > > >
> > > > >
> > > > > The idea is to calculate checksum for inputs and save outputs in
> > cache.
> > > > >
> > > > > [image: image2019-8-27_20-0-14.png]
> > > > >
> > > > > Each node checksum calculated with:
> > > > >
> > > > >
> > > > >
> > > > > ·         Effective POM hash
> > > > >
> > > > > ·         Sources hash
> > > > >
> > > > > ·         Dependencies hash (dependencies within multi-module
> > project)
> > > > >
> > > > >
> > > > >
> > > > > Project sources inputs are searched inside project + all paths from
> > > > > plugins configuration:
> > > > >
> > > > > [image: image2019-8-30_10-28-56.png]
> > > > >
> > > > > How does it work in practice:
> > > > >
> > > > >
> > > > >
> > > > > 1.       CI: runs builds and stores outputs in shared cache
> > > > >
> > > > > 2.       CI: reuse outputs for same inputs, so time is decreasing
> > > > >
> > > > > 3.       Locally: when I checkout branch and run ‘install’ for whole
> > > > > project, I get all actual snapshots from remote cache for this branch
> > > > >
> > > > > 4.       Locally: if I change multiple modules in tree, only changed
> > > > > subtree is rebuilt
> > > > >
> > > > >
> > > > >
> > > > > Impact on current Maven codebase is very localized (MojoExecutor,
> > where
> > > > we
> > > > > injected cache controller).
> > > > >
> > > > > Caching can be activated/deactivated by property, so current maven
> > flow
> > > > > will work as is.
> > > > >
> > > > >
> > > > >
> > > > > And the big plus is that you don’t need to re-work your current
> > project.
> > > > > Caching should work out of box, just need to add config in .mvn
> > folder.
> > > > >
> > > > >
> > > > >
> > > > > Please let us know what do you think. We are ready to invest in this
> > > > > feature and address any further feedback.
> > > > >
> > > > >
> > > > >
> > > > > Kind regards,
> > > > >
> > > > > Max
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > ---
> > > > > This e-mail may contain confidential and/or privileged information.
> > If
> > > > you
> > > > > are not the intended recipient (or have received this e-mail in
> > error)
> > > > > please notify the sender immediately and delete this e-mail. Any
> > > > > unauthorized copying, disclosure or distribution of the material in
> > this
> > > > > e-mail is strictly forbidden.
> > > > >
> > > > > Please refer to https://www.db.com/disclosures for additional EU
> > > > > corporate and regulatory disclosures and to
> > > > > http://www.db.com/unitedkingdom/content/privacy.htm for information
> > > > about
> > > > > privacy.
> > > > >
> > > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > For additional commands, e-mail: dev-help@maven.apache.org
> >
> >
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Romain Manni-Bucau <rm...@gmail.com>.
Le sam. 14 sept. 2019 à 08:00, Alexander Ashitkin <as...@gmail.com>
a écrit :

> Indeed we have a kind of the option 2 with variations. Current
> implementation is opt-in feature driven by configuration with some metadata
> of required cache behavior and hints.
>
> Maven extensions is the option, but we would love to have it in maven
> itself which is in interest of maven community i believe. Extension is a
> way we are trying to avoid and even not sure it could be implemented as
> extension as it requires changes in maven core.
>

No real change required in maven core here since guice enables to override
any bean or even just to rewrite the pom to remove modules to just rebuild
the minimum set (keeping downstream project).

The only challenge is an exhaustive test suite since your current impl can
easily fake a passing build (as gradle does today if you dont disable the
daemon and state cache on the CI).

Side note: test relationship discovery is close to AOT in terms of impl and
very very slow so can be worse than doing the full suite in simple projects
and it still asks the IT question.

So due to the numerous "?" of a core solution, extension is the way to go.
Now if a guice bean in core can help to write your extension, it can surely
be reviewed more easily IMHO.

Hope it helps.


> Thanks in advance, Aleks
>
> On 2019/09/13 21:37:15, Romain Manni-Bucau <rm...@gmail.com> wrote:
> > There are multiple possible incremental support:
> >
> > 1. Scm related: do a status and rebuild downstream reactor
> > 2. Full and module build graph: seems it is the one you target, ie bypass
> > modules without change. Note that it only works if upstream graph is
> taken
> > into account.
> > 3. Full build: each mojo has incremental support so the full build gets
> it.
> > Issue is that it requires each mojo to know if it needs to be executed or
> > give enough info to the mojo executor to do so (gradle requires all
> > inputs/outputs to assume this state - which is still just an heuristic
> and
> > not 100% reliable).
> >
> > In current state, 2. sounds like a good option since 3 can require  a
> loot
> > of work for external plugins (today's builds have a lot more of not maven
> > provide plugins than core plugins).
> > Now, we should be able to activate it or not so having a cacheLocation
> > config in settings.xml can be good.
> >
> > Side notes:
> >
> > 1. having it on by default will break builds - reactor is deterministic
> and
> > bypassing a module can break a build since it can init maven properties -
> > for ex - for next modules
> > 2. You cant find all in/out paths from the pom in general so your algo is
> > not generic, a meta config can be needed in .mvn
> > 3. We should let a mojo be able to disable that to replace default logic
> > (surefire is a good example where it must be refined and it can save
> hours
> > there ;))
> > 4. Let's try to impl it as a mvn extension first then if it works well on
> > multiple big project get it to core?
> >
> > Romain
> >
> >
> >
> > Le ven. 13 sept. 2019 à 23:18, Tibor Digana <ti...@apache.org> a
> > écrit :
> >
> > > In theory, the incremental compiler would make it faster.
> > > But this can be told only if you present a demo project with has
> trivial
> > > tests taking much less time to complete than the compiler.
> > >
> > > In reality the tests in huge projects take significantly longer time
> than
> > > the compiler.
> > > Some developers say "switch off all the tests" in the release phase but
> > > that's wrong because then the quality goes down and methodologies are
> > > broken.
> > >
> > > I can see a big problem that we do not have an interface between
> Surefire
> > > and Compiler plugin negotiating which tests have been modified
> including
> > > modules and classes in the entire structure.
> > >
> > > Having incremental compiler is easy, just use compiler:3.8.1 or use the
> > > Takari compiler.
> > > But IMO the biggest benefit in performance would be after having the
> truly
> > > incremental test executor.
> > >
> > > On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> > > maximilian.novikov@db.com> wrote:
> > >
> > > > Hi All,
> > > >
> > > >
> > > >
> > > > *We want to create upstream change to Maven* to support true
> incremental
> > > > build for big-sized projects.
> > > >
> > > > To raise a pull request we have to pass long chain of Deutsche Bank’s
> > > > internal procedures. So, *before starting the process we would like
> to
> > > > get your feedback regarding this feature*.
> > > >
> > > >
> > > >
> > > > *Motivation:*
> > > >
> > > >
> > > >
> > > > Our project is hosted in mono-repo and contains ~600 modules. All
> modules
> > > > has the same SNAPSHOT version.
> > > >
> > > > There are lot of test automation around this, everything is tested
> before
> > > > merge into release branch.
> > > >
> > > >
> > > >
> > > > Current setup helps us to simplify build/release/dependency
> management
> > > for
> > > > 10+ teams those contribute into codebase. We can release everything
> in
> > > > 1-click.
> > > >
> > > > The major drawback of such approach is build time: *full local build
> took
> > > > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> > > >
> > > >
> > > >
> > > > To speed-up our build we needed 2 features: incremental build and
> shared
> > > > cache.
> > > >
> > > > Initially we started to think about migration to Gradle or Bazel. As
> > > > migration costs for the mentioned tools were too high, we decided to
> add
> > > > similar functionality into Maven.
> > > >
> > > >
> > > >
> > > > Current results we get: *1-2 mins for local build(*-T8*)* if build
> was
> > > > cached by CI*, CI build ~5 mins (*-T16*).*
> > > >
> > > >
> > > >
> > > > *Feature description:*
> > > >
> > > >
> > > >
> > > > The idea is to calculate checksum for inputs and save outputs in
> cache.
> > > >
> > > > [image: image2019-8-27_20-0-14.png]
> > > >
> > > > Each node checksum calculated with:
> > > >
> > > >
> > > >
> > > > ·         Effective POM hash
> > > >
> > > > ·         Sources hash
> > > >
> > > > ·         Dependencies hash (dependencies within multi-module
> project)
> > > >
> > > >
> > > >
> > > > Project sources inputs are searched inside project + all paths from
> > > > plugins configuration:
> > > >
> > > > [image: image2019-8-30_10-28-56.png]
> > > >
> > > > How does it work in practice:
> > > >
> > > >
> > > >
> > > > 1.       CI: runs builds and stores outputs in shared cache
> > > >
> > > > 2.       CI: reuse outputs for same inputs, so time is decreasing
> > > >
> > > > 3.       Locally: when I checkout branch and run ‘install’ for whole
> > > > project, I get all actual snapshots from remote cache for this branch
> > > >
> > > > 4.       Locally: if I change multiple modules in tree, only changed
> > > > subtree is rebuilt
> > > >
> > > >
> > > >
> > > > Impact on current Maven codebase is very localized (MojoExecutor,
> where
> > > we
> > > > injected cache controller).
> > > >
> > > > Caching can be activated/deactivated by property, so current maven
> flow
> > > > will work as is.
> > > >
> > > >
> > > >
> > > > And the big plus is that you don’t need to re-work your current
> project.
> > > > Caching should work out of box, just need to add config in .mvn
> folder.
> > > >
> > > >
> > > >
> > > > Please let us know what do you think. We are ready to invest in this
> > > > feature and address any further feedback.
> > > >
> > > >
> > > >
> > > > Kind regards,
> > > >
> > > > Max
> > > >
> > > >
> > > >
> > > >
> > > > ---
> > > > This e-mail may contain confidential and/or privileged information.
> If
> > > you
> > > > are not the intended recipient (or have received this e-mail in
> error)
> > > > please notify the sender immediately and delete this e-mail. Any
> > > > unauthorized copying, disclosure or distribution of the material in
> this
> > > > e-mail is strictly forbidden.
> > > >
> > > > Please refer to https://www.db.com/disclosures for additional EU
> > > > corporate and regulatory disclosures and to
> > > > http://www.db.com/unitedkingdom/content/privacy.htm for information
> > > about
> > > > privacy.
> > > >
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Ashitkin <as...@gmail.com>.
Indeed we have a kind of the option 2 with variations. Current implementation is opt-in feature driven by configuration with some metadata of required cache behavior and hints.

Maven extensions is the option, but we would love to have it in maven itself which is in interest of maven community i believe. Extension is a way we are trying to avoid and even not sure it could be implemented as extension as it requires changes in maven core.

Thanks in advance, Aleks

On 2019/09/13 21:37:15, Romain Manni-Bucau <rm...@gmail.com> wrote: 
> There are multiple possible incremental support:
> 
> 1. Scm related: do a status and rebuild downstream reactor
> 2. Full and module build graph: seems it is the one you target, ie bypass
> modules without change. Note that it only works if upstream graph is taken
> into account.
> 3. Full build: each mojo has incremental support so the full build gets it.
> Issue is that it requires each mojo to know if it needs to be executed or
> give enough info to the mojo executor to do so (gradle requires all
> inputs/outputs to assume this state - which is still just an heuristic and
> not 100% reliable).
> 
> In current state, 2. sounds like a good option since 3 can require  a loot
> of work for external plugins (today's builds have a lot more of not maven
> provide plugins than core plugins).
> Now, we should be able to activate it or not so having a cacheLocation
> config in settings.xml can be good.
> 
> Side notes:
> 
> 1. having it on by default will break builds - reactor is deterministic and
> bypassing a module can break a build since it can init maven properties -
> for ex - for next modules
> 2. You cant find all in/out paths from the pom in general so your algo is
> not generic, a meta config can be needed in .mvn
> 3. We should let a mojo be able to disable that to replace default logic
> (surefire is a good example where it must be refined and it can save hours
> there ;))
> 4. Let's try to impl it as a mvn extension first then if it works well on
> multiple big project get it to core?
> 
> Romain
> 
> 
> 
> Le ven. 13 sept. 2019 à 23:18, Tibor Digana <ti...@apache.org> a
> écrit :
> 
> > In theory, the incremental compiler would make it faster.
> > But this can be told only if you present a demo project with has trivial
> > tests taking much less time to complete than the compiler.
> >
> > In reality the tests in huge projects take significantly longer time than
> > the compiler.
> > Some developers say "switch off all the tests" in the release phase but
> > that's wrong because then the quality goes down and methodologies are
> > broken.
> >
> > I can see a big problem that we do not have an interface between Surefire
> > and Compiler plugin negotiating which tests have been modified including
> > modules and classes in the entire structure.
> >
> > Having incremental compiler is easy, just use compiler:3.8.1 or use the
> > Takari compiler.
> > But IMO the biggest benefit in performance would be after having the truly
> > incremental test executor.
> >
> > On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> > maximilian.novikov@db.com> wrote:
> >
> > > Hi All,
> > >
> > >
> > >
> > > *We want to create upstream change to Maven* to support true incremental
> > > build for big-sized projects.
> > >
> > > To raise a pull request we have to pass long chain of Deutsche Bank’s
> > > internal procedures. So, *before starting the process we would like to
> > > get your feedback regarding this feature*.
> > >
> > >
> > >
> > > *Motivation:*
> > >
> > >
> > >
> > > Our project is hosted in mono-repo and contains ~600 modules. All modules
> > > has the same SNAPSHOT version.
> > >
> > > There are lot of test automation around this, everything is tested before
> > > merge into release branch.
> > >
> > >
> > >
> > > Current setup helps us to simplify build/release/dependency management
> > for
> > > 10+ teams those contribute into codebase. We can release everything in
> > > 1-click.
> > >
> > > The major drawback of such approach is build time: *full local build took
> > > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> > >
> > >
> > >
> > > To speed-up our build we needed 2 features: incremental build and shared
> > > cache.
> > >
> > > Initially we started to think about migration to Gradle or Bazel. As
> > > migration costs for the mentioned tools were too high, we decided to add
> > > similar functionality into Maven.
> > >
> > >
> > >
> > > Current results we get: *1-2 mins for local build(*-T8*)* if build was
> > > cached by CI*, CI build ~5 mins (*-T16*).*
> > >
> > >
> > >
> > > *Feature description:*
> > >
> > >
> > >
> > > The idea is to calculate checksum for inputs and save outputs in cache.
> > >
> > > [image: image2019-8-27_20-0-14.png]
> > >
> > > Each node checksum calculated with:
> > >
> > >
> > >
> > > ·         Effective POM hash
> > >
> > > ·         Sources hash
> > >
> > > ·         Dependencies hash (dependencies within multi-module project)
> > >
> > >
> > >
> > > Project sources inputs are searched inside project + all paths from
> > > plugins configuration:
> > >
> > > [image: image2019-8-30_10-28-56.png]
> > >
> > > How does it work in practice:
> > >
> > >
> > >
> > > 1.       CI: runs builds and stores outputs in shared cache
> > >
> > > 2.       CI: reuse outputs for same inputs, so time is decreasing
> > >
> > > 3.       Locally: when I checkout branch and run ‘install’ for whole
> > > project, I get all actual snapshots from remote cache for this branch
> > >
> > > 4.       Locally: if I change multiple modules in tree, only changed
> > > subtree is rebuilt
> > >
> > >
> > >
> > > Impact on current Maven codebase is very localized (MojoExecutor, where
> > we
> > > injected cache controller).
> > >
> > > Caching can be activated/deactivated by property, so current maven flow
> > > will work as is.
> > >
> > >
> > >
> > > And the big plus is that you don’t need to re-work your current project.
> > > Caching should work out of box, just need to add config in .mvn folder.
> > >
> > >
> > >
> > > Please let us know what do you think. We are ready to invest in this
> > > feature and address any further feedback.
> > >
> > >
> > >
> > > Kind regards,
> > >
> > > Max
> > >
> > >
> > >
> > >
> > > ---
> > > This e-mail may contain confidential and/or privileged information. If
> > you
> > > are not the intended recipient (or have received this e-mail in error)
> > > please notify the sender immediately and delete this e-mail. Any
> > > unauthorized copying, disclosure or distribution of the material in this
> > > e-mail is strictly forbidden.
> > >
> > > Please refer to https://www.db.com/disclosures for additional EU
> > > corporate and regulatory disclosures and to
> > > http://www.db.com/unitedkingdom/content/privacy.htm for information
> > about
> > > privacy.
> > >
> >
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Romain Manni-Bucau <rm...@gmail.com>.
There are multiple possible incremental support:

1. Scm related: do a status and rebuild downstream reactor
2. Full and module build graph: seems it is the one you target, ie bypass
modules without change. Note that it only works if upstream graph is taken
into account.
3. Full build: each mojo has incremental support so the full build gets it.
Issue is that it requires each mojo to know if it needs to be executed or
give enough info to the mojo executor to do so (gradle requires all
inputs/outputs to assume this state - which is still just an heuristic and
not 100% reliable).

In current state, 2. sounds like a good option since 3 can require  a loot
of work for external plugins (today's builds have a lot more of not maven
provide plugins than core plugins).
Now, we should be able to activate it or not so having a cacheLocation
config in settings.xml can be good.

Side notes:

1. having it on by default will break builds - reactor is deterministic and
bypassing a module can break a build since it can init maven properties -
for ex - for next modules
2. You cant find all in/out paths from the pom in general so your algo is
not generic, a meta config can be needed in .mvn
3. We should let a mojo be able to disable that to replace default logic
(surefire is a good example where it must be refined and it can save hours
there ;))
4. Let's try to impl it as a mvn extension first then if it works well on
multiple big project get it to core?

Romain



Le ven. 13 sept. 2019 à 23:18, Tibor Digana <ti...@apache.org> a
écrit :

> In theory, the incremental compiler would make it faster.
> But this can be told only if you present a demo project with has trivial
> tests taking much less time to complete than the compiler.
>
> In reality the tests in huge projects take significantly longer time than
> the compiler.
> Some developers say "switch off all the tests" in the release phase but
> that's wrong because then the quality goes down and methodologies are
> broken.
>
> I can see a big problem that we do not have an interface between Surefire
> and Compiler plugin negotiating which tests have been modified including
> modules and classes in the entire structure.
>
> Having incremental compiler is easy, just use compiler:3.8.1 or use the
> Takari compiler.
> But IMO the biggest benefit in performance would be after having the truly
> incremental test executor.
>
> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> maximilian.novikov@db.com> wrote:
>
> > Hi All,
> >
> >
> >
> > *We want to create upstream change to Maven* to support true incremental
> > build for big-sized projects.
> >
> > To raise a pull request we have to pass long chain of Deutsche Bank’s
> > internal procedures. So, *before starting the process we would like to
> > get your feedback regarding this feature*.
> >
> >
> >
> > *Motivation:*
> >
> >
> >
> > Our project is hosted in mono-repo and contains ~600 modules. All modules
> > has the same SNAPSHOT version.
> >
> > There are lot of test automation around this, everything is tested before
> > merge into release branch.
> >
> >
> >
> > Current setup helps us to simplify build/release/dependency management
> for
> > 10+ teams those contribute into codebase. We can release everything in
> > 1-click.
> >
> > The major drawback of such approach is build time: *full local build took
> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> >
> >
> >
> > To speed-up our build we needed 2 features: incremental build and shared
> > cache.
> >
> > Initially we started to think about migration to Gradle or Bazel. As
> > migration costs for the mentioned tools were too high, we decided to add
> > similar functionality into Maven.
> >
> >
> >
> > Current results we get: *1-2 mins for local build(*-T8*)* if build was
> > cached by CI*, CI build ~5 mins (*-T16*).*
> >
> >
> >
> > *Feature description:*
> >
> >
> >
> > The idea is to calculate checksum for inputs and save outputs in cache.
> >
> > [image: image2019-8-27_20-0-14.png]
> >
> > Each node checksum calculated with:
> >
> >
> >
> > ·         Effective POM hash
> >
> > ·         Sources hash
> >
> > ·         Dependencies hash (dependencies within multi-module project)
> >
> >
> >
> > Project sources inputs are searched inside project + all paths from
> > plugins configuration:
> >
> > [image: image2019-8-30_10-28-56.png]
> >
> > How does it work in practice:
> >
> >
> >
> > 1.       CI: runs builds and stores outputs in shared cache
> >
> > 2.       CI: reuse outputs for same inputs, so time is decreasing
> >
> > 3.       Locally: when I checkout branch and run ‘install’ for whole
> > project, I get all actual snapshots from remote cache for this branch
> >
> > 4.       Locally: if I change multiple modules in tree, only changed
> > subtree is rebuilt
> >
> >
> >
> > Impact on current Maven codebase is very localized (MojoExecutor, where
> we
> > injected cache controller).
> >
> > Caching can be activated/deactivated by property, so current maven flow
> > will work as is.
> >
> >
> >
> > And the big plus is that you don’t need to re-work your current project.
> > Caching should work out of box, just need to add config in .mvn folder.
> >
> >
> >
> > Please let us know what do you think. We are ready to invest in this
> > feature and address any further feedback.
> >
> >
> >
> > Kind regards,
> >
> > Max
> >
> >
> >
> >
> > ---
> > This e-mail may contain confidential and/or privileged information. If
> you
> > are not the intended recipient (or have received this e-mail in error)
> > please notify the sender immediately and delete this e-mail. Any
> > unauthorized copying, disclosure or distribution of the material in this
> > e-mail is strictly forbidden.
> >
> > Please refer to https://www.db.com/disclosures for additional EU
> > corporate and regulatory disclosures and to
> > http://www.db.com/unitedkingdom/content/privacy.htm for information
> about
> > privacy.
> >
>

RE: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Ashitkin <al...@db.com>.
This feature caches Tests results too, but granularity is per-project. In our case we even cache result of long-running integration tests as well so developers rerun unit/integration tests only for changed projects and dependents from them. That’s significant time win.
In our multi module project if you change just 1 module of 600, you invalidate only that module and run test for that module.

If project is invalidated by changed inputs – it will be rebuild from scratch by usual plugins regardless of which features are supported by particular plugin. In case of Takari and incremental compiler that will result in faster compilation for second and consequent runs.
Any plugin could benefit from cache/incremental nature for multi-module project.

Kindly yours
Aleks

From: Tibor Digana [mailto:tibordigana@apache.org]
Sent: Friday, September 13, 2019 5:18 PM
To: Maven Developers List <de...@maven.apache.org>
Cc: Alexander Ashitkin <al...@db.com>
Subject: Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

In theory, the incremental compiler would make it faster.
But this can be told only if you present a demo project with has trivial tests taking much less time to complete than the compiler.

In reality the tests in huge projects take significantly longer time than the compiler.
Some developers say "switch off all the tests" in the release phase but that's wrong because then the quality goes down and methodologies are broken.

I can see a big problem that we do not have an interface between Surefire and Compiler plugin negotiating which tests have been modified including modules and classes in the entire structure.

Having incremental compiler is easy, just use compiler:3.8.1 or use the Takari compiler.
But IMO the biggest benefit in performance would be after having the truly incremental test executor.

On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <ma...@db.com>> wrote:
Hi All,

We want to create upstream change to Maven to support true incremental build for big-sized projects.
To raise a pull request we have to pass long chain of Deutsche Bank’s internal procedures. So, before starting the process we would like to get your feedback regarding this feature.

Motivation:

Our project is hosted in mono-repo and contains ~600 modules. All modules has the same SNAPSHOT version.
There are lot of test automation around this, everything is tested before merge into release branch.

Current setup helps us to simplify build/release/dependency management for 10+ teams those contribute into codebase. We can release everything in 1-click.
The major drawback of such approach is build time: full local build took 45-60 min (-T8), CI build ~25min(-T16).

To speed-up our build we needed 2 features: incremental build and shared cache.
Initially we started to think about migration to Gradle or Bazel. As migration costs for the mentioned tools were too high, we decided to add similar functionality into Maven.

Current results we get: 1-2 mins for local build(-T8) if build was cached by CI, CI build ~5 mins (-T16).

Feature description:

The idea is to calculate checksum for inputs and save outputs in cache.
Each node checksum calculated with:


•         Effective POM hash

•         Sources hash

•         Dependencies hash (dependencies within multi-module project)

Project sources inputs are searched inside project + all paths from plugins configuration:
How does it work in practice:



1.       CI: runs builds and stores outputs in shared cache

2.       CI: reuse outputs for same inputs, so time is decreasing

3.       Locally: when I checkout branch and run ‘install’ for whole project, I get all actual snapshots from remote cache for this branch

4.       Locally: if I change multiple modules in tree, only changed subtree is rebuilt

Impact on current Maven codebase is very localized (MojoExecutor, where we injected cache controller).
Caching can be activated/deactivated by property, so current maven flow will work as is.

And the big plus is that you don’t need to re-work your current project. Caching should work out of box, just need to add config in .mvn folder.

Please let us know what do you think. We are ready to invest in this feature and address any further feedback.

Kind regards,
Max



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


---
Die Europäische Kommission hat unter http://ec.europa.eu/consumers/odr/ eine Europäische Online-Streitbeilegungsplattform (OS-Plattform) errichtet. Verbraucher können die OS-Plattform für die außergerichtliche Beilegung von Streitigkeiten aus Online-Verträgen mit in der EU niedergelassenen Unternehmen nutzen.

Informationen (einschließlich Pflichtangaben) zu einzelnen, innerhalb der EU tätigen Gesellschaften und Zweigniederlassungen des Konzerns Deutsche Bank finden Sie unter https://www.deutsche-bank.de/Pflichtangaben. Diese E-Mail enthält vertrauliche und/ oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser E-Mail ist nicht gestattet.

The European Commission has established a European online dispute resolution platform (OS platform) under http://ec.europa.eu/consumers/odr/. Consumers may use the OS platform to resolve disputes arising from online contracts with providers established in the EU.

Please refer to https://www.db.com/disclosures for information (including mandatory corporate particulars) on selected Deutsche Bank branches and group companies registered or incorporated in the European Union. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Tibor Digana <ti...@apache.org>.
In theory, the incremental compiler would make it faster.
But this can be told only if you present a demo project with has trivial
tests taking much less time to complete than the compiler.

In reality the tests in huge projects take significantly longer time than
the compiler.
Some developers say "switch off all the tests" in the release phase but
that's wrong because then the quality goes down and methodologies are
broken.

I can see a big problem that we do not have an interface between Surefire
and Compiler plugin negotiating which tests have been modified including
modules and classes in the entire structure.

Having incremental compiler is easy, just use compiler:3.8.1 or use the
Takari compiler.
But IMO the biggest benefit in performance would be after having the truly
incremental test executor.

On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
maximilian.novikov@db.com> wrote:

> Hi All,
>
>
>
> *We want to create upstream change to Maven* to support true incremental
> build for big-sized projects.
>
> To raise a pull request we have to pass long chain of Deutsche Bank’s
> internal procedures. So, *before starting the process we would like to
> get your feedback regarding this feature*.
>
>
>
> *Motivation:*
>
>
>
> Our project is hosted in mono-repo and contains ~600 modules. All modules
> has the same SNAPSHOT version.
>
> There are lot of test automation around this, everything is tested before
> merge into release branch.
>
>
>
> Current setup helps us to simplify build/release/dependency management for
> 10+ teams those contribute into codebase. We can release everything in
> 1-click.
>
> The major drawback of such approach is build time: *full local build took
> 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
>
>
>
> To speed-up our build we needed 2 features: incremental build and shared
> cache.
>
> Initially we started to think about migration to Gradle or Bazel. As
> migration costs for the mentioned tools were too high, we decided to add
> similar functionality into Maven.
>
>
>
> Current results we get: *1-2 mins for local build(*-T8*)* if build was
> cached by CI*, CI build ~5 mins (*-T16*).*
>
>
>
> *Feature description:*
>
>
>
> The idea is to calculate checksum for inputs and save outputs in cache.
>
> [image: image2019-8-27_20-0-14.png]
>
> Each node checksum calculated with:
>
>
>
> ·         Effective POM hash
>
> ·         Sources hash
>
> ·         Dependencies hash (dependencies within multi-module project)
>
>
>
> Project sources inputs are searched inside project + all paths from
> plugins configuration:
>
> [image: image2019-8-30_10-28-56.png]
>
> How does it work in practice:
>
>
>
> 1.       CI: runs builds and stores outputs in shared cache
>
> 2.       CI: reuse outputs for same inputs, so time is decreasing
>
> 3.       Locally: when I checkout branch and run ‘install’ for whole
> project, I get all actual snapshots from remote cache for this branch
>
> 4.       Locally: if I change multiple modules in tree, only changed
> subtree is rebuilt
>
>
>
> Impact on current Maven codebase is very localized (MojoExecutor, where we
> injected cache controller).
>
> Caching can be activated/deactivated by property, so current maven flow
> will work as is.
>
>
>
> And the big plus is that you don’t need to re-work your current project.
> Caching should work out of box, just need to add config in .mvn folder.
>
>
>
> Please let us know what do you think. We are ready to invest in this
> feature and address any further feedback.
>
>
>
> Kind regards,
>
> Max
>
>
>
>
> ---
> This e-mail may contain confidential and/or privileged information. If you
> are not the intended recipient (or have received this e-mail in error)
> please notify the sender immediately and delete this e-mail. Any
> unauthorized copying, disclosure or distribution of the material in this
> e-mail is strictly forbidden.
>
> Please refer to https://www.db.com/disclosures for additional EU
> corporate and regulatory disclosures and to
> http://www.db.com/unitedkingdom/content/privacy.htm for information about
> privacy.
>

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Enrico Olivelli <eo...@gmail.com>.
Hi Maximilian,
is there anyway to see this work ? is it already open source? (I am sorry,
maybe I missed some email with links)

Enrico

Il giorno ven 20 set 2019 alle ore 19:30 Alexander Ashitkin <
ashitkin.alex@gmail.com> ha scritto:

> Hi Martijn
> thanks for positive feedback.
>
> Regarding IDE part, yes you're right on integration part, but still there
> important cases when cache helps:
> 1) you need to navigate less in project as top level targets fast enough
> to not drill down
> 2) if you need to build a part of project (say only rest of wicket) you
> need to provide up-to-date rest dependencies which are not active in the
> subproject - and caches restores missing pieces for you without rebuilding
> remaining part of the project
> 3) If you need to test project and invoke test - cache saves your time (as
> gradle does) on unchanged pieces
> 4) and because tests run faster you can try run slow tests which often too
> expensive in rapid development
>
> So maven integration in Intellij works nice. There is nothing super smart
> here, just sharing how i benefit from the cache in everyday ide work
>
> Thank you!
>
> On 2019/09/19 11:28:48, Martijn Dashorst <ma...@gmail.com>
> wrote:
> > On Thu, Sep 19, 2019 at 7:48 AM Alexander Ashitkin
> > <as...@gmail.com> wrote:
> > > Configuration:
> > > * verify -T4 -P default,all-shapshots-repos
> > > * my project config (might be suboptimal for wicket)
> > > * scala tests disabled in 2 modules (caused bytecode version conflict
> on my machine)
> > >
> > > Results
> > > Clean state (cache disabled):                           15:58 min
> > > Second run, target up to date (cache disabled):      10:20 min
> > > Fully cached (no changes):                                      17.507
> s
> > > wicketstuff-jwicket-tooltip-wtooltips changed:          34.936 s
> > > wicketstuff-rest-utils changed:                                 54.040
> s
> > >
> > > If you want to try other modules - please let me know.
> >
> > Nice results!
> >
> > > regarding ide - it's a usual maven installation, so any ide with maven
> integration should benefit from cache them maven action invoked
> >
> > My instinct says that an IDE as Eclipse won't benefit much from it, as
> > it has its own build lifecycle. Only when you invoke a commandline
> > Maven action (such as generate-sources) one might have a benefit.
> >
> > So in the day-to-day life the caching might not be as beneficial for
> > developers, but commandline builds happen often enough to make this
> > matter.
> >
> > Martijn
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > For additional commands, e-mail: dev-help@maven.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Ashitkin <as...@gmail.com>.
Hi Martijn 
thanks for positive feedback.

Regarding IDE part, yes you're right on integration part, but still there important cases when cache helps:
1) you need to navigate less in project as top level targets fast enough to not drill down
2) if you need to build a part of project (say only rest of wicket) you need to provide up-to-date rest dependencies which are not active in the subproject - and caches restores missing pieces for you without rebuilding remaining part of the project
3) If you need to test project and invoke test - cache saves your time (as gradle does) on unchanged pieces
4) and because tests run faster you can try run slow tests which often too expensive in rapid development

So maven integration in Intellij works nice. There is nothing super smart here, just sharing how i benefit from the cache in everyday ide work

Thank you!

On 2019/09/19 11:28:48, Martijn Dashorst <ma...@gmail.com> wrote: 
> On Thu, Sep 19, 2019 at 7:48 AM Alexander Ashitkin
> <as...@gmail.com> wrote:
> > Configuration:
> > * verify -T4 -P default,all-shapshots-repos
> > * my project config (might be suboptimal for wicket)
> > * scala tests disabled in 2 modules (caused bytecode version conflict on my machine)
> >
> > Results
> > Clean state (cache disabled):                           15:58 min
> > Second run, target up to date (cache disabled):      10:20 min
> > Fully cached (no changes):                                      17.507 s
> > wicketstuff-jwicket-tooltip-wtooltips changed:          34.936 s
> > wicketstuff-rest-utils changed:                                 54.040 s
> >
> > If you want to try other modules - please let me know.
> 
> Nice results!
> 
> > regarding ide - it's a usual maven installation, so any ide with maven integration should benefit from cache them maven action invoked
> 
> My instinct says that an IDE as Eclipse won't benefit much from it, as
> it has its own build lifecycle. Only when you invoke a commandline
> Maven action (such as generate-sources) one might have a benefit.
> 
> So in the day-to-day life the caching might not be as beneficial for
> developers, but commandline builds happen often enough to make this
> matter.
> 
> Martijn
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Martijn Dashorst <ma...@gmail.com>.
On Thu, Sep 19, 2019 at 7:48 AM Alexander Ashitkin
<as...@gmail.com> wrote:
> Configuration:
> * verify -T4 -P default,all-shapshots-repos
> * my project config (might be suboptimal for wicket)
> * scala tests disabled in 2 modules (caused bytecode version conflict on my machine)
>
> Results
> Clean state (cache disabled):                           15:58 min
> Second run, target up to date (cache disabled):      10:20 min
> Fully cached (no changes):                                      17.507 s
> wicketstuff-jwicket-tooltip-wtooltips changed:          34.936 s
> wicketstuff-rest-utils changed:                                 54.040 s
>
> If you want to try other modules - please let me know.

Nice results!

> regarding ide - it's a usual maven installation, so any ide with maven integration should benefit from cache them maven action invoked

My instinct says that an IDE as Eclipse won't benefit much from it, as
it has its own build lifecycle. Only when you invoke a commandline
Maven action (such as generate-sources) one might have a benefit.

So in the day-to-day life the caching might not be as beneficial for
developers, but commandline builds happen often enough to make this
matter.

Martijn

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Bubenchikov <al...@jetbrains.com.INVALID>.
Hi Alexander and Maximilian. As a former Deutsche Bank RTC employee, I
cannot bypass this topic.

First of all, I'm not speaking for community, neither Jetbrains.
I know how painfull is to get all approvals for open-source contribution in
Deutsche Bank, but may be you can share you solution as a maven extension
first?
May be I've missed link to github, but discussion looks strange for me - we
are discussing pros and cons of solution, which is not public yet.

>so any ide with maven integration should benefit from cache them maven
action invoked
AFAIK, m2e use its own lifecycle (please, correct me, if I'm wrong). Not
sure, how this change will affect it's performance.

But IDEA executes maven tasks (actually, in a quite hacky-way) for
resolving dependencies, source folders, artifacts, etc.  Compiler plugin is
not being runned usually by IDEA

Compilation and tests runs of imported maven projects are performed by IDEA
itself, and IDEA does not use maven compilation results when run tests. No
maven code and maven extensions are used after importing, so for IDEA this
change should be irrelevant, as I got it (the only exception  case is if
you use "delegate maven" option, but this is a point of another discussion)

Thanks, Alex.



On Thu, Sep 19, 2019 at 8:48 AM Alexander Ashitkin <as...@gmail.com>
wrote:

> Sorry if duplicated, looks like my yesterday reply didn't come through.
> Sharing results.
>
> Configuration:
> * verify -T4 -P default,all-shapshots-repos
> * my project config (might be suboptimal for wicket)
> * scala tests disabled in 2 modules (caused bytecode version conflict on
> my machine)
>
> Results
> Clean state (cache disabled):                           15:58 min
> Second run, target up to date (cache disabled):      10:20 min
> Fully cached (no changes):                                      17.507 s
> wicketstuff-jwicket-tooltip-wtooltips changed:          34.936 s
> wicketstuff-rest-utils changed:                                 54.040 s
>
>
> For wicketstuff-jwicket-tooltip-wtooltips i didnt check invalidated
> modules, for wicketstuff-rest-utils
>  [wicketstuff-rest-lambda, wicketstuff-restannotations,
> wicketstuff-restannotations-json, wicketstuff-restannotations-examples]
> were invalidated and rebuilt
>
> If you want to try other modules - please let me know.
>
> regarding ide - it's a usual maven installation, so any ide with maven
> integration should benefit from cache them maven action invoked
>
> Thank you
> Aleks
>
>
> On 2019/09/17 12:29:11, Martijn Dashorst <ma...@gmail.com>
> wrote:
> > This seems like it would benefit a lot of projects (at least it would
> ours).
> >
> > How would this work in coordination with IDE's? m2e has (afaict, but
> > haven't looked closely) its own lifecycle management to bridge eclipse
> and
> > maven. AFIAK only Netbeans uses maven directly?
> >
> > If you want to benchmark a public big repo, you can use Wicket Stuff
> Core (
> > https://github.com/wicketstuff/core). It has 237 modules, and the build
> > takes quite a while to compile and package. The project levels are not
> > deep, but there's some nesting.
> >
> > Martijn
> >
> >
> > On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> > maximilian.novikov@db.com> wrote:
> >
> > > Hi All,
> > >
> > >
> > >
> > > *We want to create upstream change to Maven* to support true
> incremental
> > > build for big-sized projects.
> > >
> > > To raise a pull request we have to pass long chain of Deutsche Bank’s
> > > internal procedures. So, *before starting the process we would like to
> > > get your feedback regarding this feature*.
> > >
> > >
> > >
> > > *Motivation:*
> > >
> > >
> > >
> > > Our project is hosted in mono-repo and contains ~600 modules. All
> modules
> > > has the same SNAPSHOT version.
> > >
> > > There are lot of test automation around this, everything is tested
> before
> > > merge into release branch.
> > >
> > >
> > >
> > > Current setup helps us to simplify build/release/dependency management
> for
> > > 10+ teams those contribute into codebase. We can release everything in
> > > 1-click.
> > >
> > > The major drawback of such approach is build time: *full local build
> took
> > > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> > >
> > >
> > >
> > > To speed-up our build we needed 2 features: incremental build and
> shared
> > > cache.
> > >
> > > Initially we started to think about migration to Gradle or Bazel. As
> > > migration costs for the mentioned tools were too high, we decided to
> add
> > > similar functionality into Maven.
> > >
> > >
> > >
> > > Current results we get: *1-2 mins for local build(*-T8*)* if build was
> > > cached by CI*, CI build ~5 mins (*-T16*).*
> > >
> > >
> > >
> > > *Feature description:*
> > >
> > >
> > >
> > > The idea is to calculate checksum for inputs and save outputs in cache.
> > >
> > > [image: image2019-8-27_20-0-14.png]
> > >
> > > Each node checksum calculated with:
> > >
> > >
> > >
> > > ·         Effective POM hash
> > >
> > > ·         Sources hash
> > >
> > > ·         Dependencies hash (dependencies within multi-module project)
> > >
> > >
> > >
> > > Project sources inputs are searched inside project + all paths from
> > > plugins configuration:
> > >
> > > [image: image2019-8-30_10-28-56.png]
> > >
> > > How does it work in practice:
> > >
> > >
> > >
> > > 1.       CI: runs builds and stores outputs in shared cache
> > >
> > > 2.       CI: reuse outputs for same inputs, so time is decreasing
> > >
> > > 3.       Locally: when I checkout branch and run ‘install’ for whole
> > > project, I get all actual snapshots from remote cache for this branch
> > >
> > > 4.       Locally: if I change multiple modules in tree, only changed
> > > subtree is rebuilt
> > >
> > >
> > >
> > > Impact on current Maven codebase is very localized (MojoExecutor,
> where we
> > > injected cache controller).
> > >
> > > Caching can be activated/deactivated by property, so current maven flow
> > > will work as is.
> > >
> > >
> > >
> > > And the big plus is that you don’t need to re-work your current
> project.
> > > Caching should work out of box, just need to add config in .mvn folder.
> > >
> > >
> > >
> > > Please let us know what do you think. We are ready to invest in this
> > > feature and address any further feedback.
> > >
> > >
> > >
> > > Kind regards,
> > >
> > > Max
> > >
> > >
> > >
> > >
> > > ---
> > > This e-mail may contain confidential and/or privileged information. If
> you
> > > are not the intended recipient (or have received this e-mail in error)
> > > please notify the sender immediately and delete this e-mail. Any
> > > unauthorized copying, disclosure or distribution of the material in
> this
> > > e-mail is strictly forbidden.
> > >
> > > Please refer to https://www.db.com/disclosures for additional EU
> > > corporate and regulatory disclosures and to
> > > http://www.db.com/unitedkingdom/content/privacy.htm for information
> about
> > > privacy.
> > >
> >
> >
> > --
> > Become a Wicket expert, learn from the best: http://wicketinaction.com
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>

-- 
Alexander Bubenchikov
Software developer
JetBrains
http://www.jetbrains.com
The Drive to Develop

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Ashitkin <as...@gmail.com>.
Sorry if duplicated, looks like my yesterday reply didn't come through.
Sharing results. 

Configuration:
* verify -T4 -P default,all-shapshots-repos
* my project config (might be suboptimal for wicket)
* scala tests disabled in 2 modules (caused bytecode version conflict on my machine) 

Results
Clean state (cache disabled): 			        15:58 min
Second run, target up to date (cache disabled):      10:20 min
Fully cached (no changes): 					17.507 s
wicketstuff-jwicket-tooltip-wtooltips changed:          34.936 s 
wicketstuff-rest-utils changed:  				54.040 s 


For wicketstuff-jwicket-tooltip-wtooltips i didnt check invalidated modules, for wicketstuff-rest-utils 
 [wicketstuff-rest-lambda, wicketstuff-restannotations, wicketstuff-restannotations-json, wicketstuff-restannotations-examples] were invalidated and rebuilt

If you want to try other modules - please let me know.

regarding ide - it's a usual maven installation, so any ide with maven integration should benefit from cache them maven action invoked

Thank you
Aleks


On 2019/09/17 12:29:11, Martijn Dashorst <ma...@gmail.com> wrote: 
> This seems like it would benefit a lot of projects (at least it would ours).
> 
> How would this work in coordination with IDE's? m2e has (afaict, but
> haven't looked closely) its own lifecycle management to bridge eclipse and
> maven. AFIAK only Netbeans uses maven directly?
> 
> If you want to benchmark a public big repo, you can use Wicket Stuff Core (
> https://github.com/wicketstuff/core). It has 237 modules, and the build
> takes quite a while to compile and package. The project levels are not
> deep, but there's some nesting.
> 
> Martijn
> 
> 
> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> maximilian.novikov@db.com> wrote:
> 
> > Hi All,
> >
> >
> >
> > *We want to create upstream change to Maven* to support true incremental
> > build for big-sized projects.
> >
> > To raise a pull request we have to pass long chain of Deutsche Bank’s
> > internal procedures. So, *before starting the process we would like to
> > get your feedback regarding this feature*.
> >
> >
> >
> > *Motivation:*
> >
> >
> >
> > Our project is hosted in mono-repo and contains ~600 modules. All modules
> > has the same SNAPSHOT version.
> >
> > There are lot of test automation around this, everything is tested before
> > merge into release branch.
> >
> >
> >
> > Current setup helps us to simplify build/release/dependency management for
> > 10+ teams those contribute into codebase. We can release everything in
> > 1-click.
> >
> > The major drawback of such approach is build time: *full local build took
> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> >
> >
> >
> > To speed-up our build we needed 2 features: incremental build and shared
> > cache.
> >
> > Initially we started to think about migration to Gradle or Bazel. As
> > migration costs for the mentioned tools were too high, we decided to add
> > similar functionality into Maven.
> >
> >
> >
> > Current results we get: *1-2 mins for local build(*-T8*)* if build was
> > cached by CI*, CI build ~5 mins (*-T16*).*
> >
> >
> >
> > *Feature description:*
> >
> >
> >
> > The idea is to calculate checksum for inputs and save outputs in cache.
> >
> > [image: image2019-8-27_20-0-14.png]
> >
> > Each node checksum calculated with:
> >
> >
> >
> > ·         Effective POM hash
> >
> > ·         Sources hash
> >
> > ·         Dependencies hash (dependencies within multi-module project)
> >
> >
> >
> > Project sources inputs are searched inside project + all paths from
> > plugins configuration:
> >
> > [image: image2019-8-30_10-28-56.png]
> >
> > How does it work in practice:
> >
> >
> >
> > 1.       CI: runs builds and stores outputs in shared cache
> >
> > 2.       CI: reuse outputs for same inputs, so time is decreasing
> >
> > 3.       Locally: when I checkout branch and run ‘install’ for whole
> > project, I get all actual snapshots from remote cache for this branch
> >
> > 4.       Locally: if I change multiple modules in tree, only changed
> > subtree is rebuilt
> >
> >
> >
> > Impact on current Maven codebase is very localized (MojoExecutor, where we
> > injected cache controller).
> >
> > Caching can be activated/deactivated by property, so current maven flow
> > will work as is.
> >
> >
> >
> > And the big plus is that you don’t need to re-work your current project.
> > Caching should work out of box, just need to add config in .mvn folder.
> >
> >
> >
> > Please let us know what do you think. We are ready to invest in this
> > feature and address any further feedback.
> >
> >
> >
> > Kind regards,
> >
> > Max
> >
> >
> >
> >
> > ---
> > This e-mail may contain confidential and/or privileged information. If you
> > are not the intended recipient (or have received this e-mail in error)
> > please notify the sender immediately and delete this e-mail. Any
> > unauthorized copying, disclosure or distribution of the material in this
> > e-mail is strictly forbidden.
> >
> > Please refer to https://www.db.com/disclosures for additional EU
> > corporate and regulatory disclosures and to
> > http://www.db.com/unitedkingdom/content/privacy.htm for information about
> > privacy.
> >
> 
> 
> -- 
> Become a Wicket expert, learn from the best: http://wicketinaction.com
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Martijn Dashorst <ma...@gmail.com>.
This seems like it would benefit a lot of projects (at least it would ours).

How would this work in coordination with IDE's? m2e has (afaict, but
haven't looked closely) its own lifecycle management to bridge eclipse and
maven. AFIAK only Netbeans uses maven directly?

If you want to benchmark a public big repo, you can use Wicket Stuff Core (
https://github.com/wicketstuff/core). It has 237 modules, and the build
takes quite a while to compile and package. The project levels are not
deep, but there's some nesting.

Martijn


On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
maximilian.novikov@db.com> wrote:

> Hi All,
>
>
>
> *We want to create upstream change to Maven* to support true incremental
> build for big-sized projects.
>
> To raise a pull request we have to pass long chain of Deutsche Bank’s
> internal procedures. So, *before starting the process we would like to
> get your feedback regarding this feature*.
>
>
>
> *Motivation:*
>
>
>
> Our project is hosted in mono-repo and contains ~600 modules. All modules
> has the same SNAPSHOT version.
>
> There are lot of test automation around this, everything is tested before
> merge into release branch.
>
>
>
> Current setup helps us to simplify build/release/dependency management for
> 10+ teams those contribute into codebase. We can release everything in
> 1-click.
>
> The major drawback of such approach is build time: *full local build took
> 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
>
>
>
> To speed-up our build we needed 2 features: incremental build and shared
> cache.
>
> Initially we started to think about migration to Gradle or Bazel. As
> migration costs for the mentioned tools were too high, we decided to add
> similar functionality into Maven.
>
>
>
> Current results we get: *1-2 mins for local build(*-T8*)* if build was
> cached by CI*, CI build ~5 mins (*-T16*).*
>
>
>
> *Feature description:*
>
>
>
> The idea is to calculate checksum for inputs and save outputs in cache.
>
> [image: image2019-8-27_20-0-14.png]
>
> Each node checksum calculated with:
>
>
>
> ·         Effective POM hash
>
> ·         Sources hash
>
> ·         Dependencies hash (dependencies within multi-module project)
>
>
>
> Project sources inputs are searched inside project + all paths from
> plugins configuration:
>
> [image: image2019-8-30_10-28-56.png]
>
> How does it work in practice:
>
>
>
> 1.       CI: runs builds and stores outputs in shared cache
>
> 2.       CI: reuse outputs for same inputs, so time is decreasing
>
> 3.       Locally: when I checkout branch and run ‘install’ for whole
> project, I get all actual snapshots from remote cache for this branch
>
> 4.       Locally: if I change multiple modules in tree, only changed
> subtree is rebuilt
>
>
>
> Impact on current Maven codebase is very localized (MojoExecutor, where we
> injected cache controller).
>
> Caching can be activated/deactivated by property, so current maven flow
> will work as is.
>
>
>
> And the big plus is that you don’t need to re-work your current project.
> Caching should work out of box, just need to add config in .mvn folder.
>
>
>
> Please let us know what do you think. We are ready to invest in this
> feature and address any further feedback.
>
>
>
> Kind regards,
>
> Max
>
>
>
>
> ---
> This e-mail may contain confidential and/or privileged information. If you
> are not the intended recipient (or have received this e-mail in error)
> please notify the sender immediately and delete this e-mail. Any
> unauthorized copying, disclosure or distribution of the material in this
> e-mail is strictly forbidden.
>
> Please refer to https://www.db.com/disclosures for additional EU
> corporate and regulatory disclosures and to
> http://www.db.com/unitedkingdom/content/privacy.htm for information about
> privacy.
>


-- 
Become a Wicket expert, learn from the best: http://wicketinaction.com

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Ashitkin <as...@gmail.com>.
Hi Laird
yes, in this case B will be rebuilt - that's on of the basic requirements.

Thanks in advance

On 2019/09/14 06:29:00, Laird Nelson <lj...@gmail.com> wrote: 
> On Fri, Sep 13, 2019 at 11:01 PM Alexander Ashitkin <
> alexander.ashitkin@db.com> wrote:
> 
> > This feature is true incremental build – you don’t build modules which
> > were not changed at all and build only modified/changed ones.
> >
> 
> Suppose module B depends on module A and I change A.  Does B get rebuilt in
> your system?
> 
> Best,
> Laird
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Tibor Digana <ti...@apache.org>.
Alexander,

Nobody is a speaker for this community. I told you my experiences.
Caching the targets and repos is the old era of workarounds 10 years ago.
Any cache in Maven project is a pure workaround and not a systematic design.
You can reach the cache very easily if you do NOT delete the repo, see
https://gitbox.apache.org/repos/asf?p=maven-jenkins-lib.git;a=blob;f=vars/asfMavenTlpPlgnBuild.groovy;h=23269a36b02242216f8dce89dd541cef5229cc28;hb=HEAD#l225
Therefore I was saying that the USER has all capabilities in her/his hands
because it is all specific to her/his CI tool and CI solution. Of course
this kind of "cache" consumes some capacity on the disk as every other does
but HDD is much cheaper than RAM on the Cloud, no issue. It is easy and a
trivial solution with no work for us!

Systematic design leads to extensions which already exist on GitHub and
they are tracking SCM changes and they skip unmodified modules. This is
useful in huge tree of modules with long vertical depth. The extension msu
be very intelligent and it must understand the plugin configuration and
inheritance because that is a trigger of a module even if no change was
made in that module. So no guarantees anyway from our side, and the
responsibility for the reliable build must be on your side as a user.

A small multimodule project with depth 1 or 2 would need to use Eclipse
compiler. The first run will be slow of course. The reason to use
incremental compiler is fact that changes of the bottom of the module tree
trigger the top layer and all the project - no benefit without incremental
compiler. If the only root was changed then no need to build the bottom
because it was not changed! If the compiler is not reliable, it is not our
problem in Maven because it was users choice to use it, no guarantee!
The guarantee is to build the project from clean workspace.

Cheers
Tibor17





On Sun, Sep 15, 2019 at 2:27 PM Alexander Ashitkin <as...@gmail.com>
wrote:

> Tibor
> Let me please share a personal opinion.
> To move this conversation forward, i would kindly ask to refrain from
> judgements and speculations about our project. Speaking on behalf of
> community is a certain responsibility after all. I guess your knowledge
> about our platform, it's architecture, cases, requirements, infrastructure
> is not so huge. In general judgements and speculation without basement is a
> very thin ice on which it is very easy to lose credibility. Thanks for
> sharing with us such an important concepts like microservices, nosql and
> all over important words. I believe that was done with good intentions, not
> with intention to insult.
>
> The second - as a Maven users, we came to community with (a) our case and
> b (proposal). Speaking to users that your case is wrong, irrelevant, etc is
> counterproductive as such. Framing all customers in your vision is a
> perfect way for product stagnation. Ignoring cases which customers bring to
> you is a way to miss opportunity for product growth.
>
> Productive would be to focus on our needs and how maven could address it.
> Another constructive input will be guidance on a proper feature
> implementation and next steps. Speculating about the project does not help
> at all and no the topic we are interested in.
>
> Thank you
> Aleks
>
> On 2019/09/14 20:37:03, Tibor Digana <ti...@apache.org> wrote:
> > Hello Maximilian,
> >
> > So now the next step is to break the traditional dependencies in Maven
> and
> > isolate the services via web-services, e.g. JAX-RS or JAX-WS and you
> would
> > not "touch" the POMs.
> > You need to use Logstash, Kibana, Elasticsearch, and Zipkin because the
> > logs won't be aggregated without these frameworks.
> > This would require you to spend some time and develop automatic
> deployment
> > and reliable CI.
> >
> > The monolith would become on infrastructure level but not on code level.
> > There you can write integration tests in every service. The input
> XML/Json
> > received from another service can be a mock and mock data. The service
> and
> > it's project as well as the tests still become isolated on project level.
> > The tests would become a documentation, and the data (XML/Json) would be
> a
> > specification for another team.
> > In this position a particular functionality would appear on the right
> > place. Shared data won't become a workaround anymore. Sharing something
> may
> > easily happen in the monolith project.
> >
> > The worst situation is if you share the database between the services
> > because there you really have to deploy many services.
> > One way is for instance an architecture where you have one NoSql database
> > for one webapp, and RDBMS as master data.
> > Each webapp has another NoSql database.
> > Then the services would read only from one NoSql and write to RDBMS
> master
> > data + JMS streaming the data back to NoSql databases via data/event bus.
> >
> > It is more about infrastructure and such isolation.
> > Since every app has isolated database, then not all services have to
> change
> > only because a new feature required database migration to new tables and
> > relations.
> > The probabily of a change in the service would be smaller.
> >
> > Then you have got DDD, CQRS but not the Event Sourcing - only partial.
> >
> > Cheers
> > Tibor17
> >
> >
> > On Sat, Sep 14, 2019 at 9:35 PM Maximilian Novikov <
> > maximilian.novikov@db.com> wrote:
> >
> > > Tibor,
> > >
> > > We understand your position.
> > >
> > > We moved from separated SCM to one SCM. We can move back, but we don't
> > > want this.
> > >
> > > In single SCM we like:
> > > 1. Atomic commits
> > > 2. Single point of responsibility.
> > > If someone makes incompatible change in shared library, he is
> responsible
> > > to update all usages. At first look It can be considered as slowness in
> > > development, but it helps us to avoid growing of technical debt. We
> never
> > > get in situation when projects A, B, C, D... depends on different
> version
> > > of shared library and we need to make major upgrade, it can block
> release
> > > of some apps and etc...
> > >
> > > Now we releasing 20+ clients apps and 50+ backend components every
> week or
> > > even often. With multiple SCM we will need to hire a team of release
> > > managers and build engineers to coordinate and support this.
> > >
> > > Again, we are don’t selling our approach. We implemented the missing
> for
> > > us feature.
> > >
> > > PS. Just thing why commercial products like Gradle Maven Extensions
> > > appeared.
> > >
> > >
> > > From: Tibor Digana <tibordigana@apache.org<mailto:
> tibordigana@apache.org>>
> > > Date: Saturday, 14 Sep 2019, 9:43 PM
> > > To: Maven Developers List <dev@maven.apache.org<mailto:
> > > dev@maven.apache.org>>
> > > Subject: Re: [VOTE] Maven incremental build for BIG-sized projects with
> > > local and remote caching
> > >
> > > Alexander,
> > > Enrico is really right. Today it is Microservices and there every
> > > microservice is in a separate SCM repo.
> > >
> > > It was just only an example with Microservices but in my experiences
> you
> > > can always find the lower bound modules in the hierary which do not
> change
> > > so much and segragate them in another SCM repos. Those should undergo
> the
> > > release process, share release versions and avoid sharing SNAPSHOT
> > > versions.
> > >
> > > You can find the top roots which are actually applications. If you
> have 10
> > > WAR files as a result of the build and all of them should be deployed,
> then
> > > there is a strong reason to separate them in separate SCM repos.
> > >
> > > Then this separation concept will guide you to isolate the middle
> layers
> > > into isolated services as JAR files. And then you endup with
> Microservices,
> > > SOA services and not JAR files or you will be much closer to them. the
> huge
> > > monolith project is gone.
> > >
> > > All the development process will be faster and more flexible than it
> was
> > > before. Just try!
> > >
> > > Cheers
> > > Tibor17
> > >
> > > On Sat, Sep 14, 2019 at 5:23 PM Alexander Ashitkin <
> > > ashitkin.alex@gmail.com>
> > > wrote:
> > >
> > > > HI Enrico
> > > > Thanks for feedback. that's a side discussion for best approach for
> > > > projects layouts. Monorepo has own own advocates and it is easy to
> find
> > > > posts describing why google, microsoft or facebook go monorepo.
> > > > Unlike of way of thought, we are ready to go globally in case of
> > > emergency
> > > > scenario. If say zero-day vulnerability is discovered in some of
> > > low-level
> > > > widely reused core libraries, we need just one click to
> build/test/deploy
> > > > and safely go live globally with whole estate updated on scale of
> > > thousands
> > > > of processes. And you know, there are people in the world who think
> that
> > > > scattered across small repos codebase is difficult to maintain and
> > > > snapshots are evil. It all depends.
> > > > Honestly, i think it will be it's a kind of reversed approach them
> you
> > > > build system defines how your software development processes work.
> Google
> > > > has own vision and just implemented Bazel and this is correct
> approach.
> > > Btw
> > > > Bazel is perfect for such scenario, but costly to migrate on for
> existing
> > > > project.
> > > >
> > > > So if you choose monorepo as we did it is normal to work just on a
> part
> > > of
> > > > project. You just need a way to deal with scalability challenges:
> > > > a) you hit hardware and infrastructure limitations and need to
> address
> > > > them in some way.
> > > > b) need to have incremental build so you can work on subpart of
> project
> > > > but contribute to shared codebase
> > > >
> > > > Sincerely yours, Aleks
> > > >
> > > > On 2019/09/14 08:41:37, Enrico Olivelli <eo...@gmail.com> wrote:
> > > > > I feel that in general having an huge monolithic project is kind
> of a
> > > > > project-smell.
> > > > > Btw I have some big project with 100+ modules so I can see your
> pain.
> > > > > In the daywork experience a single developer doesn't work on all
> of the
> > > > > modules but usually you touch 1-2 modules and maybe some
> > > > integration/system
> > > > > test.
> > > > > If you need to rebuild the full project for every change maybe
> there is
> > > > > something wrong with the overall design.
> > > > >
> > > > > I think you have you motivations for your layout, so let's talk
> about
> > > > your
> > > > > proposal.
> > > > >
> > > > > If you have a way to split your project in subsystems you can use
> some
> > > > > shared remote repository for deploying snapshots in order to share
> > > > > intermediate results with other developers
> > > > >
> > > > > If your goal is to be ready for releases I don't get your point.
> > > Usually
> > > > > you work with snapshots and for a release you have to rebuild one
> time
> > > > and
> > > > > only once the full codebase in order to ensure that you a
> consistent
> > > > build
> > > > > of the project.
> > > > > With all of this kind of temporary caches how do you ensure that
> the
> > > > final
> > > > > artifacts are the intended ones?
> > > > >
> > > > >
> > > > > Beside note: this is not a real VOTE thread
> > > > >
> > > > > Just my 2 cents
> > > > >
> > > > > don't get me wrong, I admire your will to improve Maven ecosystem
> with
> > > > this
> > > > > cool feature! Thank you for contribution your work. We will try to
> get
> > > > the
> > > > > best
> > > > >
> > > > > Enrico
> > > > >
> > > > > Il sab 14 set 2019, 08:29 Laird Nelson <lj...@gmail.com> ha
> > > scritto:
> > > > >
> > > > > > On Fri, Sep 13, 2019 at 11:01 PM Alexander Ashitkin <
> > > > > > alexander.ashitkin@db.com> wrote:
> > > > > >
> > > > > > > This feature is true incremental build – you don’t build
> modules
> > > > which
> > > > > > > were not changed at all and build only modified/changed ones.
> > > > > > >
> > > > > >
> > > > > > Suppose module B depends on module A and I change A.  Does B get
> > > > rebuilt in
> > > > > > your system?
> > > > > >
> > > > > > Best,
> > > > > > Laird
> > > > > >
> > > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > > > For additional commands, e-mail: dev-help@maven.apache.org
> > > >
> > > >
> > >
> > >
> > > ---
> > > This e-mail may contain confidential and/or privileged information. If
> you
> > > are not the intended recipient (or have received this e-mail in error)
> > > please notify the sender immediately and delete this e-mail. Any
> > > unauthorized copying, disclosure or distribution of the material in
> this
> > > e-mail is strictly forbidden.
> > >
> > > Please refer to https://www.db.com/disclosures for additional EU
> > > corporate and regulatory disclosures and to
> > > http://www.db.com/unitedkingdom/content/privacy.htm for information
> about
> > > privacy.
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Ashitkin <as...@gmail.com>.
Tibor
Let me please share a personal opinion.
To move this conversation forward, i would kindly ask to refrain from judgements and speculations about our project. Speaking on behalf of community is a certain responsibility after all. I guess your knowledge about our platform, it's architecture, cases, requirements, infrastructure is not so huge. In general judgements and speculation without basement is a very thin ice on which it is very easy to lose credibility. Thanks for sharing with us such an important concepts like microservices, nosql and all over important words. I believe that was done with good intentions, not with intention to insult. 

The second - as a Maven users, we came to community with (a) our case and b (proposal). Speaking to users that your case is wrong, irrelevant, etc is counterproductive as such. Framing all customers in your vision is a perfect way for product stagnation. Ignoring cases which customers bring to you is a way to miss opportunity for product growth. 

Productive would be to focus on our needs and how maven could address it. Another constructive input will be guidance on a proper feature implementation and next steps. Speculating about the project does not help at all and no the topic we are interested in. 

Thank you
Aleks

On 2019/09/14 20:37:03, Tibor Digana <ti...@apache.org> wrote: 
> Hello Maximilian,
> 
> So now the next step is to break the traditional dependencies in Maven and
> isolate the services via web-services, e.g. JAX-RS or JAX-WS and you would
> not "touch" the POMs.
> You need to use Logstash, Kibana, Elasticsearch, and Zipkin because the
> logs won't be aggregated without these frameworks.
> This would require you to spend some time and develop automatic deployment
> and reliable CI.
> 
> The monolith would become on infrastructure level but not on code level.
> There you can write integration tests in every service. The input XML/Json
> received from another service can be a mock and mock data. The service and
> it's project as well as the tests still become isolated on project level.
> The tests would become a documentation, and the data (XML/Json) would be a
> specification for another team.
> In this position a particular functionality would appear on the right
> place. Shared data won't become a workaround anymore. Sharing something may
> easily happen in the monolith project.
> 
> The worst situation is if you share the database between the services
> because there you really have to deploy many services.
> One way is for instance an architecture where you have one NoSql database
> for one webapp, and RDBMS as master data.
> Each webapp has another NoSql database.
> Then the services would read only from one NoSql and write to RDBMS master
> data + JMS streaming the data back to NoSql databases via data/event bus.
> 
> It is more about infrastructure and such isolation.
> Since every app has isolated database, then not all services have to change
> only because a new feature required database migration to new tables and
> relations.
> The probabily of a change in the service would be smaller.
> 
> Then you have got DDD, CQRS but not the Event Sourcing - only partial.
> 
> Cheers
> Tibor17
> 
> 
> On Sat, Sep 14, 2019 at 9:35 PM Maximilian Novikov <
> maximilian.novikov@db.com> wrote:
> 
> > Tibor,
> >
> > We understand your position.
> >
> > We moved from separated SCM to one SCM. We can move back, but we don't
> > want this.
> >
> > In single SCM we like:
> > 1. Atomic commits
> > 2. Single point of responsibility.
> > If someone makes incompatible change in shared library, he is responsible
> > to update all usages. At first look It can be considered as slowness in
> > development, but it helps us to avoid growing of technical debt. We never
> > get in situation when projects A, B, C, D... depends on different version
> > of shared library and we need to make major upgrade, it can block release
> > of some apps and etc...
> >
> > Now we releasing 20+ clients apps and 50+ backend components every week or
> > even often. With multiple SCM we will need to hire a team of release
> > managers and build engineers to coordinate and support this.
> >
> > Again, we are don’t selling our approach. We implemented the missing for
> > us feature.
> >
> > PS. Just thing why commercial products like Gradle Maven Extensions
> > appeared.
> >
> >
> > From: Tibor Digana <ti...@apache.org>>
> > Date: Saturday, 14 Sep 2019, 9:43 PM
> > To: Maven Developers List <dev@maven.apache.org<mailto:
> > dev@maven.apache.org>>
> > Subject: Re: [VOTE] Maven incremental build for BIG-sized projects with
> > local and remote caching
> >
> > Alexander,
> > Enrico is really right. Today it is Microservices and there every
> > microservice is in a separate SCM repo.
> >
> > It was just only an example with Microservices but in my experiences you
> > can always find the lower bound modules in the hierary which do not change
> > so much and segragate them in another SCM repos. Those should undergo the
> > release process, share release versions and avoid sharing SNAPSHOT
> > versions.
> >
> > You can find the top roots which are actually applications. If you have 10
> > WAR files as a result of the build and all of them should be deployed, then
> > there is a strong reason to separate them in separate SCM repos.
> >
> > Then this separation concept will guide you to isolate the middle layers
> > into isolated services as JAR files. And then you endup with Microservices,
> > SOA services and not JAR files or you will be much closer to them. the huge
> > monolith project is gone.
> >
> > All the development process will be faster and more flexible than it was
> > before. Just try!
> >
> > Cheers
> > Tibor17
> >
> > On Sat, Sep 14, 2019 at 5:23 PM Alexander Ashitkin <
> > ashitkin.alex@gmail.com>
> > wrote:
> >
> > > HI Enrico
> > > Thanks for feedback. that's a side discussion for best approach for
> > > projects layouts. Monorepo has own own advocates and it is easy to find
> > > posts describing why google, microsoft or facebook go monorepo.
> > > Unlike of way of thought, we are ready to go globally in case of
> > emergency
> > > scenario. If say zero-day vulnerability is discovered in some of
> > low-level
> > > widely reused core libraries, we need just one click to build/test/deploy
> > > and safely go live globally with whole estate updated on scale of
> > thousands
> > > of processes. And you know, there are people in the world who think that
> > > scattered across small repos codebase is difficult to maintain and
> > > snapshots are evil. It all depends.
> > > Honestly, i think it will be it's a kind of reversed approach them you
> > > build system defines how your software development processes work. Google
> > > has own vision and just implemented Bazel and this is correct approach.
> > Btw
> > > Bazel is perfect for such scenario, but costly to migrate on for existing
> > > project.
> > >
> > > So if you choose monorepo as we did it is normal to work just on a part
> > of
> > > project. You just need a way to deal with scalability challenges:
> > > a) you hit hardware and infrastructure limitations and need to address
> > > them in some way.
> > > b) need to have incremental build so you can work on subpart of project
> > > but contribute to shared codebase
> > >
> > > Sincerely yours, Aleks
> > >
> > > On 2019/09/14 08:41:37, Enrico Olivelli <eo...@gmail.com> wrote:
> > > > I feel that in general having an huge monolithic project is kind of a
> > > > project-smell.
> > > > Btw I have some big project with 100+ modules so I can see your pain.
> > > > In the daywork experience a single developer doesn't work on all of the
> > > > modules but usually you touch 1-2 modules and maybe some
> > > integration/system
> > > > test.
> > > > If you need to rebuild the full project for every change maybe there is
> > > > something wrong with the overall design.
> > > >
> > > > I think you have you motivations for your layout, so let's talk about
> > > your
> > > > proposal.
> > > >
> > > > If you have a way to split your project in subsystems you can use some
> > > > shared remote repository for deploying snapshots in order to share
> > > > intermediate results with other developers
> > > >
> > > > If your goal is to be ready for releases I don't get your point.
> > Usually
> > > > you work with snapshots and for a release you have to rebuild one time
> > > and
> > > > only once the full codebase in order to ensure that you a consistent
> > > build
> > > > of the project.
> > > > With all of this kind of temporary caches how do you ensure that the
> > > final
> > > > artifacts are the intended ones?
> > > >
> > > >
> > > > Beside note: this is not a real VOTE thread
> > > >
> > > > Just my 2 cents
> > > >
> > > > don't get me wrong, I admire your will to improve Maven ecosystem with
> > > this
> > > > cool feature! Thank you for contribution your work. We will try to get
> > > the
> > > > best
> > > >
> > > > Enrico
> > > >
> > > > Il sab 14 set 2019, 08:29 Laird Nelson <lj...@gmail.com> ha
> > scritto:
> > > >
> > > > > On Fri, Sep 13, 2019 at 11:01 PM Alexander Ashitkin <
> > > > > alexander.ashitkin@db.com> wrote:
> > > > >
> > > > > > This feature is true incremental build – you don’t build modules
> > > which
> > > > > > were not changed at all and build only modified/changed ones.
> > > > > >
> > > > >
> > > > > Suppose module B depends on module A and I change A.  Does B get
> > > rebuilt in
> > > > > your system?
> > > > >
> > > > > Best,
> > > > > Laird
> > > > >
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > > For additional commands, e-mail: dev-help@maven.apache.org
> > >
> > >
> >
> >
> > ---
> > This e-mail may contain confidential and/or privileged information. If you
> > are not the intended recipient (or have received this e-mail in error)
> > please notify the sender immediately and delete this e-mail. Any
> > unauthorized copying, disclosure or distribution of the material in this
> > e-mail is strictly forbidden.
> >
> > Please refer to https://www.db.com/disclosures for additional EU
> > corporate and regulatory disclosures and to
> > http://www.db.com/unitedkingdom/content/privacy.htm for information about
> > privacy.
> >
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Tibor Digana <ti...@apache.org>.
Hello Maximilian,

So now the next step is to break the traditional dependencies in Maven and
isolate the services via web-services, e.g. JAX-RS or JAX-WS and you would
not "touch" the POMs.
You need to use Logstash, Kibana, Elasticsearch, and Zipkin because the
logs won't be aggregated without these frameworks.
This would require you to spend some time and develop automatic deployment
and reliable CI.

The monolith would become on infrastructure level but not on code level.
There you can write integration tests in every service. The input XML/Json
received from another service can be a mock and mock data. The service and
it's project as well as the tests still become isolated on project level.
The tests would become a documentation, and the data (XML/Json) would be a
specification for another team.
In this position a particular functionality would appear on the right
place. Shared data won't become a workaround anymore. Sharing something may
easily happen in the monolith project.

The worst situation is if you share the database between the services
because there you really have to deploy many services.
One way is for instance an architecture where you have one NoSql database
for one webapp, and RDBMS as master data.
Each webapp has another NoSql database.
Then the services would read only from one NoSql and write to RDBMS master
data + JMS streaming the data back to NoSql databases via data/event bus.

It is more about infrastructure and such isolation.
Since every app has isolated database, then not all services have to change
only because a new feature required database migration to new tables and
relations.
The probabily of a change in the service would be smaller.

Then you have got DDD, CQRS but not the Event Sourcing - only partial.

Cheers
Tibor17


On Sat, Sep 14, 2019 at 9:35 PM Maximilian Novikov <
maximilian.novikov@db.com> wrote:

> Tibor,
>
> We understand your position.
>
> We moved from separated SCM to one SCM. We can move back, but we don't
> want this.
>
> In single SCM we like:
> 1. Atomic commits
> 2. Single point of responsibility.
> If someone makes incompatible change in shared library, he is responsible
> to update all usages. At first look It can be considered as slowness in
> development, but it helps us to avoid growing of technical debt. We never
> get in situation when projects A, B, C, D... depends on different version
> of shared library and we need to make major upgrade, it can block release
> of some apps and etc...
>
> Now we releasing 20+ clients apps and 50+ backend components every week or
> even often. With multiple SCM we will need to hire a team of release
> managers and build engineers to coordinate and support this.
>
> Again, we are don’t selling our approach. We implemented the missing for
> us feature.
>
> PS. Just thing why commercial products like Gradle Maven Extensions
> appeared.
>
>
> From: Tibor Digana <ti...@apache.org>>
> Date: Saturday, 14 Sep 2019, 9:43 PM
> To: Maven Developers List <dev@maven.apache.org<mailto:
> dev@maven.apache.org>>
> Subject: Re: [VOTE] Maven incremental build for BIG-sized projects with
> local and remote caching
>
> Alexander,
> Enrico is really right. Today it is Microservices and there every
> microservice is in a separate SCM repo.
>
> It was just only an example with Microservices but in my experiences you
> can always find the lower bound modules in the hierary which do not change
> so much and segragate them in another SCM repos. Those should undergo the
> release process, share release versions and avoid sharing SNAPSHOT
> versions.
>
> You can find the top roots which are actually applications. If you have 10
> WAR files as a result of the build and all of them should be deployed, then
> there is a strong reason to separate them in separate SCM repos.
>
> Then this separation concept will guide you to isolate the middle layers
> into isolated services as JAR files. And then you endup with Microservices,
> SOA services and not JAR files or you will be much closer to them. the huge
> monolith project is gone.
>
> All the development process will be faster and more flexible than it was
> before. Just try!
>
> Cheers
> Tibor17
>
> On Sat, Sep 14, 2019 at 5:23 PM Alexander Ashitkin <
> ashitkin.alex@gmail.com>
> wrote:
>
> > HI Enrico
> > Thanks for feedback. that's a side discussion for best approach for
> > projects layouts. Monorepo has own own advocates and it is easy to find
> > posts describing why google, microsoft or facebook go monorepo.
> > Unlike of way of thought, we are ready to go globally in case of
> emergency
> > scenario. If say zero-day vulnerability is discovered in some of
> low-level
> > widely reused core libraries, we need just one click to build/test/deploy
> > and safely go live globally with whole estate updated on scale of
> thousands
> > of processes. And you know, there are people in the world who think that
> > scattered across small repos codebase is difficult to maintain and
> > snapshots are evil. It all depends.
> > Honestly, i think it will be it's a kind of reversed approach them you
> > build system defines how your software development processes work. Google
> > has own vision and just implemented Bazel and this is correct approach.
> Btw
> > Bazel is perfect for such scenario, but costly to migrate on for existing
> > project.
> >
> > So if you choose monorepo as we did it is normal to work just on a part
> of
> > project. You just need a way to deal with scalability challenges:
> > a) you hit hardware and infrastructure limitations and need to address
> > them in some way.
> > b) need to have incremental build so you can work on subpart of project
> > but contribute to shared codebase
> >
> > Sincerely yours, Aleks
> >
> > On 2019/09/14 08:41:37, Enrico Olivelli <eo...@gmail.com> wrote:
> > > I feel that in general having an huge monolithic project is kind of a
> > > project-smell.
> > > Btw I have some big project with 100+ modules so I can see your pain.
> > > In the daywork experience a single developer doesn't work on all of the
> > > modules but usually you touch 1-2 modules and maybe some
> > integration/system
> > > test.
> > > If you need to rebuild the full project for every change maybe there is
> > > something wrong with the overall design.
> > >
> > > I think you have you motivations for your layout, so let's talk about
> > your
> > > proposal.
> > >
> > > If you have a way to split your project in subsystems you can use some
> > > shared remote repository for deploying snapshots in order to share
> > > intermediate results with other developers
> > >
> > > If your goal is to be ready for releases I don't get your point.
> Usually
> > > you work with snapshots and for a release you have to rebuild one time
> > and
> > > only once the full codebase in order to ensure that you a consistent
> > build
> > > of the project.
> > > With all of this kind of temporary caches how do you ensure that the
> > final
> > > artifacts are the intended ones?
> > >
> > >
> > > Beside note: this is not a real VOTE thread
> > >
> > > Just my 2 cents
> > >
> > > don't get me wrong, I admire your will to improve Maven ecosystem with
> > this
> > > cool feature! Thank you for contribution your work. We will try to get
> > the
> > > best
> > >
> > > Enrico
> > >
> > > Il sab 14 set 2019, 08:29 Laird Nelson <lj...@gmail.com> ha
> scritto:
> > >
> > > > On Fri, Sep 13, 2019 at 11:01 PM Alexander Ashitkin <
> > > > alexander.ashitkin@db.com> wrote:
> > > >
> > > > > This feature is true incremental build – you don’t build modules
> > which
> > > > > were not changed at all and build only modified/changed ones.
> > > > >
> > > >
> > > > Suppose module B depends on module A and I change A.  Does B get
> > rebuilt in
> > > > your system?
> > > >
> > > > Best,
> > > > Laird
> > > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > For additional commands, e-mail: dev-help@maven.apache.org
> >
> >
>
>
> ---
> This e-mail may contain confidential and/or privileged information. If you
> are not the intended recipient (or have received this e-mail in error)
> please notify the sender immediately and delete this e-mail. Any
> unauthorized copying, disclosure or distribution of the material in this
> e-mail is strictly forbidden.
>
> Please refer to https://www.db.com/disclosures for additional EU
> corporate and regulatory disclosures and to
> http://www.db.com/unitedkingdom/content/privacy.htm for information about
> privacy.
>

RE: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Maximilian Novikov <ma...@db.com>.
Tibor,

We understand your position.

We moved from separated SCM to one SCM. We can move back, but we don't want this.

In single SCM we like:
1. Atomic commits
2. Single point of responsibility.
If someone makes incompatible change in shared library, he is responsible to update all usages. At first look It can be considered as slowness in development, but it helps us to avoid growing of technical debt. We never get in situation when projects A, B, C, D... depends on different version of shared library and we need to make major upgrade, it can block release of some apps and etc...

Now we releasing 20+ clients apps and 50+ backend components every week or even often. With multiple SCM we will need to hire a team of release managers and build engineers to coordinate and support this.

Again, we are don’t selling our approach. We implemented the missing for us feature.

PS. Just thing why commercial products like Gradle Maven Extensions appeared.


From: Tibor Digana <ti...@apache.org>>
Date: Saturday, 14 Sep 2019, 9:43 PM
To: Maven Developers List <de...@maven.apache.org>>
Subject: Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Alexander,
Enrico is really right. Today it is Microservices and there every
microservice is in a separate SCM repo.

It was just only an example with Microservices but in my experiences you
can always find the lower bound modules in the hierary which do not change
so much and segragate them in another SCM repos. Those should undergo the
release process, share release versions and avoid sharing SNAPSHOT
versions.

You can find the top roots which are actually applications. If you have 10
WAR files as a result of the build and all of them should be deployed, then
there is a strong reason to separate them in separate SCM repos.

Then this separation concept will guide you to isolate the middle layers
into isolated services as JAR files. And then you endup with Microservices,
SOA services and not JAR files or you will be much closer to them. the huge
monolith project is gone.

All the development process will be faster and more flexible than it was
before. Just try!

Cheers
Tibor17

On Sat, Sep 14, 2019 at 5:23 PM Alexander Ashitkin <as...@gmail.com>
wrote:

> HI Enrico
> Thanks for feedback. that's a side discussion for best approach for
> projects layouts. Monorepo has own own advocates and it is easy to find
> posts describing why google, microsoft or facebook go monorepo.
> Unlike of way of thought, we are ready to go globally in case of emergency
> scenario. If say zero-day vulnerability is discovered in some of low-level
> widely reused core libraries, we need just one click to build/test/deploy
> and safely go live globally with whole estate updated on scale of thousands
> of processes. And you know, there are people in the world who think that
> scattered across small repos codebase is difficult to maintain and
> snapshots are evil. It all depends.
> Honestly, i think it will be it's a kind of reversed approach them you
> build system defines how your software development processes work. Google
> has own vision and just implemented Bazel and this is correct approach. Btw
> Bazel is perfect for such scenario, but costly to migrate on for existing
> project.
>
> So if you choose monorepo as we did it is normal to work just on a part of
> project. You just need a way to deal with scalability challenges:
> a) you hit hardware and infrastructure limitations and need to address
> them in some way.
> b) need to have incremental build so you can work on subpart of project
> but contribute to shared codebase
>
> Sincerely yours, Aleks
>
> On 2019/09/14 08:41:37, Enrico Olivelli <eo...@gmail.com> wrote:
> > I feel that in general having an huge monolithic project is kind of a
> > project-smell.
> > Btw I have some big project with 100+ modules so I can see your pain.
> > In the daywork experience a single developer doesn't work on all of the
> > modules but usually you touch 1-2 modules and maybe some
> integration/system
> > test.
> > If you need to rebuild the full project for every change maybe there is
> > something wrong with the overall design.
> >
> > I think you have you motivations for your layout, so let's talk about
> your
> > proposal.
> >
> > If you have a way to split your project in subsystems you can use some
> > shared remote repository for deploying snapshots in order to share
> > intermediate results with other developers
> >
> > If your goal is to be ready for releases I don't get your point. Usually
> > you work with snapshots and for a release you have to rebuild one time
> and
> > only once the full codebase in order to ensure that you a consistent
> build
> > of the project.
> > With all of this kind of temporary caches how do you ensure that the
> final
> > artifacts are the intended ones?
> >
> >
> > Beside note: this is not a real VOTE thread
> >
> > Just my 2 cents
> >
> > don't get me wrong, I admire your will to improve Maven ecosystem with
> this
> > cool feature! Thank you for contribution your work. We will try to get
> the
> > best
> >
> > Enrico
> >
> > Il sab 14 set 2019, 08:29 Laird Nelson <lj...@gmail.com> ha scritto:
> >
> > > On Fri, Sep 13, 2019 at 11:01 PM Alexander Ashitkin <
> > > alexander.ashitkin@db.com> wrote:
> > >
> > > > This feature is true incremental build – you don’t build modules
> which
> > > > were not changed at all and build only modified/changed ones.
> > > >
> > >
> > > Suppose module B depends on module A and I change A.  Does B get
> rebuilt in
> > > your system?
> > >
> > > Best,
> > > Laird
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>


---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Tibor Digana <ti...@apache.org>.
Alexander,
Enrico is really right. Today it is Microservices and there every
microservice is in a separate SCM repo.

It was just only an example with Microservices but in my experiences you
can always find the lower bound modules in the hierary which do not change
so much and segragate them in another SCM repos. Those should undergo the
release process, share release versions and avoid sharing SNAPSHOT
versions.

You can find the top roots which are actually applications. If you have 10
WAR files as a result of the build and all of them should be deployed, then
there is a strong reason to separate them in separate SCM repos.

Then this separation concept will guide you to isolate the middle layers
into isolated services as JAR files. And then you endup with Microservices,
SOA services and not JAR files or you will be much closer to them. the huge
monolith project is gone.

All the development process will be faster and more flexible than it was
before. Just try!

Cheers
Tibor17

On Sat, Sep 14, 2019 at 5:23 PM Alexander Ashitkin <as...@gmail.com>
wrote:

> HI Enrico
> Thanks for feedback. that's a side discussion for best approach for
> projects layouts. Monorepo has own own advocates and it is easy to find
> posts describing why google, microsoft or facebook go monorepo.
> Unlike of way of thought, we are ready to go globally in case of emergency
> scenario. If say zero-day vulnerability is discovered in some of low-level
> widely reused core libraries, we need just one click to build/test/deploy
> and safely go live globally with whole estate updated on scale of thousands
> of processes. And you know, there are people in the world who think that
> scattered across small repos codebase is difficult to maintain and
> snapshots are evil. It all depends.
> Honestly, i think it will be it's a kind of reversed approach them you
> build system defines how your software development processes work. Google
> has own vision and just implemented Bazel and this is correct approach. Btw
> Bazel is perfect for such scenario, but costly to migrate on for existing
> project.
>
> So if you choose monorepo as we did it is normal to work just on a part of
> project. You just need a way to deal with scalability challenges:
> a) you hit hardware and infrastructure limitations and need to address
> them in some way.
> b) need to have incremental build so you can work on subpart of project
> but contribute to shared codebase
>
> Sincerely yours, Aleks
>
> On 2019/09/14 08:41:37, Enrico Olivelli <eo...@gmail.com> wrote:
> > I feel that in general having an huge monolithic project is kind of a
> > project-smell.
> > Btw I have some big project with 100+ modules so I can see your pain.
> > In the daywork experience a single developer doesn't work on all of the
> > modules but usually you touch 1-2 modules and maybe some
> integration/system
> > test.
> > If you need to rebuild the full project for every change maybe there is
> > something wrong with the overall design.
> >
> > I think you have you motivations for your layout, so let's talk about
> your
> > proposal.
> >
> > If you have a way to split your project in subsystems you can use some
> > shared remote repository for deploying snapshots in order to share
> > intermediate results with other developers
> >
> > If your goal is to be ready for releases I don't get your point. Usually
> > you work with snapshots and for a release you have to rebuild one time
> and
> > only once the full codebase in order to ensure that you a consistent
> build
> > of the project.
> > With all of this kind of temporary caches how do you ensure that the
> final
> > artifacts are the intended ones?
> >
> >
> > Beside note: this is not a real VOTE thread
> >
> > Just my 2 cents
> >
> > don't get me wrong, I admire your will to improve Maven ecosystem with
> this
> > cool feature! Thank you for contribution your work. We will try to get
> the
> > best
> >
> > Enrico
> >
> > Il sab 14 set 2019, 08:29 Laird Nelson <lj...@gmail.com> ha scritto:
> >
> > > On Fri, Sep 13, 2019 at 11:01 PM Alexander Ashitkin <
> > > alexander.ashitkin@db.com> wrote:
> > >
> > > > This feature is true incremental build – you don’t build modules
> which
> > > > were not changed at all and build only modified/changed ones.
> > > >
> > >
> > > Suppose module B depends on module A and I change A.  Does B get
> rebuilt in
> > > your system?
> > >
> > > Best,
> > > Laird
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Ashitkin <as...@gmail.com>.
HI Enrico
Thanks for feedback. that's a side discussion for best approach for projects layouts. Monorepo has own own advocates and it is easy to find posts describing why google, microsoft or facebook go monorepo. 
Unlike of way of thought, we are ready to go globally in case of emergency scenario. If say zero-day vulnerability is discovered in some of low-level widely reused core libraries, we need just one click to build/test/deploy and safely go live globally with whole estate updated on scale of thousands of processes. And you know, there are people in the world who think that scattered across small repos codebase is difficult to maintain and snapshots are evil. It all depends.
Honestly, i think it will be it's a kind of reversed approach them you build system defines how your software development processes work. Google has own vision and just implemented Bazel and this is correct approach. Btw Bazel is perfect for such scenario, but costly to migrate on for existing project.

So if you choose monorepo as we did it is normal to work just on a part of project. You just need a way to deal with scalability challenges:
a) you hit hardware and infrastructure limitations and need to address them in some way.
b) need to have incremental build so you can work on subpart of project but contribute to shared codebase

Sincerely yours, Aleks

On 2019/09/14 08:41:37, Enrico Olivelli <eo...@gmail.com> wrote: 
> I feel that in general having an huge monolithic project is kind of a
> project-smell.
> Btw I have some big project with 100+ modules so I can see your pain.
> In the daywork experience a single developer doesn't work on all of the
> modules but usually you touch 1-2 modules and maybe some integration/system
> test.
> If you need to rebuild the full project for every change maybe there is
> something wrong with the overall design.
> 
> I think you have you motivations for your layout, so let's talk about your
> proposal.
> 
> If you have a way to split your project in subsystems you can use some
> shared remote repository for deploying snapshots in order to share
> intermediate results with other developers
> 
> If your goal is to be ready for releases I don't get your point. Usually
> you work with snapshots and for a release you have to rebuild one time and
> only once the full codebase in order to ensure that you a consistent build
> of the project.
> With all of this kind of temporary caches how do you ensure that the final
> artifacts are the intended ones?
> 
> 
> Beside note: this is not a real VOTE thread
> 
> Just my 2 cents
> 
> don't get me wrong, I admire your will to improve Maven ecosystem with this
> cool feature! Thank you for contribution your work. We will try to get the
> best
> 
> Enrico
> 
> Il sab 14 set 2019, 08:29 Laird Nelson <lj...@gmail.com> ha scritto:
> 
> > On Fri, Sep 13, 2019 at 11:01 PM Alexander Ashitkin <
> > alexander.ashitkin@db.com> wrote:
> >
> > > This feature is true incremental build – you don’t build modules which
> > > were not changed at all and build only modified/changed ones.
> > >
> >
> > Suppose module B depends on module A and I change A.  Does B get rebuilt in
> > your system?
> >
> > Best,
> > Laird
> >
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Enrico Olivelli <eo...@gmail.com>.
I feel that in general having an huge monolithic project is kind of a
project-smell.
Btw I have some big project with 100+ modules so I can see your pain.
In the daywork experience a single developer doesn't work on all of the
modules but usually you touch 1-2 modules and maybe some integration/system
test.
If you need to rebuild the full project for every change maybe there is
something wrong with the overall design.

I think you have you motivations for your layout, so let's talk about your
proposal.

If you have a way to split your project in subsystems you can use some
shared remote repository for deploying snapshots in order to share
intermediate results with other developers

If your goal is to be ready for releases I don't get your point. Usually
you work with snapshots and for a release you have to rebuild one time and
only once the full codebase in order to ensure that you a consistent build
of the project.
With all of this kind of temporary caches how do you ensure that the final
artifacts are the intended ones?


Beside note: this is not a real VOTE thread

Just my 2 cents

don't get me wrong, I admire your will to improve Maven ecosystem with this
cool feature! Thank you for contribution your work. We will try to get the
best

Enrico

Il sab 14 set 2019, 08:29 Laird Nelson <lj...@gmail.com> ha scritto:

> On Fri, Sep 13, 2019 at 11:01 PM Alexander Ashitkin <
> alexander.ashitkin@db.com> wrote:
>
> > This feature is true incremental build – you don’t build modules which
> > were not changed at all and build only modified/changed ones.
> >
>
> Suppose module B depends on module A and I change A.  Does B get rebuilt in
> your system?
>
> Best,
> Laird
>

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Laird Nelson <lj...@gmail.com>.
On Fri, Sep 13, 2019 at 11:01 PM Alexander Ashitkin <
alexander.ashitkin@db.com> wrote:

> This feature is true incremental build – you don’t build modules which
> were not changed at all and build only modified/changed ones.
>

Suppose module B depends on module A and I change A.  Does B get rebuilt in
your system?

Best,
Laird

RE: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Alexander Ashitkin <al...@db.com>.
Hi
Yes we tried, but Takari is a bit different story – it’s a smarter scheduler which gives you some boost over default lifecycle scheduler, but still require you to build your modules.
This feature is true incremental build – you don’t build modules which were not changed at all and build only modified/changed ones. Required build state for skipped modules is restored from cache.
So for our 600 modules build time is down to 1 minute from ~40 minutes and even single threaded build benefits from the cache. Takari just doesn’t do that.

Kindly yours
Aleks

From: Tamás Cservenák [mailto:tamas@cservenak.net]
Sent: Friday, September 13, 2019 4:54 PM
To: Maven Developers List <de...@maven.apache.org>
Cc: Alexander Ashitkin <al...@db.com>
Subject: Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Hi there,

just a shot in a dark: Have you tried any of the existing stuff, like Takari Lifecycle before modding Maven itself? (http://takari.io/book/40-lifecycle.html)

Thanks,
T

On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <ma...@db.com>> wrote:
Hi All,

We want to create upstream change to Maven to support true incremental build for big-sized projects.
To raise a pull request we have to pass long chain of Deutsche Bank’s internal procedures. So, before starting the process we would like to get your feedback regarding this feature.

Motivation:

Our project is hosted in mono-repo and contains ~600 modules. All modules has the same SNAPSHOT version.
There are lot of test automation around this, everything is tested before merge into release branch.

Current setup helps us to simplify build/release/dependency management for 10+ teams those contribute into codebase. We can release everything in 1-click.
The major drawback of such approach is build time: full local build took 45-60 min (-T8), CI build ~25min(-T16).

To speed-up our build we needed 2 features: incremental build and shared cache.
Initially we started to think about migration to Gradle or Bazel. As migration costs for the mentioned tools were too high, we decided to add similar functionality into Maven.

Current results we get: 1-2 mins for local build(-T8) if build was cached by CI, CI build ~5 mins (-T16).

Feature description:

The idea is to calculate checksum for inputs and save outputs in cache.
Each node checksum calculated with:


•         Effective POM hash

•         Sources hash

•         Dependencies hash (dependencies within multi-module project)

Project sources inputs are searched inside project + all paths from plugins configuration:
How does it work in practice:



1.       CI: runs builds and stores outputs in shared cache

2.       CI: reuse outputs for same inputs, so time is decreasing

3.       Locally: when I checkout branch and run ‘install’ for whole project, I get all actual snapshots from remote cache for this branch

4.       Locally: if I change multiple modules in tree, only changed subtree is rebuilt

Impact on current Maven codebase is very localized (MojoExecutor, where we injected cache controller).
Caching can be activated/deactivated by property, so current maven flow will work as is.

And the big plus is that you don’t need to re-work your current project. Caching should work out of box, just need to add config in .mvn folder.

Please let us know what do you think. We are ready to invest in this feature and address any further feedback.

Kind regards,
Max



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


---
Die Europäische Kommission hat unter http://ec.europa.eu/consumers/odr/ eine Europäische Online-Streitbeilegungsplattform (OS-Plattform) errichtet. Verbraucher können die OS-Plattform für die außergerichtliche Beilegung von Streitigkeiten aus Online-Verträgen mit in der EU niedergelassenen Unternehmen nutzen.

Informationen (einschließlich Pflichtangaben) zu einzelnen, innerhalb der EU tätigen Gesellschaften und Zweigniederlassungen des Konzerns Deutsche Bank finden Sie unter https://www.deutsche-bank.de/Pflichtangaben. Diese E-Mail enthält vertrauliche und/ oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser E-Mail ist nicht gestattet.

The European Commission has established a European online dispute resolution platform (OS platform) under http://ec.europa.eu/consumers/odr/. Consumers may use the OS platform to resolve disputes arising from online contracts with providers established in the EU.

Please refer to https://www.db.com/disclosures for information (including mandatory corporate particulars) on selected Deutsche Bank branches and group companies registered or incorporated in the European Union. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Posted by Tamás Cservenák <ta...@cservenak.net>.
Hi there,

just a shot in a dark: Have you tried any of the existing stuff, like
Takari Lifecycle before modding Maven itself? (
http://takari.io/book/40-lifecycle.html)

Thanks,
T

On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
maximilian.novikov@db.com> wrote:

> Hi All,
>
>
>
> *We want to create upstream change to Maven* to support true incremental
> build for big-sized projects.
>
> To raise a pull request we have to pass long chain of Deutsche Bank’s
> internal procedures. So, *before starting the process we would like to
> get your feedback regarding this feature*.
>
>
>
> *Motivation:*
>
>
>
> Our project is hosted in mono-repo and contains ~600 modules. All modules
> has the same SNAPSHOT version.
>
> There are lot of test automation around this, everything is tested before
> merge into release branch.
>
>
>
> Current setup helps us to simplify build/release/dependency management for
> 10+ teams those contribute into codebase. We can release everything in
> 1-click.
>
> The major drawback of such approach is build time: *full local build took
> 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
>
>
>
> To speed-up our build we needed 2 features: incremental build and shared
> cache.
>
> Initially we started to think about migration to Gradle or Bazel. As
> migration costs for the mentioned tools were too high, we decided to add
> similar functionality into Maven.
>
>
>
> Current results we get: *1-2 mins for local build(*-T8*)* if build was
> cached by CI*, CI build ~5 mins (*-T16*).*
>
>
>
> *Feature description:*
>
>
>
> The idea is to calculate checksum for inputs and save outputs in cache.
>
> [image: image2019-8-27_20-0-14.png]
>
> Each node checksum calculated with:
>
>
>
> ·         Effective POM hash
>
> ·         Sources hash
>
> ·         Dependencies hash (dependencies within multi-module project)
>
>
>
> Project sources inputs are searched inside project + all paths from
> plugins configuration:
>
> [image: image2019-8-30_10-28-56.png]
>
> How does it work in practice:
>
>
>
> 1.       CI: runs builds and stores outputs in shared cache
>
> 2.       CI: reuse outputs for same inputs, so time is decreasing
>
> 3.       Locally: when I checkout branch and run ‘install’ for whole
> project, I get all actual snapshots from remote cache for this branch
>
> 4.       Locally: if I change multiple modules in tree, only changed
> subtree is rebuilt
>
>
>
> Impact on current Maven codebase is very localized (MojoExecutor, where we
> injected cache controller).
>
> Caching can be activated/deactivated by property, so current maven flow
> will work as is.
>
>
>
> And the big plus is that you don’t need to re-work your current project.
> Caching should work out of box, just need to add config in .mvn folder.
>
>
>
> Please let us know what do you think. We are ready to invest in this
> feature and address any further feedback.
>
>
>
> Kind regards,
>
> Max
>
>
>
>
> ---
> This e-mail may contain confidential and/or privileged information. If you
> are not the intended recipient (or have received this e-mail in error)
> please notify the sender immediately and delete this e-mail. Any
> unauthorized copying, disclosure or distribution of the material in this
> e-mail is strictly forbidden.
>
> Please refer to https://www.db.com/disclosures for additional EU
> corporate and regulatory disclosures and to
> http://www.db.com/unitedkingdom/content/privacy.htm for information about
> privacy.
>