You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@maven.apache.org by Hervé BOUTEMY <he...@free.fr> on 2019/09/22 23:52:48 UTC

Reproducible Builds for Maven

after a few years of testing, thinking, procrastination and hard work (thank you Thomas for your talk at Devoxx France 2016 [1]), I think I achieved a key step this week-end toward native Reproducible Builds with Maven [2]: Maven core itself can be built in a reproducible way!

It means that if you build "reproducible" branch of Maven core, you'll get the same apache-maven-3.6.3-SNAPSHOT-bin.zip than me or the ASF CI server [3].
The precise result depends only on 2 key facts:
- do you build on Windows or any Unix? This impacts newlines...
- what JDK major version do you use to build? This affects generated .class (notice: AFAIK minor JDK version does not have any impact, nor platform)

This branch is only a PoC: it uses unreleased packaging plugins that give reproducible results (versions in .RB-SNAPSHOT), and I had to tweak a little bit the build for remaining reproduciblity issues with sisu and plexus plugins.
There are many details to decide before releasing these plugins and making every build reproducible by default.
But the current steps proves that is is feasible.

Interested in joining the effort to bring this feature to releases for end users?

Regards,

Hervé


[1] http://zlika.github.io/presentations/devoxx_fr_2016/reproducible-builds/slides_fr.html

[2] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74682318

[3] https://builds.apache.org/view/M-R/view/Maven/job/maven-box/job/maven/job/reproducible/



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: Reproducible Builds for Maven

Posted by Emmanuel Bourg <eb...@apache.org>.
Le 23/09/2019 à 01:52, Hervé BOUTEMY a écrit :
> after a few years of testing, thinking, procrastination and hard work (thank you Thomas for your talk at Devoxx France 2016 [1]), I think I achieved a key step this week-end toward native Reproducible Builds with Maven [2]: Maven core itself can be built in a reproducible way!

An important milestone on a very long journey. Well done!

Emmanuel Bourg

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: Reproducible Builds for Maven

Posted by Mark Derricutt <ma...@talios.com>.
On 28 Sep 2019, at 18:21, Hervé BOUTEMY wrote:

> I'll share shortly a discussion on a choice we need to do together to define how to configure reproducible builds (property name and value/format of current source-date-epoch defined in PoC)

Now that support for this has been released as part of the source/jar plugins, but not yet the release plugin, I whipped up this morning a quick plugin to use in the meantime:

  https://github.com/talios/reproducible-maven-plugin

Basically to be included in preparationGoals/completionGoals using the apply/clear goal in each.

So far based on very quick, rudimentary tests this seems to work well (altho I've not yet done it as part of a release, but using apply and then building/checking my jars.

I'd be keen on any feedback...

Interestingly, whilst the class files INSIDE the jar have the correct date, the jar this has todays - which makes sense, but I wonder if it shouldn't be the same time for one extra step of "the exact same output"...

Mark



---
"The ease with which a change can be implemented has no relevance at all to whether it is the right change for the (Java) Platform for all time." &mdash; Mark Reinhold.

Mark Derricutt
http://www.theoryinpractice.net
http://www.chaliceofblood.net
http://plus.google.com/+MarkDerricutt
http://twitter.com/talios
http://facebook.com/mderricutt

Re: Reproducible Builds for Maven

Posted by Hervé BOUTEMY <he...@free.fr>.
last updates:
- tar.gz archives are now also reproducible (in addition to .zip)
- src archives are also built and reproducible (notice that the result is the same on every JDK version of a platform. Notice 2: if you don't get the same result than CI, check that you don't have IDE configuration files that went into your local source archives...)
- artifacts built on ASF CI are available, for people to download and compare if you get a different result:
https://builds.apache.org/view/M-R/view/Maven/job/maven-box/job/maven/job/reproducible/lastSuccessfulBuild/artifact/org/apache/maven/apache-maven/3.6.3-SNAPSHOT/

I'll share shortly a discussion on a choice we need to do together to define how to configure reproducible builds (property name and value/format of current source-date-epoch defined in PoC)

Once this decision is made, we can start release packaging plugins that support "native" reproducible builds
https://reproducible-builds.org/

Regards,

Hervé

Le lundi 23 septembre 2019, 01:52:48 CEST Hervé BOUTEMY a écrit :
> after a few years of testing, thinking, procrastination and hard work (thank
> you Thomas for your talk at Devoxx France 2016 [1]), I think I achieved a
> key step this week-end toward native Reproducible Builds with Maven [2]:
> Maven core itself can be built in a reproducible way!
> 
> It means that if you build "reproducible" branch of Maven core, you'll get
> the same apache-maven-3.6.3-SNAPSHOT-bin.zip than me or the ASF CI server
> [3]. The precise result depends only on 2 key facts:
> - do you build on Windows or any Unix? This impacts newlines...
> - what JDK major version do you use to build? This affects generated .class
> (notice: AFAIK minor JDK version does not have any impact, nor platform)
> 
> This branch is only a PoC: it uses unreleased packaging plugins that give
> reproducible results (versions in .RB-SNAPSHOT), and I had to tweak a
> little bit the build for remaining reproduciblity issues with sisu and
> plexus plugins. There are many details to decide before releasing these
> plugins and making every build reproducible by default. But the current
> steps proves that is is feasible.
> 
> Interested in joining the effort to bring this feature to releases for end
> users?
> 
> Regards,
> 
> Hervé
> 
> 
> [1]
> http://zlika.github.io/presentations/devoxx_fr_2016/reproducible-builds/sli
> des_fr.html
> 
> [2]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74682318
> 
> [3]
> https://builds.apache.org/view/M-R/view/Maven/job/maven-box/job/maven/job/r
> eproducible/
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: Reproducible Builds for Maven

Posted by Vladimir Sitnikov <si...@gmail.com>.
Hervé>Maven core itself can be built in a reproducible way!

This is nice, thank you for pushing this.
It would be really great if Maven plugins defaulted to reproducible
artifacts as well (e.g. maven-jar produced a jar with sorted entries and
fixed timestamps).

Hervé>Interested in joining the effort to bring this feature to releases
for end users?

+1.
Does that include reproducible source releases?

It would help to verify them and vote on releases as well, as everybody
could build their own copy of Maven and check if the resulting archive is
the same as the one that is proposed for the release.

Hervé>- do you build on Windows or any Unix? This impacts newlines...

Can the build enforce certain newlines? For instance, *.sh scripts have to
be LF no matter what the build OS is.
The same goes for *.cmd which have to be CRLF and so on.

I've implemented reproducible builds for Apache JMeter, and one more
variable is "executable bit".
Git on Windows often produces "executable readme.txt" and things like that,
so JMeter's build script enforces executable bits for the resulting
archives.

One more variable is "the state of Maven local repository". In other words,
if I `mvn install` a library (e.g. while hacking on commons-io), then Maven
build would pick that library from my .m2 folder, and it would be included
to the Maven build which is bad for reproducibility.
That could have been prevented by pgpverify-maven-plugin which could verify
that dependencies have expected PGP signatures.
Apparently my own commons-io build could not have the proper PGP signature,
so Maven could warn me like "hey, you're using unexpected commons-io.jar".

I've tried mvn install -DskipTests on my macOS machine, and the result
differs

Here's output of mvn clean install -DskipTests -Papache-release:
[INFO] --- checksum-maven-plugin:1.7:files (source-release-checksum) @
apache-maven ---
[INFO] apache-maven-3.6.3-SNAPSHOT-bin.tar.gz - SHA-512 :
4cc59e85d811dc9c900bd4b527db7327f8790b408004d254f6b270bddd4137aa4e9d87733ba2fb14e80fcf31d28ac24874d344ade7746598546bc36d156befa8
[INFO] apache-maven-3.6.3-SNAPSHOT-src.zip - SHA-512 :
b7dabdfd8d3b3f4afdea70e46b0be1c33177a487cffeddd325769aa97c121b3983b54f99c35f05ba91d4d6885953b88c681254a833f77c2ab119e71fa56f258e
[INFO] apache-maven-3.6.3-SNAPSHOT-bin.zip - SHA-512 :
5f23cb830991babd610edeeefc3dbb08b01f1d83be19edeb206bb7ea5844bc8d390ecae0c95328cab73c3f957f997ac9cd64cfe35f4ff9097bbe0af17ad567c2
[INFO] apache-maven-3.6.3-SNAPSHOT-src.tar.gz - SHA-512 :
a6cea689fefaf07bbeb5dc070b6ebeada7d1590aec7fcebb067ea22dca72cd52580eee99200a87090e8035a9b1b37217e1d4fb556a74312c510a85c3c274e2ee

$ git describe --long --all
heads/reproducible-0-g2d45315a8

$ mvn -X install -DskipTests
Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe;
2018-06-17T21:33:14+03:00)
Maven home:
/nix/store/wqsccql2a0v048cd5l95f2hfp0l7rz6i-apache-maven-3.5.4/maven
Java version: 1.8.0_121, vendor: Azul Systems, Inc., runtime:
/nix/store/89rm62lkhnnd1a0fadfg1z426wnangy2-zulu1.8.0_121-8.20.0.5/jre
Default locale: ru_RU, platform encoding: UTF-8
OS name: "mac os x", version: "10.14.6", arch: "x86_64", family: "mac"

Just in case, I've attached (see maven-3.6.3-checksums.zip) the contents
and the checksums of the files I get.

Vladimir

Re: Reproducible Builds for Maven

Posted by Hervé BOUTEMY <he...@free.fr>.
Le mardi 24 septembre 2019, 02:28:15 CEST Mark Derricutt a écrit :
> Tomo Suzuki wrote on 23/09/19 3:56 PM:
> > Does your approach use such file to record library versions?
> 
> I don't know about what Hervé is doing,
I added an "out of scope" paragraph: managing version ranges in a stable way is not in the scope
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74682318#Reproducible/VerifiableBuilds-Outofscope


> but internally we have a tool I
> wrote for handling this, we have a pom.deps file that looks like:
nice separation of concerns: stable versions chosen vs updates controlled with ranges

with such an approach, version ranges could become something I love :)

Regards,

Hervé

> 
>      repository http://nexus.XXXXXX as public;
> 
>      import smx3:smx3.upstream.bill-of-materials:1.1.22;
> 
>      resolve highest org.jetbrains:annotations:[16.0.3,17.0.0) via public;
>      resolve highest
> org.apache.maven.plugins:maven-jar-plugin:[3.1.2,4.0.0) via public;
>      resolve highest org.apache.cxf:cxf-codegen-plugin:[3.3.3,4.0.0) via
> public;
> 
> which when we resolve, will find the highest, snapshot, or lowest
> version in a given range - also allowing filtering out annoying things
> like beta/alpha/CR from central, and rewriting the pom.xml's.
> 
> Our tooling also has an 'import' option shown above that lets us
> standardize the versions we resolving, and breaking it up - so we have
> 'upstream.bill-of-materials' and 'upstream.testing.bill-of-materials`.
> 
> As part of this we also add in <exclusions> to ban all transitive build
> deps, and [] range all version references.
> 
> I keep meaning to push for open sourcing it, but just haven't had the time.
> 
> Mark





---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: Reproducible Builds for Maven

Posted by Tomo Suzuki <su...@google.com.INVALID>.
Cool. Thanks.

Re: Reproducible Builds for Maven

Posted by Mark Derricutt <ma...@talios.com>.
On 24 Sep 2019, at 23:37, Tomo Suzuki wrote:

> versions, rather than ranges. Would you share the background why your tool
> records the ranges?

The full examples is at:

  https://github.com/HalBuilder/halbuilder-support-4.x/blob/master/pom.deps

It resolves the locked down versions, but also retains the desired ranges for controlled updates.

We tend to keep ranges between major versions, i.e. [1.0.0,2.0.0) for a semblance of semver.

When I reresolve the bill of materials, I find I'll often look at the git diff and see what new versions of libraries have been updated, and decide which ( and when ) we pull them in to use - often committing those changes individually.



---
"The ease with which a change can be implemented has no relevance at all to whether it is the right change for the (Java) Platform for all time." &mdash; Mark Reinhold.

Mark Derricutt
http://www.theoryinpractice.net
http://www.chaliceofblood.net
http://plus.google.com/+MarkDerricutt
http://twitter.com/talios
http://facebook.com/mderricutt

Re: Reproducible Builds for Maven

Posted by Mark Derricutt <ma...@talios.com>.
Oh right! I missed the second half of our pom.deps file:

    blacklisted org.hibernate:hibernate-ehcache:4.2.19.Final from 
smx3:smx3.bill-of-materials:2.1.1;
    deprecated org.hibernate:hibernate-search-engine:4.4.0.Final from 
smx3:smx3.bill-of-materials:2.1.1;
    resolved org.hibernate:hibernate-search-orm:4.4.0.Final from 
smx3:smx3.bill-of-materials:2.1.1;
    locked org.hibernate:hibernate-validator:4.3.1.Final from 
smx3:smx3.upstream.bill-of-materials:1.0.16;

I provide 4 types of resolution, deprecated/blacklisted means the build 
will fail, unless you specifically opt-in to deprecated/blacklisted 
dependencies.

On resolve, anything that's changed moves from `locked` to either 
resolved/deprecated/blacklisted, and on release anything resolved goes 
to locked.

In the end user projects, we "import" the bill of materials .deps 
artifact which pulls in any locked version, but I also have support for 
adding:

    allow unlocked /^.*someregex.*$/;

which will trigger a re-resolve, which we use for our own internal 
artifacts, so we pick up the latest internal releases IF we re-resolve.

The whole process seems to work well, it does cause some annoying round 
trip releases or quirks now and then, but for the most part - it's 
really solved a heap of issues where we've needed to back a quick 
backport fix in a production environment.

Since we publish the pom.deps file for each artifact, when doing the 
backports, we simply import the deps used in the distribution version 
we're patching, and manually allow any updated deps that we need to fix.

Tomo Suzuki wrote:
>
> For reproducible builds, I expected the lock file contains specific
> versions, rather than ranges. Would you share the background why your tool
> records the ranges?

Re: Reproducible Builds for Maven

Posted by Tomo Suzuki <su...@google.com.INVALID>.
Hi Mark,

Thank you for response.

> resolve highest org.jetbrains:annotations:[16.0.3,17.0.0) via public;

For reproducible builds, I expected the lock file contains specific
versions, rather than ranges. Would you share the background why your tool
records the ranges?


-- 
Regards,
Tomo

Re: Reproducible Builds for Maven

Posted by Mark Derricutt <ma...@talios.com>.
Tomo Suzuki wrote on 23/09/19 3:56 PM:
> Does your approach use such file to record library versions?
I don't know about what Hervé is doing, but internally we have a tool I 
wrote for handling this, we have a pom.deps file that looks like:

     repository http://nexus.XXXXXX as public;

     import smx3:smx3.upstream.bill-of-materials:1.1.22;

     resolve highest org.jetbrains:annotations:[16.0.3,17.0.0) via public;
     resolve highest 
org.apache.maven.plugins:maven-jar-plugin:[3.1.2,4.0.0) via public;
     resolve highest org.apache.cxf:cxf-codegen-plugin:[3.3.3,4.0.0) via 
public;

which when we resolve, will find the highest, snapshot, or lowest 
version in a given range - also allowing filtering out annoying things 
like beta/alpha/CR from central, and rewriting the pom.xml's.

Our tooling also has an 'import' option shown above that lets us 
standardize the versions we resolving, and breaking it up - so we have 
'upstream.bill-of-materials' and 'upstream.testing.bill-of-materials`.

As part of this we also add in <exclusions> to ban all transitive build 
deps, and [] range all version references.

I keep meaning to push for open sourcing it, but just haven't had the time.

Mark


-- 
Sent from Postbox <https://www.postbox-inc.com>

Re: Reproducible Builds for Maven

Posted by Hervé BOUTEMY <he...@free.fr>.
Le lundi 23 septembre 2019, 05:56:06 CEST Tomo Suzuki a écrit :
> Sounds nice!
don't hesitate to build for yourself, check that you get the same sha512 and 
report: this will help me either confirm "it works", or find little remaining 
issues.

> 
> > The precise result depends only on 2 key facts
> 
> When I hear “reproducible builds”, I think of  lock files that remember
> library versions used.
> Gradle’s approach:
> https://docs.gradle.org/current/userguide/dependency_locking.html
> 
> Does your approach use such file to record library versions?
no, we don't need such a lock file since we don't use version ranges: the 
dependency resolution is already stable

Here, "Reproducible builds are a set of software development practices that 
create an independently-verifiable path from source to binary code."
see https://reproducible-builds.org/

For Java, one key non-reproducible aspect for example is the timestamp of zip 
entries in jar files.

Regards,

Hervé 

> 
> Regards,
> Tomo





---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: Reproducible Builds for Maven

Posted by Tomo Suzuki <su...@google.com.INVALID>.
Sounds nice!

> The precise result depends only on 2 key facts

When I hear “reproducible builds”, I think of  lock files that remember
library versions used.
Gradle’s approach:
https://docs.gradle.org/current/userguide/dependency_locking.html

Does your approach use such file to record library versions?

Regards,
Tomo



-- 
Regards,
Tomo