You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Gilles <gi...@harfang.homelinux.org> on 2014/12/28 19:46:54 UTC

Re: [Math] What's in a release

Hi.

On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
> Le 28/12/2014 00:22, sebb a écrit :
>> On 27 December 2014 at 22:19, Gilles <gi...@harfang.homelinux.org> 
>> wrote:
>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
>>>>
>>>> On 24 December 2014 at 15:11, Gilles 
>>>> <gi...@harfang.homelinux.org> wrote:
>>>>>
>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe wrote:
>>>>>>
>>>>>>
>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
>>>>>>>
>>>>>>>
>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math 3.4 from 
>>>>>>>>>> release
>>>>>>>>>> candidate 3.
>>>>>>>>>>
>>>>>>>>>> Tag name:
>>>>>>>>>>   MATH_3_4_RC3 (signature can be checked from git using 'git 
>>>>>>>>>> tag
>>>>>>>>>> -v')
>>>>>>>>>>
>>>>>>>>>> Tag URL:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 
>>>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Is there a way to check that the source code referred to 
>>>>>>>>> above
>>>>>>>>> was the one used to create the JAR of the ".class" files.
>>>>>>>>> [Out of curiosity, not suspicion, of course...]
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, you can look at the end of the META-INF/MANIFEST.MS file 
>>>>>>>> embedded
>>>>>>>> in the jar. The second-to-last entry is called 
>>>>>>>> Implementation-Build.
>>>>>>>> It
>>>>>>>> is automatically created by maven-jgit-buildnumber-plugin and 
>>>>>>>> contains
>>>>>>>> the SHA1 identifier of the last commit used for the build. 
>>>>>>>> Here, is is
>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can check it 
>>>>>>>> really
>>>>>>>> corresponds to the expected status of the git repository.
>>>>>>>>
>>>>>>>
>>>>>>> Can this be considered "secure", i.e. can't this entry in the 
>>>>>>> MANIFEST
>>>>>>> file be modified to be the checksum of the repository but with 
>>>>>>> the
>>>>>>> .class
>>>>>>> files being substitued with those coming from another 
>>>>>>> compilation?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Modifying anything in the jar (either this entry within the 
>>>>>> manifest or
>>>>>> any class) will modify the jar signature. So as long as people 
>>>>>> do check
>>>>>> the global MD5, SHA1 or gpg signature we provide with our build, 
>>>>>> they
>>>>>> are safe to assume the artifacts are Apache artifacts.
>>>>>>
>>>>>> This is not different from how releases are done with subversion 
>>>>>> as the
>>>>>> source code control system, or even in C or C++ as the language. 
>>>>>> At one
>>>>>> time, the release manager does perform a compilation and the 
>>>>>> fellow
>>>>>> reviewers check the result. There is no fullproof process here, 
>>>>>> as
>>>>>> always when security is involved. Even using an automated build 
>>>>>> and
>>>>>> automatic signing on an Apache server would involve trust (i.e. 
>>>>>> one
>>>>>> should assume that the server has not been tampered with, that 
>>>>>> the build
>>>>>> process really does what it is expected to do, that the 
>>>>>> artifacts put to
>>>>>> review are really the one created by the automatic process ...).
>>>>>>
>>>>>> Another point is that what we officially release is the source, 
>>>>>> which
>>>>>> can be reviewed by external users. The binary parts are merely a
>>>>>> convenience.
>>>>>
>>>>>
>>>>>
>>>>> That's an interesting point to come back to since it looks like 
>>>>> the
>>>>> most time-consuming part of a release is not related to the 
>>>>> sources!
>>>>>
>>>>> Isn't it conceivable that a release could just be a commit 
>>>>> identifier
>>>>> and a checksum of the repository?
>>>>>
>>>>> If the binaries are a just a convenience, why put so much effort 
>>>>> in it?
>>>>> As a convenience, the artefacts could be produced after the 
>>>>> release,
>>>>> accompanied with all the "caveat" notes which you mentioned.
>>>>>
>>>>> That would certainly increase the release rate.
>>>>
>>>>
>>>> Binary releases still need to be reviewed to ensure that the 
>>>> correct N
>>>> & L files are present, and that the archives don't contain 
>>>> material
>>>> with disallowed licenses.
>>>>
>>>> It's not unknown for automated build processes to include files 
>>>> that
>>>> should not be present.
>>>>
>>>
>>> I fail to see the difference of principle between the "release" 
>>> context
>>> and, say, the daily snapshot context.
>>
>> Snapshots are not (should not) be promoted to the general public as
>> releases of the ASF.
>>
>>> What I mean is that there seem to be a contradiction between saying 
>>> that
>>> a "release" is only about _source_ and the obligation to check 
>>> _binaries_.
>>
>> There is no contradiction here.
>> The ASF releases source, they are required in a release.
>> Binaries are optional.
>> That does not mean that the ASF mirror system can be used to
>> distribute arbitrary binaries.
>>
>>> It can occur that disallowed material is, at some point in time, 
>>> part of
>>> the repository and/or the snapshot binaries.
>>> However, what is forbidden is... forbidden, at all times.
>>
>> As with most things, this is not a strict dichotomy.
>>
>>> If it is indeed a problem to distribute forbidden material, 
>>> shouldn't
>>> this be corrected in the repository? [That's indeed what you did 
>>> with
>>> the blocking of the release.]
>>
>> If the repo is discovered to contain disallowed material, it needs 
>> to
>> be removed.
>>
>>> Then again, once the repository is "clean", it can be tagged and 
>>> that
>>> tagged _source_ is the release.
>>
>> Not quite.
>>
>> A release is a source archive that is voted on and distributed via 
>> the
>> ASF mirror system.
>> The contents must agree with the source tag, but the source tag is 
>> not
>> the release.
>>
>>> Non-compliant binaries would thus only be the result of a "mistake"
>>> (if the build system is flawed, it's another problem, unrelated to
>>> the released contents, which is _source_) to be corrected per se.
>>
>> Not so. There are other failure modes.
>>
>> An automated build obviously reduces the chances of mistakes, but it
>> can still create an archive containing files that should not be 
>> there.
>> [Or indeed, omits files that should be present]
>> For example, the workspace contains spurious files which are
>> implicitly included by the assembly instructions.
>> Or the build process creates spurious files that are incorrectly 
>> added
>> to the archive.
>> Or the build incorrectly includes jars that are supposed to be
>> provided by the end user
>> etc.
>>
>> I have seen all the above in RC votes.
>> There are probably other falure modes.
>>
>>> My proposition is that it's an independent step: once the build
>>> system is adjusted to the expectations, "correct" binaries can be
>>> generated from the same tagged release.
>>
>> It does not matter when the binary is built.
>> If it is distributed by the PMC as a formal release, it must not
>> contain any surprises, e.g. it must be licensed under the AL.
>>
>> It is therefore vital that the contents are as expected from the 
>> build.
>>
>> Note also that a formal release becomes an act of the PMC by the 
>> voting process.
>> The ASF can then assume responsibility for any legal issues that may 
>> arise.
>> Otherwise it is entirely the personal responsibility of the person 
>> who
>> releases it.
>
> I think the last two points are really important: binaries must be
> checked and the foundation provides a legal protection for the 
> project
> if something weird occurs.
>
> I also think another point is important: many if not most users do
> really expect binaries and not source. From our internal Apache point
> of view, these are a by-product,. For many others it is the important
> thing. It is mostly true in maven land as dependencies are
> automatically retrieved in binary form, not source form. So the maven
> central repository as a distribution system is important.
>
> Even if for some security reason it sounds at first thought logical 
> to
> rely on source only and compile oneself, in an industrial context
> project teams do not have enough time to do it for all their
> dependencies, so they use binaries provided by trusted third parties. 
> A
> long time ago, I compiled a lot of free software tools for the
> department I worked for at that time. I do not do this anymore, and
> trust the binaries provided by the packaging team for a distribution
> (typically Debian). They do rely on source and compile themselves. 
> Hey,
> I even think Emmanuel here belongs to the Debian java team ;-) I 
> guess
> such teams that do rely on source are rather the exception than the
> rule. The other examples I can think of are packaging teams,
> development teams that need bleeding edge (and will also directly
> depend on the repository, not even the release), projects that need 
> to
> introduce their own patches and people who have critical needs (for
> example when safety of people is concerned or when they need full
> control for legal or contractual reasons). Many other people download
> binaries directly and would simply not consider using a project if it
> is not readily available: they don't have time for this and don't 
> want
> to learn how to build tens or hundred of different projects they 
> simply
> use.
>

I do not disagree with anything said on this thread. [In particular, I
did not at all imply that any one committer could take responsibility
for releasing unchecked items.]

I'm simply suggesting that what is called the release 
process/management
could be made simpler (and _consequently_ could lead to more regularly
releasing the CM code), by separating the concerns.
The concerns are
  1. "code" (the contents), and
  2. "artefacts" (the result of the build system acting on the "code").

Checking of one of these is largely independent from checking the 
other.
[The more so that, as you said, no fool-proof link between the two can
be ensured: From a security POV, checking the former requires a code
review, while using the latter requires trust in the build system.]

Thus we could release the "code", after checking and voting on the
concerned elements (i.e. the repository state corresponding to a
specific tag + the web site).

Then we could release the "binaries", as a convenience, after checking
and voting on the concerned elements (i.e. the files about to be
distributed).

I think that it's an added flexibility that would, for example, allow
the tagging of the repository without necessarily release binaries 
(i.e.
not involving that part of the work); and to release binaries (say, at
regular intervals) based on the latest tagged code (i.e. not involving
the work about solving/evaluating/postponing issues).

[I completely admit that, at first, it might look a little more
confusing for the plain user, but (IIUC) it would be a better
representation of the reality covered by stating that the ASF
releases source code.]


Best regards,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by Gilles <gi...@harfang.homelinux.org>.
On Mon, 29 Dec 2014 16:21:05 +0100, Thomas Neidhart wrote:
> On 12/29/2014 04:21 AM, Phil Steitz wrote:
>> On 12/28/14 11:46 AM, Gilles wrote:
>>> Hi.
>>>
>>> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
>>>> Le 28/12/2014 00:22, sebb a écrit :
>>>>> On 27 December 2014 at 22:19, Gilles
>>>>> <gi...@harfang.homelinux.org> wrote:
>>>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
>>>>>>>
>>>>>>> On 24 December 2014 at 15:11, Gilles
>>>>>>> <gi...@harfang.homelinux.org> wrote:
>>>>>>>>
>>>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math 3.4
>>>>>>>>>>>>> from release
>>>>>>>>>>>>> candidate 3.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Tag name:
>>>>>>>>>>>>>   MATH_3_4_RC3 (signature can be checked from git using
>>>>>>>>>>>>> 'git tag
>>>>>>>>>>>>> -v')
>>>>>>>>>>>>>
>>>>>>>>>>>>> Tag URL:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 
>>>>>>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Is there a way to check that the source code referred to
>>>>>>>>>>>> above
>>>>>>>>>>>> was the one used to create the JAR of the ".class" files.
>>>>>>>>>>>> [Out of curiosity, not suspicion, of course...]
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yes, you can look at the end of the META-INF/MANIFEST.MS
>>>>>>>>>>> file embedded
>>>>>>>>>>> in the jar. The second-to-last entry is called
>>>>>>>>>>> Implementation-Build.
>>>>>>>>>>> It
>>>>>>>>>>> is automatically created by maven-jgit-buildnumber-plugin
>>>>>>>>>>> and contains
>>>>>>>>>>> the SHA1 identifier of the last commit used for the build.
>>>>>>>>>>> Here, is is
>>>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can check
>>>>>>>>>>> it really
>>>>>>>>>>> corresponds to the expected status of the git repository.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Can this be considered "secure", i.e. can't this entry in
>>>>>>>>>> the MANIFEST
>>>>>>>>>> file be modified to be the checksum of the repository but
>>>>>>>>>> with the
>>>>>>>>>> .class
>>>>>>>>>> files being substitued with those coming from another
>>>>>>>>>> compilation?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Modifying anything in the jar (either this entry within the
>>>>>>>>> manifest or
>>>>>>>>> any class) will modify the jar signature. So as long as
>>>>>>>>> people do check
>>>>>>>>> the global MD5, SHA1 or gpg signature we provide with our
>>>>>>>>> build, they
>>>>>>>>> are safe to assume the artifacts are Apache artifacts.
>>>>>>>>>
>>>>>>>>> This is not different from how releases are done with
>>>>>>>>> subversion as the
>>>>>>>>> source code control system, or even in C or C++ as the
>>>>>>>>> language. At one
>>>>>>>>> time, the release manager does perform a compilation and the
>>>>>>>>> fellow
>>>>>>>>> reviewers check the result. There is no fullproof process
>>>>>>>>> here, as
>>>>>>>>> always when security is involved. Even using an automated
>>>>>>>>> build and
>>>>>>>>> automatic signing on an Apache server would involve trust
>>>>>>>>> (i.e. one
>>>>>>>>> should assume that the server has not been tampered with,
>>>>>>>>> that the build
>>>>>>>>> process really does what it is expected to do, that the
>>>>>>>>> artifacts put to
>>>>>>>>> review are really the one created by the automatic process
>>>>>>>>> ...).
>>>>>>>>>
>>>>>>>>> Another point is that what we officially release is the
>>>>>>>>> source, which
>>>>>>>>> can be reviewed by external users. The binary parts are
>>>>>>>>> merely a
>>>>>>>>> convenience.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> That's an interesting point to come back to since it looks
>>>>>>>> like the
>>>>>>>> most time-consuming part of a release is not related to the
>>>>>>>> sources!
>>>>>>>>
>>>>>>>> Isn't it conceivable that a release could just be a commit
>>>>>>>> identifier
>>>>>>>> and a checksum of the repository?
>>>>>>>>
>>>>>>>> If the binaries are a just a convenience, why put so much
>>>>>>>> effort in it?
>>>>>>>> As a convenience, the artefacts could be produced after the
>>>>>>>> release,
>>>>>>>> accompanied with all the "caveat" notes which you mentioned.
>>>>>>>>
>>>>>>>> That would certainly increase the release rate.
>>>>>>>
>>>>>>>
>>>>>>> Binary releases still need to be reviewed to ensure that the
>>>>>>> correct N
>>>>>>> & L files are present, and that the archives don't contain
>>>>>>> material
>>>>>>> with disallowed licenses.
>>>>>>>
>>>>>>> It's not unknown for automated build processes to include
>>>>>>> files that
>>>>>>> should not be present.
>>>>>>>
>>>>>>
>>>>>> I fail to see the difference of principle between the "release"
>>>>>> context
>>>>>> and, say, the daily snapshot context.
>>>>>
>>>>> Snapshots are not (should not) be promoted to the general public 
>>>>> as
>>>>> releases of the ASF.
>>>>>
>>>>>> What I mean is that there seem to be a contradiction between
>>>>>> saying that
>>>>>> a "release" is only about _source_ and the obligation to check
>>>>>> _binaries_.
>>>>>
>>>>> There is no contradiction here.
>>>>> The ASF releases source, they are required in a release.
>>>>> Binaries are optional.
>>>>> That does not mean that the ASF mirror system can be used to
>>>>> distribute arbitrary binaries.
>>>>>
>>>>>> It can occur that disallowed material is, at some point in
>>>>>> time, part of
>>>>>> the repository and/or the snapshot binaries.
>>>>>> However, what is forbidden is... forbidden, at all times.
>>>>>
>>>>> As with most things, this is not a strict dichotomy.
>>>>>
>>>>>> If it is indeed a problem to distribute forbidden material,
>>>>>> shouldn't
>>>>>> this be corrected in the repository? [That's indeed what you
>>>>>> did with
>>>>>> the blocking of the release.]
>>>>>
>>>>> If the repo is discovered to contain disallowed material, it
>>>>> needs to
>>>>> be removed.
>>>>>
>>>>>> Then again, once the repository is "clean", it can be tagged
>>>>>> and that
>>>>>> tagged _source_ is the release.
>>>>>
>>>>> Not quite.
>>>>>
>>>>> A release is a source archive that is voted on and distributed
>>>>> via the
>>>>> ASF mirror system.
>>>>> The contents must agree with the source tag, but the source tag
>>>>> is not
>>>>> the release.
>>>>>
>>>>>> Non-compliant binaries would thus only be the result of a
>>>>>> "mistake"
>>>>>> (if the build system is flawed, it's another problem, unrelated 
>>>>>> to
>>>>>> the released contents, which is _source_) to be corrected per 
>>>>>> se.
>>>>>
>>>>> Not so. There are other failure modes.
>>>>>
>>>>> An automated build obviously reduces the chances of mistakes,
>>>>> but it
>>>>> can still create an archive containing files that should not be
>>>>> there.
>>>>> [Or indeed, omits files that should be present]
>>>>> For example, the workspace contains spurious files which are
>>>>> implicitly included by the assembly instructions.
>>>>> Or the build process creates spurious files that are incorrectly
>>>>> added
>>>>> to the archive.
>>>>> Or the build incorrectly includes jars that are supposed to be
>>>>> provided by the end user
>>>>> etc.
>>>>>
>>>>> I have seen all the above in RC votes.
>>>>> There are probably other falure modes.
>>>>>
>>>>>> My proposition is that it's an independent step: once the build
>>>>>> system is adjusted to the expectations, "correct" binaries can 
>>>>>> be
>>>>>> generated from the same tagged release.
>>>>>
>>>>> It does not matter when the binary is built.
>>>>> If it is distributed by the PMC as a formal release, it must not
>>>>> contain any surprises, e.g. it must be licensed under the AL.
>>>>>
>>>>> It is therefore vital that the contents are as expected from the
>>>>> build.
>>>>>
>>>>> Note also that a formal release becomes an act of the PMC by the
>>>>> voting process.
>>>>> The ASF can then assume responsibility for any legal issues that
>>>>> may arise.
>>>>> Otherwise it is entirely the personal responsibility of the
>>>>> person who
>>>>> releases it.
>>>>
>>>> I think the last two points are really important: binaries must be
>>>> checked and the foundation provides a legal protection for the
>>>> project
>>>> if something weird occurs.
>>>>
>>>> I also think another point is important: many if not most users do
>>>> really expect binaries and not source. From our internal Apache
>>>> point
>>>> of view, these are a by-product,. For many others it is the
>>>> important
>>>> thing. It is mostly true in maven land as dependencies are
>>>> automatically retrieved in binary form, not source form. So the
>>>> maven
>>>> central repository as a distribution system is important.
>>>>
>>>> Even if for some security reason it sounds at first thought
>>>> logical to
>>>> rely on source only and compile oneself, in an industrial context
>>>> project teams do not have enough time to do it for all their
>>>> dependencies, so they use binaries provided by trusted third
>>>> parties. A
>>>> long time ago, I compiled a lot of free software tools for the
>>>> department I worked for at that time. I do not do this anymore, 
>>>> and
>>>> trust the binaries provided by the packaging team for a 
>>>> distribution
>>>> (typically Debian). They do rely on source and compile
>>>> themselves. Hey,
>>>> I even think Emmanuel here belongs to the Debian java team ;-) I
>>>> guess
>>>> such teams that do rely on source are rather the exception than 
>>>> the
>>>> rule. The other examples I can think of are packaging teams,
>>>> development teams that need bleeding edge (and will also directly
>>>> depend on the repository, not even the release), projects that
>>>> need to
>>>> introduce their own patches and people who have critical needs 
>>>> (for
>>>> example when safety of people is concerned or when they need full
>>>> control for legal or contractual reasons). Many other people
>>>> download
>>>> binaries directly and would simply not consider using a project
>>>> if it
>>>> is not readily available: they don't have time for this and don't
>>>> want
>>>> to learn how to build tens or hundred of different projects they
>>>> simply
>>>> use.
>>>>
>>>
>>> I do not disagree with anything said on this thread. [In
>>> particular, I
>>> did not at all imply that any one committer could take 
>>> responsibility
>>> for releasing unchecked items.]
>>>
>>> I'm simply suggesting that what is called the release
>>> process/management
>>> could be made simpler (and _consequently_ could lead to more
>>> regularly
>>> releasing the CM code), by separating the concerns.
>>> The concerns are
>>>  1. "code" (the contents), and
>>>  2. "artefacts" (the result of the build system acting on the
>>> "code").
>>>
>>> Checking of one of these is largely independent from checking the
>>> other.
>>
>> Unfortunately, not really.  One principle that we have (maybe not
>> crystal clear in the release doco) is that when we do distribute
>> binaries, they should really be "convenience binaries" which means
>> that everything needed to create them is in the source or its
>> documented dependencies.  What that means is that what we tag as the
>> source release needs to be able to generate any binaries that we
>> subsequently release.  The only way to really test that is to
>> generate the binaries and inspect them as part of verifying the 
>> release.
>>
>> As others have pointed out, anything we release has to be verified
>> and voted on.  As RM and reviewer, I think it is actually easier to
>> roll and verify source and binaries together.
>
> Personally, I do not think that the RM tasks are that much work or
> cumbersome, once you have done it a few times.
>
> The bigger problem I see is related to the voting process, as there 
> are
> many people looking at a release from very different POVs and finding
> problems that a RM (or single developer of a component) may not be 
> aware
> of or able to test himself, thus delaying the release process a lot.

My proposal is an attempt to relieve a little that precise problem.

> A more automated way of creating and especially testing the 
> correctness
> of releases would help here.

Checking the signed tag, as advertized by Luc, is a step in that
direction. If we allow source releases, then it's done: a reviewer can
be sure that the code on his machine is the one provided by the RM.

Gilles

>
> Thomas
>
>> Phil
>>
>>
>>> [The more so that, as you said, no fool-proof link between the two
>>> can
>>> be ensured: From a security POV, checking the former requires a 
>>> code
>>> review, while using the latter requires trust in the build system.]
>>>
>>> Thus we could release the "code", after checking and voting on the
>>> concerned elements (i.e. the repository state corresponding to a
>>> specific tag + the web site).
>>>
>>> Then we could release the "binaries", as a convenience, after
>>> checking
>>> and voting on the concerned elements (i.e. the files about to be
>>> distributed).
>>>
>>> I think that it's an added flexibility that would, for example, 
>>> allow
>>> the tagging of the repository without necessarily release binaries
>>> (i.e.
>>> not involving that part of the work); and to release binaries
>>> (say, at
>>> regular intervals) based on the latest tagged code (i.e. not
>>> involving
>>> the work about solving/evaluating/postponing issues).
>>>
>>> [I completely admit that, at first, it might look a little more
>>> confusing for the plain user, but (IIUC) it would be a better
>>> representation of the reality covered by stating that the ASF
>>> releases source code.]
>>>
>>>
>>> Best regards,
>>> Gilles
>>>
>>>
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>>
>>
>>
>>
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by Thomas Neidhart <th...@gmail.com>.
On 12/29/2014 04:21 AM, Phil Steitz wrote:
> On 12/28/14 11:46 AM, Gilles wrote:
>> Hi.
>>
>> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
>>> Le 28/12/2014 00:22, sebb a écrit :
>>>> On 27 December 2014 at 22:19, Gilles
>>>> <gi...@harfang.homelinux.org> wrote:
>>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
>>>>>>
>>>>>> On 24 December 2014 at 15:11, Gilles
>>>>>> <gi...@harfang.homelinux.org> wrote:
>>>>>>>
>>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math 3.4
>>>>>>>>>>>> from release
>>>>>>>>>>>> candidate 3.
>>>>>>>>>>>>
>>>>>>>>>>>> Tag name:
>>>>>>>>>>>>   MATH_3_4_RC3 (signature can be checked from git using
>>>>>>>>>>>> 'git tag
>>>>>>>>>>>> -v')
>>>>>>>>>>>>
>>>>>>>>>>>> Tag URL:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Is there a way to check that the source code referred to
>>>>>>>>>>> above
>>>>>>>>>>> was the one used to create the JAR of the ".class" files.
>>>>>>>>>>> [Out of curiosity, not suspicion, of course...]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yes, you can look at the end of the META-INF/MANIFEST.MS
>>>>>>>>>> file embedded
>>>>>>>>>> in the jar. The second-to-last entry is called
>>>>>>>>>> Implementation-Build.
>>>>>>>>>> It
>>>>>>>>>> is automatically created by maven-jgit-buildnumber-plugin
>>>>>>>>>> and contains
>>>>>>>>>> the SHA1 identifier of the last commit used for the build.
>>>>>>>>>> Here, is is
>>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can check
>>>>>>>>>> it really
>>>>>>>>>> corresponds to the expected status of the git repository.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Can this be considered "secure", i.e. can't this entry in
>>>>>>>>> the MANIFEST
>>>>>>>>> file be modified to be the checksum of the repository but
>>>>>>>>> with the
>>>>>>>>> .class
>>>>>>>>> files being substitued with those coming from another
>>>>>>>>> compilation?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Modifying anything in the jar (either this entry within the
>>>>>>>> manifest or
>>>>>>>> any class) will modify the jar signature. So as long as
>>>>>>>> people do check
>>>>>>>> the global MD5, SHA1 or gpg signature we provide with our
>>>>>>>> build, they
>>>>>>>> are safe to assume the artifacts are Apache artifacts.
>>>>>>>>
>>>>>>>> This is not different from how releases are done with
>>>>>>>> subversion as the
>>>>>>>> source code control system, or even in C or C++ as the
>>>>>>>> language. At one
>>>>>>>> time, the release manager does perform a compilation and the
>>>>>>>> fellow
>>>>>>>> reviewers check the result. There is no fullproof process
>>>>>>>> here, as
>>>>>>>> always when security is involved. Even using an automated
>>>>>>>> build and
>>>>>>>> automatic signing on an Apache server would involve trust
>>>>>>>> (i.e. one
>>>>>>>> should assume that the server has not been tampered with,
>>>>>>>> that the build
>>>>>>>> process really does what it is expected to do, that the
>>>>>>>> artifacts put to
>>>>>>>> review are really the one created by the automatic process
>>>>>>>> ...).
>>>>>>>>
>>>>>>>> Another point is that what we officially release is the
>>>>>>>> source, which
>>>>>>>> can be reviewed by external users. The binary parts are
>>>>>>>> merely a
>>>>>>>> convenience.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> That's an interesting point to come back to since it looks
>>>>>>> like the
>>>>>>> most time-consuming part of a release is not related to the
>>>>>>> sources!
>>>>>>>
>>>>>>> Isn't it conceivable that a release could just be a commit
>>>>>>> identifier
>>>>>>> and a checksum of the repository?
>>>>>>>
>>>>>>> If the binaries are a just a convenience, why put so much
>>>>>>> effort in it?
>>>>>>> As a convenience, the artefacts could be produced after the
>>>>>>> release,
>>>>>>> accompanied with all the "caveat" notes which you mentioned.
>>>>>>>
>>>>>>> That would certainly increase the release rate.
>>>>>>
>>>>>>
>>>>>> Binary releases still need to be reviewed to ensure that the
>>>>>> correct N
>>>>>> & L files are present, and that the archives don't contain
>>>>>> material
>>>>>> with disallowed licenses.
>>>>>>
>>>>>> It's not unknown for automated build processes to include
>>>>>> files that
>>>>>> should not be present.
>>>>>>
>>>>>
>>>>> I fail to see the difference of principle between the "release"
>>>>> context
>>>>> and, say, the daily snapshot context.
>>>>
>>>> Snapshots are not (should not) be promoted to the general public as
>>>> releases of the ASF.
>>>>
>>>>> What I mean is that there seem to be a contradiction between
>>>>> saying that
>>>>> a "release" is only about _source_ and the obligation to check
>>>>> _binaries_.
>>>>
>>>> There is no contradiction here.
>>>> The ASF releases source, they are required in a release.
>>>> Binaries are optional.
>>>> That does not mean that the ASF mirror system can be used to
>>>> distribute arbitrary binaries.
>>>>
>>>>> It can occur that disallowed material is, at some point in
>>>>> time, part of
>>>>> the repository and/or the snapshot binaries.
>>>>> However, what is forbidden is... forbidden, at all times.
>>>>
>>>> As with most things, this is not a strict dichotomy.
>>>>
>>>>> If it is indeed a problem to distribute forbidden material,
>>>>> shouldn't
>>>>> this be corrected in the repository? [That's indeed what you
>>>>> did with
>>>>> the blocking of the release.]
>>>>
>>>> If the repo is discovered to contain disallowed material, it
>>>> needs to
>>>> be removed.
>>>>
>>>>> Then again, once the repository is "clean", it can be tagged
>>>>> and that
>>>>> tagged _source_ is the release.
>>>>
>>>> Not quite.
>>>>
>>>> A release is a source archive that is voted on and distributed
>>>> via the
>>>> ASF mirror system.
>>>> The contents must agree with the source tag, but the source tag
>>>> is not
>>>> the release.
>>>>
>>>>> Non-compliant binaries would thus only be the result of a
>>>>> "mistake"
>>>>> (if the build system is flawed, it's another problem, unrelated to
>>>>> the released contents, which is _source_) to be corrected per se.
>>>>
>>>> Not so. There are other failure modes.
>>>>
>>>> An automated build obviously reduces the chances of mistakes,
>>>> but it
>>>> can still create an archive containing files that should not be
>>>> there.
>>>> [Or indeed, omits files that should be present]
>>>> For example, the workspace contains spurious files which are
>>>> implicitly included by the assembly instructions.
>>>> Or the build process creates spurious files that are incorrectly
>>>> added
>>>> to the archive.
>>>> Or the build incorrectly includes jars that are supposed to be
>>>> provided by the end user
>>>> etc.
>>>>
>>>> I have seen all the above in RC votes.
>>>> There are probably other falure modes.
>>>>
>>>>> My proposition is that it's an independent step: once the build
>>>>> system is adjusted to the expectations, "correct" binaries can be
>>>>> generated from the same tagged release.
>>>>
>>>> It does not matter when the binary is built.
>>>> If it is distributed by the PMC as a formal release, it must not
>>>> contain any surprises, e.g. it must be licensed under the AL.
>>>>
>>>> It is therefore vital that the contents are as expected from the
>>>> build.
>>>>
>>>> Note also that a formal release becomes an act of the PMC by the
>>>> voting process.
>>>> The ASF can then assume responsibility for any legal issues that
>>>> may arise.
>>>> Otherwise it is entirely the personal responsibility of the
>>>> person who
>>>> releases it.
>>>
>>> I think the last two points are really important: binaries must be
>>> checked and the foundation provides a legal protection for the
>>> project
>>> if something weird occurs.
>>>
>>> I also think another point is important: many if not most users do
>>> really expect binaries and not source. From our internal Apache
>>> point
>>> of view, these are a by-product,. For many others it is the
>>> important
>>> thing. It is mostly true in maven land as dependencies are
>>> automatically retrieved in binary form, not source form. So the
>>> maven
>>> central repository as a distribution system is important.
>>>
>>> Even if for some security reason it sounds at first thought
>>> logical to
>>> rely on source only and compile oneself, in an industrial context
>>> project teams do not have enough time to do it for all their
>>> dependencies, so they use binaries provided by trusted third
>>> parties. A
>>> long time ago, I compiled a lot of free software tools for the
>>> department I worked for at that time. I do not do this anymore, and
>>> trust the binaries provided by the packaging team for a distribution
>>> (typically Debian). They do rely on source and compile
>>> themselves. Hey,
>>> I even think Emmanuel here belongs to the Debian java team ;-) I
>>> guess
>>> such teams that do rely on source are rather the exception than the
>>> rule. The other examples I can think of are packaging teams,
>>> development teams that need bleeding edge (and will also directly
>>> depend on the repository, not even the release), projects that
>>> need to
>>> introduce their own patches and people who have critical needs (for
>>> example when safety of people is concerned or when they need full
>>> control for legal or contractual reasons). Many other people
>>> download
>>> binaries directly and would simply not consider using a project
>>> if it
>>> is not readily available: they don't have time for this and don't
>>> want
>>> to learn how to build tens or hundred of different projects they
>>> simply
>>> use.
>>>
>>
>> I do not disagree with anything said on this thread. [In
>> particular, I
>> did not at all imply that any one committer could take responsibility
>> for releasing unchecked items.]
>>
>> I'm simply suggesting that what is called the release
>> process/management
>> could be made simpler (and _consequently_ could lead to more
>> regularly
>> releasing the CM code), by separating the concerns.
>> The concerns are
>>  1. "code" (the contents), and
>>  2. "artefacts" (the result of the build system acting on the
>> "code").
>>
>> Checking of one of these is largely independent from checking the
>> other.
> 
> Unfortunately, not really.  One principle that we have (maybe not
> crystal clear in the release doco) is that when we do distribute
> binaries, they should really be "convenience binaries" which means
> that everything needed to create them is in the source or its
> documented dependencies.  What that means is that what we tag as the
> source release needs to be able to generate any binaries that we
> subsequently release.  The only way to really test that is to
> generate the binaries and inspect them as part of verifying the release.
> 
> As others have pointed out, anything we release has to be verified
> and voted on.  As RM and reviewer, I think it is actually easier to
> roll and verify source and binaries together. 

Personally, I do not think that the RM tasks are that much work or
cumbersome, once you have done it a few times.

The bigger problem I see is related to the voting process, as there are
many people looking at a release from very different POVs and finding
problems that a RM (or single developer of a component) may not be aware
of or able to test himself, thus delaying the release process a lot.

A more automated way of creating and especially testing the correctness
of releases would help here.

Thomas

> Phil
> 
> 
>> [The more so that, as you said, no fool-proof link between the two
>> can
>> be ensured: From a security POV, checking the former requires a code
>> review, while using the latter requires trust in the build system.]
>>
>> Thus we could release the "code", after checking and voting on the
>> concerned elements (i.e. the repository state corresponding to a
>> specific tag + the web site).
>>
>> Then we could release the "binaries", as a convenience, after
>> checking
>> and voting on the concerned elements (i.e. the files about to be
>> distributed).
>>
>> I think that it's an added flexibility that would, for example, allow
>> the tagging of the repository without necessarily release binaries
>> (i.e.
>> not involving that part of the work); and to release binaries
>> (say, at
>> regular intervals) based on the latest tagged code (i.e. not
>> involving
>> the work about solving/evaluating/postponing issues).
>>
>> [I completely admit that, at first, it might look a little more
>> confusing for the plain user, but (IIUC) it would be a better
>> representation of the reality covered by stating that the ASF
>> releases source code.]
>>
>>
>> Best regards,
>> Gilles
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by Gilles <gi...@harfang.homelinux.org>.
On Tue, 30 Dec 2014 03:05:57 +0000, sebb wrote:
> On 30 December 2014 at 02:40, Gilles <gi...@harfang.homelinux.org> 
> wrote:
>> On Tue, 30 Dec 2014 01:38:12 +0000, sebb wrote:
>>>
>>> On 30 December 2014 at 01:29, Gilles <gi...@harfang.homelinux.org> 
>>> wrote:
>>>>
>>>> On Tue, 30 Dec 2014 02:09:42 +0100, Bernd Eckenfels wrote:
>>>>>
>>>>>
>>>>> That thread gets deep. :)
>>>>>
>>>>> I just wanted to comment on "releasing only
>>>>> source is faster because of less checks". I disagree with that, 
>>>>> most
>>>>> release delay/time is due to preparation work. Failed (binary) 
>>>>> checks
>>>>> are typically for a reason which would also be present in the 
>>>>> source
>>>>> (especially the POM), so it does not really reduce the number of
>>>>> rework.
>>>>
>>>>
>>>>
>>>> RM is a streamlined procedure: so, if you do (say) 10 steps rather
>>>> than 15, it will objectively take less time, and this is 
>>>> compounded
>>>> by the additional tests which should (ideally) be performed by the
>>>> reviewers. [Thus delaying the release.]
>>>>
>>>>> (At least not in most cases, so two votes will actually make us
>>>>> more work not less).
>>>>
>>>>
>>>>
>>>> The additional work exactly amounts to sending _one_ additional 
>>>> mail.
>>>
>>>
>>> No.
>>>
>>> Both source and binary release need to be checked and voted on.
>>> And the votes need to be tallied, and successful releases have to 
>>> be
>>> published, and unsuccessful ones dropped.
>>
>>
>> Yes, so?
>
> I was just pointing out that separate source and binary releases are
> more than just an additional e-mail.
>
> The total effort is bigger.
> Also it's usually less efficient to split the same jobs over two
> sessions because of the extra prep.
> Which is easier - committing the same change to two files as one
> commit or two at different times?
>
>> If a certain RC would be vetoed only because of a problem with the
>> binaries?  The source could have otherwise been released.
>
> Yes, but how frequent is that?
> Usually there is an issue with the source that has caused the issue.
>
> Besides, how many Commons users want just the source?
>
>>>
>>> Checking the source release requires (for the reviewer) downloading
>>> all the artifacts and tags.
>>> If the releases are separated in time some of this may have to be
>>> repeated.
>>
>>
>> What "may have to be repeated" exactly?
>
> Downloading the tag is one step that may have to be repeated.
> I also check the sigs against the current KEYS file by loading that
> into a temp GPG keyring.
>
>> You wouldn't have to repeat whatever has been succesfully voted on.
>> If source was released, you'd only have to check the binaries 
>> (signature),
>> not the repository.
>>
>>>
>>> Even for the RM role, it is more work overall.
>>>
>>>> Then, as I noted,
>>>>  * some releases will be done as before (same work)
>>>>  * some releases will be "source only" (less work)
>>>>  * some releases will be two-steps, possibly performed by two 
>>>> different
>>>>    people (i.e. less work for each RM)
>>>>
>>>> Of course, each release means some work has to be done; then IIUC 
>>>> your
>>>> point, the fewer releases the better. :-}
>>>
>>>
>>> I'm sorry, but I think you are glossing over several stages in your
>>> presentation of the process.
>>>
>>> If you really think your process is going to save work, please 
>>> detail
>>> the exact stages necessary in both cases.
>>
>>
>> Why do you see this in black or white?
>
> I'm not; I'm trying to understand the two approaches in order to
> compare them.
>
>> I never (and I repeated that several times already) intended to ask
>> that all RM perform a two-step procedure: Anyone willing to RM as 
>> usual
>> will obviously do it as he pleases.
>>
>> Every time the issue of "we should release more often" comes up, 
>> almost
>> everyone agrees.  Every time a discussion occurs on the RM issue, 
>> several
>> people complain about the complexity of the procedure.
>> I then propose something to _try_ and improve that situation 
>> (sometimes)
>> and suddenly, the current procedure is found more efficient than 
>> ever.
>
> That is not an accurate summary of what I wrote.

It is an interpretation of the consequence of what you write: no gain
whatsoever, in no circumstances.
You cannot prove that, yet you ask me to prove that there could be a
gain; it's not fair.

> I just don't see how performing the release in separate stages is
> going to help reduce the total work done.

This is black and white. My position is that, in some cases (however
rare maybe-but-we-don't-know-since-you-don't-even-want-to-try), it
might be useful to vote on a source-only release.

You gave one example (linux distributions), I gave one (urgent 
fix/feature).
Why should we argue on the overall time saved, or not, rather than 
agree
on the principle (even if it proves useless _most_ of the time)?

>>> This information will be needed anyway as documentation if it is 
>>> ever
>>> agreed upon.
>>
>>
>> For source-only release, the information is the same as compiled by 
>> Luc
>> (leaving out the Nexus-related steps and possibly replacing the 
>> bunch of
>> files copied to "https://dist.apache.org/repos/dist" with the 
>> tarball
>> referred to previously).
>>
>> IMO, the contradiction is obvious between talk of releasing 
>> source-only
>> and nit-picking that amounts to actually refuse to consider 
>> source-only
>> releases.
>>
>
> No, I am not nitpicking for the sake of it; I just don't understand
> what you hope to gain, given that most end users will want the
> convenience of a binary release.

Cf. above.
If some reviewers are afraid of the supposed added work, they simply
don't vote for a source release; and wait until a RM provides the
binaries so they can do their overall checks as usual.  Same work,
_exactly_.
[A flaw discovered in the "binaries" vote would prevent the release
of those artefacts and, for "convenience" users, the version would
never have existed in practice. The veto would eventually lead to a
new (bumped version) source release. Same procedure as the current
one, _exactly_.]

My proposal does not remove anything from anyone, it only adds more
possibilities.  What's wrong with that, please?


Gilles

>> Good night,
>>
>> Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by sebb <se...@gmail.com>.
On 30 December 2014 at 02:40, Gilles <gi...@harfang.homelinux.org> wrote:
> On Tue, 30 Dec 2014 01:38:12 +0000, sebb wrote:
>>
>> On 30 December 2014 at 01:29, Gilles <gi...@harfang.homelinux.org> wrote:
>>>
>>> On Tue, 30 Dec 2014 02:09:42 +0100, Bernd Eckenfels wrote:
>>>>
>>>>
>>>> That thread gets deep. :)
>>>>
>>>> I just wanted to comment on "releasing only
>>>> source is faster because of less checks". I disagree with that, most
>>>> release delay/time is due to preparation work. Failed (binary) checks
>>>> are typically for a reason which would also be present in the source
>>>> (especially the POM), so it does not really reduce the number of
>>>> rework.
>>>
>>>
>>>
>>> RM is a streamlined procedure: so, if you do (say) 10 steps rather
>>> than 15, it will objectively take less time, and this is compounded
>>> by the additional tests which should (ideally) be performed by the
>>> reviewers. [Thus delaying the release.]
>>>
>>>> (At least not in most cases, so two votes will actually make us
>>>> more work not less).
>>>
>>>
>>>
>>> The additional work exactly amounts to sending _one_ additional mail.
>>
>>
>> No.
>>
>> Both source and binary release need to be checked and voted on.
>> And the votes need to be tallied, and successful releases have to be
>> published, and unsuccessful ones dropped.
>
>
> Yes, so?

I was just pointing out that separate source and binary releases are
more than just an additional e-mail.

The total effort is bigger.
Also it's usually less efficient to split the same jobs over two
sessions because of the extra prep.
Which is easier - committing the same change to two files as one
commit or two at different times?

> If a certain RC would be vetoed only because of a problem with the
> binaries?  The source could have otherwise been released.

Yes, but how frequent is that?
Usually there is an issue with the source that has caused the issue.

Besides, how many Commons users want just the source?

>>
>> Checking the source release requires (for the reviewer) downloading
>> all the artifacts and tags.
>> If the releases are separated in time some of this may have to be
>> repeated.
>
>
> What "may have to be repeated" exactly?

Downloading the tag is one step that may have to be repeated.
I also check the sigs against the current KEYS file by loading that
into a temp GPG keyring.

> You wouldn't have to repeat whatever has been succesfully voted on.
> If source was released, you'd only have to check the binaries (signature),
> not the repository.
>
>>
>> Even for the RM role, it is more work overall.
>>
>>> Then, as I noted,
>>>  * some releases will be done as before (same work)
>>>  * some releases will be "source only" (less work)
>>>  * some releases will be two-steps, possibly performed by two different
>>>    people (i.e. less work for each RM)
>>>
>>> Of course, each release means some work has to be done; then IIUC your
>>> point, the fewer releases the better. :-}
>>
>>
>> I'm sorry, but I think you are glossing over several stages in your
>> presentation of the process.
>>
>> If you really think your process is going to save work, please detail
>> the exact stages necessary in both cases.
>
>
> Why do you see this in black or white?

I'm not; I'm trying to understand the two approaches in order to compare them.

> I never (and I repeated that several times already) intended to ask
> that all RM perform a two-step procedure: Anyone willing to RM as usual
> will obviously do it as he pleases.
>
> Every time the issue of "we should release more often" comes up, almost
> everyone agrees.  Every time a discussion occurs on the RM issue, several
> people complain about the complexity of the procedure.
> I then propose something to _try_ and improve that situation (sometimes)
> and suddenly, the current procedure is found more efficient than ever.

That is not an accurate summary of what I wrote.

I just don't see how performing the release in separate stages is
going to help reduce the total work done.

>> This information will be needed anyway as documentation if it is ever
>> agreed upon.
>
>
> For source-only release, the information is the same as compiled by Luc
> (leaving out the Nexus-related steps and possibly replacing the bunch of
> files copied to "https://dist.apache.org/repos/dist" with the tarball
> referred to previously).
>
> IMO, the contradiction is obvious between talk of releasing source-only
> and nit-picking that amounts to actually refuse to consider source-only
> releases.
>

No, I am not nitpicking for the sake of it; I just don't understand
what you hope to gain, given that most end users will want the
convenience of a binary release.

> Good night,
>
> Gilles
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by Gilles <gi...@harfang.homelinux.org>.
On Tue, 30 Dec 2014 01:38:12 +0000, sebb wrote:
> On 30 December 2014 at 01:29, Gilles <gi...@harfang.homelinux.org> 
> wrote:
>> On Tue, 30 Dec 2014 02:09:42 +0100, Bernd Eckenfels wrote:
>>>
>>> That thread gets deep. :)
>>>
>>> I just wanted to comment on "releasing only
>>> source is faster because of less checks". I disagree with that, 
>>> most
>>> release delay/time is due to preparation work. Failed (binary) 
>>> checks
>>> are typically for a reason which would also be present in the 
>>> source
>>> (especially the POM), so it does not really reduce the number of
>>> rework.
>>
>>
>> RM is a streamlined procedure: so, if you do (say) 10 steps rather
>> than 15, it will objectively take less time, and this is compounded
>> by the additional tests which should (ideally) be performed by the
>> reviewers. [Thus delaying the release.]
>>
>>> (At least not in most cases, so two votes will actually make us
>>> more work not less).
>>
>>
>> The additional work exactly amounts to sending _one_ additional 
>> mail.
>
> No.
>
> Both source and binary release need to be checked and voted on.
> And the votes need to be tallied, and successful releases have to be
> published, and unsuccessful ones dropped.

Yes, so?
If a certain RC would be vetoed only because of a problem with the
binaries?  The source could have otherwise been released.

>
> Checking the source release requires (for the reviewer) downloading
> all the artifacts and tags.
> If the releases are separated in time some of this may have to be 
> repeated.

What "may have to be repeated" exactly?
You wouldn't have to repeat whatever has been succesfully voted on.
If source was released, you'd only have to check the binaries 
(signature),
not the repository.

>
> Even for the RM role, it is more work overall.
>
>> Then, as I noted,
>>  * some releases will be done as before (same work)
>>  * some releases will be "source only" (less work)
>>  * some releases will be two-steps, possibly performed by two 
>> different
>>    people (i.e. less work for each RM)
>>
>> Of course, each release means some work has to be done; then IIUC 
>> your
>> point, the fewer releases the better. :-}
>
> I'm sorry, but I think you are glossing over several stages in your
> presentation of the process.
>
> If you really think your process is going to save work, please detail
> the exact stages necessary in both cases.

Why do you see this in black or white?
I never (and I repeated that several times already) intended to ask
that all RM perform a two-step procedure: Anyone willing to RM as usual
will obviously do it as he pleases.

Every time the issue of "we should release more often" comes up, almost
everyone agrees.  Every time a discussion occurs on the RM issue, 
several
people complain about the complexity of the procedure.
I then propose something to _try_ and improve that situation 
(sometimes)
and suddenly, the current procedure is found more efficient than ever.

> This information will be needed anyway as documentation if it is ever
> agreed upon.

For source-only release, the information is the same as compiled by Luc
(leaving out the Nexus-related steps and possibly replacing the bunch 
of
files copied to "https://dist.apache.org/repos/dist" with the tarball
referred to previously).

IMO, the contradiction is obvious between talk of releasing source-only
and nit-picking that amounts to actually refuse to consider source-only
releases.


Good night,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by sebb <se...@gmail.com>.
On 30 December 2014 at 01:29, Gilles <gi...@harfang.homelinux.org> wrote:
> On Tue, 30 Dec 2014 02:09:42 +0100, Bernd Eckenfels wrote:
>>
>> That thread gets deep. :)
>>
>> I just wanted to comment on "releasing only
>> source is faster because of less checks". I disagree with that, most
>> release delay/time is due to preparation work. Failed (binary) checks
>> are typically for a reason which would also be present in the source
>> (especially the POM), so it does not really reduce the number of
>> rework.
>
>
> RM is a streamlined procedure: so, if you do (say) 10 steps rather
> than 15, it will objectively take less time, and this is compounded
> by the additional tests which should (ideally) be performed by the
> reviewers. [Thus delaying the release.]
>
>> (At least not in most cases, so two votes will actually make us
>> more work not less).
>
>
> The additional work exactly amounts to sending _one_ additional mail.

No.

Both source and binary release need to be checked and voted on.
And the votes need to be tallied, and successful releases have to be
published, and unsuccessful ones dropped.

Checking the source release requires (for the reviewer) downloading
all the artifacts and tags.
If the releases are separated in time some of this may have to be repeated.

Even for the RM role, it is more work overall.

> Then, as I noted,
>  * some releases will be done as before (same work)
>  * some releases will be "source only" (less work)
>  * some releases will be two-steps, possibly performed by two different
>    people (i.e. less work for each RM)
>
> Of course, each release means some work has to be done; then IIUC your
> point, the fewer releases the better. :-}

I'm sorry, but I think you are glossing over several stages in your
presentation of the process.

If you really think your process is going to save work, please detail
the exact stages necessary in both cases.
This information will be needed anyway as documentation if it is ever
agreed upon.

>
> Regards,
> Gilles
>
>
>>
>> Gruss
>> Bernd
>>
>>
>>
>>  Am Tue, 30 Dec 2014 02:05:29
>> +0100 schrieb Gilles <gi...@harfang.homelinux.org>:
>>
>>> On Mon, 29 Dec 2014 10:54:59 +0000, sebb wrote:
>>> > On 29 December 2014 at 10:36, Gilles <gi...@harfang.homelinux.org>
>>> > wrote:
>>> >> On Sun, 28 Dec 2014 20:21:32 -0700, Phil Steitz wrote:
>>> >>>
>>> >>> On 12/28/14 11:46 AM, Gilles wrote:
>>> >>>>
>>> >>>> Hi.
>>> >>>>
>>> >>>> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
>>> >>>>>
>>> >>>>> Le 28/12/2014 00:22, sebb a écrit :
>>> >>>>>>
>>> >>>>>> On 27 December 2014 at 22:19, Gilles
>>> >>>>>> <gi...@harfang.homelinux.org> wrote:
>>> >>>>>>>
>>> >>>>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> On 24 December 2014 at 15:11, Gilles
>>> >>>>>>>> <gi...@harfang.homelinux.org> wrote:
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe wrote:
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe wrote:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math 3.4
>>> >>>>>>>>>>>>>> from release
>>> >>>>>>>>>>>>>> candidate 3.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Tag name:
>>> >>>>>>>>>>>>>>   MATH_3_4_RC3 (signature can be checked from git using
>>> >>>>>>>>>>>>>> 'git tag
>>> >>>>>>>>>>>>>> -v')
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Tag URL:
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> Is there a way to check that the source code referred to
>>> >>>>>>>>>>>>> above
>>> >>>>>>>>>>>>> was the one used to create the JAR of the ".class"
>>> >>>>>>>>>>>>> files. [Out of curiosity, not suspicion, of course...]
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Yes, you can look at the end of the META-INF/MANIFEST.MS
>>> >>>>>>>>>>>> file embedded
>>> >>>>>>>>>>>> in the jar. The second-to-last entry is called
>>> >>>>>>>>>>>> Implementation-Build.
>>> >>>>>>>>>>>> It
>>> >>>>>>>>>>>> is automatically created by maven-jgit-buildnumber-plugin
>>> >>>>>>>>>>>> and contains
>>> >>>>>>>>>>>> the SHA1 identifier of the last commit used for the
>>> >>>>>>>>>>>> build. Here, is is
>>> >>>>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can check
>>> >>>>>>>>>>>> it really
>>> >>>>>>>>>>>> corresponds to the expected status of the git repository.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Can this be considered "secure", i.e. can't this entry in
>>> >>>>>>>>>>> the MANIFEST
>>> >>>>>>>>>>> file be modified to be the checksum of the repository but
>>> >>>>>>>>>>> with the
>>> >>>>>>>>>>> .class
>>> >>>>>>>>>>> files being substitued with those coming from another
>>> >>>>>>>>>>> compilation?
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>> Modifying anything in the jar (either this entry within the
>>> >>>>>>>>>> manifest or
>>> >>>>>>>>>> any class) will modify the jar signature. So as long as
>>> >>>>>>>>>> people do check
>>> >>>>>>>>>> the global MD5, SHA1 or gpg signature we provide with our
>>> >>>>>>>>>> build, they
>>> >>>>>>>>>> are safe to assume the artifacts are Apache artifacts.
>>> >>>>>>>>>>
>>> >>>>>>>>>> This is not different from how releases are done with
>>> >>>>>>>>>> subversion as the
>>> >>>>>>>>>> source code control system, or even in C or C++ as the
>>> >>>>>>>>>> language. At one
>>> >>>>>>>>>> time, the release manager does perform a compilation and
>>> >>>>>>>>>> the fellow
>>> >>>>>>>>>> reviewers check the result. There is no fullproof process
>>> >>>>>>>>>> here, as
>>> >>>>>>>>>> always when security is involved. Even using an automated
>>> >>>>>>>>>> build and
>>> >>>>>>>>>> automatic signing on an Apache server would involve trust
>>> >>>>>>>>>> (i.e. one
>>> >>>>>>>>>> should assume that the server has not been tampered with,
>>> >>>>>>>>>> that the build
>>> >>>>>>>>>> process really does what it is expected to do, that the
>>> >>>>>>>>>> artifacts put to
>>> >>>>>>>>>> review are really the one created by the automatic process
>>> >>>>>>>>>> ...).
>>> >>>>>>>>>>
>>> >>>>>>>>>> Another point is that what we officially release is the
>>> >>>>>>>>>> source, which
>>> >>>>>>>>>> can be reviewed by external users. The binary parts are
>>> >>>>>>>>>> merely a
>>> >>>>>>>>>> convenience.
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> That's an interesting point to come back to since it looks
>>> >>>>>>>>> like the
>>> >>>>>>>>> most time-consuming part of a release is not related to the
>>> >>>>>>>>> sources!
>>> >>>>>>>>>
>>> >>>>>>>>> Isn't it conceivable that a release could just be a commit
>>> >>>>>>>>> identifier
>>> >>>>>>>>> and a checksum of the repository?
>>> >>>>>>>>>
>>> >>>>>>>>> If the binaries are a just a convenience, why put so much
>>> >>>>>>>>> effort in it?
>>> >>>>>>>>> As a convenience, the artefacts could be produced after the
>>> >>>>>>>>> release,
>>> >>>>>>>>> accompanied with all the "caveat" notes which you mentioned.
>>> >>>>>>>>>
>>> >>>>>>>>> That would certainly increase the release rate.
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> Binary releases still need to be reviewed to ensure that the
>>> >>>>>>>> correct N
>>> >>>>>>>> & L files are present, and that the archives don't contain
>>> >>>>>>>> material
>>> >>>>>>>> with disallowed licenses.
>>> >>>>>>>>
>>> >>>>>>>> It's not unknown for automated build processes to include
>>> >>>>>>>> files that
>>> >>>>>>>> should not be present.
>>> >>>>>>>>
>>> >>>>>>>
>>> >>>>>>> I fail to see the difference of principle between the
>>> >>>>>>> "release" context
>>> >>>>>>> and, say, the daily snapshot context.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> Snapshots are not (should not) be promoted to the general
>>> >>>>>> public as
>>> >>>>>> releases of the ASF.
>>> >>>>>>
>>> >>>>>>> What I mean is that there seem to be a contradiction between
>>> >>>>>>> saying that
>>> >>>>>>> a "release" is only about _source_ and the obligation to check
>>> >>>>>>> _binaries_.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> There is no contradiction here.
>>> >>>>>> The ASF releases source, they are required in a release.
>>> >>>>>> Binaries are optional.
>>> >>>>>> That does not mean that the ASF mirror system can be used to
>>> >>>>>> distribute arbitrary binaries.
>>> >>>>>>
>>> >>>>>>> It can occur that disallowed material is, at some point in
>>> >>>>>>> time, part of
>>> >>>>>>> the repository and/or the snapshot binaries.
>>> >>>>>>> However, what is forbidden is... forbidden, at all times.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> As with most things, this is not a strict dichotomy.
>>> >>>>>>
>>> >>>>>>> If it is indeed a problem to distribute forbidden material,
>>> >>>>>>> shouldn't
>>> >>>>>>> this be corrected in the repository? [That's indeed what you
>>> >>>>>>> did with
>>> >>>>>>> the blocking of the release.]
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> If the repo is discovered to contain disallowed material, it
>>> >>>>>> needs to
>>> >>>>>> be removed.
>>> >>>>>>
>>> >>>>>>> Then again, once the repository is "clean", it can be tagged
>>> >>>>>>> and that
>>> >>>>>>> tagged _source_ is the release.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> Not quite.
>>> >>>>>>
>>> >>>>>> A release is a source archive that is voted on and distributed
>>> >>>>>> via the
>>> >>>>>> ASF mirror system.
>>> >>>>>> The contents must agree with the source tag, but the source tag
>>> >>>>>> is not
>>> >>>>>> the release.
>>> >>>>>>
>>> >>>>>>> Non-compliant binaries would thus only be the result of a
>>> >>>>>>> "mistake"
>>> >>>>>>> (if the build system is flawed, it's another problem,
>>> >>>>>>> unrelated to
>>> >>>>>>> the released contents, which is _source_) to be corrected per
>>> >>>>>>> se.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> Not so. There are other failure modes.
>>> >>>>>>
>>> >>>>>> An automated build obviously reduces the chances of mistakes,
>>> >>>>>> but it
>>> >>>>>> can still create an archive containing files that should not be
>>> >>>>>> there.
>>> >>>>>> [Or indeed, omits files that should be present]
>>> >>>>>> For example, the workspace contains spurious files which are
>>> >>>>>> implicitly included by the assembly instructions.
>>> >>>>>> Or the build process creates spurious files that are
>>> >>>>>> incorrectly added
>>> >>>>>> to the archive.
>>> >>>>>> Or the build incorrectly includes jars that are supposed to be
>>> >>>>>> provided by the end user
>>> >>>>>> etc.
>>> >>>>>>
>>> >>>>>> I have seen all the above in RC votes.
>>> >>>>>> There are probably other falure modes.
>>> >>>>>>
>>> >>>>>>> My proposition is that it's an independent step: once the
>>> >>>>>>> build system is adjusted to the expectations, "correct"
>>> >>>>>>> binaries can be
>>> >>>>>>> generated from the same tagged release.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> It does not matter when the binary is built.
>>> >>>>>> If it is distributed by the PMC as a formal release, it must
>>> >>>>>> not contain any surprises, e.g. it must be licensed under the
>>> >>>>>> AL.
>>> >>>>>>
>>> >>>>>> It is therefore vital that the contents are as expected from
>>> >>>>>> the build.
>>> >>>>>>
>>> >>>>>> Note also that a formal release becomes an act of the PMC by
>>> >>>>>> the voting process.
>>> >>>>>> The ASF can then assume responsibility for any legal issues
>>> >>>>>> that may arise.
>>> >>>>>> Otherwise it is entirely the personal responsibility of the
>>> >>>>>> person who
>>> >>>>>> releases it.
>>> >>>>>
>>> >>>>>
>>> >>>>> I think the last two points are really important: binaries must
>>> >>>>> be
>>> >>>>> checked and the foundation provides a legal protection for the
>>> >>>>> project
>>> >>>>> if something weird occurs.
>>> >>>>>
>>> >>>>> I also think another point is important: many if not most users
>>> >>>>> do
>>> >>>>> really expect binaries and not source. From our internal Apache
>>> >>>>> point
>>> >>>>> of view, these are a by-product,. For many others it is the
>>> >>>>> important
>>> >>>>> thing. It is mostly true in maven land as dependencies are
>>> >>>>> automatically retrieved in binary form, not source form. So the
>>> >>>>> maven
>>> >>>>> central repository as a distribution system is important.
>>> >>>>>
>>> >>>>> Even if for some security reason it sounds at first thought
>>> >>>>> logical to
>>> >>>>> rely on source only and compile oneself, in an industrial
>>> >>>>> context project teams do not have enough time to do it for all
>>> >>>>> their dependencies, so they use binaries provided by trusted
>>> >>>>> third parties. A
>>> >>>>> long time ago, I compiled a lot of free software tools for the
>>> >>>>> department I worked for at that time. I do not do this anymore,
>>> >>>>> and
>>> >>>>> trust the binaries provided by the packaging team for a
>>> >>>>> distribution
>>> >>>>> (typically Debian). They do rely on source and compile
>>> >>>>> themselves. Hey,
>>> >>>>> I even think Emmanuel here belongs to the Debian java team ;-) I
>>> >>>>> guess
>>> >>>>> such teams that do rely on source are rather the exception than
>>> >>>>> the
>>> >>>>> rule. The other examples I can think of are packaging teams,
>>> >>>>> development teams that need bleeding edge (and will also
>>> >>>>> directly depend on the repository, not even the release),
>>> >>>>> projects that need to
>>> >>>>> introduce their own patches and people who have critical needs
>>> >>>>> (for
>>> >>>>> example when safety of people is concerned or when they need
>>> >>>>> full control for legal or contractual reasons). Many other
>>> >>>>> people download
>>> >>>>> binaries directly and would simply not consider using a project
>>> >>>>> if it
>>> >>>>> is not readily available: they don't have time for this and
>>> >>>>> don't want
>>> >>>>> to learn how to build tens or hundred of different projects they
>>> >>>>> simply
>>> >>>>> use.
>>> >>>>>
>>> >>>>
>>> >>>> I do not disagree with anything said on this thread. [In
>>> >>>> particular, I
>>> >>>> did not at all imply that any one committer could take
>>> >>>> responsibility
>>> >>>> for releasing unchecked items.]
>>> >>>>
>>> >>>> I'm simply suggesting that what is called the release
>>> >>>> process/management
>>> >>>> could be made simpler (and _consequently_ could lead to more
>>> >>>> regularly
>>> >>>> releasing the CM code), by separating the concerns.
>>> >>>> The concerns are
>>> >>>>  1. "code" (the contents), and
>>> >>>>  2. "artefacts" (the result of the build system acting on the
>>> >>>> "code").
>>> >>>>
>>> >>>> Checking of one of these is largely independent from checking the
>>> >>>> other.
>>> >>>
>>> >>>
>>> >>> Unfortunately, not really.  One principle that we have (maybe not
>>> >>> crystal clear in the release doco) is that when we do distribute
>>> >>> binaries, they should really be "convenience binaries" which means
>>> >>> that everything needed to create them is in the source or its
>>> >>> documented dependencies.  What that means is that what we tag as
>>> >>> the
>>> >>> source release needs to be able to generate any binaries that we
>>> >>> subsequently release.  The only way to really test that is to
>>> >>> generate the binaries and inspect them as part of verifying the
>>> >>> release.
>>> >>
>>> >>
>>> >> Only way?  That's certainly not obvious to me: Since a tag/branch
>>> >> uniquely identifies a set of files, that is, the "source release
>>> >> [that
>>> >> is] able to generate any binaries that we subsequently release",
>>> >> if a
>>> >> RM can do it at (source) release time, he (or someone else!) can
>>> >> do it
>>> >> later, too (by running the build from a clone of the repository in
>>> >> its
>>> >> tagged state).
>>> >>
>>> >>> As others have pointed out, anything we release has to be verified
>>> >>> and voted on.  As RM and reviewer, I think it is actually easier
>>> >>> to roll and verify source and binaries together.
>>> >>
>>> >
>>> > +1
>>> >
>>> >>
>>> >> It's precisely my main point.
>>> >> I won't dispute that you can prefer doing both (and nobody would
>>> >> forbid
>>> >> a RM to do just that) but the point is about the possibility to
>>> >> release
>>> >> source-only code (as the first step of a two-step procedure which I
>>> >> described earlier).
>>> >> [IMHO, the two-step one seems easier (both for the RM and the
>>> >> reviewer),
>>> >> (mileage does vary).]
>>> >
>>> > What is easier?
>>> > It seems to me there will be at least one other step in your
>>> > proposed process, i.e. a second VOTE e-mail
>>>
>>> Yes, that's obviously what I meant:
>>> Two steps == two votes
>>>
>>> [But: source releases need not necessarily be accompanied with
>>> "binaries", which, I imagine, could lead to official releases
>>> occurring more often (due to the reduced number of checks).]
>>>
>>> > These will both contain most of the same information.
>>>
>>> No.
>>> The first step is about the source, i.e. the code which humans create.
>>> The second step is about the files which a build system creates.
>>>
>>> As I indicated previously, the first vote will be about a set of
>>> reviewers being satisfied with the state of the souce code, while
>>> the second vote will be about another set of reviewers being satisfied
>>> with the results of the build system ("no glitch", as you described
>>> in an earlier message).
>>>
>>> > Is the intention to announce the source release separately from the
>>> > binary release?
>>> > If so, there will need to be 2 announce mails, and 2 updates to the
>>> > download page.
>>>
>>> Is there a problem with that?
>>> There are actually several possible cases (depending on the will of
>>> the RM):
>>>   * one-step release (only source code)
>>>   * two-steps (source, then binaries based on that source)
>>>   * combined (as is done up to now)
>>>   * binaries (based on any previously released source)
>>>
>>> >> In short is it forbidden (by the official/legal rules of ASF) to
>>> >> proceed
>>> >> as I propose?
>>> >
>>> > Dunno, depends on what exactly you are proposing.
>>>
>>> Cf. above (and previous mails).
>>>
>>> In practice the release could (IIUC) be like the link provided
>>> by Luc in RC1 of CM 3.4 (whose target was a TAR of the tagged
>>> repository).
>>>
>>>
>>> >> It is impossible technically?
>>> >
>>> > Currently the Maven build process creates:
>>> > - Maven source and binary jars
>>> > - ASF source and binary bundles
>>>
>>> AFAIU, the JARs (source and binary) are "binaries", the binary
>>> bundles are "binaries". Only the ASF source is "source".
>>>
>>> > It's not clear to me what exactly you propose to release in stage
>>> > one,
>>>
>>> The ASF source (e.g. in the form of a tarball, or the appropriate
>>> "git clone" command).
>>>
>>> > but there will need to be some changes to the process in order to
>>> > release just the ASF source.
>>>
>>> I don't see which.
>>> A "source RM" would just stop the process after resolving/postponing
>>> the pending issues, and checking the various reports about the source
>>> code. [Then create the tag, and request a vote.]
>>>
>>> A "binary RM" would take on from that point (a tagged repository),
>>> i.e. create all the binaries, sign them, etc.
>>>
>>> > There is no point releasing the Maven source jars separately from
>>> > the binary jars; they are not complete as they only contain java
>>> > files for
>>> > use with IDEs.
>>>
>>> I don't understand that.
>>> In principle, a JAR with the Java sources is indeed the necessary and
>>> sufficient condition for users to create the executable bytecode, with
>>> whatever build system they wish.
>>> But I agree that it's not useful to not release all the files needed
>>> to easily run maven. [And, for convenience, a source release would be
>>> accompanied with instructions on how to build a JAR of the compiled
>>> classes, using maven.]
>>>
>>> > But in any case, AFAIK it is very tricky to release new files into
>>> > an existing Maven folder, and it may cause problems for end users.
>>>
>>> I don't understand what you mean by "release new files into an
>>> existing Maven folder"...
>>>
>>> Gilles
>>>
>>> >>
>>> >>
>>> >>> Phil
>>> >>>
>>> >>>
>>> >>>> [The more so that, as you said, no fool-proof link between the
>>> >>>> two can
>>> >>>> be ensured: From a security POV, checking the former requires a
>>> >>>> code
>>> >>>> review, while using the latter requires trust in the build
>>> >>>> system.]
>>> >>>>
>>> >>>> Thus we could release the "code", after checking and voting on
>>> >>>> the concerned elements (i.e. the repository state corresponding
>>> >>>> to a specific tag + the web site).
>>> >>>>
>>> >>>> Then we could release the "binaries", as a convenience, after
>>> >>>> checking
>>> >>>> and voting on the concerned elements (i.e. the files about to be
>>> >>>> distributed).
>>> >>>>
>>> >>>> I think that it's an added flexibility that would, for example,
>>> >>>> allow
>>> >>>> the tagging of the repository without necessarily release
>>> >>>> binaries (i.e.
>>> >>>> not involving that part of the work); and to release binaries
>>> >>>> (say, at
>>> >>>> regular intervals) based on the latest tagged code (i.e. not
>>> >>>> involving
>>> >>>> the work about solving/evaluating/postponing issues).
>>> >>>>
>>> >>>> [I completely admit that, at first, it might look a little more
>>> >>>> confusing for the plain user, but (IIUC) it would be a better
>>> >>>> representation of the reality covered by stating that the ASF
>>> >>>> releases source code.]
>>> >>>>
>>> >>>>
>>> >>>> Best regards,
>>> >>>> Gilles
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by Gilles <gi...@harfang.homelinux.org>.
On Tue, 30 Dec 2014 02:36:24 +0100, Bernd Eckenfels wrote:
> Hello,
>
> Am Tue, 30 Dec 2014 02:29:38 +0100
> schrieb Gilles <gi...@harfang.homelinux.org>:
>
>> On Tue, 30 Dec 2014 02:09:42 +0100, Bernd Eckenfels wrote:
>> > That thread gets deep. :)
>> >
>> > I just wanted to comment on "releasing only
>> > source is faster because of less checks". I disagree with that, 
>> most
>> > release delay/time is due to preparation work. Failed (binary)
>> > checks are typically for a reason which would also be present in
>> > the source (especially the POM), so it does not really reduce the
>> > number of rework.
>>
>> RM is a streamlined procedure: so, if you do (say) 10 steps rather
>> than 15, it will objectively take less time, and this is compounded
>> by the additional tests which should (ideally) be performed by the
>> reviewers. [Thus delaying the release.]
>
> The problem is not the small additional time for the last 5 steps but
> the large time for redoing all steps (on veto).

That's not my experience. [I particularly hated to have to delete
manually some files inside Nexus: Wrong click, and up for another
round... :-/]
Moreover, most of the initial tasks are shared between the active
contributors (committing pending patches, cleaing up code, evaluating
pending issues).  And the collective effort is hopefully triggered by
the perspective of the release.

>> > (At least not in most cases, so two votes will actually make us
>> > more work not less).
>>
>> The additional work exactly amounts to sending _one_ additional 
>> mail.
>
> The actual work is not the vote mail but the people doing the
> preparation and the review.

Yes, and that is _exactly_ the same work because the total work for
the two steps combined is the sum of each of the steps! ;-)

>>
>> Then, as I noted,
>>   * some releases will be done as before (same work)
>>   * some releases will be "source only" (less work)
>
> Not much, you still have to check if the source actually works and 
> can
> be build, produces sane archives and so on.

Well, no. Source-only is source-only; sane compilation is always
implicitly checked by the "test" target, which is the minimum
required to ensure that the source is OK.

>
>>   * some releases will be two-steps, possibly performed by two
>> different people (i.e. less work for each RM)
>
> And more work in sum, not only for the RMs but also the reviewers. 
> (and
> the users which want to use the source release with maven like 
> anybody
> there days)

I'd expect that most source would come with "convenience" binaries.
The main point is that some "official" release can be provided more
quickly if the circumstance would require it (e.g. urgent bug fix or
new feature for a user who would be satisfied with a source release).

> But I dont mind, if a project wants to do a source release only, 
> thats
> fine with me, I just don't see the advantage.

In one word: Flexibility.

Gilles

>
> Gruss
> Bernd
>
>>
>> Of course, each release means some work has to be done; then IIUC 
>> your
>> point, the fewer releases the better. :-}
>>
>>
>
>> >  Am Tue, 30 Dec 2014 02:05:29
>> > +0100 schrieb Gilles <gi...@harfang.homelinux.org>:
>> >
>> >> On Mon, 29 Dec 2014 10:54:59 +0000, sebb wrote:
>> >> > On 29 December 2014 at 10:36, Gilles
>> >> <gi...@harfang.homelinux.org>
>> >> > wrote:
>> >> >> On Sun, 28 Dec 2014 20:21:32 -0700, Phil Steitz wrote:
>> >> >>>
>> >> >>> On 12/28/14 11:46 AM, Gilles wrote:
>> >> >>>>
>> >> >>>> Hi.
>> >> >>>>
>> >> >>>> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
>> >> >>>>>
>> >> >>>>> Le 28/12/2014 00:22, sebb a écrit :
>> >> >>>>>>
>> >> >>>>>> On 27 December 2014 at 22:19, Gilles
>> >> >>>>>> <gi...@harfang.homelinux.org> wrote:
>> >> >>>>>>>
>> >> >>>>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> On 24 December 2014 at 15:11, Gilles
>> >> >>>>>>>> <gi...@harfang.homelinux.org> wrote:
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe 
>> wrote:
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
>> >> >>>>>>>>>>>
>> >> >>>>>>>>>>>
>> >> >>>>>>>>>>>
>> >> >>>>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe
>> >> >>>>>>>>>>> wrote:
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
>> >> >>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>
>> >> >>>>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math
>> >> 3.4
>> >> >>>>>>>>>>>>>> from release
>> >> >>>>>>>>>>>>>> candidate 3.
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>> Tag name:
>> >> >>>>>>>>>>>>>>   MATH_3_4_RC3 (signature can be checked from git
>> >> using
>> >> >>>>>>>>>>>>>> 'git tag
>> >> >>>>>>>>>>>>>> -v')
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>> Tag URL:
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> 
>> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>
>> >> >>>>>>>>>>>>> Is there a way to check that the source code
>> >> >>>>>>>>>>>>> referred
>> >> to
>> >> >>>>>>>>>>>>> above
>> >> >>>>>>>>>>>>> was the one used to create the JAR of the ".class"
>> >> >>>>>>>>>>>>> files. [Out of curiosity, not suspicion, of
>> >> >>>>>>>>>>>>> course...]
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>> Yes, you can look at the end of the
>> >> META-INF/MANIFEST.MS
>> >> >>>>>>>>>>>> file embedded
>> >> >>>>>>>>>>>> in the jar. The second-to-last entry is called
>> >> >>>>>>>>>>>> Implementation-Build.
>> >> >>>>>>>>>>>> It
>> >> >>>>>>>>>>>> is automatically created by
>> >> maven-jgit-buildnumber-plugin
>> >> >>>>>>>>>>>> and contains
>> >> >>>>>>>>>>>> the SHA1 identifier of the last commit used for the
>> >> >>>>>>>>>>>> build. Here, is is
>> >> >>>>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can
>> >> check
>> >> >>>>>>>>>>>> it really
>> >> >>>>>>>>>>>> corresponds to the expected status of the git
>> >> repository.
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>
>> >> >>>>>>>>>>> Can this be considered "secure", i.e. can't this 
>> entry
>> >> in
>> >> >>>>>>>>>>> the MANIFEST
>> >> >>>>>>>>>>> file be modified to be the checksum of the repository
>> >> but
>> >> >>>>>>>>>>> with the
>> >> >>>>>>>>>>> .class
>> >> >>>>>>>>>>> files being substitued with those coming from another
>> >> >>>>>>>>>>> compilation?
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> Modifying anything in the jar (either this entry 
>> within
>> >> the
>> >> >>>>>>>>>> manifest or
>> >> >>>>>>>>>> any class) will modify the jar signature. So as long 
>> as
>> >> >>>>>>>>>> people do check
>> >> >>>>>>>>>> the global MD5, SHA1 or gpg signature we provide with
>> >> >>>>>>>>>> our build, they
>> >> >>>>>>>>>> are safe to assume the artifacts are Apache artifacts.
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> This is not different from how releases are done with
>> >> >>>>>>>>>> subversion as the
>> >> >>>>>>>>>> source code control system, or even in C or C++ as the
>> >> >>>>>>>>>> language. At one
>> >> >>>>>>>>>> time, the release manager does perform a compilation 
>> and
>> >> >>>>>>>>>> the fellow
>> >> >>>>>>>>>> reviewers check the result. There is no fullproof
>> >> >>>>>>>>>> process here, as
>> >> >>>>>>>>>> always when security is involved. Even using an
>> >> >>>>>>>>>> automated build and
>> >> >>>>>>>>>> automatic signing on an Apache server would involve
>> >> >>>>>>>>>> trust (i.e. one
>> >> >>>>>>>>>> should assume that the server has not been tampered
>> >> >>>>>>>>>> with, that the build
>> >> >>>>>>>>>> process really does what it is expected to do, that 
>> the
>> >> >>>>>>>>>> artifacts put to
>> >> >>>>>>>>>> review are really the one created by the automatic
>> >> process
>> >> >>>>>>>>>> ...).
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> Another point is that what we officially release is 
>> the
>> >> >>>>>>>>>> source, which
>> >> >>>>>>>>>> can be reviewed by external users. The binary parts 
>> are
>> >> >>>>>>>>>> merely a
>> >> >>>>>>>>>> convenience.
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>> That's an interesting point to come back to since it
>> >> >>>>>>>>> looks like the
>> >> >>>>>>>>> most time-consuming part of a release is not related to
>> >> the
>> >> >>>>>>>>> sources!
>> >> >>>>>>>>>
>> >> >>>>>>>>> Isn't it conceivable that a release could just be a
>> >> >>>>>>>>> commit identifier
>> >> >>>>>>>>> and a checksum of the repository?
>> >> >>>>>>>>>
>> >> >>>>>>>>> If the binaries are a just a convenience, why put so 
>> much
>> >> >>>>>>>>> effort in it?
>> >> >>>>>>>>> As a convenience, the artefacts could be produced after
>> >> the
>> >> >>>>>>>>> release,
>> >> >>>>>>>>> accompanied with all the "caveat" notes which you
>> >> mentioned.
>> >> >>>>>>>>>
>> >> >>>>>>>>> That would certainly increase the release rate.
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> Binary releases still need to be reviewed to ensure that
>> >> the
>> >> >>>>>>>> correct N
>> >> >>>>>>>> & L files are present, and that the archives don't 
>> contain
>> >> >>>>>>>> material
>> >> >>>>>>>> with disallowed licenses.
>> >> >>>>>>>>
>> >> >>>>>>>> It's not unknown for automated build processes to 
>> include
>> >> >>>>>>>> files that
>> >> >>>>>>>> should not be present.
>> >> >>>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> I fail to see the difference of principle between the
>> >> >>>>>>> "release" context
>> >> >>>>>>> and, say, the daily snapshot context.
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> Snapshots are not (should not) be promoted to the general
>> >> >>>>>> public as
>> >> >>>>>> releases of the ASF.
>> >> >>>>>>
>> >> >>>>>>> What I mean is that there seem to be a contradiction
>> >> >>>>>>> between saying that
>> >> >>>>>>> a "release" is only about _source_ and the obligation to
>> >> check
>> >> >>>>>>> _binaries_.
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> There is no contradiction here.
>> >> >>>>>> The ASF releases source, they are required in a release.
>> >> >>>>>> Binaries are optional.
>> >> >>>>>> That does not mean that the ASF mirror system can be used 
>> to
>> >> >>>>>> distribute arbitrary binaries.
>> >> >>>>>>
>> >> >>>>>>> It can occur that disallowed material is, at some point 
>> in
>> >> >>>>>>> time, part of
>> >> >>>>>>> the repository and/or the snapshot binaries.
>> >> >>>>>>> However, what is forbidden is... forbidden, at all times.
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> As with most things, this is not a strict dichotomy.
>> >> >>>>>>
>> >> >>>>>>> If it is indeed a problem to distribute forbidden 
>> material,
>> >> >>>>>>> shouldn't
>> >> >>>>>>> this be corrected in the repository? [That's indeed what
>> >> >>>>>>> you did with
>> >> >>>>>>> the blocking of the release.]
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> If the repo is discovered to contain disallowed material, 
>> it
>> >> >>>>>> needs to
>> >> >>>>>> be removed.
>> >> >>>>>>
>> >> >>>>>>> Then again, once the repository is "clean", it can be
>> >> >>>>>>> tagged and that
>> >> >>>>>>> tagged _source_ is the release.
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> Not quite.
>> >> >>>>>>
>> >> >>>>>> A release is a source archive that is voted on and
>> >> distributed
>> >> >>>>>> via the
>> >> >>>>>> ASF mirror system.
>> >> >>>>>> The contents must agree with the source tag, but the 
>> source
>> >> tag
>> >> >>>>>> is not
>> >> >>>>>> the release.
>> >> >>>>>>
>> >> >>>>>>> Non-compliant binaries would thus only be the result of a
>> >> >>>>>>> "mistake"
>> >> >>>>>>> (if the build system is flawed, it's another problem,
>> >> >>>>>>> unrelated to
>> >> >>>>>>> the released contents, which is _source_) to be corrected
>> >> per
>> >> >>>>>>> se.
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> Not so. There are other failure modes.
>> >> >>>>>>
>> >> >>>>>> An automated build obviously reduces the chances of
>> >> >>>>>> mistakes, but it
>> >> >>>>>> can still create an archive containing files that should
>> >> >>>>>> not
>> >> be
>> >> >>>>>> there.
>> >> >>>>>> [Or indeed, omits files that should be present]
>> >> >>>>>> For example, the workspace contains spurious files which 
>> are
>> >> >>>>>> implicitly included by the assembly instructions.
>> >> >>>>>> Or the build process creates spurious files that are
>> >> >>>>>> incorrectly added
>> >> >>>>>> to the archive.
>> >> >>>>>> Or the build incorrectly includes jars that are supposed 
>> to
>> >> be
>> >> >>>>>> provided by the end user
>> >> >>>>>> etc.
>> >> >>>>>>
>> >> >>>>>> I have seen all the above in RC votes.
>> >> >>>>>> There are probably other falure modes.
>> >> >>>>>>
>> >> >>>>>>> My proposition is that it's an independent step: once the
>> >> >>>>>>> build system is adjusted to the expectations, "correct"
>> >> >>>>>>> binaries can be
>> >> >>>>>>> generated from the same tagged release.
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> It does not matter when the binary is built.
>> >> >>>>>> If it is distributed by the PMC as a formal release, it 
>> must
>> >> >>>>>> not contain any surprises, e.g. it must be licensed under
>> >> >>>>>> the AL.
>> >> >>>>>>
>> >> >>>>>> It is therefore vital that the contents are as expected 
>> from
>> >> >>>>>> the build.
>> >> >>>>>>
>> >> >>>>>> Note also that a formal release becomes an act of the PMC 
>> by
>> >> >>>>>> the voting process.
>> >> >>>>>> The ASF can then assume responsibility for any legal 
>> issues
>> >> >>>>>> that may arise.
>> >> >>>>>> Otherwise it is entirely the personal responsibility of 
>> the
>> >> >>>>>> person who
>> >> >>>>>> releases it.
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> I think the last two points are really important: binaries
>> >> must
>> >> >>>>> be
>> >> >>>>> checked and the foundation provides a legal protection for
>> >> >>>>> the project
>> >> >>>>> if something weird occurs.
>> >> >>>>>
>> >> >>>>> I also think another point is important: many if not most
>> >> users
>> >> >>>>> do
>> >> >>>>> really expect binaries and not source. From our internal
>> >> Apache
>> >> >>>>> point
>> >> >>>>> of view, these are a by-product,. For many others it is the
>> >> >>>>> important
>> >> >>>>> thing. It is mostly true in maven land as dependencies are
>> >> >>>>> automatically retrieved in binary form, not source form. So
>> >> the
>> >> >>>>> maven
>> >> >>>>> central repository as a distribution system is important.
>> >> >>>>>
>> >> >>>>> Even if for some security reason it sounds at first thought
>> >> >>>>> logical to
>> >> >>>>> rely on source only and compile oneself, in an industrial
>> >> >>>>> context project teams do not have enough time to do it for
>> >> >>>>> all their dependencies, so they use binaries provided by
>> >> >>>>> trusted third parties. A
>> >> >>>>> long time ago, I compiled a lot of free software tools for
>> >> >>>>> the department I worked for at that time. I do not do this
>> >> anymore,
>> >> >>>>> and
>> >> >>>>> trust the binaries provided by the packaging team for a
>> >> >>>>> distribution
>> >> >>>>> (typically Debian). They do rely on source and compile
>> >> >>>>> themselves. Hey,
>> >> >>>>> I even think Emmanuel here belongs to the Debian java
>> >> >>>>> team ;-)
>> >> I
>> >> >>>>> guess
>> >> >>>>> such teams that do rely on source are rather the exception
>> >> than
>> >> >>>>> the
>> >> >>>>> rule. The other examples I can think of are packaging 
>> teams,
>> >> >>>>> development teams that need bleeding edge (and will also
>> >> >>>>> directly depend on the repository, not even the release),
>> >> >>>>> projects that need to
>> >> >>>>> introduce their own patches and people who have critical
>> >> >>>>> needs (for
>> >> >>>>> example when safety of people is concerned or when they 
>> need
>> >> >>>>> full control for legal or contractual reasons). Many other
>> >> >>>>> people download
>> >> >>>>> binaries directly and would simply not consider using a
>> >> project
>> >> >>>>> if it
>> >> >>>>> is not readily available: they don't have time for this and
>> >> >>>>> don't want
>> >> >>>>> to learn how to build tens or hundred of different projects
>> >> they
>> >> >>>>> simply
>> >> >>>>> use.
>> >> >>>>>
>> >> >>>>
>> >> >>>> I do not disagree with anything said on this thread. [In
>> >> >>>> particular, I
>> >> >>>> did not at all imply that any one committer could take
>> >> >>>> responsibility
>> >> >>>> for releasing unchecked items.]
>> >> >>>>
>> >> >>>> I'm simply suggesting that what is called the release
>> >> >>>> process/management
>> >> >>>> could be made simpler (and _consequently_ could lead to more
>> >> >>>> regularly
>> >> >>>> releasing the CM code), by separating the concerns.
>> >> >>>> The concerns are
>> >> >>>>  1. "code" (the contents), and
>> >> >>>>  2. "artefacts" (the result of the build system acting on 
>> the
>> >> >>>> "code").
>> >> >>>>
>> >> >>>> Checking of one of these is largely independent from 
>> checking
>> >> the
>> >> >>>> other.
>> >> >>>
>> >> >>>
>> >> >>> Unfortunately, not really.  One principle that we have (maybe
>> >> not
>> >> >>> crystal clear in the release doco) is that when we do
>> >> >>> distribute binaries, they should really be "convenience
>> >> >>> binaries" which
>> >> means
>> >> >>> that everything needed to create them is in the source or its
>> >> >>> documented dependencies.  What that means is that what we tag
>> >> >>> as the
>> >> >>> source release needs to be able to generate any binaries that
>> >> >>> we subsequently release.  The only way to really test that is
>> >> >>> to generate the binaries and inspect them as part of 
>> verifying
>> >> >>> the release.
>> >> >>
>> >> >>
>> >> >> Only way?  That's certainly not obvious to me: Since a
>> >> >> tag/branch uniquely identifies a set of files, that is, the
>> >> >> "source release [that
>> >> >> is] able to generate any binaries that we subsequently 
>> release",
>> >> >> if a
>> >> >> RM can do it at (source) release time, he (or someone else!) 
>> can
>> >> >> do it
>> >> >> later, too (by running the build from a clone of the 
>> repository
>> >> in
>> >> >> its
>> >> >> tagged state).
>> >> >>
>> >> >>> As others have pointed out, anything we release has to be
>> >> verified
>> >> >>> and voted on.  As RM and reviewer, I think it is actually
>> >> >>> easier to roll and verify source and binaries together.
>> >> >>
>> >> >
>> >> > +1
>> >> >
>> >> >>
>> >> >> It's precisely my main point.
>> >> >> I won't dispute that you can prefer doing both (and nobody 
>> would
>> >> >> forbid
>> >> >> a RM to do just that) but the point is about the possibility 
>> to
>> >> >> release
>> >> >> source-only code (as the first step of a two-step procedure
>> >> >> which
>> >> I
>> >> >> described earlier).
>> >> >> [IMHO, the two-step one seems easier (both for the RM and the
>> >> >> reviewer),
>> >> >> (mileage does vary).]
>> >> >
>> >> > What is easier?
>> >> > It seems to me there will be at least one other step in your
>> >> > proposed process, i.e. a second VOTE e-mail
>> >>
>> >> Yes, that's obviously what I meant:
>> >> Two steps == two votes
>> >>
>> >> [But: source releases need not necessarily be accompanied with
>> >> "binaries", which, I imagine, could lead to official releases
>> >> occurring more often (due to the reduced number of checks).]
>> >>
>> >> > These will both contain most of the same information.
>> >>
>> >> No.
>> >> The first step is about the source, i.e. the code which humans
>> >> create.
>> >> The second step is about the files which a build system creates.
>> >>
>> >> As I indicated previously, the first vote will be about a set of
>> >> reviewers being satisfied with the state of the souce code, while
>> >> the second vote will be about another set of reviewers being
>> >> satisfied
>> >> with the results of the build system ("no glitch", as you 
>> described
>> >> in an earlier message).
>> >>
>> >> > Is the intention to announce the source release separately from
>> >> the
>> >> > binary release?
>> >> > If so, there will need to be 2 announce mails, and 2 updates to
>> >> the
>> >> > download page.
>> >>
>> >> Is there a problem with that?
>> >> There are actually several possible cases (depending on the will 
>> of
>> >> the RM):
>> >>   * one-step release (only source code)
>> >>   * two-steps (source, then binaries based on that source)
>> >>   * combined (as is done up to now)
>> >>   * binaries (based on any previously released source)
>> >>
>> >> >> In short is it forbidden (by the official/legal rules of ASF) 
>> to
>> >> >> proceed
>> >> >> as I propose?
>> >> >
>> >> > Dunno, depends on what exactly you are proposing.
>> >>
>> >> Cf. above (and previous mails).
>> >>
>> >> In practice the release could (IIUC) be like the link provided
>> >> by Luc in RC1 of CM 3.4 (whose target was a TAR of the tagged
>> >> repository).
>> >>
>> >>
>> >> >> It is impossible technically?
>> >> >
>> >> > Currently the Maven build process creates:
>> >> > - Maven source and binary jars
>> >> > - ASF source and binary bundles
>> >>
>> >> AFAIU, the JARs (source and binary) are "binaries", the binary
>> >> bundles are "binaries". Only the ASF source is "source".
>> >>
>> >> > It's not clear to me what exactly you propose to release in 
>> stage
>> >> > one,
>> >>
>> >> The ASF source (e.g. in the form of a tarball, or the appropriate
>> >> "git clone" command).
>> >>
>> >> > but there will need to be some changes to the process in order 
>> to
>> >> > release just the ASF source.
>> >>
>> >> I don't see which.
>> >> A "source RM" would just stop the process after
>> >> resolving/postponing the pending issues, and checking the various
>> >> reports about the source
>> >> code. [Then create the tag, and request a vote.]
>> >>
>> >> A "binary RM" would take on from that point (a tagged 
>> repository),
>> >> i.e. create all the binaries, sign them, etc.
>> >>
>> >> > There is no point releasing the Maven source jars separately 
>> from
>> >> > the binary jars; they are not complete as they only contain 
>> java
>> >> > files for
>> >> > use with IDEs.
>> >>
>> >> I don't understand that.
>> >> In principle, a JAR with the Java sources is indeed the necessary
>> >> and
>> >> sufficient condition for users to create the executable bytecode,
>> >> with
>> >> whatever build system they wish.
>> >> But I agree that it's not useful to not release all the files
>> >> needed to easily run maven. [And, for convenience, a source
>> >> release would be
>> >> accompanied with instructions on how to build a JAR of the 
>> compiled
>> >> classes, using maven.]
>> >>
>> >> > But in any case, AFAIK it is very tricky to release new files
>> >> > into an existing Maven folder, and it may cause problems for 
>> end
>> >> > users.
>> >>
>> >> I don't understand what you mean by "release new files into an
>> >> existing Maven folder"...
>> >>
>> >> Gilles
>> >>
>> >> >>
>> >> >>
>> >> >>> Phil
>> >> >>>
>> >> >>>
>> >> >>>> [The more so that, as you said, no fool-proof link between 
>> the
>> >> >>>> two can
>> >> >>>> be ensured: From a security POV, checking the former 
>> requires
>> >> >>>> a code
>> >> >>>> review, while using the latter requires trust in the build
>> >> >>>> system.]
>> >> >>>>
>> >> >>>> Thus we could release the "code", after checking and voting 
>> on
>> >> >>>> the concerned elements (i.e. the repository state
>> >> >>>> corresponding to a specific tag + the web site).
>> >> >>>>
>> >> >>>> Then we could release the "binaries", as a convenience, 
>> after
>> >> >>>> checking
>> >> >>>> and voting on the concerned elements (i.e. the files about 
>> to
>> >> be
>> >> >>>> distributed).
>> >> >>>>
>> >> >>>> I think that it's an added flexibility that would, for
>> >> >>>> example, allow
>> >> >>>> the tagging of the repository without necessarily release
>> >> >>>> binaries (i.e.
>> >> >>>> not involving that part of the work); and to release 
>> binaries
>> >> >>>> (say, at
>> >> >>>> regular intervals) based on the latest tagged code (i.e. not
>> >> >>>> involving
>> >> >>>> the work about solving/evaluating/postponing issues).
>> >> >>>>
>> >> >>>> [I completely admit that, at first, it might look a little
>> >> >>>> more confusing for the plain user, but (IIUC) it would be a
>> >> >>>> better representation of the reality covered by stating that
>> >> >>>> the ASF releases source code.]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by Gilles <gi...@harfang.homelinux.org>.
On Tue, 30 Dec 2014 03:22:32 +0000, sebb wrote:
> On 30 December 2014 at 03:05, Gilles <gi...@harfang.homelinux.org> 
> wrote:
>> On Tue, 30 Dec 2014 02:12:51 +0000, sebb wrote:
>>>
>>> On 30 December 2014 at 02:06, Gilles <gi...@harfang.homelinux.org> 
>>> wrote:
>>>>
>>>> On Tue, 30 Dec 2014 01:48:20 +0000, sebb wrote:
>>>>>
>>>>>
>>>>> On 30 December 2014 at 01:36, Bernd Eckenfels 
>>>>> <ec...@zusammenkunft.net>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Am Tue, 30 Dec 2014 02:29:38 +0100
>>>>>> schrieb Gilles <gi...@harfang.homelinux.org>:
>>>>>>
>>>>>>> On Tue, 30 Dec 2014 02:09:42 +0100, Bernd Eckenfels wrote:
>>>>>>> > That thread gets deep. :)
>>>>>>> >
>>>>>>> > I just wanted to comment on "releasing only
>>>>>>> > source is faster because of less checks". I disagree with 
>>>>>>> that, most
>>>>>>> > release delay/time is due to preparation work. Failed 
>>>>>>> (binary)
>>>>>>> > checks are typically for a reason which would also be present 
>>>>>>> in
>>>>>>> > the source (especially the POM), so it does not really reduce 
>>>>>>> the
>>>>>>> > number of rework.
>>>>>>>
>>>>>>> RM is a streamlined procedure: so, if you do (say) 10 steps 
>>>>>>> rather
>>>>>>> than 15, it will objectively take less time, and this is 
>>>>>>> compounded
>>>>>>> by the additional tests which should (ideally) be performed by 
>>>>>>> the
>>>>>>> reviewers. [Thus delaying the release.]
>>>>>>
>>>>>>
>>>>>>
>>>>>> The problem is not the small additional time for the last 5 
>>>>>> steps but
>>>>>> the large time for redoing all steps (on veto).
>>>>>>
>>>>>>
>>>>>>> > (At least not in most cases, so two votes will actually make 
>>>>>>> us
>>>>>>> > more work not less).
>>>>>>>
>>>>>>> The additional work exactly amounts to sending _one_ additional 
>>>>>>> mail.
>>>>>>
>>>>>>
>>>>>>
>>>>>> The actual work is not the vote mail but the people doing the
>>>>>> preparation and the review.
>>>>>>
>>>>>>>
>>>>>>> Then, as I noted,
>>>>>>>   * some releases will be done as before (same work)
>>>>>>>   * some releases will be "source only" (less work)
>>>>>>
>>>>>>
>>>>>>
>>>>>> Not much, you still have to check if the source actually works 
>>>>>> and can
>>>>>> be build, produces sane archives and so on.
>>>>>>
>>>>>>>   * some releases will be two-steps, possibly performed by two
>>>>>>> different people (i.e. less work for each RM)
>>>>>>
>>>>>>
>>>>>>
>>>>>> And more work in sum, not only for the RMs but also the 
>>>>>> reviewers. (and
>>>>>> the users which want to use the source release with maven like 
>>>>>> anybody
>>>>>> there days)
>>>>>>
>>>>>> But I dont mind, if a project wants to do a source release only, 
>>>>>> thats
>>>>>> fine with me, I just don't see the advantage.
>>>>>
>>>>>
>>>>>
>>>>> How many end users just want a source release anyway?
>>>>>
>>>>> I would expect most users to use the Maven jars, some will use 
>>>>> the ASF
>>>>> binaries, and a few will use the ASF source (AIUI Linux distros 
>>>>> often
>>>>> build from source).
>>>>
>>>>
>>>>
>>>> So, you answered your own question.
>>>>
>>>>>
>>>>> Even if only the source is released, it's still necessary for the 
>>>>> RM
>>>>> and reviewers to build and test it.
>>>>
>>>>
>>>>
>>>> Never said otherwise.
>>>> [Testing the sources is one git command and one maven command.
>>>
>>>
>>> Not so.
>>>
>>> The source archive has to be downloaded, and its sigs and hashes 
>>> checked.
>>> It also has to be compared against the SCM tag, and the N&L files 
>>> checked.
>>
>>
>> (1)
>> download == git clone tag_url
>> --> No download of a signed archive.
>
> But the signed archive is what is released.
> The ASF releases open source which is distributed from the ASF mirror 
> system.
>
> So the signed archive is a fundamental part of the RC vote.

So it's either that or point (2).
[Both check the signature of the source code.]

>> (2)
>> git tag -v tag_name
>>
>
> No idea what that does.

Cf. previous paragraph.

>
>> (3)
>> build == maven test site
>>
>> [Sorry: that was 3 commands.]
>>
>> Then maybe people in the know can examine the license issues, like 
>> you did.
>> But I hardly count that every reviewer would do it. [Besides, it 
>> should
>> have been done at the time the code was introduced. And, as I said 
>> in the
>> other thread, we might seriously need to consider requesting an 
>> actual legal
>> review if the matter is so sensitive: Submit to a lawyer when the 
>> contents
>> is changed; no need to check when the contents is left untouched.]
>
> The contents is potentially changed with every commit.
> Yes, the N&L files should be kept up to date as each commit is added.
>
> However, this is not always done, so it's important to check them
> before release.

I contend that there should be a big fat warning that those files 
should
not be modified lightly. And if they are, an issue _must_ be opened on
the bug-tracking system with the rationale for the new contents, or a
request that knwoledgeable people examine the situation.


>>>> Testing
>>>> the binaries requires downloading each of them and check the 
>>>> signatures
>>>> and/or checksums, each a separate command.]
>>>
>>>
>>> The files can be downloaded as a single bunch, especially if one 
>>> uses
>>> the SVN dist/dev staging area.
>>>
>>> It's easy enough to write shell scripts to check all hashes and 
>>> sigs
>>> in a single directory.
>>
>>
>> When I've asked at this thread's start (under subject "Git 
>> question"),
>> the answer was that this does not strictly prove the link between 
>> source
>> code and binaries.
>
>> Hence the attempt to segregate what can be proved from what cannot.
>> Back to square one.
>
> Provable provenance is only part of what the vote should be about.
>
> It's not possible in general to prove that a binary is derived from a 
> source.
> However, it is possible to document the source tag and release
> artifacts in the vote such that a release artifact downloaded from 
> the
> ASF mirror system can be proved to have been voted on.

How is this not true with source-only release?


Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by sebb <se...@gmail.com>.
On 30 December 2014 at 03:05, Gilles <gi...@harfang.homelinux.org> wrote:
> On Tue, 30 Dec 2014 02:12:51 +0000, sebb wrote:
>>
>> On 30 December 2014 at 02:06, Gilles <gi...@harfang.homelinux.org> wrote:
>>>
>>> On Tue, 30 Dec 2014 01:48:20 +0000, sebb wrote:
>>>>
>>>>
>>>> On 30 December 2014 at 01:36, Bernd Eckenfels <ec...@zusammenkunft.net>
>>>> wrote:
>>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> Am Tue, 30 Dec 2014 02:29:38 +0100
>>>>> schrieb Gilles <gi...@harfang.homelinux.org>:
>>>>>
>>>>>> On Tue, 30 Dec 2014 02:09:42 +0100, Bernd Eckenfels wrote:
>>>>>> > That thread gets deep. :)
>>>>>> >
>>>>>> > I just wanted to comment on "releasing only
>>>>>> > source is faster because of less checks". I disagree with that, most
>>>>>> > release delay/time is due to preparation work. Failed (binary)
>>>>>> > checks are typically for a reason which would also be present in
>>>>>> > the source (especially the POM), so it does not really reduce the
>>>>>> > number of rework.
>>>>>>
>>>>>> RM is a streamlined procedure: so, if you do (say) 10 steps rather
>>>>>> than 15, it will objectively take less time, and this is compounded
>>>>>> by the additional tests which should (ideally) be performed by the
>>>>>> reviewers. [Thus delaying the release.]
>>>>>
>>>>>
>>>>>
>>>>> The problem is not the small additional time for the last 5 steps but
>>>>> the large time for redoing all steps (on veto).
>>>>>
>>>>>
>>>>>> > (At least not in most cases, so two votes will actually make us
>>>>>> > more work not less).
>>>>>>
>>>>>> The additional work exactly amounts to sending _one_ additional mail.
>>>>>
>>>>>
>>>>>
>>>>> The actual work is not the vote mail but the people doing the
>>>>> preparation and the review.
>>>>>
>>>>>>
>>>>>> Then, as I noted,
>>>>>>   * some releases will be done as before (same work)
>>>>>>   * some releases will be "source only" (less work)
>>>>>
>>>>>
>>>>>
>>>>> Not much, you still have to check if the source actually works and can
>>>>> be build, produces sane archives and so on.
>>>>>
>>>>>>   * some releases will be two-steps, possibly performed by two
>>>>>> different people (i.e. less work for each RM)
>>>>>
>>>>>
>>>>>
>>>>> And more work in sum, not only for the RMs but also the reviewers. (and
>>>>> the users which want to use the source release with maven like anybody
>>>>> there days)
>>>>>
>>>>> But I dont mind, if a project wants to do a source release only, thats
>>>>> fine with me, I just don't see the advantage.
>>>>
>>>>
>>>>
>>>> How many end users just want a source release anyway?
>>>>
>>>> I would expect most users to use the Maven jars, some will use the ASF
>>>> binaries, and a few will use the ASF source (AIUI Linux distros often
>>>> build from source).
>>>
>>>
>>>
>>> So, you answered your own question.
>>>
>>>>
>>>> Even if only the source is released, it's still necessary for the RM
>>>> and reviewers to build and test it.
>>>
>>>
>>>
>>> Never said otherwise.
>>> [Testing the sources is one git command and one maven command.
>>
>>
>> Not so.
>>
>> The source archive has to be downloaded, and its sigs and hashes checked.
>> It also has to be compared against the SCM tag, and the N&L files checked.
>
>
> (1)
> download == git clone tag_url
> --> No download of a signed archive.

But the signed archive is what is released.
The ASF releases open source which is distributed from the ASF mirror system.

So the signed archive is a fundamental part of the RC vote.

> (2)
> git tag -v tag_name
>

No idea what that does.

> (3)
> build == maven test site
>
> [Sorry: that was 3 commands.]
>
> Then maybe people in the know can examine the license issues, like you did.
> But I hardly count that every reviewer would do it. [Besides, it should
> have been done at the time the code was introduced. And, as I said in the
> other thread, we might seriously need to consider requesting an actual legal
> review if the matter is so sensitive: Submit to a lawyer when the contents
> is changed; no need to check when the contents is left untouched.]

The contents is potentially changed with every commit.
Yes, the N&L files should be kept up to date as each commit is added.

However, this is not always done, so it's important to check them
before release.

>>> Testing
>>> the binaries requires downloading each of them and check the signatures
>>> and/or checksums, each a separate command.]
>>
>>
>> The files can be downloaded as a single bunch, especially if one uses
>> the SVN dist/dev staging area.
>>
>> It's easy enough to write shell scripts to check all hashes and sigs
>> in a single directory.
>
>
> When I've asked at this thread's start (under subject "Git question"),
> the answer was that this does not strictly prove the link between source
> code and binaries.

> Hence the attempt to segregate what can be proved from what cannot.
> Back to square one.

Provable provenance is only part of what the vote should be about.

It's not possible in general to prove that a binary is derived from a source.
However, it is possible to document the source tag and release
artifacts in the vote such that a release artifact downloaded from the
ASF mirror system can be proved to have been voted on.

>
> Gilles
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by Gilles <gi...@harfang.homelinux.org>.
On Tue, 30 Dec 2014 02:12:51 +0000, sebb wrote:
> On 30 December 2014 at 02:06, Gilles <gi...@harfang.homelinux.org> 
> wrote:
>> On Tue, 30 Dec 2014 01:48:20 +0000, sebb wrote:
>>>
>>> On 30 December 2014 at 01:36, Bernd Eckenfels 
>>> <ec...@zusammenkunft.net>
>>> wrote:
>>>>
>>>> Hello,
>>>>
>>>> Am Tue, 30 Dec 2014 02:29:38 +0100
>>>> schrieb Gilles <gi...@harfang.homelinux.org>:
>>>>
>>>>> On Tue, 30 Dec 2014 02:09:42 +0100, Bernd Eckenfels wrote:
>>>>> > That thread gets deep. :)
>>>>> >
>>>>> > I just wanted to comment on "releasing only
>>>>> > source is faster because of less checks". I disagree with that, 
>>>>> most
>>>>> > release delay/time is due to preparation work. Failed (binary)
>>>>> > checks are typically for a reason which would also be present 
>>>>> in
>>>>> > the source (especially the POM), so it does not really reduce 
>>>>> the
>>>>> > number of rework.
>>>>>
>>>>> RM is a streamlined procedure: so, if you do (say) 10 steps 
>>>>> rather
>>>>> than 15, it will objectively take less time, and this is 
>>>>> compounded
>>>>> by the additional tests which should (ideally) be performed by 
>>>>> the
>>>>> reviewers. [Thus delaying the release.]
>>>>
>>>>
>>>> The problem is not the small additional time for the last 5 steps 
>>>> but
>>>> the large time for redoing all steps (on veto).
>>>>
>>>>
>>>>> > (At least not in most cases, so two votes will actually make us
>>>>> > more work not less).
>>>>>
>>>>> The additional work exactly amounts to sending _one_ additional 
>>>>> mail.
>>>>
>>>>
>>>> The actual work is not the vote mail but the people doing the
>>>> preparation and the review.
>>>>
>>>>>
>>>>> Then, as I noted,
>>>>>   * some releases will be done as before (same work)
>>>>>   * some releases will be "source only" (less work)
>>>>
>>>>
>>>> Not much, you still have to check if the source actually works and 
>>>> can
>>>> be build, produces sane archives and so on.
>>>>
>>>>>   * some releases will be two-steps, possibly performed by two
>>>>> different people (i.e. less work for each RM)
>>>>
>>>>
>>>> And more work in sum, not only for the RMs but also the reviewers. 
>>>> (and
>>>> the users which want to use the source release with maven like 
>>>> anybody
>>>> there days)
>>>>
>>>> But I dont mind, if a project wants to do a source release only, 
>>>> thats
>>>> fine with me, I just don't see the advantage.
>>>
>>>
>>> How many end users just want a source release anyway?
>>>
>>> I would expect most users to use the Maven jars, some will use the 
>>> ASF
>>> binaries, and a few will use the ASF source (AIUI Linux distros 
>>> often
>>> build from source).
>>
>>
>> So, you answered your own question.
>>
>>>
>>> Even if only the source is released, it's still necessary for the 
>>> RM
>>> and reviewers to build and test it.
>>
>>
>> Never said otherwise.
>> [Testing the sources is one git command and one maven command.
>
> Not so.
>
> The source archive has to be downloaded, and its sigs and hashes 
> checked.
> It also has to be compared against the SCM tag, and the N&L files 
> checked.

(1)
download == git clone tag_url
--> No download of a signed archive.

(2)
git tag -v tag_name

(3)
build == maven test site

[Sorry: that was 3 commands.]

Then maybe people in the know can examine the license issues, like you 
did.
But I hardly count that every reviewer would do it. [Besides, it should
have been done at the time the code was introduced. And, as I said in 
the
other thread, we might seriously need to consider requesting an actual 
legal
review if the matter is so sensitive: Submit to a lawyer when the 
contents
is changed; no need to check when the contents is left untouched.]

>> Testing
>> the binaries requires downloading each of them and check the 
>> signatures
>> and/or checksums, each a separate command.]
>
> The files can be downloaded as a single bunch, especially if one uses
> the SVN dist/dev staging area.
>
> It's easy enough to write shell scripts to check all hashes and sigs
> in a single directory.

When I've asked at this thread's start (under subject "Git question"),
the answer was that this does not strictly prove the link between 
source
code and binaries.
Hence the attempt to segregate what can be proved from what cannot.
Back to square one.


Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by sebb <se...@gmail.com>.
On 30 December 2014 at 02:06, Gilles <gi...@harfang.homelinux.org> wrote:
> On Tue, 30 Dec 2014 01:48:20 +0000, sebb wrote:
>>
>> On 30 December 2014 at 01:36, Bernd Eckenfels <ec...@zusammenkunft.net>
>> wrote:
>>>
>>> Hello,
>>>
>>> Am Tue, 30 Dec 2014 02:29:38 +0100
>>> schrieb Gilles <gi...@harfang.homelinux.org>:
>>>
>>>> On Tue, 30 Dec 2014 02:09:42 +0100, Bernd Eckenfels wrote:
>>>> > That thread gets deep. :)
>>>> >
>>>> > I just wanted to comment on "releasing only
>>>> > source is faster because of less checks". I disagree with that, most
>>>> > release delay/time is due to preparation work. Failed (binary)
>>>> > checks are typically for a reason which would also be present in
>>>> > the source (especially the POM), so it does not really reduce the
>>>> > number of rework.
>>>>
>>>> RM is a streamlined procedure: so, if you do (say) 10 steps rather
>>>> than 15, it will objectively take less time, and this is compounded
>>>> by the additional tests which should (ideally) be performed by the
>>>> reviewers. [Thus delaying the release.]
>>>
>>>
>>> The problem is not the small additional time for the last 5 steps but
>>> the large time for redoing all steps (on veto).
>>>
>>>
>>>> > (At least not in most cases, so two votes will actually make us
>>>> > more work not less).
>>>>
>>>> The additional work exactly amounts to sending _one_ additional mail.
>>>
>>>
>>> The actual work is not the vote mail but the people doing the
>>> preparation and the review.
>>>
>>>>
>>>> Then, as I noted,
>>>>   * some releases will be done as before (same work)
>>>>   * some releases will be "source only" (less work)
>>>
>>>
>>> Not much, you still have to check if the source actually works and can
>>> be build, produces sane archives and so on.
>>>
>>>>   * some releases will be two-steps, possibly performed by two
>>>> different people (i.e. less work for each RM)
>>>
>>>
>>> And more work in sum, not only for the RMs but also the reviewers. (and
>>> the users which want to use the source release with maven like anybody
>>> there days)
>>>
>>> But I dont mind, if a project wants to do a source release only, thats
>>> fine with me, I just don't see the advantage.
>>
>>
>> How many end users just want a source release anyway?
>>
>> I would expect most users to use the Maven jars, some will use the ASF
>> binaries, and a few will use the ASF source (AIUI Linux distros often
>> build from source).
>
>
> So, you answered your own question.
>
>>
>> Even if only the source is released, it's still necessary for the RM
>> and reviewers to build and test it.
>
>
> Never said otherwise.
> [Testing the sources is one git command and one maven command.

Not so.

The source archive has to be downloaded, and its sigs and hashes checked.
It also has to be compared against the SCM tag, and the N&L files checked.

> Testing
> the binaries requires downloading each of them and check the signatures
> and/or checksums, each a separate command.]

The files can be downloaded as a single bunch, especially if one uses
the SVN dist/dev staging area.

It's easy enough to write shell scripts to check all hashes and sigs
in a single directory.

>
> Gilles
>
>
>>
>>> Gruss
>>> Bernd
>>>
>>>>
>>>> Of course, each release means some work has to be done; then IIUC your
>>>> point, the fewer releases the better. :-}
>>>>
>>>>
>>>
>>>> >  Am Tue, 30 Dec 2014 02:05:29
>>>> > +0100 schrieb Gilles <gi...@harfang.homelinux.org>:
>>>> >
>>>> >> On Mon, 29 Dec 2014 10:54:59 +0000, sebb wrote:
>>>> >> > On 29 December 2014 at 10:36, Gilles
>>>> >> <gi...@harfang.homelinux.org>
>>>> >> > wrote:
>>>> >> >> On Sun, 28 Dec 2014 20:21:32 -0700, Phil Steitz wrote:
>>>> >> >>>
>>>> >> >>> On 12/28/14 11:46 AM, Gilles wrote:
>>>> >> >>>>
>>>> >> >>>> Hi.
>>>> >> >>>>
>>>> >> >>>> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
>>>> >> >>>>>
>>>> >> >>>>> Le 28/12/2014 00:22, sebb a écrit :
>>>> >> >>>>>>
>>>> >> >>>>>> On 27 December 2014 at 22:19, Gilles
>>>> >> >>>>>> <gi...@harfang.homelinux.org> wrote:
>>>> >> >>>>>>>
>>>> >> >>>>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
>>>> >> >>>>>>>>
>>>> >> >>>>>>>>
>>>> >> >>>>>>>> On 24 December 2014 at 15:11, Gilles
>>>> >> >>>>>>>> <gi...@harfang.homelinux.org> wrote:
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe wrote:
>>>> >> >>>>>>>>>>
>>>> >> >>>>>>>>>>
>>>> >> >>>>>>>>>>
>>>> >> >>>>>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
>>>> >> >>>>>>>>>>>
>>>> >> >>>>>>>>>>>
>>>> >> >>>>>>>>>>>
>>>> >> >>>>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe
>>>> >> >>>>>>>>>>> wrote:
>>>> >> >>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>
>>>> >> >>>>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
>>>> >> >>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math
>>>> >> 3.4
>>>> >> >>>>>>>>>>>>>> from release
>>>> >> >>>>>>>>>>>>>> candidate 3.
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>> Tag name:
>>>> >> >>>>>>>>>>>>>>   MATH_3_4_RC3 (signature can be checked from git
>>>> >> using
>>>> >> >>>>>>>>>>>>>> 'git tag
>>>> >> >>>>>>>>>>>>>> -v')
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>> Tag URL:
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>>
>>>> >>
>>>> >> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>> Is there a way to check that the source code
>>>> >> >>>>>>>>>>>>> referred
>>>> >> to
>>>> >> >>>>>>>>>>>>> above
>>>> >> >>>>>>>>>>>>> was the one used to create the JAR of the ".class"
>>>> >> >>>>>>>>>>>>> files. [Out of curiosity, not suspicion, of
>>>> >> >>>>>>>>>>>>> course...]
>>>> >> >>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>
>>>> >> >>>>>>>>>>>>
>>>> >> >>>>>>>>>>>> Yes, you can look at the end of the
>>>> >> META-INF/MANIFEST.MS
>>>> >> >>>>>>>>>>>> file embedded
>>>> >> >>>>>>>>>>>> in the jar. The second-to-last entry is called
>>>> >> >>>>>>>>>>>> Implementation-Build.
>>>> >> >>>>>>>>>>>> It
>>>> >> >>>>>>>>>>>> is automatically created by
>>>> >> maven-jgit-buildnumber-plugin
>>>> >> >>>>>>>>>>>> and contains
>>>> >> >>>>>>>>>>>> the SHA1 identifier of the last commit used for the
>>>> >> >>>>>>>>>>>> build. Here, is is
>>>> >> >>>>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can
>>>> >> check
>>>> >> >>>>>>>>>>>> it really
>>>> >> >>>>>>>>>>>> corresponds to the expected status of the git
>>>> >> repository.
>>>> >> >>>>>>>>>>>>
>>>> >> >>>>>>>>>>>
>>>> >> >>>>>>>>>>> Can this be considered "secure", i.e. can't this entry
>>>> >> in
>>>> >> >>>>>>>>>>> the MANIFEST
>>>> >> >>>>>>>>>>> file be modified to be the checksum of the repository
>>>> >> but
>>>> >> >>>>>>>>>>> with the
>>>> >> >>>>>>>>>>> .class
>>>> >> >>>>>>>>>>> files being substitued with those coming from another
>>>> >> >>>>>>>>>>> compilation?
>>>> >> >>>>>>>>>>
>>>> >> >>>>>>>>>>
>>>> >> >>>>>>>>>>
>>>> >> >>>>>>>>>>
>>>> >> >>>>>>>>>> Modifying anything in the jar (either this entry within
>>>> >> the
>>>> >> >>>>>>>>>> manifest or
>>>> >> >>>>>>>>>> any class) will modify the jar signature. So as long as
>>>> >> >>>>>>>>>> people do check
>>>> >> >>>>>>>>>> the global MD5, SHA1 or gpg signature we provide with
>>>> >> >>>>>>>>>> our build, they
>>>> >> >>>>>>>>>> are safe to assume the artifacts are Apache artifacts.
>>>> >> >>>>>>>>>>
>>>> >> >>>>>>>>>> This is not different from how releases are done with
>>>> >> >>>>>>>>>> subversion as the
>>>> >> >>>>>>>>>> source code control system, or even in C or C++ as the
>>>> >> >>>>>>>>>> language. At one
>>>> >> >>>>>>>>>> time, the release manager does perform a compilation and
>>>> >> >>>>>>>>>> the fellow
>>>> >> >>>>>>>>>> reviewers check the result. There is no fullproof
>>>> >> >>>>>>>>>> process here, as
>>>> >> >>>>>>>>>> always when security is involved. Even using an
>>>> >> >>>>>>>>>> automated build and
>>>> >> >>>>>>>>>> automatic signing on an Apache server would involve
>>>> >> >>>>>>>>>> trust (i.e. one
>>>> >> >>>>>>>>>> should assume that the server has not been tampered
>>>> >> >>>>>>>>>> with, that the build
>>>> >> >>>>>>>>>> process really does what it is expected to do, that the
>>>> >> >>>>>>>>>> artifacts put to
>>>> >> >>>>>>>>>> review are really the one created by the automatic
>>>> >> process
>>>> >> >>>>>>>>>> ...).
>>>> >> >>>>>>>>>>
>>>> >> >>>>>>>>>> Another point is that what we officially release is the
>>>> >> >>>>>>>>>> source, which
>>>> >> >>>>>>>>>> can be reviewed by external users. The binary parts are
>>>> >> >>>>>>>>>> merely a
>>>> >> >>>>>>>>>> convenience.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> That's an interesting point to come back to since it
>>>> >> >>>>>>>>> looks like the
>>>> >> >>>>>>>>> most time-consuming part of a release is not related to
>>>> >> the
>>>> >> >>>>>>>>> sources!
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Isn't it conceivable that a release could just be a
>>>> >> >>>>>>>>> commit identifier
>>>> >> >>>>>>>>> and a checksum of the repository?
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> If the binaries are a just a convenience, why put so much
>>>> >> >>>>>>>>> effort in it?
>>>> >> >>>>>>>>> As a convenience, the artefacts could be produced after
>>>> >> the
>>>> >> >>>>>>>>> release,
>>>> >> >>>>>>>>> accompanied with all the "caveat" notes which you
>>>> >> mentioned.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> That would certainly increase the release rate.
>>>> >> >>>>>>>>
>>>> >> >>>>>>>>
>>>> >> >>>>>>>>
>>>> >> >>>>>>>> Binary releases still need to be reviewed to ensure that
>>>> >> the
>>>> >> >>>>>>>> correct N
>>>> >> >>>>>>>> & L files are present, and that the archives don't contain
>>>> >> >>>>>>>> material
>>>> >> >>>>>>>> with disallowed licenses.
>>>> >> >>>>>>>>
>>>> >> >>>>>>>> It's not unknown for automated build processes to include
>>>> >> >>>>>>>> files that
>>>> >> >>>>>>>> should not be present.
>>>> >> >>>>>>>>
>>>> >> >>>>>>>
>>>> >> >>>>>>> I fail to see the difference of principle between the
>>>> >> >>>>>>> "release" context
>>>> >> >>>>>>> and, say, the daily snapshot context.
>>>> >> >>>>>>
>>>> >> >>>>>>
>>>> >> >>>>>> Snapshots are not (should not) be promoted to the general
>>>> >> >>>>>> public as
>>>> >> >>>>>> releases of the ASF.
>>>> >> >>>>>>
>>>> >> >>>>>>> What I mean is that there seem to be a contradiction
>>>> >> >>>>>>> between saying that
>>>> >> >>>>>>> a "release" is only about _source_ and the obligation to
>>>> >> check
>>>> >> >>>>>>> _binaries_.
>>>> >> >>>>>>
>>>> >> >>>>>>
>>>> >> >>>>>> There is no contradiction here.
>>>> >> >>>>>> The ASF releases source, they are required in a release.
>>>> >> >>>>>> Binaries are optional.
>>>> >> >>>>>> That does not mean that the ASF mirror system can be used to
>>>> >> >>>>>> distribute arbitrary binaries.
>>>> >> >>>>>>
>>>> >> >>>>>>> It can occur that disallowed material is, at some point in
>>>> >> >>>>>>> time, part of
>>>> >> >>>>>>> the repository and/or the snapshot binaries.
>>>> >> >>>>>>> However, what is forbidden is... forbidden, at all times.
>>>> >> >>>>>>
>>>> >> >>>>>>
>>>> >> >>>>>> As with most things, this is not a strict dichotomy.
>>>> >> >>>>>>
>>>> >> >>>>>>> If it is indeed a problem to distribute forbidden material,
>>>> >> >>>>>>> shouldn't
>>>> >> >>>>>>> this be corrected in the repository? [That's indeed what
>>>> >> >>>>>>> you did with
>>>> >> >>>>>>> the blocking of the release.]
>>>> >> >>>>>>
>>>> >> >>>>>>
>>>> >> >>>>>> If the repo is discovered to contain disallowed material, it
>>>> >> >>>>>> needs to
>>>> >> >>>>>> be removed.
>>>> >> >>>>>>
>>>> >> >>>>>>> Then again, once the repository is "clean", it can be
>>>> >> >>>>>>> tagged and that
>>>> >> >>>>>>> tagged _source_ is the release.
>>>> >> >>>>>>
>>>> >> >>>>>>
>>>> >> >>>>>> Not quite.
>>>> >> >>>>>>
>>>> >> >>>>>> A release is a source archive that is voted on and
>>>> >> distributed
>>>> >> >>>>>> via the
>>>> >> >>>>>> ASF mirror system.
>>>> >> >>>>>> The contents must agree with the source tag, but the source
>>>> >> tag
>>>> >> >>>>>> is not
>>>> >> >>>>>> the release.
>>>> >> >>>>>>
>>>> >> >>>>>>> Non-compliant binaries would thus only be the result of a
>>>> >> >>>>>>> "mistake"
>>>> >> >>>>>>> (if the build system is flawed, it's another problem,
>>>> >> >>>>>>> unrelated to
>>>> >> >>>>>>> the released contents, which is _source_) to be corrected
>>>> >> per
>>>> >> >>>>>>> se.
>>>> >> >>>>>>
>>>> >> >>>>>>
>>>> >> >>>>>> Not so. There are other failure modes.
>>>> >> >>>>>>
>>>> >> >>>>>> An automated build obviously reduces the chances of
>>>> >> >>>>>> mistakes, but it
>>>> >> >>>>>> can still create an archive containing files that should
>>>> >> >>>>>> not
>>>> >> be
>>>> >> >>>>>> there.
>>>> >> >>>>>> [Or indeed, omits files that should be present]
>>>> >> >>>>>> For example, the workspace contains spurious files which are
>>>> >> >>>>>> implicitly included by the assembly instructions.
>>>> >> >>>>>> Or the build process creates spurious files that are
>>>> >> >>>>>> incorrectly added
>>>> >> >>>>>> to the archive.
>>>> >> >>>>>> Or the build incorrectly includes jars that are supposed to
>>>> >> be
>>>> >> >>>>>> provided by the end user
>>>> >> >>>>>> etc.
>>>> >> >>>>>>
>>>> >> >>>>>> I have seen all the above in RC votes.
>>>> >> >>>>>> There are probably other falure modes.
>>>> >> >>>>>>
>>>> >> >>>>>>> My proposition is that it's an independent step: once the
>>>> >> >>>>>>> build system is adjusted to the expectations, "correct"
>>>> >> >>>>>>> binaries can be
>>>> >> >>>>>>> generated from the same tagged release.
>>>> >> >>>>>>
>>>> >> >>>>>>
>>>> >> >>>>>> It does not matter when the binary is built.
>>>> >> >>>>>> If it is distributed by the PMC as a formal release, it must
>>>> >> >>>>>> not contain any surprises, e.g. it must be licensed under
>>>> >> >>>>>> the AL.
>>>> >> >>>>>>
>>>> >> >>>>>> It is therefore vital that the contents are as expected from
>>>> >> >>>>>> the build.
>>>> >> >>>>>>
>>>> >> >>>>>> Note also that a formal release becomes an act of the PMC by
>>>> >> >>>>>> the voting process.
>>>> >> >>>>>> The ASF can then assume responsibility for any legal issues
>>>> >> >>>>>> that may arise.
>>>> >> >>>>>> Otherwise it is entirely the personal responsibility of the
>>>> >> >>>>>> person who
>>>> >> >>>>>> releases it.
>>>> >> >>>>>
>>>> >> >>>>>
>>>> >> >>>>> I think the last two points are really important: binaries
>>>> >> must
>>>> >> >>>>> be
>>>> >> >>>>> checked and the foundation provides a legal protection for
>>>> >> >>>>> the project
>>>> >> >>>>> if something weird occurs.
>>>> >> >>>>>
>>>> >> >>>>> I also think another point is important: many if not most
>>>> >> users
>>>> >> >>>>> do
>>>> >> >>>>> really expect binaries and not source. From our internal
>>>> >> Apache
>>>> >> >>>>> point
>>>> >> >>>>> of view, these are a by-product,. For many others it is the
>>>> >> >>>>> important
>>>> >> >>>>> thing. It is mostly true in maven land as dependencies are
>>>> >> >>>>> automatically retrieved in binary form, not source form. So
>>>> >> the
>>>> >> >>>>> maven
>>>> >> >>>>> central repository as a distribution system is important.
>>>> >> >>>>>
>>>> >> >>>>> Even if for some security reason it sounds at first thought
>>>> >> >>>>> logical to
>>>> >> >>>>> rely on source only and compile oneself, in an industrial
>>>> >> >>>>> context project teams do not have enough time to do it for
>>>> >> >>>>> all their dependencies, so they use binaries provided by
>>>> >> >>>>> trusted third parties. A
>>>> >> >>>>> long time ago, I compiled a lot of free software tools for
>>>> >> >>>>> the department I worked for at that time. I do not do this
>>>> >> anymore,
>>>> >> >>>>> and
>>>> >> >>>>> trust the binaries provided by the packaging team for a
>>>> >> >>>>> distribution
>>>> >> >>>>> (typically Debian). They do rely on source and compile
>>>> >> >>>>> themselves. Hey,
>>>> >> >>>>> I even think Emmanuel here belongs to the Debian java
>>>> >> >>>>> team ;-)
>>>> >> I
>>>> >> >>>>> guess
>>>> >> >>>>> such teams that do rely on source are rather the exception
>>>> >> than
>>>> >> >>>>> the
>>>> >> >>>>> rule. The other examples I can think of are packaging teams,
>>>> >> >>>>> development teams that need bleeding edge (and will also
>>>> >> >>>>> directly depend on the repository, not even the release),
>>>> >> >>>>> projects that need to
>>>> >> >>>>> introduce their own patches and people who have critical
>>>> >> >>>>> needs (for
>>>> >> >>>>> example when safety of people is concerned or when they need
>>>> >> >>>>> full control for legal or contractual reasons). Many other
>>>> >> >>>>> people download
>>>> >> >>>>> binaries directly and would simply not consider using a
>>>> >> project
>>>> >> >>>>> if it
>>>> >> >>>>> is not readily available: they don't have time for this and
>>>> >> >>>>> don't want
>>>> >> >>>>> to learn how to build tens or hundred of different projects
>>>> >> they
>>>> >> >>>>> simply
>>>> >> >>>>> use.
>>>> >> >>>>>
>>>> >> >>>>
>>>> >> >>>> I do not disagree with anything said on this thread. [In
>>>> >> >>>> particular, I
>>>> >> >>>> did not at all imply that any one committer could take
>>>> >> >>>> responsibility
>>>> >> >>>> for releasing unchecked items.]
>>>> >> >>>>
>>>> >> >>>> I'm simply suggesting that what is called the release
>>>> >> >>>> process/management
>>>> >> >>>> could be made simpler (and _consequently_ could lead to more
>>>> >> >>>> regularly
>>>> >> >>>> releasing the CM code), by separating the concerns.
>>>> >> >>>> The concerns are
>>>> >> >>>>  1. "code" (the contents), and
>>>> >> >>>>  2. "artefacts" (the result of the build system acting on the
>>>> >> >>>> "code").
>>>> >> >>>>
>>>> >> >>>> Checking of one of these is largely independent from checking
>>>> >> the
>>>> >> >>>> other.
>>>> >> >>>
>>>> >> >>>
>>>> >> >>> Unfortunately, not really.  One principle that we have (maybe
>>>> >> not
>>>> >> >>> crystal clear in the release doco) is that when we do
>>>> >> >>> distribute binaries, they should really be "convenience
>>>> >> >>> binaries" which
>>>> >> means
>>>> >> >>> that everything needed to create them is in the source or its
>>>> >> >>> documented dependencies.  What that means is that what we tag
>>>> >> >>> as the
>>>> >> >>> source release needs to be able to generate any binaries that
>>>> >> >>> we subsequently release.  The only way to really test that is
>>>> >> >>> to generate the binaries and inspect them as part of verifying
>>>> >> >>> the release.
>>>> >> >>
>>>> >> >>
>>>> >> >> Only way?  That's certainly not obvious to me: Since a
>>>> >> >> tag/branch uniquely identifies a set of files, that is, the
>>>> >> >> "source release [that
>>>> >> >> is] able to generate any binaries that we subsequently release",
>>>> >> >> if a
>>>> >> >> RM can do it at (source) release time, he (or someone else!) can
>>>> >> >> do it
>>>> >> >> later, too (by running the build from a clone of the repository
>>>> >> in
>>>> >> >> its
>>>> >> >> tagged state).
>>>> >> >>
>>>> >> >>> As others have pointed out, anything we release has to be
>>>> >> verified
>>>> >> >>> and voted on.  As RM and reviewer, I think it is actually
>>>> >> >>> easier to roll and verify source and binaries together.
>>>> >> >>
>>>> >> >
>>>> >> > +1
>>>> >> >
>>>> >> >>
>>>> >> >> It's precisely my main point.
>>>> >> >> I won't dispute that you can prefer doing both (and nobody would
>>>> >> >> forbid
>>>> >> >> a RM to do just that) but the point is about the possibility to
>>>> >> >> release
>>>> >> >> source-only code (as the first step of a two-step procedure
>>>> >> >> which
>>>> >> I
>>>> >> >> described earlier).
>>>> >> >> [IMHO, the two-step one seems easier (both for the RM and the
>>>> >> >> reviewer),
>>>> >> >> (mileage does vary).]
>>>> >> >
>>>> >> > What is easier?
>>>> >> > It seems to me there will be at least one other step in your
>>>> >> > proposed process, i.e. a second VOTE e-mail
>>>> >>
>>>> >> Yes, that's obviously what I meant:
>>>> >> Two steps == two votes
>>>> >>
>>>> >> [But: source releases need not necessarily be accompanied with
>>>> >> "binaries", which, I imagine, could lead to official releases
>>>> >> occurring more often (due to the reduced number of checks).]
>>>> >>
>>>> >> > These will both contain most of the same information.
>>>> >>
>>>> >> No.
>>>> >> The first step is about the source, i.e. the code which humans
>>>> >> create.
>>>> >> The second step is about the files which a build system creates.
>>>> >>
>>>> >> As I indicated previously, the first vote will be about a set of
>>>> >> reviewers being satisfied with the state of the souce code, while
>>>> >> the second vote will be about another set of reviewers being
>>>> >> satisfied
>>>> >> with the results of the build system ("no glitch", as you described
>>>> >> in an earlier message).
>>>> >>
>>>> >> > Is the intention to announce the source release separately from
>>>> >> the
>>>> >> > binary release?
>>>> >> > If so, there will need to be 2 announce mails, and 2 updates to
>>>> >> the
>>>> >> > download page.
>>>> >>
>>>> >> Is there a problem with that?
>>>> >> There are actually several possible cases (depending on the will of
>>>> >> the RM):
>>>> >>   * one-step release (only source code)
>>>> >>   * two-steps (source, then binaries based on that source)
>>>> >>   * combined (as is done up to now)
>>>> >>   * binaries (based on any previously released source)
>>>> >>
>>>> >> >> In short is it forbidden (by the official/legal rules of ASF) to
>>>> >> >> proceed
>>>> >> >> as I propose?
>>>> >> >
>>>> >> > Dunno, depends on what exactly you are proposing.
>>>> >>
>>>> >> Cf. above (and previous mails).
>>>> >>
>>>> >> In practice the release could (IIUC) be like the link provided
>>>> >> by Luc in RC1 of CM 3.4 (whose target was a TAR of the tagged
>>>> >> repository).
>>>> >>
>>>> >>
>>>> >> >> It is impossible technically?
>>>> >> >
>>>> >> > Currently the Maven build process creates:
>>>> >> > - Maven source and binary jars
>>>> >> > - ASF source and binary bundles
>>>> >>
>>>> >> AFAIU, the JARs (source and binary) are "binaries", the binary
>>>> >> bundles are "binaries". Only the ASF source is "source".
>>>> >>
>>>> >> > It's not clear to me what exactly you propose to release in stage
>>>> >> > one,
>>>> >>
>>>> >> The ASF source (e.g. in the form of a tarball, or the appropriate
>>>> >> "git clone" command).
>>>> >>
>>>> >> > but there will need to be some changes to the process in order to
>>>> >> > release just the ASF source.
>>>> >>
>>>> >> I don't see which.
>>>> >> A "source RM" would just stop the process after
>>>> >> resolving/postponing the pending issues, and checking the various
>>>> >> reports about the source
>>>> >> code. [Then create the tag, and request a vote.]
>>>> >>
>>>> >> A "binary RM" would take on from that point (a tagged repository),
>>>> >> i.e. create all the binaries, sign them, etc.
>>>> >>
>>>> >> > There is no point releasing the Maven source jars separately from
>>>> >> > the binary jars; they are not complete as they only contain java
>>>> >> > files for
>>>> >> > use with IDEs.
>>>> >>
>>>> >> I don't understand that.
>>>> >> In principle, a JAR with the Java sources is indeed the necessary
>>>> >> and
>>>> >> sufficient condition for users to create the executable bytecode,
>>>> >> with
>>>> >> whatever build system they wish.
>>>> >> But I agree that it's not useful to not release all the files
>>>> >> needed to easily run maven. [And, for convenience, a source
>>>> >> release would be
>>>> >> accompanied with instructions on how to build a JAR of the compiled
>>>> >> classes, using maven.]
>>>> >>
>>>> >> > But in any case, AFAIK it is very tricky to release new files
>>>> >> > into an existing Maven folder, and it may cause problems for end
>>>> >> > users.
>>>> >>
>>>> >> I don't understand what you mean by "release new files into an
>>>> >> existing Maven folder"...
>>>> >>
>>>> >> Gilles
>>>> >>
>>>> >> >>
>>>> >> >>
>>>> >> >>> Phil
>>>> >> >>>
>>>> >> >>>
>>>> >> >>>> [The more so that, as you said, no fool-proof link between the
>>>> >> >>>> two can
>>>> >> >>>> be ensured: From a security POV, checking the former requires
>>>> >> >>>> a code
>>>> >> >>>> review, while using the latter requires trust in the build
>>>> >> >>>> system.]
>>>> >> >>>>
>>>> >> >>>> Thus we could release the "code", after checking and voting on
>>>> >> >>>> the concerned elements (i.e. the repository state
>>>> >> >>>> corresponding to a specific tag + the web site).
>>>> >> >>>>
>>>> >> >>>> Then we could release the "binaries", as a convenience, after
>>>> >> >>>> checking
>>>> >> >>>> and voting on the concerned elements (i.e. the files about to
>>>> >> be
>>>> >> >>>> distributed).
>>>> >> >>>>
>>>> >> >>>> I think that it's an added flexibility that would, for
>>>> >> >>>> example, allow
>>>> >> >>>> the tagging of the repository without necessarily release
>>>> >> >>>> binaries (i.e.
>>>> >> >>>> not involving that part of the work); and to release binaries
>>>> >> >>>> (say, at
>>>> >> >>>> regular intervals) based on the latest tagged code (i.e. not
>>>> >> >>>> involving
>>>> >> >>>> the work about solving/evaluating/postponing issues).
>>>> >> >>>>
>>>> >> >>>> [I completely admit that, at first, it might look a little
>>>> >> >>>> more confusing for the plain user, but (IIUC) it would be a
>>>> >> >>>> better representation of the reality covered by stating that
>>>> >> >>>> the ASF releases source code.]
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by Gilles <gi...@harfang.homelinux.org>.
On Tue, 30 Dec 2014 01:48:20 +0000, sebb wrote:
> On 30 December 2014 at 01:36, Bernd Eckenfels 
> <ec...@zusammenkunft.net> wrote:
>> Hello,
>>
>> Am Tue, 30 Dec 2014 02:29:38 +0100
>> schrieb Gilles <gi...@harfang.homelinux.org>:
>>
>>> On Tue, 30 Dec 2014 02:09:42 +0100, Bernd Eckenfels wrote:
>>> > That thread gets deep. :)
>>> >
>>> > I just wanted to comment on "releasing only
>>> > source is faster because of less checks". I disagree with that, 
>>> most
>>> > release delay/time is due to preparation work. Failed (binary)
>>> > checks are typically for a reason which would also be present in
>>> > the source (especially the POM), so it does not really reduce the
>>> > number of rework.
>>>
>>> RM is a streamlined procedure: so, if you do (say) 10 steps rather
>>> than 15, it will objectively take less time, and this is compounded
>>> by the additional tests which should (ideally) be performed by the
>>> reviewers. [Thus delaying the release.]
>>
>> The problem is not the small additional time for the last 5 steps 
>> but
>> the large time for redoing all steps (on veto).
>>
>>
>>> > (At least not in most cases, so two votes will actually make us
>>> > more work not less).
>>>
>>> The additional work exactly amounts to sending _one_ additional 
>>> mail.
>>
>> The actual work is not the vote mail but the people doing the
>> preparation and the review.
>>
>>>
>>> Then, as I noted,
>>>   * some releases will be done as before (same work)
>>>   * some releases will be "source only" (less work)
>>
>> Not much, you still have to check if the source actually works and 
>> can
>> be build, produces sane archives and so on.
>>
>>>   * some releases will be two-steps, possibly performed by two
>>> different people (i.e. less work for each RM)
>>
>> And more work in sum, not only for the RMs but also the reviewers. 
>> (and
>> the users which want to use the source release with maven like 
>> anybody
>> there days)
>>
>> But I dont mind, if a project wants to do a source release only, 
>> thats
>> fine with me, I just don't see the advantage.
>
> How many end users just want a source release anyway?
>
> I would expect most users to use the Maven jars, some will use the 
> ASF
> binaries, and a few will use the ASF source (AIUI Linux distros often
> build from source).

So, you answered your own question.

>
> Even if only the source is released, it's still necessary for the RM
> and reviewers to build and test it.

Never said otherwise.
[Testing the sources is one git command and one maven command. Testing
the binaries requires downloading each of them and check the signatures
and/or checksums, each a separate command.]


Gilles

>
>> Gruss
>> Bernd
>>
>>>
>>> Of course, each release means some work has to be done; then IIUC 
>>> your
>>> point, the fewer releases the better. :-}
>>>
>>>
>>
>>> >  Am Tue, 30 Dec 2014 02:05:29
>>> > +0100 schrieb Gilles <gi...@harfang.homelinux.org>:
>>> >
>>> >> On Mon, 29 Dec 2014 10:54:59 +0000, sebb wrote:
>>> >> > On 29 December 2014 at 10:36, Gilles
>>> >> <gi...@harfang.homelinux.org>
>>> >> > wrote:
>>> >> >> On Sun, 28 Dec 2014 20:21:32 -0700, Phil Steitz wrote:
>>> >> >>>
>>> >> >>> On 12/28/14 11:46 AM, Gilles wrote:
>>> >> >>>>
>>> >> >>>> Hi.
>>> >> >>>>
>>> >> >>>> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
>>> >> >>>>>
>>> >> >>>>> Le 28/12/2014 00:22, sebb a écrit :
>>> >> >>>>>>
>>> >> >>>>>> On 27 December 2014 at 22:19, Gilles
>>> >> >>>>>> <gi...@harfang.homelinux.org> wrote:
>>> >> >>>>>>>
>>> >> >>>>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
>>> >> >>>>>>>>
>>> >> >>>>>>>>
>>> >> >>>>>>>> On 24 December 2014 at 15:11, Gilles
>>> >> >>>>>>>> <gi...@harfang.homelinux.org> wrote:
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe 
>>> wrote:
>>> >> >>>>>>>>>>
>>> >> >>>>>>>>>>
>>> >> >>>>>>>>>>
>>> >> >>>>>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
>>> >> >>>>>>>>>>>
>>> >> >>>>>>>>>>>
>>> >> >>>>>>>>>>>
>>> >> >>>>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe
>>> >> >>>>>>>>>>> wrote:
>>> >> >>>>>>>>>>>>
>>> >> >>>>>>>>>>>>
>>> >> >>>>>>>>>>>>
>>> >> >>>>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
>>> >> >>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons 
>>> Math
>>> >> 3.4
>>> >> >>>>>>>>>>>>>> from release
>>> >> >>>>>>>>>>>>>> candidate 3.
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>> Tag name:
>>> >> >>>>>>>>>>>>>>   MATH_3_4_RC3 (signature can be checked from git
>>> >> using
>>> >> >>>>>>>>>>>>>> 'git tag
>>> >> >>>>>>>>>>>>>> -v')
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>> Tag URL:
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>>
>>> >> 
>>> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>>
>>> >> >>>>>>>>>>>>> Is there a way to check that the source code
>>> >> >>>>>>>>>>>>> referred
>>> >> to
>>> >> >>>>>>>>>>>>> above
>>> >> >>>>>>>>>>>>> was the one used to create the JAR of the ".class"
>>> >> >>>>>>>>>>>>> files. [Out of curiosity, not suspicion, of
>>> >> >>>>>>>>>>>>> course...]
>>> >> >>>>>>>>>>>>
>>> >> >>>>>>>>>>>>
>>> >> >>>>>>>>>>>>
>>> >> >>>>>>>>>>>>
>>> >> >>>>>>>>>>>> Yes, you can look at the end of the
>>> >> META-INF/MANIFEST.MS
>>> >> >>>>>>>>>>>> file embedded
>>> >> >>>>>>>>>>>> in the jar. The second-to-last entry is called
>>> >> >>>>>>>>>>>> Implementation-Build.
>>> >> >>>>>>>>>>>> It
>>> >> >>>>>>>>>>>> is automatically created by
>>> >> maven-jgit-buildnumber-plugin
>>> >> >>>>>>>>>>>> and contains
>>> >> >>>>>>>>>>>> the SHA1 identifier of the last commit used for the
>>> >> >>>>>>>>>>>> build. Here, is is
>>> >> >>>>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can
>>> >> check
>>> >> >>>>>>>>>>>> it really
>>> >> >>>>>>>>>>>> corresponds to the expected status of the git
>>> >> repository.
>>> >> >>>>>>>>>>>>
>>> >> >>>>>>>>>>>
>>> >> >>>>>>>>>>> Can this be considered "secure", i.e. can't this 
>>> entry
>>> >> in
>>> >> >>>>>>>>>>> the MANIFEST
>>> >> >>>>>>>>>>> file be modified to be the checksum of the 
>>> repository
>>> >> but
>>> >> >>>>>>>>>>> with the
>>> >> >>>>>>>>>>> .class
>>> >> >>>>>>>>>>> files being substitued with those coming from 
>>> another
>>> >> >>>>>>>>>>> compilation?
>>> >> >>>>>>>>>>
>>> >> >>>>>>>>>>
>>> >> >>>>>>>>>>
>>> >> >>>>>>>>>>
>>> >> >>>>>>>>>> Modifying anything in the jar (either this entry 
>>> within
>>> >> the
>>> >> >>>>>>>>>> manifest or
>>> >> >>>>>>>>>> any class) will modify the jar signature. So as long 
>>> as
>>> >> >>>>>>>>>> people do check
>>> >> >>>>>>>>>> the global MD5, SHA1 or gpg signature we provide with
>>> >> >>>>>>>>>> our build, they
>>> >> >>>>>>>>>> are safe to assume the artifacts are Apache 
>>> artifacts.
>>> >> >>>>>>>>>>
>>> >> >>>>>>>>>> This is not different from how releases are done with
>>> >> >>>>>>>>>> subversion as the
>>> >> >>>>>>>>>> source code control system, or even in C or C++ as 
>>> the
>>> >> >>>>>>>>>> language. At one
>>> >> >>>>>>>>>> time, the release manager does perform a compilation 
>>> and
>>> >> >>>>>>>>>> the fellow
>>> >> >>>>>>>>>> reviewers check the result. There is no fullproof
>>> >> >>>>>>>>>> process here, as
>>> >> >>>>>>>>>> always when security is involved. Even using an
>>> >> >>>>>>>>>> automated build and
>>> >> >>>>>>>>>> automatic signing on an Apache server would involve
>>> >> >>>>>>>>>> trust (i.e. one
>>> >> >>>>>>>>>> should assume that the server has not been tampered
>>> >> >>>>>>>>>> with, that the build
>>> >> >>>>>>>>>> process really does what it is expected to do, that 
>>> the
>>> >> >>>>>>>>>> artifacts put to
>>> >> >>>>>>>>>> review are really the one created by the automatic
>>> >> process
>>> >> >>>>>>>>>> ...).
>>> >> >>>>>>>>>>
>>> >> >>>>>>>>>> Another point is that what we officially release is 
>>> the
>>> >> >>>>>>>>>> source, which
>>> >> >>>>>>>>>> can be reviewed by external users. The binary parts 
>>> are
>>> >> >>>>>>>>>> merely a
>>> >> >>>>>>>>>> convenience.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> That's an interesting point to come back to since it
>>> >> >>>>>>>>> looks like the
>>> >> >>>>>>>>> most time-consuming part of a release is not related 
>>> to
>>> >> the
>>> >> >>>>>>>>> sources!
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Isn't it conceivable that a release could just be a
>>> >> >>>>>>>>> commit identifier
>>> >> >>>>>>>>> and a checksum of the repository?
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> If the binaries are a just a convenience, why put so 
>>> much
>>> >> >>>>>>>>> effort in it?
>>> >> >>>>>>>>> As a convenience, the artefacts could be produced 
>>> after
>>> >> the
>>> >> >>>>>>>>> release,
>>> >> >>>>>>>>> accompanied with all the "caveat" notes which you
>>> >> mentioned.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> That would certainly increase the release rate.
>>> >> >>>>>>>>
>>> >> >>>>>>>>
>>> >> >>>>>>>>
>>> >> >>>>>>>> Binary releases still need to be reviewed to ensure 
>>> that
>>> >> the
>>> >> >>>>>>>> correct N
>>> >> >>>>>>>> & L files are present, and that the archives don't 
>>> contain
>>> >> >>>>>>>> material
>>> >> >>>>>>>> with disallowed licenses.
>>> >> >>>>>>>>
>>> >> >>>>>>>> It's not unknown for automated build processes to 
>>> include
>>> >> >>>>>>>> files that
>>> >> >>>>>>>> should not be present.
>>> >> >>>>>>>>
>>> >> >>>>>>>
>>> >> >>>>>>> I fail to see the difference of principle between the
>>> >> >>>>>>> "release" context
>>> >> >>>>>>> and, say, the daily snapshot context.
>>> >> >>>>>>
>>> >> >>>>>>
>>> >> >>>>>> Snapshots are not (should not) be promoted to the general
>>> >> >>>>>> public as
>>> >> >>>>>> releases of the ASF.
>>> >> >>>>>>
>>> >> >>>>>>> What I mean is that there seem to be a contradiction
>>> >> >>>>>>> between saying that
>>> >> >>>>>>> a "release" is only about _source_ and the obligation to
>>> >> check
>>> >> >>>>>>> _binaries_.
>>> >> >>>>>>
>>> >> >>>>>>
>>> >> >>>>>> There is no contradiction here.
>>> >> >>>>>> The ASF releases source, they are required in a release.
>>> >> >>>>>> Binaries are optional.
>>> >> >>>>>> That does not mean that the ASF mirror system can be used 
>>> to
>>> >> >>>>>> distribute arbitrary binaries.
>>> >> >>>>>>
>>> >> >>>>>>> It can occur that disallowed material is, at some point 
>>> in
>>> >> >>>>>>> time, part of
>>> >> >>>>>>> the repository and/or the snapshot binaries.
>>> >> >>>>>>> However, what is forbidden is... forbidden, at all 
>>> times.
>>> >> >>>>>>
>>> >> >>>>>>
>>> >> >>>>>> As with most things, this is not a strict dichotomy.
>>> >> >>>>>>
>>> >> >>>>>>> If it is indeed a problem to distribute forbidden 
>>> material,
>>> >> >>>>>>> shouldn't
>>> >> >>>>>>> this be corrected in the repository? [That's indeed what
>>> >> >>>>>>> you did with
>>> >> >>>>>>> the blocking of the release.]
>>> >> >>>>>>
>>> >> >>>>>>
>>> >> >>>>>> If the repo is discovered to contain disallowed material, 
>>> it
>>> >> >>>>>> needs to
>>> >> >>>>>> be removed.
>>> >> >>>>>>
>>> >> >>>>>>> Then again, once the repository is "clean", it can be
>>> >> >>>>>>> tagged and that
>>> >> >>>>>>> tagged _source_ is the release.
>>> >> >>>>>>
>>> >> >>>>>>
>>> >> >>>>>> Not quite.
>>> >> >>>>>>
>>> >> >>>>>> A release is a source archive that is voted on and
>>> >> distributed
>>> >> >>>>>> via the
>>> >> >>>>>> ASF mirror system.
>>> >> >>>>>> The contents must agree with the source tag, but the 
>>> source
>>> >> tag
>>> >> >>>>>> is not
>>> >> >>>>>> the release.
>>> >> >>>>>>
>>> >> >>>>>>> Non-compliant binaries would thus only be the result of 
>>> a
>>> >> >>>>>>> "mistake"
>>> >> >>>>>>> (if the build system is flawed, it's another problem,
>>> >> >>>>>>> unrelated to
>>> >> >>>>>>> the released contents, which is _source_) to be 
>>> corrected
>>> >> per
>>> >> >>>>>>> se.
>>> >> >>>>>>
>>> >> >>>>>>
>>> >> >>>>>> Not so. There are other failure modes.
>>> >> >>>>>>
>>> >> >>>>>> An automated build obviously reduces the chances of
>>> >> >>>>>> mistakes, but it
>>> >> >>>>>> can still create an archive containing files that should
>>> >> >>>>>> not
>>> >> be
>>> >> >>>>>> there.
>>> >> >>>>>> [Or indeed, omits files that should be present]
>>> >> >>>>>> For example, the workspace contains spurious files which 
>>> are
>>> >> >>>>>> implicitly included by the assembly instructions.
>>> >> >>>>>> Or the build process creates spurious files that are
>>> >> >>>>>> incorrectly added
>>> >> >>>>>> to the archive.
>>> >> >>>>>> Or the build incorrectly includes jars that are supposed 
>>> to
>>> >> be
>>> >> >>>>>> provided by the end user
>>> >> >>>>>> etc.
>>> >> >>>>>>
>>> >> >>>>>> I have seen all the above in RC votes.
>>> >> >>>>>> There are probably other falure modes.
>>> >> >>>>>>
>>> >> >>>>>>> My proposition is that it's an independent step: once 
>>> the
>>> >> >>>>>>> build system is adjusted to the expectations, "correct"
>>> >> >>>>>>> binaries can be
>>> >> >>>>>>> generated from the same tagged release.
>>> >> >>>>>>
>>> >> >>>>>>
>>> >> >>>>>> It does not matter when the binary is built.
>>> >> >>>>>> If it is distributed by the PMC as a formal release, it 
>>> must
>>> >> >>>>>> not contain any surprises, e.g. it must be licensed under
>>> >> >>>>>> the AL.
>>> >> >>>>>>
>>> >> >>>>>> It is therefore vital that the contents are as expected 
>>> from
>>> >> >>>>>> the build.
>>> >> >>>>>>
>>> >> >>>>>> Note also that a formal release becomes an act of the PMC 
>>> by
>>> >> >>>>>> the voting process.
>>> >> >>>>>> The ASF can then assume responsibility for any legal 
>>> issues
>>> >> >>>>>> that may arise.
>>> >> >>>>>> Otherwise it is entirely the personal responsibility of 
>>> the
>>> >> >>>>>> person who
>>> >> >>>>>> releases it.
>>> >> >>>>>
>>> >> >>>>>
>>> >> >>>>> I think the last two points are really important: binaries
>>> >> must
>>> >> >>>>> be
>>> >> >>>>> checked and the foundation provides a legal protection for
>>> >> >>>>> the project
>>> >> >>>>> if something weird occurs.
>>> >> >>>>>
>>> >> >>>>> I also think another point is important: many if not most
>>> >> users
>>> >> >>>>> do
>>> >> >>>>> really expect binaries and not source. From our internal
>>> >> Apache
>>> >> >>>>> point
>>> >> >>>>> of view, these are a by-product,. For many others it is 
>>> the
>>> >> >>>>> important
>>> >> >>>>> thing. It is mostly true in maven land as dependencies are
>>> >> >>>>> automatically retrieved in binary form, not source form. 
>>> So
>>> >> the
>>> >> >>>>> maven
>>> >> >>>>> central repository as a distribution system is important.
>>> >> >>>>>
>>> >> >>>>> Even if for some security reason it sounds at first 
>>> thought
>>> >> >>>>> logical to
>>> >> >>>>> rely on source only and compile oneself, in an industrial
>>> >> >>>>> context project teams do not have enough time to do it for
>>> >> >>>>> all their dependencies, so they use binaries provided by
>>> >> >>>>> trusted third parties. A
>>> >> >>>>> long time ago, I compiled a lot of free software tools for
>>> >> >>>>> the department I worked for at that time. I do not do this
>>> >> anymore,
>>> >> >>>>> and
>>> >> >>>>> trust the binaries provided by the packaging team for a
>>> >> >>>>> distribution
>>> >> >>>>> (typically Debian). They do rely on source and compile
>>> >> >>>>> themselves. Hey,
>>> >> >>>>> I even think Emmanuel here belongs to the Debian java
>>> >> >>>>> team ;-)
>>> >> I
>>> >> >>>>> guess
>>> >> >>>>> such teams that do rely on source are rather the exception
>>> >> than
>>> >> >>>>> the
>>> >> >>>>> rule. The other examples I can think of are packaging 
>>> teams,
>>> >> >>>>> development teams that need bleeding edge (and will also
>>> >> >>>>> directly depend on the repository, not even the release),
>>> >> >>>>> projects that need to
>>> >> >>>>> introduce their own patches and people who have critical
>>> >> >>>>> needs (for
>>> >> >>>>> example when safety of people is concerned or when they 
>>> need
>>> >> >>>>> full control for legal or contractual reasons). Many other
>>> >> >>>>> people download
>>> >> >>>>> binaries directly and would simply not consider using a
>>> >> project
>>> >> >>>>> if it
>>> >> >>>>> is not readily available: they don't have time for this 
>>> and
>>> >> >>>>> don't want
>>> >> >>>>> to learn how to build tens or hundred of different 
>>> projects
>>> >> they
>>> >> >>>>> simply
>>> >> >>>>> use.
>>> >> >>>>>
>>> >> >>>>
>>> >> >>>> I do not disagree with anything said on this thread. [In
>>> >> >>>> particular, I
>>> >> >>>> did not at all imply that any one committer could take
>>> >> >>>> responsibility
>>> >> >>>> for releasing unchecked items.]
>>> >> >>>>
>>> >> >>>> I'm simply suggesting that what is called the release
>>> >> >>>> process/management
>>> >> >>>> could be made simpler (and _consequently_ could lead to 
>>> more
>>> >> >>>> regularly
>>> >> >>>> releasing the CM code), by separating the concerns.
>>> >> >>>> The concerns are
>>> >> >>>>  1. "code" (the contents), and
>>> >> >>>>  2. "artefacts" (the result of the build system acting on 
>>> the
>>> >> >>>> "code").
>>> >> >>>>
>>> >> >>>> Checking of one of these is largely independent from 
>>> checking
>>> >> the
>>> >> >>>> other.
>>> >> >>>
>>> >> >>>
>>> >> >>> Unfortunately, not really.  One principle that we have 
>>> (maybe
>>> >> not
>>> >> >>> crystal clear in the release doco) is that when we do
>>> >> >>> distribute binaries, they should really be "convenience
>>> >> >>> binaries" which
>>> >> means
>>> >> >>> that everything needed to create them is in the source or 
>>> its
>>> >> >>> documented dependencies.  What that means is that what we 
>>> tag
>>> >> >>> as the
>>> >> >>> source release needs to be able to generate any binaries 
>>> that
>>> >> >>> we subsequently release.  The only way to really test that 
>>> is
>>> >> >>> to generate the binaries and inspect them as part of 
>>> verifying
>>> >> >>> the release.
>>> >> >>
>>> >> >>
>>> >> >> Only way?  That's certainly not obvious to me: Since a
>>> >> >> tag/branch uniquely identifies a set of files, that is, the
>>> >> >> "source release [that
>>> >> >> is] able to generate any binaries that we subsequently 
>>> release",
>>> >> >> if a
>>> >> >> RM can do it at (source) release time, he (or someone else!) 
>>> can
>>> >> >> do it
>>> >> >> later, too (by running the build from a clone of the 
>>> repository
>>> >> in
>>> >> >> its
>>> >> >> tagged state).
>>> >> >>
>>> >> >>> As others have pointed out, anything we release has to be
>>> >> verified
>>> >> >>> and voted on.  As RM and reviewer, I think it is actually
>>> >> >>> easier to roll and verify source and binaries together.
>>> >> >>
>>> >> >
>>> >> > +1
>>> >> >
>>> >> >>
>>> >> >> It's precisely my main point.
>>> >> >> I won't dispute that you can prefer doing both (and nobody 
>>> would
>>> >> >> forbid
>>> >> >> a RM to do just that) but the point is about the possibility 
>>> to
>>> >> >> release
>>> >> >> source-only code (as the first step of a two-step procedure
>>> >> >> which
>>> >> I
>>> >> >> described earlier).
>>> >> >> [IMHO, the two-step one seems easier (both for the RM and the
>>> >> >> reviewer),
>>> >> >> (mileage does vary).]
>>> >> >
>>> >> > What is easier?
>>> >> > It seems to me there will be at least one other step in your
>>> >> > proposed process, i.e. a second VOTE e-mail
>>> >>
>>> >> Yes, that's obviously what I meant:
>>> >> Two steps == two votes
>>> >>
>>> >> [But: source releases need not necessarily be accompanied with
>>> >> "binaries", which, I imagine, could lead to official releases
>>> >> occurring more often (due to the reduced number of checks).]
>>> >>
>>> >> > These will both contain most of the same information.
>>> >>
>>> >> No.
>>> >> The first step is about the source, i.e. the code which humans
>>> >> create.
>>> >> The second step is about the files which a build system creates.
>>> >>
>>> >> As I indicated previously, the first vote will be about a set of
>>> >> reviewers being satisfied with the state of the souce code, 
>>> while
>>> >> the second vote will be about another set of reviewers being
>>> >> satisfied
>>> >> with the results of the build system ("no glitch", as you 
>>> described
>>> >> in an earlier message).
>>> >>
>>> >> > Is the intention to announce the source release separately 
>>> from
>>> >> the
>>> >> > binary release?
>>> >> > If so, there will need to be 2 announce mails, and 2 updates 
>>> to
>>> >> the
>>> >> > download page.
>>> >>
>>> >> Is there a problem with that?
>>> >> There are actually several possible cases (depending on the will 
>>> of
>>> >> the RM):
>>> >>   * one-step release (only source code)
>>> >>   * two-steps (source, then binaries based on that source)
>>> >>   * combined (as is done up to now)
>>> >>   * binaries (based on any previously released source)
>>> >>
>>> >> >> In short is it forbidden (by the official/legal rules of ASF) 
>>> to
>>> >> >> proceed
>>> >> >> as I propose?
>>> >> >
>>> >> > Dunno, depends on what exactly you are proposing.
>>> >>
>>> >> Cf. above (and previous mails).
>>> >>
>>> >> In practice the release could (IIUC) be like the link provided
>>> >> by Luc in RC1 of CM 3.4 (whose target was a TAR of the tagged
>>> >> repository).
>>> >>
>>> >>
>>> >> >> It is impossible technically?
>>> >> >
>>> >> > Currently the Maven build process creates:
>>> >> > - Maven source and binary jars
>>> >> > - ASF source and binary bundles
>>> >>
>>> >> AFAIU, the JARs (source and binary) are "binaries", the binary
>>> >> bundles are "binaries". Only the ASF source is "source".
>>> >>
>>> >> > It's not clear to me what exactly you propose to release in 
>>> stage
>>> >> > one,
>>> >>
>>> >> The ASF source (e.g. in the form of a tarball, or the 
>>> appropriate
>>> >> "git clone" command).
>>> >>
>>> >> > but there will need to be some changes to the process in order 
>>> to
>>> >> > release just the ASF source.
>>> >>
>>> >> I don't see which.
>>> >> A "source RM" would just stop the process after
>>> >> resolving/postponing the pending issues, and checking the 
>>> various
>>> >> reports about the source
>>> >> code. [Then create the tag, and request a vote.]
>>> >>
>>> >> A "binary RM" would take on from that point (a tagged 
>>> repository),
>>> >> i.e. create all the binaries, sign them, etc.
>>> >>
>>> >> > There is no point releasing the Maven source jars separately 
>>> from
>>> >> > the binary jars; they are not complete as they only contain 
>>> java
>>> >> > files for
>>> >> > use with IDEs.
>>> >>
>>> >> I don't understand that.
>>> >> In principle, a JAR with the Java sources is indeed the 
>>> necessary
>>> >> and
>>> >> sufficient condition for users to create the executable 
>>> bytecode,
>>> >> with
>>> >> whatever build system they wish.
>>> >> But I agree that it's not useful to not release all the files
>>> >> needed to easily run maven. [And, for convenience, a source
>>> >> release would be
>>> >> accompanied with instructions on how to build a JAR of the 
>>> compiled
>>> >> classes, using maven.]
>>> >>
>>> >> > But in any case, AFAIK it is very tricky to release new files
>>> >> > into an existing Maven folder, and it may cause problems for 
>>> end
>>> >> > users.
>>> >>
>>> >> I don't understand what you mean by "release new files into an
>>> >> existing Maven folder"...
>>> >>
>>> >> Gilles
>>> >>
>>> >> >>
>>> >> >>
>>> >> >>> Phil
>>> >> >>>
>>> >> >>>
>>> >> >>>> [The more so that, as you said, no fool-proof link between 
>>> the
>>> >> >>>> two can
>>> >> >>>> be ensured: From a security POV, checking the former 
>>> requires
>>> >> >>>> a code
>>> >> >>>> review, while using the latter requires trust in the build
>>> >> >>>> system.]
>>> >> >>>>
>>> >> >>>> Thus we could release the "code", after checking and voting 
>>> on
>>> >> >>>> the concerned elements (i.e. the repository state
>>> >> >>>> corresponding to a specific tag + the web site).
>>> >> >>>>
>>> >> >>>> Then we could release the "binaries", as a convenience, 
>>> after
>>> >> >>>> checking
>>> >> >>>> and voting on the concerned elements (i.e. the files about 
>>> to
>>> >> be
>>> >> >>>> distributed).
>>> >> >>>>
>>> >> >>>> I think that it's an added flexibility that would, for
>>> >> >>>> example, allow
>>> >> >>>> the tagging of the repository without necessarily release
>>> >> >>>> binaries (i.e.
>>> >> >>>> not involving that part of the work); and to release 
>>> binaries
>>> >> >>>> (say, at
>>> >> >>>> regular intervals) based on the latest tagged code (i.e. 
>>> not
>>> >> >>>> involving
>>> >> >>>> the work about solving/evaluating/postponing issues).
>>> >> >>>>
>>> >> >>>> [I completely admit that, at first, it might look a little
>>> >> >>>> more confusing for the plain user, but (IIUC) it would be a
>>> >> >>>> better representation of the reality covered by stating 
>>> that
>>> >> >>>> the ASF releases source code.]
>>
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by sebb <se...@gmail.com>.
On 30 December 2014 at 01:36, Bernd Eckenfels <ec...@zusammenkunft.net> wrote:
> Hello,
>
> Am Tue, 30 Dec 2014 02:29:38 +0100
> schrieb Gilles <gi...@harfang.homelinux.org>:
>
>> On Tue, 30 Dec 2014 02:09:42 +0100, Bernd Eckenfels wrote:
>> > That thread gets deep. :)
>> >
>> > I just wanted to comment on "releasing only
>> > source is faster because of less checks". I disagree with that, most
>> > release delay/time is due to preparation work. Failed (binary)
>> > checks are typically for a reason which would also be present in
>> > the source (especially the POM), so it does not really reduce the
>> > number of rework.
>>
>> RM is a streamlined procedure: so, if you do (say) 10 steps rather
>> than 15, it will objectively take less time, and this is compounded
>> by the additional tests which should (ideally) be performed by the
>> reviewers. [Thus delaying the release.]
>
> The problem is not the small additional time for the last 5 steps but
> the large time for redoing all steps (on veto).
>
>
>> > (At least not in most cases, so two votes will actually make us
>> > more work not less).
>>
>> The additional work exactly amounts to sending _one_ additional mail.
>
> The actual work is not the vote mail but the people doing the
> preparation and the review.
>
>>
>> Then, as I noted,
>>   * some releases will be done as before (same work)
>>   * some releases will be "source only" (less work)
>
> Not much, you still have to check if the source actually works and can
> be build, produces sane archives and so on.
>
>>   * some releases will be two-steps, possibly performed by two
>> different people (i.e. less work for each RM)
>
> And more work in sum, not only for the RMs but also the reviewers. (and
> the users which want to use the source release with maven like anybody
> there days)
>
> But I dont mind, if a project wants to do a source release only, thats
> fine with me, I just don't see the advantage.

How many end users just want a source release anyway?

I would expect most users to use the Maven jars, some will use the ASF
binaries, and a few will use the ASF source (AIUI Linux distros often
build from source).

Even if only the source is released, it's still necessary for the RM
and reviewers to build and test it.

> Gruss
> Bernd
>
>>
>> Of course, each release means some work has to be done; then IIUC your
>> point, the fewer releases the better. :-}
>>
>>
>
>> >  Am Tue, 30 Dec 2014 02:05:29
>> > +0100 schrieb Gilles <gi...@harfang.homelinux.org>:
>> >
>> >> On Mon, 29 Dec 2014 10:54:59 +0000, sebb wrote:
>> >> > On 29 December 2014 at 10:36, Gilles
>> >> <gi...@harfang.homelinux.org>
>> >> > wrote:
>> >> >> On Sun, 28 Dec 2014 20:21:32 -0700, Phil Steitz wrote:
>> >> >>>
>> >> >>> On 12/28/14 11:46 AM, Gilles wrote:
>> >> >>>>
>> >> >>>> Hi.
>> >> >>>>
>> >> >>>> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
>> >> >>>>>
>> >> >>>>> Le 28/12/2014 00:22, sebb a écrit :
>> >> >>>>>>
>> >> >>>>>> On 27 December 2014 at 22:19, Gilles
>> >> >>>>>> <gi...@harfang.homelinux.org> wrote:
>> >> >>>>>>>
>> >> >>>>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> On 24 December 2014 at 15:11, Gilles
>> >> >>>>>>>> <gi...@harfang.homelinux.org> wrote:
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe wrote:
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
>> >> >>>>>>>>>>>
>> >> >>>>>>>>>>>
>> >> >>>>>>>>>>>
>> >> >>>>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe
>> >> >>>>>>>>>>> wrote:
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
>> >> >>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>
>> >> >>>>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math
>> >> 3.4
>> >> >>>>>>>>>>>>>> from release
>> >> >>>>>>>>>>>>>> candidate 3.
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>> Tag name:
>> >> >>>>>>>>>>>>>>   MATH_3_4_RC3 (signature can be checked from git
>> >> using
>> >> >>>>>>>>>>>>>> 'git tag
>> >> >>>>>>>>>>>>>> -v')
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>> Tag URL:
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>>
>> >> >>>>>>>>>>>>>
>> >> >>>>>>>>>>>>> Is there a way to check that the source code
>> >> >>>>>>>>>>>>> referred
>> >> to
>> >> >>>>>>>>>>>>> above
>> >> >>>>>>>>>>>>> was the one used to create the JAR of the ".class"
>> >> >>>>>>>>>>>>> files. [Out of curiosity, not suspicion, of
>> >> >>>>>>>>>>>>> course...]
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>> Yes, you can look at the end of the
>> >> META-INF/MANIFEST.MS
>> >> >>>>>>>>>>>> file embedded
>> >> >>>>>>>>>>>> in the jar. The second-to-last entry is called
>> >> >>>>>>>>>>>> Implementation-Build.
>> >> >>>>>>>>>>>> It
>> >> >>>>>>>>>>>> is automatically created by
>> >> maven-jgit-buildnumber-plugin
>> >> >>>>>>>>>>>> and contains
>> >> >>>>>>>>>>>> the SHA1 identifier of the last commit used for the
>> >> >>>>>>>>>>>> build. Here, is is
>> >> >>>>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can
>> >> check
>> >> >>>>>>>>>>>> it really
>> >> >>>>>>>>>>>> corresponds to the expected status of the git
>> >> repository.
>> >> >>>>>>>>>>>>
>> >> >>>>>>>>>>>
>> >> >>>>>>>>>>> Can this be considered "secure", i.e. can't this entry
>> >> in
>> >> >>>>>>>>>>> the MANIFEST
>> >> >>>>>>>>>>> file be modified to be the checksum of the repository
>> >> but
>> >> >>>>>>>>>>> with the
>> >> >>>>>>>>>>> .class
>> >> >>>>>>>>>>> files being substitued with those coming from another
>> >> >>>>>>>>>>> compilation?
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> Modifying anything in the jar (either this entry within
>> >> the
>> >> >>>>>>>>>> manifest or
>> >> >>>>>>>>>> any class) will modify the jar signature. So as long as
>> >> >>>>>>>>>> people do check
>> >> >>>>>>>>>> the global MD5, SHA1 or gpg signature we provide with
>> >> >>>>>>>>>> our build, they
>> >> >>>>>>>>>> are safe to assume the artifacts are Apache artifacts.
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> This is not different from how releases are done with
>> >> >>>>>>>>>> subversion as the
>> >> >>>>>>>>>> source code control system, or even in C or C++ as the
>> >> >>>>>>>>>> language. At one
>> >> >>>>>>>>>> time, the release manager does perform a compilation and
>> >> >>>>>>>>>> the fellow
>> >> >>>>>>>>>> reviewers check the result. There is no fullproof
>> >> >>>>>>>>>> process here, as
>> >> >>>>>>>>>> always when security is involved. Even using an
>> >> >>>>>>>>>> automated build and
>> >> >>>>>>>>>> automatic signing on an Apache server would involve
>> >> >>>>>>>>>> trust (i.e. one
>> >> >>>>>>>>>> should assume that the server has not been tampered
>> >> >>>>>>>>>> with, that the build
>> >> >>>>>>>>>> process really does what it is expected to do, that the
>> >> >>>>>>>>>> artifacts put to
>> >> >>>>>>>>>> review are really the one created by the automatic
>> >> process
>> >> >>>>>>>>>> ...).
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> Another point is that what we officially release is the
>> >> >>>>>>>>>> source, which
>> >> >>>>>>>>>> can be reviewed by external users. The binary parts are
>> >> >>>>>>>>>> merely a
>> >> >>>>>>>>>> convenience.
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>> That's an interesting point to come back to since it
>> >> >>>>>>>>> looks like the
>> >> >>>>>>>>> most time-consuming part of a release is not related to
>> >> the
>> >> >>>>>>>>> sources!
>> >> >>>>>>>>>
>> >> >>>>>>>>> Isn't it conceivable that a release could just be a
>> >> >>>>>>>>> commit identifier
>> >> >>>>>>>>> and a checksum of the repository?
>> >> >>>>>>>>>
>> >> >>>>>>>>> If the binaries are a just a convenience, why put so much
>> >> >>>>>>>>> effort in it?
>> >> >>>>>>>>> As a convenience, the artefacts could be produced after
>> >> the
>> >> >>>>>>>>> release,
>> >> >>>>>>>>> accompanied with all the "caveat" notes which you
>> >> mentioned.
>> >> >>>>>>>>>
>> >> >>>>>>>>> That would certainly increase the release rate.
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> Binary releases still need to be reviewed to ensure that
>> >> the
>> >> >>>>>>>> correct N
>> >> >>>>>>>> & L files are present, and that the archives don't contain
>> >> >>>>>>>> material
>> >> >>>>>>>> with disallowed licenses.
>> >> >>>>>>>>
>> >> >>>>>>>> It's not unknown for automated build processes to include
>> >> >>>>>>>> files that
>> >> >>>>>>>> should not be present.
>> >> >>>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> I fail to see the difference of principle between the
>> >> >>>>>>> "release" context
>> >> >>>>>>> and, say, the daily snapshot context.
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> Snapshots are not (should not) be promoted to the general
>> >> >>>>>> public as
>> >> >>>>>> releases of the ASF.
>> >> >>>>>>
>> >> >>>>>>> What I mean is that there seem to be a contradiction
>> >> >>>>>>> between saying that
>> >> >>>>>>> a "release" is only about _source_ and the obligation to
>> >> check
>> >> >>>>>>> _binaries_.
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> There is no contradiction here.
>> >> >>>>>> The ASF releases source, they are required in a release.
>> >> >>>>>> Binaries are optional.
>> >> >>>>>> That does not mean that the ASF mirror system can be used to
>> >> >>>>>> distribute arbitrary binaries.
>> >> >>>>>>
>> >> >>>>>>> It can occur that disallowed material is, at some point in
>> >> >>>>>>> time, part of
>> >> >>>>>>> the repository and/or the snapshot binaries.
>> >> >>>>>>> However, what is forbidden is... forbidden, at all times.
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> As with most things, this is not a strict dichotomy.
>> >> >>>>>>
>> >> >>>>>>> If it is indeed a problem to distribute forbidden material,
>> >> >>>>>>> shouldn't
>> >> >>>>>>> this be corrected in the repository? [That's indeed what
>> >> >>>>>>> you did with
>> >> >>>>>>> the blocking of the release.]
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> If the repo is discovered to contain disallowed material, it
>> >> >>>>>> needs to
>> >> >>>>>> be removed.
>> >> >>>>>>
>> >> >>>>>>> Then again, once the repository is "clean", it can be
>> >> >>>>>>> tagged and that
>> >> >>>>>>> tagged _source_ is the release.
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> Not quite.
>> >> >>>>>>
>> >> >>>>>> A release is a source archive that is voted on and
>> >> distributed
>> >> >>>>>> via the
>> >> >>>>>> ASF mirror system.
>> >> >>>>>> The contents must agree with the source tag, but the source
>> >> tag
>> >> >>>>>> is not
>> >> >>>>>> the release.
>> >> >>>>>>
>> >> >>>>>>> Non-compliant binaries would thus only be the result of a
>> >> >>>>>>> "mistake"
>> >> >>>>>>> (if the build system is flawed, it's another problem,
>> >> >>>>>>> unrelated to
>> >> >>>>>>> the released contents, which is _source_) to be corrected
>> >> per
>> >> >>>>>>> se.
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> Not so. There are other failure modes.
>> >> >>>>>>
>> >> >>>>>> An automated build obviously reduces the chances of
>> >> >>>>>> mistakes, but it
>> >> >>>>>> can still create an archive containing files that should
>> >> >>>>>> not
>> >> be
>> >> >>>>>> there.
>> >> >>>>>> [Or indeed, omits files that should be present]
>> >> >>>>>> For example, the workspace contains spurious files which are
>> >> >>>>>> implicitly included by the assembly instructions.
>> >> >>>>>> Or the build process creates spurious files that are
>> >> >>>>>> incorrectly added
>> >> >>>>>> to the archive.
>> >> >>>>>> Or the build incorrectly includes jars that are supposed to
>> >> be
>> >> >>>>>> provided by the end user
>> >> >>>>>> etc.
>> >> >>>>>>
>> >> >>>>>> I have seen all the above in RC votes.
>> >> >>>>>> There are probably other falure modes.
>> >> >>>>>>
>> >> >>>>>>> My proposition is that it's an independent step: once the
>> >> >>>>>>> build system is adjusted to the expectations, "correct"
>> >> >>>>>>> binaries can be
>> >> >>>>>>> generated from the same tagged release.
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> It does not matter when the binary is built.
>> >> >>>>>> If it is distributed by the PMC as a formal release, it must
>> >> >>>>>> not contain any surprises, e.g. it must be licensed under
>> >> >>>>>> the AL.
>> >> >>>>>>
>> >> >>>>>> It is therefore vital that the contents are as expected from
>> >> >>>>>> the build.
>> >> >>>>>>
>> >> >>>>>> Note also that a formal release becomes an act of the PMC by
>> >> >>>>>> the voting process.
>> >> >>>>>> The ASF can then assume responsibility for any legal issues
>> >> >>>>>> that may arise.
>> >> >>>>>> Otherwise it is entirely the personal responsibility of the
>> >> >>>>>> person who
>> >> >>>>>> releases it.
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> I think the last two points are really important: binaries
>> >> must
>> >> >>>>> be
>> >> >>>>> checked and the foundation provides a legal protection for
>> >> >>>>> the project
>> >> >>>>> if something weird occurs.
>> >> >>>>>
>> >> >>>>> I also think another point is important: many if not most
>> >> users
>> >> >>>>> do
>> >> >>>>> really expect binaries and not source. From our internal
>> >> Apache
>> >> >>>>> point
>> >> >>>>> of view, these are a by-product,. For many others it is the
>> >> >>>>> important
>> >> >>>>> thing. It is mostly true in maven land as dependencies are
>> >> >>>>> automatically retrieved in binary form, not source form. So
>> >> the
>> >> >>>>> maven
>> >> >>>>> central repository as a distribution system is important.
>> >> >>>>>
>> >> >>>>> Even if for some security reason it sounds at first thought
>> >> >>>>> logical to
>> >> >>>>> rely on source only and compile oneself, in an industrial
>> >> >>>>> context project teams do not have enough time to do it for
>> >> >>>>> all their dependencies, so they use binaries provided by
>> >> >>>>> trusted third parties. A
>> >> >>>>> long time ago, I compiled a lot of free software tools for
>> >> >>>>> the department I worked for at that time. I do not do this
>> >> anymore,
>> >> >>>>> and
>> >> >>>>> trust the binaries provided by the packaging team for a
>> >> >>>>> distribution
>> >> >>>>> (typically Debian). They do rely on source and compile
>> >> >>>>> themselves. Hey,
>> >> >>>>> I even think Emmanuel here belongs to the Debian java
>> >> >>>>> team ;-)
>> >> I
>> >> >>>>> guess
>> >> >>>>> such teams that do rely on source are rather the exception
>> >> than
>> >> >>>>> the
>> >> >>>>> rule. The other examples I can think of are packaging teams,
>> >> >>>>> development teams that need bleeding edge (and will also
>> >> >>>>> directly depend on the repository, not even the release),
>> >> >>>>> projects that need to
>> >> >>>>> introduce their own patches and people who have critical
>> >> >>>>> needs (for
>> >> >>>>> example when safety of people is concerned or when they need
>> >> >>>>> full control for legal or contractual reasons). Many other
>> >> >>>>> people download
>> >> >>>>> binaries directly and would simply not consider using a
>> >> project
>> >> >>>>> if it
>> >> >>>>> is not readily available: they don't have time for this and
>> >> >>>>> don't want
>> >> >>>>> to learn how to build tens or hundred of different projects
>> >> they
>> >> >>>>> simply
>> >> >>>>> use.
>> >> >>>>>
>> >> >>>>
>> >> >>>> I do not disagree with anything said on this thread. [In
>> >> >>>> particular, I
>> >> >>>> did not at all imply that any one committer could take
>> >> >>>> responsibility
>> >> >>>> for releasing unchecked items.]
>> >> >>>>
>> >> >>>> I'm simply suggesting that what is called the release
>> >> >>>> process/management
>> >> >>>> could be made simpler (and _consequently_ could lead to more
>> >> >>>> regularly
>> >> >>>> releasing the CM code), by separating the concerns.
>> >> >>>> The concerns are
>> >> >>>>  1. "code" (the contents), and
>> >> >>>>  2. "artefacts" (the result of the build system acting on the
>> >> >>>> "code").
>> >> >>>>
>> >> >>>> Checking of one of these is largely independent from checking
>> >> the
>> >> >>>> other.
>> >> >>>
>> >> >>>
>> >> >>> Unfortunately, not really.  One principle that we have (maybe
>> >> not
>> >> >>> crystal clear in the release doco) is that when we do
>> >> >>> distribute binaries, they should really be "convenience
>> >> >>> binaries" which
>> >> means
>> >> >>> that everything needed to create them is in the source or its
>> >> >>> documented dependencies.  What that means is that what we tag
>> >> >>> as the
>> >> >>> source release needs to be able to generate any binaries that
>> >> >>> we subsequently release.  The only way to really test that is
>> >> >>> to generate the binaries and inspect them as part of verifying
>> >> >>> the release.
>> >> >>
>> >> >>
>> >> >> Only way?  That's certainly not obvious to me: Since a
>> >> >> tag/branch uniquely identifies a set of files, that is, the
>> >> >> "source release [that
>> >> >> is] able to generate any binaries that we subsequently release",
>> >> >> if a
>> >> >> RM can do it at (source) release time, he (or someone else!) can
>> >> >> do it
>> >> >> later, too (by running the build from a clone of the repository
>> >> in
>> >> >> its
>> >> >> tagged state).
>> >> >>
>> >> >>> As others have pointed out, anything we release has to be
>> >> verified
>> >> >>> and voted on.  As RM and reviewer, I think it is actually
>> >> >>> easier to roll and verify source and binaries together.
>> >> >>
>> >> >
>> >> > +1
>> >> >
>> >> >>
>> >> >> It's precisely my main point.
>> >> >> I won't dispute that you can prefer doing both (and nobody would
>> >> >> forbid
>> >> >> a RM to do just that) but the point is about the possibility to
>> >> >> release
>> >> >> source-only code (as the first step of a two-step procedure
>> >> >> which
>> >> I
>> >> >> described earlier).
>> >> >> [IMHO, the two-step one seems easier (both for the RM and the
>> >> >> reviewer),
>> >> >> (mileage does vary).]
>> >> >
>> >> > What is easier?
>> >> > It seems to me there will be at least one other step in your
>> >> > proposed process, i.e. a second VOTE e-mail
>> >>
>> >> Yes, that's obviously what I meant:
>> >> Two steps == two votes
>> >>
>> >> [But: source releases need not necessarily be accompanied with
>> >> "binaries", which, I imagine, could lead to official releases
>> >> occurring more often (due to the reduced number of checks).]
>> >>
>> >> > These will both contain most of the same information.
>> >>
>> >> No.
>> >> The first step is about the source, i.e. the code which humans
>> >> create.
>> >> The second step is about the files which a build system creates.
>> >>
>> >> As I indicated previously, the first vote will be about a set of
>> >> reviewers being satisfied with the state of the souce code, while
>> >> the second vote will be about another set of reviewers being
>> >> satisfied
>> >> with the results of the build system ("no glitch", as you described
>> >> in an earlier message).
>> >>
>> >> > Is the intention to announce the source release separately from
>> >> the
>> >> > binary release?
>> >> > If so, there will need to be 2 announce mails, and 2 updates to
>> >> the
>> >> > download page.
>> >>
>> >> Is there a problem with that?
>> >> There are actually several possible cases (depending on the will of
>> >> the RM):
>> >>   * one-step release (only source code)
>> >>   * two-steps (source, then binaries based on that source)
>> >>   * combined (as is done up to now)
>> >>   * binaries (based on any previously released source)
>> >>
>> >> >> In short is it forbidden (by the official/legal rules of ASF) to
>> >> >> proceed
>> >> >> as I propose?
>> >> >
>> >> > Dunno, depends on what exactly you are proposing.
>> >>
>> >> Cf. above (and previous mails).
>> >>
>> >> In practice the release could (IIUC) be like the link provided
>> >> by Luc in RC1 of CM 3.4 (whose target was a TAR of the tagged
>> >> repository).
>> >>
>> >>
>> >> >> It is impossible technically?
>> >> >
>> >> > Currently the Maven build process creates:
>> >> > - Maven source and binary jars
>> >> > - ASF source and binary bundles
>> >>
>> >> AFAIU, the JARs (source and binary) are "binaries", the binary
>> >> bundles are "binaries". Only the ASF source is "source".
>> >>
>> >> > It's not clear to me what exactly you propose to release in stage
>> >> > one,
>> >>
>> >> The ASF source (e.g. in the form of a tarball, or the appropriate
>> >> "git clone" command).
>> >>
>> >> > but there will need to be some changes to the process in order to
>> >> > release just the ASF source.
>> >>
>> >> I don't see which.
>> >> A "source RM" would just stop the process after
>> >> resolving/postponing the pending issues, and checking the various
>> >> reports about the source
>> >> code. [Then create the tag, and request a vote.]
>> >>
>> >> A "binary RM" would take on from that point (a tagged repository),
>> >> i.e. create all the binaries, sign them, etc.
>> >>
>> >> > There is no point releasing the Maven source jars separately from
>> >> > the binary jars; they are not complete as they only contain java
>> >> > files for
>> >> > use with IDEs.
>> >>
>> >> I don't understand that.
>> >> In principle, a JAR with the Java sources is indeed the necessary
>> >> and
>> >> sufficient condition for users to create the executable bytecode,
>> >> with
>> >> whatever build system they wish.
>> >> But I agree that it's not useful to not release all the files
>> >> needed to easily run maven. [And, for convenience, a source
>> >> release would be
>> >> accompanied with instructions on how to build a JAR of the compiled
>> >> classes, using maven.]
>> >>
>> >> > But in any case, AFAIK it is very tricky to release new files
>> >> > into an existing Maven folder, and it may cause problems for end
>> >> > users.
>> >>
>> >> I don't understand what you mean by "release new files into an
>> >> existing Maven folder"...
>> >>
>> >> Gilles
>> >>
>> >> >>
>> >> >>
>> >> >>> Phil
>> >> >>>
>> >> >>>
>> >> >>>> [The more so that, as you said, no fool-proof link between the
>> >> >>>> two can
>> >> >>>> be ensured: From a security POV, checking the former requires
>> >> >>>> a code
>> >> >>>> review, while using the latter requires trust in the build
>> >> >>>> system.]
>> >> >>>>
>> >> >>>> Thus we could release the "code", after checking and voting on
>> >> >>>> the concerned elements (i.e. the repository state
>> >> >>>> corresponding to a specific tag + the web site).
>> >> >>>>
>> >> >>>> Then we could release the "binaries", as a convenience, after
>> >> >>>> checking
>> >> >>>> and voting on the concerned elements (i.e. the files about to
>> >> be
>> >> >>>> distributed).
>> >> >>>>
>> >> >>>> I think that it's an added flexibility that would, for
>> >> >>>> example, allow
>> >> >>>> the tagging of the repository without necessarily release
>> >> >>>> binaries (i.e.
>> >> >>>> not involving that part of the work); and to release binaries
>> >> >>>> (say, at
>> >> >>>> regular intervals) based on the latest tagged code (i.e. not
>> >> >>>> involving
>> >> >>>> the work about solving/evaluating/postponing issues).
>> >> >>>>
>> >> >>>> [I completely admit that, at first, it might look a little
>> >> >>>> more confusing for the plain user, but (IIUC) it would be a
>> >> >>>> better representation of the reality covered by stating that
>> >> >>>> the ASF releases source code.]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by Bernd Eckenfels <ec...@zusammenkunft.net>.
Hello,

Am Tue, 30 Dec 2014 02:29:38 +0100
schrieb Gilles <gi...@harfang.homelinux.org>:

> On Tue, 30 Dec 2014 02:09:42 +0100, Bernd Eckenfels wrote:
> > That thread gets deep. :)
> >
> > I just wanted to comment on "releasing only
> > source is faster because of less checks". I disagree with that, most
> > release delay/time is due to preparation work. Failed (binary)
> > checks are typically for a reason which would also be present in
> > the source (especially the POM), so it does not really reduce the
> > number of rework.
> 
> RM is a streamlined procedure: so, if you do (say) 10 steps rather
> than 15, it will objectively take less time, and this is compounded
> by the additional tests which should (ideally) be performed by the
> reviewers. [Thus delaying the release.]

The problem is not the small additional time for the last 5 steps but
the large time for redoing all steps (on veto).


> > (At least not in most cases, so two votes will actually make us
> > more work not less).
> 
> The additional work exactly amounts to sending _one_ additional mail.

The actual work is not the vote mail but the people doing the
preparation and the review.

> 
> Then, as I noted,
>   * some releases will be done as before (same work)
>   * some releases will be "source only" (less work)

Not much, you still have to check if the source actually works and can
be build, produces sane archives and so on.

>   * some releases will be two-steps, possibly performed by two
> different people (i.e. less work for each RM)

And more work in sum, not only for the RMs but also the reviewers. (and
the users which want to use the source release with maven like anybody
there days)

But I dont mind, if a project wants to do a source release only, thats
fine with me, I just don't see the advantage.

Gruss
Bernd

> 
> Of course, each release means some work has to be done; then IIUC your
> point, the fewer releases the better. :-}
> 
> 

> >  Am Tue, 30 Dec 2014 02:05:29
> > +0100 schrieb Gilles <gi...@harfang.homelinux.org>:
> >
> >> On Mon, 29 Dec 2014 10:54:59 +0000, sebb wrote:
> >> > On 29 December 2014 at 10:36, Gilles 
> >> <gi...@harfang.homelinux.org>
> >> > wrote:
> >> >> On Sun, 28 Dec 2014 20:21:32 -0700, Phil Steitz wrote:
> >> >>>
> >> >>> On 12/28/14 11:46 AM, Gilles wrote:
> >> >>>>
> >> >>>> Hi.
> >> >>>>
> >> >>>> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
> >> >>>>>
> >> >>>>> Le 28/12/2014 00:22, sebb a écrit :
> >> >>>>>>
> >> >>>>>> On 27 December 2014 at 22:19, Gilles
> >> >>>>>> <gi...@harfang.homelinux.org> wrote:
> >> >>>>>>>
> >> >>>>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> On 24 December 2014 at 15:11, Gilles
> >> >>>>>>>> <gi...@harfang.homelinux.org> wrote:
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe wrote:
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe
> >> >>>>>>>>>>> wrote:
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math 
> >> 3.4
> >> >>>>>>>>>>>>>> from release
> >> >>>>>>>>>>>>>> candidate 3.
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> Tag name:
> >> >>>>>>>>>>>>>>   MATH_3_4_RC3 (signature can be checked from git 
> >> using
> >> >>>>>>>>>>>>>> 'git tag
> >> >>>>>>>>>>>>>> -v')
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> Tag URL:
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> 
> >> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> Is there a way to check that the source code
> >> >>>>>>>>>>>>> referred 
> >> to
> >> >>>>>>>>>>>>> above
> >> >>>>>>>>>>>>> was the one used to create the JAR of the ".class"
> >> >>>>>>>>>>>>> files. [Out of curiosity, not suspicion, of
> >> >>>>>>>>>>>>> course...]
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Yes, you can look at the end of the 
> >> META-INF/MANIFEST.MS
> >> >>>>>>>>>>>> file embedded
> >> >>>>>>>>>>>> in the jar. The second-to-last entry is called
> >> >>>>>>>>>>>> Implementation-Build.
> >> >>>>>>>>>>>> It
> >> >>>>>>>>>>>> is automatically created by 
> >> maven-jgit-buildnumber-plugin
> >> >>>>>>>>>>>> and contains
> >> >>>>>>>>>>>> the SHA1 identifier of the last commit used for the
> >> >>>>>>>>>>>> build. Here, is is
> >> >>>>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can 
> >> check
> >> >>>>>>>>>>>> it really
> >> >>>>>>>>>>>> corresponds to the expected status of the git 
> >> repository.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> Can this be considered "secure", i.e. can't this entry 
> >> in
> >> >>>>>>>>>>> the MANIFEST
> >> >>>>>>>>>>> file be modified to be the checksum of the repository 
> >> but
> >> >>>>>>>>>>> with the
> >> >>>>>>>>>>> .class
> >> >>>>>>>>>>> files being substitued with those coming from another
> >> >>>>>>>>>>> compilation?
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> Modifying anything in the jar (either this entry within 
> >> the
> >> >>>>>>>>>> manifest or
> >> >>>>>>>>>> any class) will modify the jar signature. So as long as
> >> >>>>>>>>>> people do check
> >> >>>>>>>>>> the global MD5, SHA1 or gpg signature we provide with
> >> >>>>>>>>>> our build, they
> >> >>>>>>>>>> are safe to assume the artifacts are Apache artifacts.
> >> >>>>>>>>>>
> >> >>>>>>>>>> This is not different from how releases are done with
> >> >>>>>>>>>> subversion as the
> >> >>>>>>>>>> source code control system, or even in C or C++ as the
> >> >>>>>>>>>> language. At one
> >> >>>>>>>>>> time, the release manager does perform a compilation and
> >> >>>>>>>>>> the fellow
> >> >>>>>>>>>> reviewers check the result. There is no fullproof
> >> >>>>>>>>>> process here, as
> >> >>>>>>>>>> always when security is involved. Even using an
> >> >>>>>>>>>> automated build and
> >> >>>>>>>>>> automatic signing on an Apache server would involve
> >> >>>>>>>>>> trust (i.e. one
> >> >>>>>>>>>> should assume that the server has not been tampered
> >> >>>>>>>>>> with, that the build
> >> >>>>>>>>>> process really does what it is expected to do, that the
> >> >>>>>>>>>> artifacts put to
> >> >>>>>>>>>> review are really the one created by the automatic 
> >> process
> >> >>>>>>>>>> ...).
> >> >>>>>>>>>>
> >> >>>>>>>>>> Another point is that what we officially release is the
> >> >>>>>>>>>> source, which
> >> >>>>>>>>>> can be reviewed by external users. The binary parts are
> >> >>>>>>>>>> merely a
> >> >>>>>>>>>> convenience.
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> That's an interesting point to come back to since it
> >> >>>>>>>>> looks like the
> >> >>>>>>>>> most time-consuming part of a release is not related to 
> >> the
> >> >>>>>>>>> sources!
> >> >>>>>>>>>
> >> >>>>>>>>> Isn't it conceivable that a release could just be a
> >> >>>>>>>>> commit identifier
> >> >>>>>>>>> and a checksum of the repository?
> >> >>>>>>>>>
> >> >>>>>>>>> If the binaries are a just a convenience, why put so much
> >> >>>>>>>>> effort in it?
> >> >>>>>>>>> As a convenience, the artefacts could be produced after 
> >> the
> >> >>>>>>>>> release,
> >> >>>>>>>>> accompanied with all the "caveat" notes which you 
> >> mentioned.
> >> >>>>>>>>>
> >> >>>>>>>>> That would certainly increase the release rate.
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> Binary releases still need to be reviewed to ensure that 
> >> the
> >> >>>>>>>> correct N
> >> >>>>>>>> & L files are present, and that the archives don't contain
> >> >>>>>>>> material
> >> >>>>>>>> with disallowed licenses.
> >> >>>>>>>>
> >> >>>>>>>> It's not unknown for automated build processes to include
> >> >>>>>>>> files that
> >> >>>>>>>> should not be present.
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>> I fail to see the difference of principle between the
> >> >>>>>>> "release" context
> >> >>>>>>> and, say, the daily snapshot context.
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> Snapshots are not (should not) be promoted to the general
> >> >>>>>> public as
> >> >>>>>> releases of the ASF.
> >> >>>>>>
> >> >>>>>>> What I mean is that there seem to be a contradiction
> >> >>>>>>> between saying that
> >> >>>>>>> a "release" is only about _source_ and the obligation to 
> >> check
> >> >>>>>>> _binaries_.
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> There is no contradiction here.
> >> >>>>>> The ASF releases source, they are required in a release.
> >> >>>>>> Binaries are optional.
> >> >>>>>> That does not mean that the ASF mirror system can be used to
> >> >>>>>> distribute arbitrary binaries.
> >> >>>>>>
> >> >>>>>>> It can occur that disallowed material is, at some point in
> >> >>>>>>> time, part of
> >> >>>>>>> the repository and/or the snapshot binaries.
> >> >>>>>>> However, what is forbidden is... forbidden, at all times.
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> As with most things, this is not a strict dichotomy.
> >> >>>>>>
> >> >>>>>>> If it is indeed a problem to distribute forbidden material,
> >> >>>>>>> shouldn't
> >> >>>>>>> this be corrected in the repository? [That's indeed what
> >> >>>>>>> you did with
> >> >>>>>>> the blocking of the release.]
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> If the repo is discovered to contain disallowed material, it
> >> >>>>>> needs to
> >> >>>>>> be removed.
> >> >>>>>>
> >> >>>>>>> Then again, once the repository is "clean", it can be
> >> >>>>>>> tagged and that
> >> >>>>>>> tagged _source_ is the release.
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> Not quite.
> >> >>>>>>
> >> >>>>>> A release is a source archive that is voted on and 
> >> distributed
> >> >>>>>> via the
> >> >>>>>> ASF mirror system.
> >> >>>>>> The contents must agree with the source tag, but the source 
> >> tag
> >> >>>>>> is not
> >> >>>>>> the release.
> >> >>>>>>
> >> >>>>>>> Non-compliant binaries would thus only be the result of a
> >> >>>>>>> "mistake"
> >> >>>>>>> (if the build system is flawed, it's another problem,
> >> >>>>>>> unrelated to
> >> >>>>>>> the released contents, which is _source_) to be corrected 
> >> per
> >> >>>>>>> se.
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> Not so. There are other failure modes.
> >> >>>>>>
> >> >>>>>> An automated build obviously reduces the chances of
> >> >>>>>> mistakes, but it
> >> >>>>>> can still create an archive containing files that should
> >> >>>>>> not 
> >> be
> >> >>>>>> there.
> >> >>>>>> [Or indeed, omits files that should be present]
> >> >>>>>> For example, the workspace contains spurious files which are
> >> >>>>>> implicitly included by the assembly instructions.
> >> >>>>>> Or the build process creates spurious files that are
> >> >>>>>> incorrectly added
> >> >>>>>> to the archive.
> >> >>>>>> Or the build incorrectly includes jars that are supposed to 
> >> be
> >> >>>>>> provided by the end user
> >> >>>>>> etc.
> >> >>>>>>
> >> >>>>>> I have seen all the above in RC votes.
> >> >>>>>> There are probably other falure modes.
> >> >>>>>>
> >> >>>>>>> My proposition is that it's an independent step: once the
> >> >>>>>>> build system is adjusted to the expectations, "correct"
> >> >>>>>>> binaries can be
> >> >>>>>>> generated from the same tagged release.
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> It does not matter when the binary is built.
> >> >>>>>> If it is distributed by the PMC as a formal release, it must
> >> >>>>>> not contain any surprises, e.g. it must be licensed under
> >> >>>>>> the AL.
> >> >>>>>>
> >> >>>>>> It is therefore vital that the contents are as expected from
> >> >>>>>> the build.
> >> >>>>>>
> >> >>>>>> Note also that a formal release becomes an act of the PMC by
> >> >>>>>> the voting process.
> >> >>>>>> The ASF can then assume responsibility for any legal issues
> >> >>>>>> that may arise.
> >> >>>>>> Otherwise it is entirely the personal responsibility of the
> >> >>>>>> person who
> >> >>>>>> releases it.
> >> >>>>>
> >> >>>>>
> >> >>>>> I think the last two points are really important: binaries 
> >> must
> >> >>>>> be
> >> >>>>> checked and the foundation provides a legal protection for
> >> >>>>> the project
> >> >>>>> if something weird occurs.
> >> >>>>>
> >> >>>>> I also think another point is important: many if not most 
> >> users
> >> >>>>> do
> >> >>>>> really expect binaries and not source. From our internal 
> >> Apache
> >> >>>>> point
> >> >>>>> of view, these are a by-product,. For many others it is the
> >> >>>>> important
> >> >>>>> thing. It is mostly true in maven land as dependencies are
> >> >>>>> automatically retrieved in binary form, not source form. So 
> >> the
> >> >>>>> maven
> >> >>>>> central repository as a distribution system is important.
> >> >>>>>
> >> >>>>> Even if for some security reason it sounds at first thought
> >> >>>>> logical to
> >> >>>>> rely on source only and compile oneself, in an industrial
> >> >>>>> context project teams do not have enough time to do it for
> >> >>>>> all their dependencies, so they use binaries provided by
> >> >>>>> trusted third parties. A
> >> >>>>> long time ago, I compiled a lot of free software tools for
> >> >>>>> the department I worked for at that time. I do not do this 
> >> anymore,
> >> >>>>> and
> >> >>>>> trust the binaries provided by the packaging team for a
> >> >>>>> distribution
> >> >>>>> (typically Debian). They do rely on source and compile
> >> >>>>> themselves. Hey,
> >> >>>>> I even think Emmanuel here belongs to the Debian java
> >> >>>>> team ;-) 
> >> I
> >> >>>>> guess
> >> >>>>> such teams that do rely on source are rather the exception 
> >> than
> >> >>>>> the
> >> >>>>> rule. The other examples I can think of are packaging teams,
> >> >>>>> development teams that need bleeding edge (and will also
> >> >>>>> directly depend on the repository, not even the release),
> >> >>>>> projects that need to
> >> >>>>> introduce their own patches and people who have critical
> >> >>>>> needs (for
> >> >>>>> example when safety of people is concerned or when they need
> >> >>>>> full control for legal or contractual reasons). Many other
> >> >>>>> people download
> >> >>>>> binaries directly and would simply not consider using a 
> >> project
> >> >>>>> if it
> >> >>>>> is not readily available: they don't have time for this and
> >> >>>>> don't want
> >> >>>>> to learn how to build tens or hundred of different projects 
> >> they
> >> >>>>> simply
> >> >>>>> use.
> >> >>>>>
> >> >>>>
> >> >>>> I do not disagree with anything said on this thread. [In
> >> >>>> particular, I
> >> >>>> did not at all imply that any one committer could take
> >> >>>> responsibility
> >> >>>> for releasing unchecked items.]
> >> >>>>
> >> >>>> I'm simply suggesting that what is called the release
> >> >>>> process/management
> >> >>>> could be made simpler (and _consequently_ could lead to more
> >> >>>> regularly
> >> >>>> releasing the CM code), by separating the concerns.
> >> >>>> The concerns are
> >> >>>>  1. "code" (the contents), and
> >> >>>>  2. "artefacts" (the result of the build system acting on the
> >> >>>> "code").
> >> >>>>
> >> >>>> Checking of one of these is largely independent from checking 
> >> the
> >> >>>> other.
> >> >>>
> >> >>>
> >> >>> Unfortunately, not really.  One principle that we have (maybe 
> >> not
> >> >>> crystal clear in the release doco) is that when we do
> >> >>> distribute binaries, they should really be "convenience
> >> >>> binaries" which 
> >> means
> >> >>> that everything needed to create them is in the source or its
> >> >>> documented dependencies.  What that means is that what we tag
> >> >>> as the
> >> >>> source release needs to be able to generate any binaries that
> >> >>> we subsequently release.  The only way to really test that is
> >> >>> to generate the binaries and inspect them as part of verifying
> >> >>> the release.
> >> >>
> >> >>
> >> >> Only way?  That's certainly not obvious to me: Since a
> >> >> tag/branch uniquely identifies a set of files, that is, the
> >> >> "source release [that
> >> >> is] able to generate any binaries that we subsequently release",
> >> >> if a
> >> >> RM can do it at (source) release time, he (or someone else!) can
> >> >> do it
> >> >> later, too (by running the build from a clone of the repository 
> >> in
> >> >> its
> >> >> tagged state).
> >> >>
> >> >>> As others have pointed out, anything we release has to be 
> >> verified
> >> >>> and voted on.  As RM and reviewer, I think it is actually
> >> >>> easier to roll and verify source and binaries together.
> >> >>
> >> >
> >> > +1
> >> >
> >> >>
> >> >> It's precisely my main point.
> >> >> I won't dispute that you can prefer doing both (and nobody would
> >> >> forbid
> >> >> a RM to do just that) but the point is about the possibility to
> >> >> release
> >> >> source-only code (as the first step of a two-step procedure
> >> >> which 
> >> I
> >> >> described earlier).
> >> >> [IMHO, the two-step one seems easier (both for the RM and the
> >> >> reviewer),
> >> >> (mileage does vary).]
> >> >
> >> > What is easier?
> >> > It seems to me there will be at least one other step in your
> >> > proposed process, i.e. a second VOTE e-mail
> >>
> >> Yes, that's obviously what I meant:
> >> Two steps == two votes
> >>
> >> [But: source releases need not necessarily be accompanied with
> >> "binaries", which, I imagine, could lead to official releases
> >> occurring more often (due to the reduced number of checks).]
> >>
> >> > These will both contain most of the same information.
> >>
> >> No.
> >> The first step is about the source, i.e. the code which humans 
> >> create.
> >> The second step is about the files which a build system creates.
> >>
> >> As I indicated previously, the first vote will be about a set of
> >> reviewers being satisfied with the state of the souce code, while
> >> the second vote will be about another set of reviewers being 
> >> satisfied
> >> with the results of the build system ("no glitch", as you described
> >> in an earlier message).
> >>
> >> > Is the intention to announce the source release separately from 
> >> the
> >> > binary release?
> >> > If so, there will need to be 2 announce mails, and 2 updates to 
> >> the
> >> > download page.
> >>
> >> Is there a problem with that?
> >> There are actually several possible cases (depending on the will of
> >> the RM):
> >>   * one-step release (only source code)
> >>   * two-steps (source, then binaries based on that source)
> >>   * combined (as is done up to now)
> >>   * binaries (based on any previously released source)
> >>
> >> >> In short is it forbidden (by the official/legal rules of ASF) to
> >> >> proceed
> >> >> as I propose?
> >> >
> >> > Dunno, depends on what exactly you are proposing.
> >>
> >> Cf. above (and previous mails).
> >>
> >> In practice the release could (IIUC) be like the link provided
> >> by Luc in RC1 of CM 3.4 (whose target was a TAR of the tagged
> >> repository).
> >>
> >>
> >> >> It is impossible technically?
> >> >
> >> > Currently the Maven build process creates:
> >> > - Maven source and binary jars
> >> > - ASF source and binary bundles
> >>
> >> AFAIU, the JARs (source and binary) are "binaries", the binary
> >> bundles are "binaries". Only the ASF source is "source".
> >>
> >> > It's not clear to me what exactly you propose to release in stage
> >> > one,
> >>
> >> The ASF source (e.g. in the form of a tarball, or the appropriate
> >> "git clone" command).
> >>
> >> > but there will need to be some changes to the process in order to
> >> > release just the ASF source.
> >>
> >> I don't see which.
> >> A "source RM" would just stop the process after
> >> resolving/postponing the pending issues, and checking the various
> >> reports about the source
> >> code. [Then create the tag, and request a vote.]
> >>
> >> A "binary RM" would take on from that point (a tagged repository),
> >> i.e. create all the binaries, sign them, etc.
> >>
> >> > There is no point releasing the Maven source jars separately from
> >> > the binary jars; they are not complete as they only contain java
> >> > files for
> >> > use with IDEs.
> >>
> >> I don't understand that.
> >> In principle, a JAR with the Java sources is indeed the necessary 
> >> and
> >> sufficient condition for users to create the executable bytecode, 
> >> with
> >> whatever build system they wish.
> >> But I agree that it's not useful to not release all the files
> >> needed to easily run maven. [And, for convenience, a source
> >> release would be
> >> accompanied with instructions on how to build a JAR of the compiled
> >> classes, using maven.]
> >>
> >> > But in any case, AFAIK it is very tricky to release new files
> >> > into an existing Maven folder, and it may cause problems for end
> >> > users.
> >>
> >> I don't understand what you mean by "release new files into an
> >> existing Maven folder"...
> >>
> >> Gilles
> >>
> >> >>
> >> >>
> >> >>> Phil
> >> >>>
> >> >>>
> >> >>>> [The more so that, as you said, no fool-proof link between the
> >> >>>> two can
> >> >>>> be ensured: From a security POV, checking the former requires
> >> >>>> a code
> >> >>>> review, while using the latter requires trust in the build
> >> >>>> system.]
> >> >>>>
> >> >>>> Thus we could release the "code", after checking and voting on
> >> >>>> the concerned elements (i.e. the repository state
> >> >>>> corresponding to a specific tag + the web site).
> >> >>>>
> >> >>>> Then we could release the "binaries", as a convenience, after
> >> >>>> checking
> >> >>>> and voting on the concerned elements (i.e. the files about to 
> >> be
> >> >>>> distributed).
> >> >>>>
> >> >>>> I think that it's an added flexibility that would, for
> >> >>>> example, allow
> >> >>>> the tagging of the repository without necessarily release
> >> >>>> binaries (i.e.
> >> >>>> not involving that part of the work); and to release binaries
> >> >>>> (say, at
> >> >>>> regular intervals) based on the latest tagged code (i.e. not
> >> >>>> involving
> >> >>>> the work about solving/evaluating/postponing issues).
> >> >>>>
> >> >>>> [I completely admit that, at first, it might look a little
> >> >>>> more confusing for the plain user, but (IIUC) it would be a
> >> >>>> better representation of the reality covered by stating that
> >> >>>> the ASF releases source code.]

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by Gilles <gi...@harfang.homelinux.org>.
On Tue, 30 Dec 2014 02:09:42 +0100, Bernd Eckenfels wrote:
> That thread gets deep. :)
>
> I just wanted to comment on "releasing only
> source is faster because of less checks". I disagree with that, most
> release delay/time is due to preparation work. Failed (binary) checks
> are typically for a reason which would also be present in the source
> (especially the POM), so it does not really reduce the number of
> rework.

RM is a streamlined procedure: so, if you do (say) 10 steps rather
than 15, it will objectively take less time, and this is compounded
by the additional tests which should (ideally) be performed by the
reviewers. [Thus delaying the release.]

> (At least not in most cases, so two votes will actually make us
> more work not less).

The additional work exactly amounts to sending _one_ additional mail.

Then, as I noted,
  * some releases will be done as before (same work)
  * some releases will be "source only" (less work)
  * some releases will be two-steps, possibly performed by two different
    people (i.e. less work for each RM)

Of course, each release means some work has to be done; then IIUC your
point, the fewer releases the better. :-}


Regards,
Gilles

>
> Gruss
> Bernd
>
>
>
>  Am Tue, 30 Dec 2014 02:05:29
> +0100 schrieb Gilles <gi...@harfang.homelinux.org>:
>
>> On Mon, 29 Dec 2014 10:54:59 +0000, sebb wrote:
>> > On 29 December 2014 at 10:36, Gilles 
>> <gi...@harfang.homelinux.org>
>> > wrote:
>> >> On Sun, 28 Dec 2014 20:21:32 -0700, Phil Steitz wrote:
>> >>>
>> >>> On 12/28/14 11:46 AM, Gilles wrote:
>> >>>>
>> >>>> Hi.
>> >>>>
>> >>>> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
>> >>>>>
>> >>>>> Le 28/12/2014 00:22, sebb a écrit :
>> >>>>>>
>> >>>>>> On 27 December 2014 at 22:19, Gilles
>> >>>>>> <gi...@harfang.homelinux.org> wrote:
>> >>>>>>>
>> >>>>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On 24 December 2014 at 15:11, Gilles
>> >>>>>>>> <gi...@harfang.homelinux.org> wrote:
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe wrote:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math 
>> 3.4
>> >>>>>>>>>>>>>> from release
>> >>>>>>>>>>>>>> candidate 3.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Tag name:
>> >>>>>>>>>>>>>>   MATH_3_4_RC3 (signature can be checked from git 
>> using
>> >>>>>>>>>>>>>> 'git tag
>> >>>>>>>>>>>>>> -v')
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Tag URL:
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> 
>> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Is there a way to check that the source code referred 
>> to
>> >>>>>>>>>>>>> above
>> >>>>>>>>>>>>> was the one used to create the JAR of the ".class"
>> >>>>>>>>>>>>> files. [Out of curiosity, not suspicion, of course...]
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Yes, you can look at the end of the 
>> META-INF/MANIFEST.MS
>> >>>>>>>>>>>> file embedded
>> >>>>>>>>>>>> in the jar. The second-to-last entry is called
>> >>>>>>>>>>>> Implementation-Build.
>> >>>>>>>>>>>> It
>> >>>>>>>>>>>> is automatically created by 
>> maven-jgit-buildnumber-plugin
>> >>>>>>>>>>>> and contains
>> >>>>>>>>>>>> the SHA1 identifier of the last commit used for the
>> >>>>>>>>>>>> build. Here, is is
>> >>>>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can 
>> check
>> >>>>>>>>>>>> it really
>> >>>>>>>>>>>> corresponds to the expected status of the git 
>> repository.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> Can this be considered "secure", i.e. can't this entry 
>> in
>> >>>>>>>>>>> the MANIFEST
>> >>>>>>>>>>> file be modified to be the checksum of the repository 
>> but
>> >>>>>>>>>>> with the
>> >>>>>>>>>>> .class
>> >>>>>>>>>>> files being substitued with those coming from another
>> >>>>>>>>>>> compilation?
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> Modifying anything in the jar (either this entry within 
>> the
>> >>>>>>>>>> manifest or
>> >>>>>>>>>> any class) will modify the jar signature. So as long as
>> >>>>>>>>>> people do check
>> >>>>>>>>>> the global MD5, SHA1 or gpg signature we provide with our
>> >>>>>>>>>> build, they
>> >>>>>>>>>> are safe to assume the artifacts are Apache artifacts.
>> >>>>>>>>>>
>> >>>>>>>>>> This is not different from how releases are done with
>> >>>>>>>>>> subversion as the
>> >>>>>>>>>> source code control system, or even in C or C++ as the
>> >>>>>>>>>> language. At one
>> >>>>>>>>>> time, the release manager does perform a compilation and
>> >>>>>>>>>> the fellow
>> >>>>>>>>>> reviewers check the result. There is no fullproof process
>> >>>>>>>>>> here, as
>> >>>>>>>>>> always when security is involved. Even using an automated
>> >>>>>>>>>> build and
>> >>>>>>>>>> automatic signing on an Apache server would involve trust
>> >>>>>>>>>> (i.e. one
>> >>>>>>>>>> should assume that the server has not been tampered with,
>> >>>>>>>>>> that the build
>> >>>>>>>>>> process really does what it is expected to do, that the
>> >>>>>>>>>> artifacts put to
>> >>>>>>>>>> review are really the one created by the automatic 
>> process
>> >>>>>>>>>> ...).
>> >>>>>>>>>>
>> >>>>>>>>>> Another point is that what we officially release is the
>> >>>>>>>>>> source, which
>> >>>>>>>>>> can be reviewed by external users. The binary parts are
>> >>>>>>>>>> merely a
>> >>>>>>>>>> convenience.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> That's an interesting point to come back to since it looks
>> >>>>>>>>> like the
>> >>>>>>>>> most time-consuming part of a release is not related to 
>> the
>> >>>>>>>>> sources!
>> >>>>>>>>>
>> >>>>>>>>> Isn't it conceivable that a release could just be a commit
>> >>>>>>>>> identifier
>> >>>>>>>>> and a checksum of the repository?
>> >>>>>>>>>
>> >>>>>>>>> If the binaries are a just a convenience, why put so much
>> >>>>>>>>> effort in it?
>> >>>>>>>>> As a convenience, the artefacts could be produced after 
>> the
>> >>>>>>>>> release,
>> >>>>>>>>> accompanied with all the "caveat" notes which you 
>> mentioned.
>> >>>>>>>>>
>> >>>>>>>>> That would certainly increase the release rate.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> Binary releases still need to be reviewed to ensure that 
>> the
>> >>>>>>>> correct N
>> >>>>>>>> & L files are present, and that the archives don't contain
>> >>>>>>>> material
>> >>>>>>>> with disallowed licenses.
>> >>>>>>>>
>> >>>>>>>> It's not unknown for automated build processes to include
>> >>>>>>>> files that
>> >>>>>>>> should not be present.
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>> I fail to see the difference of principle between the
>> >>>>>>> "release" context
>> >>>>>>> and, say, the daily snapshot context.
>> >>>>>>
>> >>>>>>
>> >>>>>> Snapshots are not (should not) be promoted to the general
>> >>>>>> public as
>> >>>>>> releases of the ASF.
>> >>>>>>
>> >>>>>>> What I mean is that there seem to be a contradiction between
>> >>>>>>> saying that
>> >>>>>>> a "release" is only about _source_ and the obligation to 
>> check
>> >>>>>>> _binaries_.
>> >>>>>>
>> >>>>>>
>> >>>>>> There is no contradiction here.
>> >>>>>> The ASF releases source, they are required in a release.
>> >>>>>> Binaries are optional.
>> >>>>>> That does not mean that the ASF mirror system can be used to
>> >>>>>> distribute arbitrary binaries.
>> >>>>>>
>> >>>>>>> It can occur that disallowed material is, at some point in
>> >>>>>>> time, part of
>> >>>>>>> the repository and/or the snapshot binaries.
>> >>>>>>> However, what is forbidden is... forbidden, at all times.
>> >>>>>>
>> >>>>>>
>> >>>>>> As with most things, this is not a strict dichotomy.
>> >>>>>>
>> >>>>>>> If it is indeed a problem to distribute forbidden material,
>> >>>>>>> shouldn't
>> >>>>>>> this be corrected in the repository? [That's indeed what you
>> >>>>>>> did with
>> >>>>>>> the blocking of the release.]
>> >>>>>>
>> >>>>>>
>> >>>>>> If the repo is discovered to contain disallowed material, it
>> >>>>>> needs to
>> >>>>>> be removed.
>> >>>>>>
>> >>>>>>> Then again, once the repository is "clean", it can be tagged
>> >>>>>>> and that
>> >>>>>>> tagged _source_ is the release.
>> >>>>>>
>> >>>>>>
>> >>>>>> Not quite.
>> >>>>>>
>> >>>>>> A release is a source archive that is voted on and 
>> distributed
>> >>>>>> via the
>> >>>>>> ASF mirror system.
>> >>>>>> The contents must agree with the source tag, but the source 
>> tag
>> >>>>>> is not
>> >>>>>> the release.
>> >>>>>>
>> >>>>>>> Non-compliant binaries would thus only be the result of a
>> >>>>>>> "mistake"
>> >>>>>>> (if the build system is flawed, it's another problem,
>> >>>>>>> unrelated to
>> >>>>>>> the released contents, which is _source_) to be corrected 
>> per
>> >>>>>>> se.
>> >>>>>>
>> >>>>>>
>> >>>>>> Not so. There are other failure modes.
>> >>>>>>
>> >>>>>> An automated build obviously reduces the chances of mistakes,
>> >>>>>> but it
>> >>>>>> can still create an archive containing files that should not 
>> be
>> >>>>>> there.
>> >>>>>> [Or indeed, omits files that should be present]
>> >>>>>> For example, the workspace contains spurious files which are
>> >>>>>> implicitly included by the assembly instructions.
>> >>>>>> Or the build process creates spurious files that are
>> >>>>>> incorrectly added
>> >>>>>> to the archive.
>> >>>>>> Or the build incorrectly includes jars that are supposed to 
>> be
>> >>>>>> provided by the end user
>> >>>>>> etc.
>> >>>>>>
>> >>>>>> I have seen all the above in RC votes.
>> >>>>>> There are probably other falure modes.
>> >>>>>>
>> >>>>>>> My proposition is that it's an independent step: once the
>> >>>>>>> build system is adjusted to the expectations, "correct"
>> >>>>>>> binaries can be
>> >>>>>>> generated from the same tagged release.
>> >>>>>>
>> >>>>>>
>> >>>>>> It does not matter when the binary is built.
>> >>>>>> If it is distributed by the PMC as a formal release, it must
>> >>>>>> not contain any surprises, e.g. it must be licensed under the
>> >>>>>> AL.
>> >>>>>>
>> >>>>>> It is therefore vital that the contents are as expected from
>> >>>>>> the build.
>> >>>>>>
>> >>>>>> Note also that a formal release becomes an act of the PMC by
>> >>>>>> the voting process.
>> >>>>>> The ASF can then assume responsibility for any legal issues
>> >>>>>> that may arise.
>> >>>>>> Otherwise it is entirely the personal responsibility of the
>> >>>>>> person who
>> >>>>>> releases it.
>> >>>>>
>> >>>>>
>> >>>>> I think the last two points are really important: binaries 
>> must
>> >>>>> be
>> >>>>> checked and the foundation provides a legal protection for the
>> >>>>> project
>> >>>>> if something weird occurs.
>> >>>>>
>> >>>>> I also think another point is important: many if not most 
>> users
>> >>>>> do
>> >>>>> really expect binaries and not source. From our internal 
>> Apache
>> >>>>> point
>> >>>>> of view, these are a by-product,. For many others it is the
>> >>>>> important
>> >>>>> thing. It is mostly true in maven land as dependencies are
>> >>>>> automatically retrieved in binary form, not source form. So 
>> the
>> >>>>> maven
>> >>>>> central repository as a distribution system is important.
>> >>>>>
>> >>>>> Even if for some security reason it sounds at first thought
>> >>>>> logical to
>> >>>>> rely on source only and compile oneself, in an industrial
>> >>>>> context project teams do not have enough time to do it for all
>> >>>>> their dependencies, so they use binaries provided by trusted
>> >>>>> third parties. A
>> >>>>> long time ago, I compiled a lot of free software tools for the
>> >>>>> department I worked for at that time. I do not do this 
>> anymore,
>> >>>>> and
>> >>>>> trust the binaries provided by the packaging team for a
>> >>>>> distribution
>> >>>>> (typically Debian). They do rely on source and compile
>> >>>>> themselves. Hey,
>> >>>>> I even think Emmanuel here belongs to the Debian java team ;-) 
>> I
>> >>>>> guess
>> >>>>> such teams that do rely on source are rather the exception 
>> than
>> >>>>> the
>> >>>>> rule. The other examples I can think of are packaging teams,
>> >>>>> development teams that need bleeding edge (and will also
>> >>>>> directly depend on the repository, not even the release),
>> >>>>> projects that need to
>> >>>>> introduce their own patches and people who have critical needs
>> >>>>> (for
>> >>>>> example when safety of people is concerned or when they need
>> >>>>> full control for legal or contractual reasons). Many other
>> >>>>> people download
>> >>>>> binaries directly and would simply not consider using a 
>> project
>> >>>>> if it
>> >>>>> is not readily available: they don't have time for this and
>> >>>>> don't want
>> >>>>> to learn how to build tens or hundred of different projects 
>> they
>> >>>>> simply
>> >>>>> use.
>> >>>>>
>> >>>>
>> >>>> I do not disagree with anything said on this thread. [In
>> >>>> particular, I
>> >>>> did not at all imply that any one committer could take
>> >>>> responsibility
>> >>>> for releasing unchecked items.]
>> >>>>
>> >>>> I'm simply suggesting that what is called the release
>> >>>> process/management
>> >>>> could be made simpler (and _consequently_ could lead to more
>> >>>> regularly
>> >>>> releasing the CM code), by separating the concerns.
>> >>>> The concerns are
>> >>>>  1. "code" (the contents), and
>> >>>>  2. "artefacts" (the result of the build system acting on the
>> >>>> "code").
>> >>>>
>> >>>> Checking of one of these is largely independent from checking 
>> the
>> >>>> other.
>> >>>
>> >>>
>> >>> Unfortunately, not really.  One principle that we have (maybe 
>> not
>> >>> crystal clear in the release doco) is that when we do distribute
>> >>> binaries, they should really be "convenience binaries" which 
>> means
>> >>> that everything needed to create them is in the source or its
>> >>> documented dependencies.  What that means is that what we tag as
>> >>> the
>> >>> source release needs to be able to generate any binaries that we
>> >>> subsequently release.  The only way to really test that is to
>> >>> generate the binaries and inspect them as part of verifying the
>> >>> release.
>> >>
>> >>
>> >> Only way?  That's certainly not obvious to me: Since a tag/branch
>> >> uniquely identifies a set of files, that is, the "source release
>> >> [that
>> >> is] able to generate any binaries that we subsequently release",
>> >> if a
>> >> RM can do it at (source) release time, he (or someone else!) can
>> >> do it
>> >> later, too (by running the build from a clone of the repository 
>> in
>> >> its
>> >> tagged state).
>> >>
>> >>> As others have pointed out, anything we release has to be 
>> verified
>> >>> and voted on.  As RM and reviewer, I think it is actually easier
>> >>> to roll and verify source and binaries together.
>> >>
>> >
>> > +1
>> >
>> >>
>> >> It's precisely my main point.
>> >> I won't dispute that you can prefer doing both (and nobody would
>> >> forbid
>> >> a RM to do just that) but the point is about the possibility to
>> >> release
>> >> source-only code (as the first step of a two-step procedure which 
>> I
>> >> described earlier).
>> >> [IMHO, the two-step one seems easier (both for the RM and the
>> >> reviewer),
>> >> (mileage does vary).]
>> >
>> > What is easier?
>> > It seems to me there will be at least one other step in your
>> > proposed process, i.e. a second VOTE e-mail
>>
>> Yes, that's obviously what I meant:
>> Two steps == two votes
>>
>> [But: source releases need not necessarily be accompanied with
>> "binaries", which, I imagine, could lead to official releases
>> occurring more often (due to the reduced number of checks).]
>>
>> > These will both contain most of the same information.
>>
>> No.
>> The first step is about the source, i.e. the code which humans 
>> create.
>> The second step is about the files which a build system creates.
>>
>> As I indicated previously, the first vote will be about a set of
>> reviewers being satisfied with the state of the souce code, while
>> the second vote will be about another set of reviewers being 
>> satisfied
>> with the results of the build system ("no glitch", as you described
>> in an earlier message).
>>
>> > Is the intention to announce the source release separately from 
>> the
>> > binary release?
>> > If so, there will need to be 2 announce mails, and 2 updates to 
>> the
>> > download page.
>>
>> Is there a problem with that?
>> There are actually several possible cases (depending on the will of
>> the RM):
>>   * one-step release (only source code)
>>   * two-steps (source, then binaries based on that source)
>>   * combined (as is done up to now)
>>   * binaries (based on any previously released source)
>>
>> >> In short is it forbidden (by the official/legal rules of ASF) to
>> >> proceed
>> >> as I propose?
>> >
>> > Dunno, depends on what exactly you are proposing.
>>
>> Cf. above (and previous mails).
>>
>> In practice the release could (IIUC) be like the link provided
>> by Luc in RC1 of CM 3.4 (whose target was a TAR of the tagged
>> repository).
>>
>>
>> >> It is impossible technically?
>> >
>> > Currently the Maven build process creates:
>> > - Maven source and binary jars
>> > - ASF source and binary bundles
>>
>> AFAIU, the JARs (source and binary) are "binaries", the binary
>> bundles are "binaries". Only the ASF source is "source".
>>
>> > It's not clear to me what exactly you propose to release in stage
>> > one,
>>
>> The ASF source (e.g. in the form of a tarball, or the appropriate
>> "git clone" command).
>>
>> > but there will need to be some changes to the process in order to
>> > release just the ASF source.
>>
>> I don't see which.
>> A "source RM" would just stop the process after resolving/postponing
>> the pending issues, and checking the various reports about the 
>> source
>> code. [Then create the tag, and request a vote.]
>>
>> A "binary RM" would take on from that point (a tagged repository),
>> i.e. create all the binaries, sign them, etc.
>>
>> > There is no point releasing the Maven source jars separately from
>> > the binary jars; they are not complete as they only contain java
>> > files for
>> > use with IDEs.
>>
>> I don't understand that.
>> In principle, a JAR with the Java sources is indeed the necessary 
>> and
>> sufficient condition for users to create the executable bytecode, 
>> with
>> whatever build system they wish.
>> But I agree that it's not useful to not release all the files needed
>> to easily run maven. [And, for convenience, a source release would 
>> be
>> accompanied with instructions on how to build a JAR of the compiled
>> classes, using maven.]
>>
>> > But in any case, AFAIK it is very tricky to release new files into
>> > an existing Maven folder, and it may cause problems for end users.
>>
>> I don't understand what you mean by "release new files into an
>> existing Maven folder"...
>>
>> Gilles
>>
>> >>
>> >>
>> >>> Phil
>> >>>
>> >>>
>> >>>> [The more so that, as you said, no fool-proof link between the
>> >>>> two can
>> >>>> be ensured: From a security POV, checking the former requires a
>> >>>> code
>> >>>> review, while using the latter requires trust in the build
>> >>>> system.]
>> >>>>
>> >>>> Thus we could release the "code", after checking and voting on
>> >>>> the concerned elements (i.e. the repository state corresponding
>> >>>> to a specific tag + the web site).
>> >>>>
>> >>>> Then we could release the "binaries", as a convenience, after
>> >>>> checking
>> >>>> and voting on the concerned elements (i.e. the files about to 
>> be
>> >>>> distributed).
>> >>>>
>> >>>> I think that it's an added flexibility that would, for example,
>> >>>> allow
>> >>>> the tagging of the repository without necessarily release
>> >>>> binaries (i.e.
>> >>>> not involving that part of the work); and to release binaries
>> >>>> (say, at
>> >>>> regular intervals) based on the latest tagged code (i.e. not
>> >>>> involving
>> >>>> the work about solving/evaluating/postponing issues).
>> >>>>
>> >>>> [I completely admit that, at first, it might look a little more
>> >>>> confusing for the plain user, but (IIUC) it would be a better
>> >>>> representation of the reality covered by stating that the ASF
>> >>>> releases source code.]
>> >>>>
>> >>>>
>> >>>> Best regards,
>> >>>> Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by Bernd Eckenfels <ec...@zusammenkunft.net>.
That thread gets deep. :)

I just wanted to comment on "releasing only
source is faster because of less checks". I disagree with that, most
release delay/time is due to preparation work. Failed (binary) checks
are typically for a reason which would also be present in the source
(especially the POM), so it does not really reduce the number of
rework. (At least not in most cases, so two votes will actually make us
more work not less).

Gruss
Bernd



 Am Tue, 30 Dec 2014 02:05:29
+0100 schrieb Gilles <gi...@harfang.homelinux.org>:

> On Mon, 29 Dec 2014 10:54:59 +0000, sebb wrote:
> > On 29 December 2014 at 10:36, Gilles <gi...@harfang.homelinux.org> 
> > wrote:
> >> On Sun, 28 Dec 2014 20:21:32 -0700, Phil Steitz wrote:
> >>>
> >>> On 12/28/14 11:46 AM, Gilles wrote:
> >>>>
> >>>> Hi.
> >>>>
> >>>> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
> >>>>>
> >>>>> Le 28/12/2014 00:22, sebb a écrit :
> >>>>>>
> >>>>>> On 27 December 2014 at 22:19, Gilles
> >>>>>> <gi...@harfang.homelinux.org> wrote:
> >>>>>>>
> >>>>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 24 December 2014 at 15:11, Gilles
> >>>>>>>> <gi...@harfang.homelinux.org> wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math 3.4
> >>>>>>>>>>>>>> from release
> >>>>>>>>>>>>>> candidate 3.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Tag name:
> >>>>>>>>>>>>>>   MATH_3_4_RC3 (signature can be checked from git using
> >>>>>>>>>>>>>> 'git tag
> >>>>>>>>>>>>>> -v')
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Tag URL:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 
> >>>>>>>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Is there a way to check that the source code referred to
> >>>>>>>>>>>>> above
> >>>>>>>>>>>>> was the one used to create the JAR of the ".class"
> >>>>>>>>>>>>> files. [Out of curiosity, not suspicion, of course...]
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Yes, you can look at the end of the META-INF/MANIFEST.MS
> >>>>>>>>>>>> file embedded
> >>>>>>>>>>>> in the jar. The second-to-last entry is called
> >>>>>>>>>>>> Implementation-Build.
> >>>>>>>>>>>> It
> >>>>>>>>>>>> is automatically created by maven-jgit-buildnumber-plugin
> >>>>>>>>>>>> and contains
> >>>>>>>>>>>> the SHA1 identifier of the last commit used for the
> >>>>>>>>>>>> build. Here, is is
> >>>>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can check
> >>>>>>>>>>>> it really
> >>>>>>>>>>>> corresponds to the expected status of the git repository.
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Can this be considered "secure", i.e. can't this entry in
> >>>>>>>>>>> the MANIFEST
> >>>>>>>>>>> file be modified to be the checksum of the repository but
> >>>>>>>>>>> with the
> >>>>>>>>>>> .class
> >>>>>>>>>>> files being substitued with those coming from another
> >>>>>>>>>>> compilation?
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Modifying anything in the jar (either this entry within the
> >>>>>>>>>> manifest or
> >>>>>>>>>> any class) will modify the jar signature. So as long as
> >>>>>>>>>> people do check
> >>>>>>>>>> the global MD5, SHA1 or gpg signature we provide with our
> >>>>>>>>>> build, they
> >>>>>>>>>> are safe to assume the artifacts are Apache artifacts.
> >>>>>>>>>>
> >>>>>>>>>> This is not different from how releases are done with
> >>>>>>>>>> subversion as the
> >>>>>>>>>> source code control system, or even in C or C++ as the
> >>>>>>>>>> language. At one
> >>>>>>>>>> time, the release manager does perform a compilation and
> >>>>>>>>>> the fellow
> >>>>>>>>>> reviewers check the result. There is no fullproof process
> >>>>>>>>>> here, as
> >>>>>>>>>> always when security is involved. Even using an automated
> >>>>>>>>>> build and
> >>>>>>>>>> automatic signing on an Apache server would involve trust
> >>>>>>>>>> (i.e. one
> >>>>>>>>>> should assume that the server has not been tampered with,
> >>>>>>>>>> that the build
> >>>>>>>>>> process really does what it is expected to do, that the
> >>>>>>>>>> artifacts put to
> >>>>>>>>>> review are really the one created by the automatic process
> >>>>>>>>>> ...).
> >>>>>>>>>>
> >>>>>>>>>> Another point is that what we officially release is the
> >>>>>>>>>> source, which
> >>>>>>>>>> can be reviewed by external users. The binary parts are
> >>>>>>>>>> merely a
> >>>>>>>>>> convenience.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> That's an interesting point to come back to since it looks
> >>>>>>>>> like the
> >>>>>>>>> most time-consuming part of a release is not related to the
> >>>>>>>>> sources!
> >>>>>>>>>
> >>>>>>>>> Isn't it conceivable that a release could just be a commit
> >>>>>>>>> identifier
> >>>>>>>>> and a checksum of the repository?
> >>>>>>>>>
> >>>>>>>>> If the binaries are a just a convenience, why put so much
> >>>>>>>>> effort in it?
> >>>>>>>>> As a convenience, the artefacts could be produced after the
> >>>>>>>>> release,
> >>>>>>>>> accompanied with all the "caveat" notes which you mentioned.
> >>>>>>>>>
> >>>>>>>>> That would certainly increase the release rate.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Binary releases still need to be reviewed to ensure that the
> >>>>>>>> correct N
> >>>>>>>> & L files are present, and that the archives don't contain
> >>>>>>>> material
> >>>>>>>> with disallowed licenses.
> >>>>>>>>
> >>>>>>>> It's not unknown for automated build processes to include
> >>>>>>>> files that
> >>>>>>>> should not be present.
> >>>>>>>>
> >>>>>>>
> >>>>>>> I fail to see the difference of principle between the
> >>>>>>> "release" context
> >>>>>>> and, say, the daily snapshot context.
> >>>>>>
> >>>>>>
> >>>>>> Snapshots are not (should not) be promoted to the general
> >>>>>> public as
> >>>>>> releases of the ASF.
> >>>>>>
> >>>>>>> What I mean is that there seem to be a contradiction between
> >>>>>>> saying that
> >>>>>>> a "release" is only about _source_ and the obligation to check
> >>>>>>> _binaries_.
> >>>>>>
> >>>>>>
> >>>>>> There is no contradiction here.
> >>>>>> The ASF releases source, they are required in a release.
> >>>>>> Binaries are optional.
> >>>>>> That does not mean that the ASF mirror system can be used to
> >>>>>> distribute arbitrary binaries.
> >>>>>>
> >>>>>>> It can occur that disallowed material is, at some point in
> >>>>>>> time, part of
> >>>>>>> the repository and/or the snapshot binaries.
> >>>>>>> However, what is forbidden is... forbidden, at all times.
> >>>>>>
> >>>>>>
> >>>>>> As with most things, this is not a strict dichotomy.
> >>>>>>
> >>>>>>> If it is indeed a problem to distribute forbidden material,
> >>>>>>> shouldn't
> >>>>>>> this be corrected in the repository? [That's indeed what you
> >>>>>>> did with
> >>>>>>> the blocking of the release.]
> >>>>>>
> >>>>>>
> >>>>>> If the repo is discovered to contain disallowed material, it
> >>>>>> needs to
> >>>>>> be removed.
> >>>>>>
> >>>>>>> Then again, once the repository is "clean", it can be tagged
> >>>>>>> and that
> >>>>>>> tagged _source_ is the release.
> >>>>>>
> >>>>>>
> >>>>>> Not quite.
> >>>>>>
> >>>>>> A release is a source archive that is voted on and distributed
> >>>>>> via the
> >>>>>> ASF mirror system.
> >>>>>> The contents must agree with the source tag, but the source tag
> >>>>>> is not
> >>>>>> the release.
> >>>>>>
> >>>>>>> Non-compliant binaries would thus only be the result of a
> >>>>>>> "mistake"
> >>>>>>> (if the build system is flawed, it's another problem,
> >>>>>>> unrelated to
> >>>>>>> the released contents, which is _source_) to be corrected per 
> >>>>>>> se.
> >>>>>>
> >>>>>>
> >>>>>> Not so. There are other failure modes.
> >>>>>>
> >>>>>> An automated build obviously reduces the chances of mistakes,
> >>>>>> but it
> >>>>>> can still create an archive containing files that should not be
> >>>>>> there.
> >>>>>> [Or indeed, omits files that should be present]
> >>>>>> For example, the workspace contains spurious files which are
> >>>>>> implicitly included by the assembly instructions.
> >>>>>> Or the build process creates spurious files that are
> >>>>>> incorrectly added
> >>>>>> to the archive.
> >>>>>> Or the build incorrectly includes jars that are supposed to be
> >>>>>> provided by the end user
> >>>>>> etc.
> >>>>>>
> >>>>>> I have seen all the above in RC votes.
> >>>>>> There are probably other falure modes.
> >>>>>>
> >>>>>>> My proposition is that it's an independent step: once the
> >>>>>>> build system is adjusted to the expectations, "correct"
> >>>>>>> binaries can be
> >>>>>>> generated from the same tagged release.
> >>>>>>
> >>>>>>
> >>>>>> It does not matter when the binary is built.
> >>>>>> If it is distributed by the PMC as a formal release, it must
> >>>>>> not contain any surprises, e.g. it must be licensed under the
> >>>>>> AL.
> >>>>>>
> >>>>>> It is therefore vital that the contents are as expected from
> >>>>>> the build.
> >>>>>>
> >>>>>> Note also that a formal release becomes an act of the PMC by
> >>>>>> the voting process.
> >>>>>> The ASF can then assume responsibility for any legal issues
> >>>>>> that may arise.
> >>>>>> Otherwise it is entirely the personal responsibility of the
> >>>>>> person who
> >>>>>> releases it.
> >>>>>
> >>>>>
> >>>>> I think the last two points are really important: binaries must 
> >>>>> be
> >>>>> checked and the foundation provides a legal protection for the
> >>>>> project
> >>>>> if something weird occurs.
> >>>>>
> >>>>> I also think another point is important: many if not most users 
> >>>>> do
> >>>>> really expect binaries and not source. From our internal Apache
> >>>>> point
> >>>>> of view, these are a by-product,. For many others it is the
> >>>>> important
> >>>>> thing. It is mostly true in maven land as dependencies are
> >>>>> automatically retrieved in binary form, not source form. So the
> >>>>> maven
> >>>>> central repository as a distribution system is important.
> >>>>>
> >>>>> Even if for some security reason it sounds at first thought
> >>>>> logical to
> >>>>> rely on source only and compile oneself, in an industrial
> >>>>> context project teams do not have enough time to do it for all
> >>>>> their dependencies, so they use binaries provided by trusted
> >>>>> third parties. A
> >>>>> long time ago, I compiled a lot of free software tools for the
> >>>>> department I worked for at that time. I do not do this anymore, 
> >>>>> and
> >>>>> trust the binaries provided by the packaging team for a 
> >>>>> distribution
> >>>>> (typically Debian). They do rely on source and compile
> >>>>> themselves. Hey,
> >>>>> I even think Emmanuel here belongs to the Debian java team ;-) I
> >>>>> guess
> >>>>> such teams that do rely on source are rather the exception than 
> >>>>> the
> >>>>> rule. The other examples I can think of are packaging teams,
> >>>>> development teams that need bleeding edge (and will also
> >>>>> directly depend on the repository, not even the release),
> >>>>> projects that need to
> >>>>> introduce their own patches and people who have critical needs 
> >>>>> (for
> >>>>> example when safety of people is concerned or when they need
> >>>>> full control for legal or contractual reasons). Many other
> >>>>> people download
> >>>>> binaries directly and would simply not consider using a project
> >>>>> if it
> >>>>> is not readily available: they don't have time for this and
> >>>>> don't want
> >>>>> to learn how to build tens or hundred of different projects they
> >>>>> simply
> >>>>> use.
> >>>>>
> >>>>
> >>>> I do not disagree with anything said on this thread. [In
> >>>> particular, I
> >>>> did not at all imply that any one committer could take 
> >>>> responsibility
> >>>> for releasing unchecked items.]
> >>>>
> >>>> I'm simply suggesting that what is called the release
> >>>> process/management
> >>>> could be made simpler (and _consequently_ could lead to more
> >>>> regularly
> >>>> releasing the CM code), by separating the concerns.
> >>>> The concerns are
> >>>>  1. "code" (the contents), and
> >>>>  2. "artefacts" (the result of the build system acting on the
> >>>> "code").
> >>>>
> >>>> Checking of one of these is largely independent from checking the
> >>>> other.
> >>>
> >>>
> >>> Unfortunately, not really.  One principle that we have (maybe not
> >>> crystal clear in the release doco) is that when we do distribute
> >>> binaries, they should really be "convenience binaries" which means
> >>> that everything needed to create them is in the source or its
> >>> documented dependencies.  What that means is that what we tag as 
> >>> the
> >>> source release needs to be able to generate any binaries that we
> >>> subsequently release.  The only way to really test that is to
> >>> generate the binaries and inspect them as part of verifying the 
> >>> release.
> >>
> >>
> >> Only way?  That's certainly not obvious to me: Since a tag/branch
> >> uniquely identifies a set of files, that is, the "source release 
> >> [that
> >> is] able to generate any binaries that we subsequently release",
> >> if a
> >> RM can do it at (source) release time, he (or someone else!) can
> >> do it
> >> later, too (by running the build from a clone of the repository in 
> >> its
> >> tagged state).
> >>
> >>> As others have pointed out, anything we release has to be verified
> >>> and voted on.  As RM and reviewer, I think it is actually easier
> >>> to roll and verify source and binaries together.
> >>
> >
> > +1
> >
> >>
> >> It's precisely my main point.
> >> I won't dispute that you can prefer doing both (and nobody would 
> >> forbid
> >> a RM to do just that) but the point is about the possibility to 
> >> release
> >> source-only code (as the first step of a two-step procedure which I
> >> described earlier).
> >> [IMHO, the two-step one seems easier (both for the RM and the 
> >> reviewer),
> >> (mileage does vary).]
> >
> > What is easier?
> > It seems to me there will be at least one other step in your
> > proposed process, i.e. a second VOTE e-mail
> 
> Yes, that's obviously what I meant:
> Two steps == two votes
> 
> [But: source releases need not necessarily be accompanied with
> "binaries", which, I imagine, could lead to official releases
> occurring more often (due to the reduced number of checks).]
> 
> > These will both contain most of the same information.
> 
> No.
> The first step is about the source, i.e. the code which humans create.
> The second step is about the files which a build system creates.
> 
> As I indicated previously, the first vote will be about a set of
> reviewers being satisfied with the state of the souce code, while
> the second vote will be about another set of reviewers being satisfied
> with the results of the build system ("no glitch", as you described
> in an earlier message).
> 
> > Is the intention to announce the source release separately from the
> > binary release?
> > If so, there will need to be 2 announce mails, and 2 updates to the
> > download page.
> 
> Is there a problem with that?
> There are actually several possible cases (depending on the will of
> the RM):
>   * one-step release (only source code)
>   * two-steps (source, then binaries based on that source)
>   * combined (as is done up to now)
>   * binaries (based on any previously released source)
> 
> >> In short is it forbidden (by the official/legal rules of ASF) to 
> >> proceed
> >> as I propose?
> >
> > Dunno, depends on what exactly you are proposing.
> 
> Cf. above (and previous mails).
> 
> In practice the release could (IIUC) be like the link provided
> by Luc in RC1 of CM 3.4 (whose target was a TAR of the tagged
> repository).
> 
> 
> >> It is impossible technically?
> >
> > Currently the Maven build process creates:
> > - Maven source and binary jars
> > - ASF source and binary bundles
> 
> AFAIU, the JARs (source and binary) are "binaries", the binary
> bundles are "binaries". Only the ASF source is "source".
> 
> > It's not clear to me what exactly you propose to release in stage 
> > one,
> 
> The ASF source (e.g. in the form of a tarball, or the appropriate
> "git clone" command).
> 
> > but there will need to be some changes to the process in order to
> > release just the ASF source.
> 
> I don't see which.
> A "source RM" would just stop the process after resolving/postponing
> the pending issues, and checking the various reports about the source
> code. [Then create the tag, and request a vote.]
> 
> A "binary RM" would take on from that point (a tagged repository),
> i.e. create all the binaries, sign them, etc.
> 
> > There is no point releasing the Maven source jars separately from
> > the binary jars; they are not complete as they only contain java
> > files for
> > use with IDEs.
> 
> I don't understand that.
> In principle, a JAR with the Java sources is indeed the necessary and
> sufficient condition for users to create the executable bytecode, with
> whatever build system they wish.
> But I agree that it's not useful to not release all the files needed
> to easily run maven. [And, for convenience, a source release would be
> accompanied with instructions on how to build a JAR of the compiled
> classes, using maven.]
> 
> > But in any case, AFAIK it is very tricky to release new files into
> > an existing Maven folder, and it may cause problems for end users.
> 
> I don't understand what you mean by "release new files into an
> existing Maven folder"...
> 
> Gilles
> 
> >>
> >>
> >>> Phil
> >>>
> >>>
> >>>> [The more so that, as you said, no fool-proof link between the
> >>>> two can
> >>>> be ensured: From a security POV, checking the former requires a 
> >>>> code
> >>>> review, while using the latter requires trust in the build 
> >>>> system.]
> >>>>
> >>>> Thus we could release the "code", after checking and voting on
> >>>> the concerned elements (i.e. the repository state corresponding
> >>>> to a specific tag + the web site).
> >>>>
> >>>> Then we could release the "binaries", as a convenience, after
> >>>> checking
> >>>> and voting on the concerned elements (i.e. the files about to be
> >>>> distributed).
> >>>>
> >>>> I think that it's an added flexibility that would, for example, 
> >>>> allow
> >>>> the tagging of the repository without necessarily release
> >>>> binaries (i.e.
> >>>> not involving that part of the work); and to release binaries
> >>>> (say, at
> >>>> regular intervals) based on the latest tagged code (i.e. not
> >>>> involving
> >>>> the work about solving/evaluating/postponing issues).
> >>>>
> >>>> [I completely admit that, at first, it might look a little more
> >>>> confusing for the plain user, but (IIUC) it would be a better
> >>>> representation of the reality covered by stating that the ASF
> >>>> releases source code.]
> >>>>
> >>>>
> >>>> Best regards,
> >>>> Gilles
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by Gilles <gi...@harfang.homelinux.org>.
On Mon, 29 Dec 2014 10:54:59 +0000, sebb wrote:
> On 29 December 2014 at 10:36, Gilles <gi...@harfang.homelinux.org> 
> wrote:
>> On Sun, 28 Dec 2014 20:21:32 -0700, Phil Steitz wrote:
>>>
>>> On 12/28/14 11:46 AM, Gilles wrote:
>>>>
>>>> Hi.
>>>>
>>>> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
>>>>>
>>>>> Le 28/12/2014 00:22, sebb a écrit :
>>>>>>
>>>>>> On 27 December 2014 at 22:19, Gilles
>>>>>> <gi...@harfang.homelinux.org> wrote:
>>>>>>>
>>>>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 24 December 2014 at 15:11, Gilles
>>>>>>>> <gi...@harfang.homelinux.org> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math 3.4
>>>>>>>>>>>>>> from release
>>>>>>>>>>>>>> candidate 3.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Tag name:
>>>>>>>>>>>>>>   MATH_3_4_RC3 (signature can be checked from git using
>>>>>>>>>>>>>> 'git tag
>>>>>>>>>>>>>> -v')
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Tag URL:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is there a way to check that the source code referred to
>>>>>>>>>>>>> above
>>>>>>>>>>>>> was the one used to create the JAR of the ".class" files.
>>>>>>>>>>>>> [Out of curiosity, not suspicion, of course...]
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, you can look at the end of the META-INF/MANIFEST.MS
>>>>>>>>>>>> file embedded
>>>>>>>>>>>> in the jar. The second-to-last entry is called
>>>>>>>>>>>> Implementation-Build.
>>>>>>>>>>>> It
>>>>>>>>>>>> is automatically created by maven-jgit-buildnumber-plugin
>>>>>>>>>>>> and contains
>>>>>>>>>>>> the SHA1 identifier of the last commit used for the build.
>>>>>>>>>>>> Here, is is
>>>>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can check
>>>>>>>>>>>> it really
>>>>>>>>>>>> corresponds to the expected status of the git repository.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Can this be considered "secure", i.e. can't this entry in
>>>>>>>>>>> the MANIFEST
>>>>>>>>>>> file be modified to be the checksum of the repository but
>>>>>>>>>>> with the
>>>>>>>>>>> .class
>>>>>>>>>>> files being substitued with those coming from another
>>>>>>>>>>> compilation?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Modifying anything in the jar (either this entry within the
>>>>>>>>>> manifest or
>>>>>>>>>> any class) will modify the jar signature. So as long as
>>>>>>>>>> people do check
>>>>>>>>>> the global MD5, SHA1 or gpg signature we provide with our
>>>>>>>>>> build, they
>>>>>>>>>> are safe to assume the artifacts are Apache artifacts.
>>>>>>>>>>
>>>>>>>>>> This is not different from how releases are done with
>>>>>>>>>> subversion as the
>>>>>>>>>> source code control system, or even in C or C++ as the
>>>>>>>>>> language. At one
>>>>>>>>>> time, the release manager does perform a compilation and the
>>>>>>>>>> fellow
>>>>>>>>>> reviewers check the result. There is no fullproof process
>>>>>>>>>> here, as
>>>>>>>>>> always when security is involved. Even using an automated
>>>>>>>>>> build and
>>>>>>>>>> automatic signing on an Apache server would involve trust
>>>>>>>>>> (i.e. one
>>>>>>>>>> should assume that the server has not been tampered with,
>>>>>>>>>> that the build
>>>>>>>>>> process really does what it is expected to do, that the
>>>>>>>>>> artifacts put to
>>>>>>>>>> review are really the one created by the automatic process
>>>>>>>>>> ...).
>>>>>>>>>>
>>>>>>>>>> Another point is that what we officially release is the
>>>>>>>>>> source, which
>>>>>>>>>> can be reviewed by external users. The binary parts are
>>>>>>>>>> merely a
>>>>>>>>>> convenience.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> That's an interesting point to come back to since it looks
>>>>>>>>> like the
>>>>>>>>> most time-consuming part of a release is not related to the
>>>>>>>>> sources!
>>>>>>>>>
>>>>>>>>> Isn't it conceivable that a release could just be a commit
>>>>>>>>> identifier
>>>>>>>>> and a checksum of the repository?
>>>>>>>>>
>>>>>>>>> If the binaries are a just a convenience, why put so much
>>>>>>>>> effort in it?
>>>>>>>>> As a convenience, the artefacts could be produced after the
>>>>>>>>> release,
>>>>>>>>> accompanied with all the "caveat" notes which you mentioned.
>>>>>>>>>
>>>>>>>>> That would certainly increase the release rate.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Binary releases still need to be reviewed to ensure that the
>>>>>>>> correct N
>>>>>>>> & L files are present, and that the archives don't contain
>>>>>>>> material
>>>>>>>> with disallowed licenses.
>>>>>>>>
>>>>>>>> It's not unknown for automated build processes to include
>>>>>>>> files that
>>>>>>>> should not be present.
>>>>>>>>
>>>>>>>
>>>>>>> I fail to see the difference of principle between the "release"
>>>>>>> context
>>>>>>> and, say, the daily snapshot context.
>>>>>>
>>>>>>
>>>>>> Snapshots are not (should not) be promoted to the general public 
>>>>>> as
>>>>>> releases of the ASF.
>>>>>>
>>>>>>> What I mean is that there seem to be a contradiction between
>>>>>>> saying that
>>>>>>> a "release" is only about _source_ and the obligation to check
>>>>>>> _binaries_.
>>>>>>
>>>>>>
>>>>>> There is no contradiction here.
>>>>>> The ASF releases source, they are required in a release.
>>>>>> Binaries are optional.
>>>>>> That does not mean that the ASF mirror system can be used to
>>>>>> distribute arbitrary binaries.
>>>>>>
>>>>>>> It can occur that disallowed material is, at some point in
>>>>>>> time, part of
>>>>>>> the repository and/or the snapshot binaries.
>>>>>>> However, what is forbidden is... forbidden, at all times.
>>>>>>
>>>>>>
>>>>>> As with most things, this is not a strict dichotomy.
>>>>>>
>>>>>>> If it is indeed a problem to distribute forbidden material,
>>>>>>> shouldn't
>>>>>>> this be corrected in the repository? [That's indeed what you
>>>>>>> did with
>>>>>>> the blocking of the release.]
>>>>>>
>>>>>>
>>>>>> If the repo is discovered to contain disallowed material, it
>>>>>> needs to
>>>>>> be removed.
>>>>>>
>>>>>>> Then again, once the repository is "clean", it can be tagged
>>>>>>> and that
>>>>>>> tagged _source_ is the release.
>>>>>>
>>>>>>
>>>>>> Not quite.
>>>>>>
>>>>>> A release is a source archive that is voted on and distributed
>>>>>> via the
>>>>>> ASF mirror system.
>>>>>> The contents must agree with the source tag, but the source tag
>>>>>> is not
>>>>>> the release.
>>>>>>
>>>>>>> Non-compliant binaries would thus only be the result of a
>>>>>>> "mistake"
>>>>>>> (if the build system is flawed, it's another problem, unrelated 
>>>>>>> to
>>>>>>> the released contents, which is _source_) to be corrected per 
>>>>>>> se.
>>>>>>
>>>>>>
>>>>>> Not so. There are other failure modes.
>>>>>>
>>>>>> An automated build obviously reduces the chances of mistakes,
>>>>>> but it
>>>>>> can still create an archive containing files that should not be
>>>>>> there.
>>>>>> [Or indeed, omits files that should be present]
>>>>>> For example, the workspace contains spurious files which are
>>>>>> implicitly included by the assembly instructions.
>>>>>> Or the build process creates spurious files that are incorrectly
>>>>>> added
>>>>>> to the archive.
>>>>>> Or the build incorrectly includes jars that are supposed to be
>>>>>> provided by the end user
>>>>>> etc.
>>>>>>
>>>>>> I have seen all the above in RC votes.
>>>>>> There are probably other falure modes.
>>>>>>
>>>>>>> My proposition is that it's an independent step: once the build
>>>>>>> system is adjusted to the expectations, "correct" binaries can 
>>>>>>> be
>>>>>>> generated from the same tagged release.
>>>>>>
>>>>>>
>>>>>> It does not matter when the binary is built.
>>>>>> If it is distributed by the PMC as a formal release, it must not
>>>>>> contain any surprises, e.g. it must be licensed under the AL.
>>>>>>
>>>>>> It is therefore vital that the contents are as expected from the
>>>>>> build.
>>>>>>
>>>>>> Note also that a formal release becomes an act of the PMC by the
>>>>>> voting process.
>>>>>> The ASF can then assume responsibility for any legal issues that
>>>>>> may arise.
>>>>>> Otherwise it is entirely the personal responsibility of the
>>>>>> person who
>>>>>> releases it.
>>>>>
>>>>>
>>>>> I think the last two points are really important: binaries must 
>>>>> be
>>>>> checked and the foundation provides a legal protection for the
>>>>> project
>>>>> if something weird occurs.
>>>>>
>>>>> I also think another point is important: many if not most users 
>>>>> do
>>>>> really expect binaries and not source. From our internal Apache
>>>>> point
>>>>> of view, these are a by-product,. For many others it is the
>>>>> important
>>>>> thing. It is mostly true in maven land as dependencies are
>>>>> automatically retrieved in binary form, not source form. So the
>>>>> maven
>>>>> central repository as a distribution system is important.
>>>>>
>>>>> Even if for some security reason it sounds at first thought
>>>>> logical to
>>>>> rely on source only and compile oneself, in an industrial context
>>>>> project teams do not have enough time to do it for all their
>>>>> dependencies, so they use binaries provided by trusted third
>>>>> parties. A
>>>>> long time ago, I compiled a lot of free software tools for the
>>>>> department I worked for at that time. I do not do this anymore, 
>>>>> and
>>>>> trust the binaries provided by the packaging team for a 
>>>>> distribution
>>>>> (typically Debian). They do rely on source and compile
>>>>> themselves. Hey,
>>>>> I even think Emmanuel here belongs to the Debian java team ;-) I
>>>>> guess
>>>>> such teams that do rely on source are rather the exception than 
>>>>> the
>>>>> rule. The other examples I can think of are packaging teams,
>>>>> development teams that need bleeding edge (and will also directly
>>>>> depend on the repository, not even the release), projects that
>>>>> need to
>>>>> introduce their own patches and people who have critical needs 
>>>>> (for
>>>>> example when safety of people is concerned or when they need full
>>>>> control for legal or contractual reasons). Many other people
>>>>> download
>>>>> binaries directly and would simply not consider using a project
>>>>> if it
>>>>> is not readily available: they don't have time for this and don't
>>>>> want
>>>>> to learn how to build tens or hundred of different projects they
>>>>> simply
>>>>> use.
>>>>>
>>>>
>>>> I do not disagree with anything said on this thread. [In
>>>> particular, I
>>>> did not at all imply that any one committer could take 
>>>> responsibility
>>>> for releasing unchecked items.]
>>>>
>>>> I'm simply suggesting that what is called the release
>>>> process/management
>>>> could be made simpler (and _consequently_ could lead to more
>>>> regularly
>>>> releasing the CM code), by separating the concerns.
>>>> The concerns are
>>>>  1. "code" (the contents), and
>>>>  2. "artefacts" (the result of the build system acting on the
>>>> "code").
>>>>
>>>> Checking of one of these is largely independent from checking the
>>>> other.
>>>
>>>
>>> Unfortunately, not really.  One principle that we have (maybe not
>>> crystal clear in the release doco) is that when we do distribute
>>> binaries, they should really be "convenience binaries" which means
>>> that everything needed to create them is in the source or its
>>> documented dependencies.  What that means is that what we tag as 
>>> the
>>> source release needs to be able to generate any binaries that we
>>> subsequently release.  The only way to really test that is to
>>> generate the binaries and inspect them as part of verifying the 
>>> release.
>>
>>
>> Only way?  That's certainly not obvious to me: Since a tag/branch
>> uniquely identifies a set of files, that is, the "source release 
>> [that
>> is] able to generate any binaries that we subsequently release", if 
>> a
>> RM can do it at (source) release time, he (or someone else!) can do 
>> it
>> later, too (by running the build from a clone of the repository in 
>> its
>> tagged state).
>>
>>> As others have pointed out, anything we release has to be verified
>>> and voted on.  As RM and reviewer, I think it is actually easier to
>>> roll and verify source and binaries together.
>>
>
> +1
>
>>
>> It's precisely my main point.
>> I won't dispute that you can prefer doing both (and nobody would 
>> forbid
>> a RM to do just that) but the point is about the possibility to 
>> release
>> source-only code (as the first step of a two-step procedure which I
>> described earlier).
>> [IMHO, the two-step one seems easier (both for the RM and the 
>> reviewer),
>> (mileage does vary).]
>
> What is easier?
> It seems to me there will be at least one other step in your proposed
> process, i.e. a second VOTE e-mail

Yes, that's obviously what I meant:
Two steps == two votes

[But: source releases need not necessarily be accompanied with
"binaries", which, I imagine, could lead to official releases
occurring more often (due to the reduced number of checks).]

> These will both contain most of the same information.

No.
The first step is about the source, i.e. the code which humans create.
The second step is about the files which a build system creates.

As I indicated previously, the first vote will be about a set of
reviewers being satisfied with the state of the souce code, while
the second vote will be about another set of reviewers being satisfied
with the results of the build system ("no glitch", as you described
in an earlier message).

> Is the intention to announce the source release separately from the
> binary release?
> If so, there will need to be 2 announce mails, and 2 updates to the
> download page.

Is there a problem with that?
There are actually several possible cases (depending on the will of
the RM):
  * one-step release (only source code)
  * two-steps (source, then binaries based on that source)
  * combined (as is done up to now)
  * binaries (based on any previously released source)

>> In short is it forbidden (by the official/legal rules of ASF) to 
>> proceed
>> as I propose?
>
> Dunno, depends on what exactly you are proposing.

Cf. above (and previous mails).

In practice the release could (IIUC) be like the link provided
by Luc in RC1 of CM 3.4 (whose target was a TAR of the tagged
repository).


>> It is impossible technically?
>
> Currently the Maven build process creates:
> - Maven source and binary jars
> - ASF source and binary bundles

AFAIU, the JARs (source and binary) are "binaries", the binary
bundles are "binaries". Only the ASF source is "source".

> It's not clear to me what exactly you propose to release in stage 
> one,

The ASF source (e.g. in the form of a tarball, or the appropriate
"git clone" command).

> but there will need to be some changes to the process in order to
> release just the ASF source.

I don't see which.
A "source RM" would just stop the process after resolving/postponing
the pending issues, and checking the various reports about the source
code. [Then create the tag, and request a vote.]

A "binary RM" would take on from that point (a tagged repository), i.e.
create all the binaries, sign them, etc.

> There is no point releasing the Maven source jars separately from the
> binary jars; they are not complete as they only contain java files 
> for
> use with IDEs.

I don't understand that.
In principle, a JAR with the Java sources is indeed the necessary and
sufficient condition for users to create the executable bytecode, with
whatever build system they wish.
But I agree that it's not useful to not release all the files needed
to easily run maven. [And, for convenience, a source release would be
accompanied with instructions on how to build a JAR of the compiled
classes, using maven.]

> But in any case, AFAIK it is very tricky to release new files into an
> existing Maven folder, and it may cause problems for end users.

I don't understand what you mean by "release new files into an existing
Maven folder"...

Gilles

>>
>>
>>> Phil
>>>
>>>
>>>> [The more so that, as you said, no fool-proof link between the two
>>>> can
>>>> be ensured: From a security POV, checking the former requires a 
>>>> code
>>>> review, while using the latter requires trust in the build 
>>>> system.]
>>>>
>>>> Thus we could release the "code", after checking and voting on the
>>>> concerned elements (i.e. the repository state corresponding to a
>>>> specific tag + the web site).
>>>>
>>>> Then we could release the "binaries", as a convenience, after
>>>> checking
>>>> and voting on the concerned elements (i.e. the files about to be
>>>> distributed).
>>>>
>>>> I think that it's an added flexibility that would, for example, 
>>>> allow
>>>> the tagging of the repository without necessarily release binaries
>>>> (i.e.
>>>> not involving that part of the work); and to release binaries
>>>> (say, at
>>>> regular intervals) based on the latest tagged code (i.e. not
>>>> involving
>>>> the work about solving/evaluating/postponing issues).
>>>>
>>>> [I completely admit that, at first, it might look a little more
>>>> confusing for the plain user, but (IIUC) it would be a better
>>>> representation of the reality covered by stating that the ASF
>>>> releases source code.]
>>>>
>>>>
>>>> Best regards,
>>>> Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by sebb <se...@gmail.com>.
On 29 December 2014 at 10:36, Gilles <gi...@harfang.homelinux.org> wrote:
> On Sun, 28 Dec 2014 20:21:32 -0700, Phil Steitz wrote:
>>
>> On 12/28/14 11:46 AM, Gilles wrote:
>>>
>>> Hi.
>>>
>>> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
>>>>
>>>> Le 28/12/2014 00:22, sebb a écrit :
>>>>>
>>>>> On 27 December 2014 at 22:19, Gilles
>>>>> <gi...@harfang.homelinux.org> wrote:
>>>>>>
>>>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 24 December 2014 at 15:11, Gilles
>>>>>>> <gi...@harfang.homelinux.org> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math 3.4
>>>>>>>>>>>>> from release
>>>>>>>>>>>>> candidate 3.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Tag name:
>>>>>>>>>>>>>   MATH_3_4_RC3 (signature can be checked from git using
>>>>>>>>>>>>> 'git tag
>>>>>>>>>>>>> -v')
>>>>>>>>>>>>>
>>>>>>>>>>>>> Tag URL:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Is there a way to check that the source code referred to
>>>>>>>>>>>> above
>>>>>>>>>>>> was the one used to create the JAR of the ".class" files.
>>>>>>>>>>>> [Out of curiosity, not suspicion, of course...]
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yes, you can look at the end of the META-INF/MANIFEST.MS
>>>>>>>>>>> file embedded
>>>>>>>>>>> in the jar. The second-to-last entry is called
>>>>>>>>>>> Implementation-Build.
>>>>>>>>>>> It
>>>>>>>>>>> is automatically created by maven-jgit-buildnumber-plugin
>>>>>>>>>>> and contains
>>>>>>>>>>> the SHA1 identifier of the last commit used for the build.
>>>>>>>>>>> Here, is is
>>>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can check
>>>>>>>>>>> it really
>>>>>>>>>>> corresponds to the expected status of the git repository.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Can this be considered "secure", i.e. can't this entry in
>>>>>>>>>> the MANIFEST
>>>>>>>>>> file be modified to be the checksum of the repository but
>>>>>>>>>> with the
>>>>>>>>>> .class
>>>>>>>>>> files being substitued with those coming from another
>>>>>>>>>> compilation?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Modifying anything in the jar (either this entry within the
>>>>>>>>> manifest or
>>>>>>>>> any class) will modify the jar signature. So as long as
>>>>>>>>> people do check
>>>>>>>>> the global MD5, SHA1 or gpg signature we provide with our
>>>>>>>>> build, they
>>>>>>>>> are safe to assume the artifacts are Apache artifacts.
>>>>>>>>>
>>>>>>>>> This is not different from how releases are done with
>>>>>>>>> subversion as the
>>>>>>>>> source code control system, or even in C or C++ as the
>>>>>>>>> language. At one
>>>>>>>>> time, the release manager does perform a compilation and the
>>>>>>>>> fellow
>>>>>>>>> reviewers check the result. There is no fullproof process
>>>>>>>>> here, as
>>>>>>>>> always when security is involved. Even using an automated
>>>>>>>>> build and
>>>>>>>>> automatic signing on an Apache server would involve trust
>>>>>>>>> (i.e. one
>>>>>>>>> should assume that the server has not been tampered with,
>>>>>>>>> that the build
>>>>>>>>> process really does what it is expected to do, that the
>>>>>>>>> artifacts put to
>>>>>>>>> review are really the one created by the automatic process
>>>>>>>>> ...).
>>>>>>>>>
>>>>>>>>> Another point is that what we officially release is the
>>>>>>>>> source, which
>>>>>>>>> can be reviewed by external users. The binary parts are
>>>>>>>>> merely a
>>>>>>>>> convenience.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> That's an interesting point to come back to since it looks
>>>>>>>> like the
>>>>>>>> most time-consuming part of a release is not related to the
>>>>>>>> sources!
>>>>>>>>
>>>>>>>> Isn't it conceivable that a release could just be a commit
>>>>>>>> identifier
>>>>>>>> and a checksum of the repository?
>>>>>>>>
>>>>>>>> If the binaries are a just a convenience, why put so much
>>>>>>>> effort in it?
>>>>>>>> As a convenience, the artefacts could be produced after the
>>>>>>>> release,
>>>>>>>> accompanied with all the "caveat" notes which you mentioned.
>>>>>>>>
>>>>>>>> That would certainly increase the release rate.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Binary releases still need to be reviewed to ensure that the
>>>>>>> correct N
>>>>>>> & L files are present, and that the archives don't contain
>>>>>>> material
>>>>>>> with disallowed licenses.
>>>>>>>
>>>>>>> It's not unknown for automated build processes to include
>>>>>>> files that
>>>>>>> should not be present.
>>>>>>>
>>>>>>
>>>>>> I fail to see the difference of principle between the "release"
>>>>>> context
>>>>>> and, say, the daily snapshot context.
>>>>>
>>>>>
>>>>> Snapshots are not (should not) be promoted to the general public as
>>>>> releases of the ASF.
>>>>>
>>>>>> What I mean is that there seem to be a contradiction between
>>>>>> saying that
>>>>>> a "release" is only about _source_ and the obligation to check
>>>>>> _binaries_.
>>>>>
>>>>>
>>>>> There is no contradiction here.
>>>>> The ASF releases source, they are required in a release.
>>>>> Binaries are optional.
>>>>> That does not mean that the ASF mirror system can be used to
>>>>> distribute arbitrary binaries.
>>>>>
>>>>>> It can occur that disallowed material is, at some point in
>>>>>> time, part of
>>>>>> the repository and/or the snapshot binaries.
>>>>>> However, what is forbidden is... forbidden, at all times.
>>>>>
>>>>>
>>>>> As with most things, this is not a strict dichotomy.
>>>>>
>>>>>> If it is indeed a problem to distribute forbidden material,
>>>>>> shouldn't
>>>>>> this be corrected in the repository? [That's indeed what you
>>>>>> did with
>>>>>> the blocking of the release.]
>>>>>
>>>>>
>>>>> If the repo is discovered to contain disallowed material, it
>>>>> needs to
>>>>> be removed.
>>>>>
>>>>>> Then again, once the repository is "clean", it can be tagged
>>>>>> and that
>>>>>> tagged _source_ is the release.
>>>>>
>>>>>
>>>>> Not quite.
>>>>>
>>>>> A release is a source archive that is voted on and distributed
>>>>> via the
>>>>> ASF mirror system.
>>>>> The contents must agree with the source tag, but the source tag
>>>>> is not
>>>>> the release.
>>>>>
>>>>>> Non-compliant binaries would thus only be the result of a
>>>>>> "mistake"
>>>>>> (if the build system is flawed, it's another problem, unrelated to
>>>>>> the released contents, which is _source_) to be corrected per se.
>>>>>
>>>>>
>>>>> Not so. There are other failure modes.
>>>>>
>>>>> An automated build obviously reduces the chances of mistakes,
>>>>> but it
>>>>> can still create an archive containing files that should not be
>>>>> there.
>>>>> [Or indeed, omits files that should be present]
>>>>> For example, the workspace contains spurious files which are
>>>>> implicitly included by the assembly instructions.
>>>>> Or the build process creates spurious files that are incorrectly
>>>>> added
>>>>> to the archive.
>>>>> Or the build incorrectly includes jars that are supposed to be
>>>>> provided by the end user
>>>>> etc.
>>>>>
>>>>> I have seen all the above in RC votes.
>>>>> There are probably other falure modes.
>>>>>
>>>>>> My proposition is that it's an independent step: once the build
>>>>>> system is adjusted to the expectations, "correct" binaries can be
>>>>>> generated from the same tagged release.
>>>>>
>>>>>
>>>>> It does not matter when the binary is built.
>>>>> If it is distributed by the PMC as a formal release, it must not
>>>>> contain any surprises, e.g. it must be licensed under the AL.
>>>>>
>>>>> It is therefore vital that the contents are as expected from the
>>>>> build.
>>>>>
>>>>> Note also that a formal release becomes an act of the PMC by the
>>>>> voting process.
>>>>> The ASF can then assume responsibility for any legal issues that
>>>>> may arise.
>>>>> Otherwise it is entirely the personal responsibility of the
>>>>> person who
>>>>> releases it.
>>>>
>>>>
>>>> I think the last two points are really important: binaries must be
>>>> checked and the foundation provides a legal protection for the
>>>> project
>>>> if something weird occurs.
>>>>
>>>> I also think another point is important: many if not most users do
>>>> really expect binaries and not source. From our internal Apache
>>>> point
>>>> of view, these are a by-product,. For many others it is the
>>>> important
>>>> thing. It is mostly true in maven land as dependencies are
>>>> automatically retrieved in binary form, not source form. So the
>>>> maven
>>>> central repository as a distribution system is important.
>>>>
>>>> Even if for some security reason it sounds at first thought
>>>> logical to
>>>> rely on source only and compile oneself, in an industrial context
>>>> project teams do not have enough time to do it for all their
>>>> dependencies, so they use binaries provided by trusted third
>>>> parties. A
>>>> long time ago, I compiled a lot of free software tools for the
>>>> department I worked for at that time. I do not do this anymore, and
>>>> trust the binaries provided by the packaging team for a distribution
>>>> (typically Debian). They do rely on source and compile
>>>> themselves. Hey,
>>>> I even think Emmanuel here belongs to the Debian java team ;-) I
>>>> guess
>>>> such teams that do rely on source are rather the exception than the
>>>> rule. The other examples I can think of are packaging teams,
>>>> development teams that need bleeding edge (and will also directly
>>>> depend on the repository, not even the release), projects that
>>>> need to
>>>> introduce their own patches and people who have critical needs (for
>>>> example when safety of people is concerned or when they need full
>>>> control for legal or contractual reasons). Many other people
>>>> download
>>>> binaries directly and would simply not consider using a project
>>>> if it
>>>> is not readily available: they don't have time for this and don't
>>>> want
>>>> to learn how to build tens or hundred of different projects they
>>>> simply
>>>> use.
>>>>
>>>
>>> I do not disagree with anything said on this thread. [In
>>> particular, I
>>> did not at all imply that any one committer could take responsibility
>>> for releasing unchecked items.]
>>>
>>> I'm simply suggesting that what is called the release
>>> process/management
>>> could be made simpler (and _consequently_ could lead to more
>>> regularly
>>> releasing the CM code), by separating the concerns.
>>> The concerns are
>>>  1. "code" (the contents), and
>>>  2. "artefacts" (the result of the build system acting on the
>>> "code").
>>>
>>> Checking of one of these is largely independent from checking the
>>> other.
>>
>>
>> Unfortunately, not really.  One principle that we have (maybe not
>> crystal clear in the release doco) is that when we do distribute
>> binaries, they should really be "convenience binaries" which means
>> that everything needed to create them is in the source or its
>> documented dependencies.  What that means is that what we tag as the
>> source release needs to be able to generate any binaries that we
>> subsequently release.  The only way to really test that is to
>> generate the binaries and inspect them as part of verifying the release.
>
>
> Only way?  That's certainly not obvious to me: Since a tag/branch
> uniquely identifies a set of files, that is, the "source release [that
> is] able to generate any binaries that we subsequently release", if a
> RM can do it at (source) release time, he (or someone else!) can do it
> later, too (by running the build from a clone of the repository in its
> tagged state).
>
>> As others have pointed out, anything we release has to be verified
>> and voted on.  As RM and reviewer, I think it is actually easier to
>> roll and verify source and binaries together.
>

+1

>
> It's precisely my main point.
> I won't dispute that you can prefer doing both (and nobody would forbid
> a RM to do just that) but the point is about the possibility to release
> source-only code (as the first step of a two-step procedure which I
> described earlier).
> [IMHO, the two-step one seems easier (both for the RM and the reviewer),
> (mileage does vary).]

What is easier?
It seems to me there will be at least one other step in your proposed
process, i.e. a second VOTE e-mail
These will both contain most of the same information.

Is the intention to announce the source release separately from the
binary release?
If so, there will need to be 2 announce mails, and 2 updates to the
download page.

> In short is it forbidden (by the official/legal rules of ASF) to proceed
> as I propose?

Dunno, depends on what exactly you are proposing.

> It is impossible technically?

Currently the Maven build process creates:
- Maven source and binary jars
- ASF source and binary bundles

It's not clear to me what exactly you propose to release in stage one,
but there will need to be some changes to the process in order to
release just the ASF source.

There is no point releasing the Maven source jars separately from the
binary jars; they are not complete as they only contain java files for
use with IDEs.
But in any case, AFAIK it is very tricky to release new files into an
existing Maven folder, and it may cause problems for end users.

>
> Gilles
>
>
>> Phil
>>
>>
>>> [The more so that, as you said, no fool-proof link between the two
>>> can
>>> be ensured: From a security POV, checking the former requires a code
>>> review, while using the latter requires trust in the build system.]
>>>
>>> Thus we could release the "code", after checking and voting on the
>>> concerned elements (i.e. the repository state corresponding to a
>>> specific tag + the web site).
>>>
>>> Then we could release the "binaries", as a convenience, after
>>> checking
>>> and voting on the concerned elements (i.e. the files about to be
>>> distributed).
>>>
>>> I think that it's an added flexibility that would, for example, allow
>>> the tagging of the repository without necessarily release binaries
>>> (i.e.
>>> not involving that part of the work); and to release binaries
>>> (say, at
>>> regular intervals) based on the latest tagged code (i.e. not
>>> involving
>>> the work about solving/evaluating/postponing issues).
>>>
>>> [I completely admit that, at first, it might look a little more
>>> confusing for the plain user, but (IIUC) it would be a better
>>> representation of the reality covered by stating that the ASF
>>> releases source code.]
>>>
>>>
>>> Best regards,
>>> Gilles
>>>
>>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by Gilles <gi...@harfang.homelinux.org>.
On Sun, 28 Dec 2014 20:21:32 -0700, Phil Steitz wrote:
> On 12/28/14 11:46 AM, Gilles wrote:
>> Hi.
>>
>> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
>>> Le 28/12/2014 00:22, sebb a écrit :
>>>> On 27 December 2014 at 22:19, Gilles
>>>> <gi...@harfang.homelinux.org> wrote:
>>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
>>>>>>
>>>>>> On 24 December 2014 at 15:11, Gilles
>>>>>> <gi...@harfang.homelinux.org> wrote:
>>>>>>>
>>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math 3.4
>>>>>>>>>>>> from release
>>>>>>>>>>>> candidate 3.
>>>>>>>>>>>>
>>>>>>>>>>>> Tag name:
>>>>>>>>>>>>   MATH_3_4_RC3 (signature can be checked from git using
>>>>>>>>>>>> 'git tag
>>>>>>>>>>>> -v')
>>>>>>>>>>>>
>>>>>>>>>>>> Tag URL:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 
>>>>>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Is there a way to check that the source code referred to
>>>>>>>>>>> above
>>>>>>>>>>> was the one used to create the JAR of the ".class" files.
>>>>>>>>>>> [Out of curiosity, not suspicion, of course...]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yes, you can look at the end of the META-INF/MANIFEST.MS
>>>>>>>>>> file embedded
>>>>>>>>>> in the jar. The second-to-last entry is called
>>>>>>>>>> Implementation-Build.
>>>>>>>>>> It
>>>>>>>>>> is automatically created by maven-jgit-buildnumber-plugin
>>>>>>>>>> and contains
>>>>>>>>>> the SHA1 identifier of the last commit used for the build.
>>>>>>>>>> Here, is is
>>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can check
>>>>>>>>>> it really
>>>>>>>>>> corresponds to the expected status of the git repository.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Can this be considered "secure", i.e. can't this entry in
>>>>>>>>> the MANIFEST
>>>>>>>>> file be modified to be the checksum of the repository but
>>>>>>>>> with the
>>>>>>>>> .class
>>>>>>>>> files being substitued with those coming from another
>>>>>>>>> compilation?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Modifying anything in the jar (either this entry within the
>>>>>>>> manifest or
>>>>>>>> any class) will modify the jar signature. So as long as
>>>>>>>> people do check
>>>>>>>> the global MD5, SHA1 or gpg signature we provide with our
>>>>>>>> build, they
>>>>>>>> are safe to assume the artifacts are Apache artifacts.
>>>>>>>>
>>>>>>>> This is not different from how releases are done with
>>>>>>>> subversion as the
>>>>>>>> source code control system, or even in C or C++ as the
>>>>>>>> language. At one
>>>>>>>> time, the release manager does perform a compilation and the
>>>>>>>> fellow
>>>>>>>> reviewers check the result. There is no fullproof process
>>>>>>>> here, as
>>>>>>>> always when security is involved. Even using an automated
>>>>>>>> build and
>>>>>>>> automatic signing on an Apache server would involve trust
>>>>>>>> (i.e. one
>>>>>>>> should assume that the server has not been tampered with,
>>>>>>>> that the build
>>>>>>>> process really does what it is expected to do, that the
>>>>>>>> artifacts put to
>>>>>>>> review are really the one created by the automatic process
>>>>>>>> ...).
>>>>>>>>
>>>>>>>> Another point is that what we officially release is the
>>>>>>>> source, which
>>>>>>>> can be reviewed by external users. The binary parts are
>>>>>>>> merely a
>>>>>>>> convenience.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> That's an interesting point to come back to since it looks
>>>>>>> like the
>>>>>>> most time-consuming part of a release is not related to the
>>>>>>> sources!
>>>>>>>
>>>>>>> Isn't it conceivable that a release could just be a commit
>>>>>>> identifier
>>>>>>> and a checksum of the repository?
>>>>>>>
>>>>>>> If the binaries are a just a convenience, why put so much
>>>>>>> effort in it?
>>>>>>> As a convenience, the artefacts could be produced after the
>>>>>>> release,
>>>>>>> accompanied with all the "caveat" notes which you mentioned.
>>>>>>>
>>>>>>> That would certainly increase the release rate.
>>>>>>
>>>>>>
>>>>>> Binary releases still need to be reviewed to ensure that the
>>>>>> correct N
>>>>>> & L files are present, and that the archives don't contain
>>>>>> material
>>>>>> with disallowed licenses.
>>>>>>
>>>>>> It's not unknown for automated build processes to include
>>>>>> files that
>>>>>> should not be present.
>>>>>>
>>>>>
>>>>> I fail to see the difference of principle between the "release"
>>>>> context
>>>>> and, say, the daily snapshot context.
>>>>
>>>> Snapshots are not (should not) be promoted to the general public 
>>>> as
>>>> releases of the ASF.
>>>>
>>>>> What I mean is that there seem to be a contradiction between
>>>>> saying that
>>>>> a "release" is only about _source_ and the obligation to check
>>>>> _binaries_.
>>>>
>>>> There is no contradiction here.
>>>> The ASF releases source, they are required in a release.
>>>> Binaries are optional.
>>>> That does not mean that the ASF mirror system can be used to
>>>> distribute arbitrary binaries.
>>>>
>>>>> It can occur that disallowed material is, at some point in
>>>>> time, part of
>>>>> the repository and/or the snapshot binaries.
>>>>> However, what is forbidden is... forbidden, at all times.
>>>>
>>>> As with most things, this is not a strict dichotomy.
>>>>
>>>>> If it is indeed a problem to distribute forbidden material,
>>>>> shouldn't
>>>>> this be corrected in the repository? [That's indeed what you
>>>>> did with
>>>>> the blocking of the release.]
>>>>
>>>> If the repo is discovered to contain disallowed material, it
>>>> needs to
>>>> be removed.
>>>>
>>>>> Then again, once the repository is "clean", it can be tagged
>>>>> and that
>>>>> tagged _source_ is the release.
>>>>
>>>> Not quite.
>>>>
>>>> A release is a source archive that is voted on and distributed
>>>> via the
>>>> ASF mirror system.
>>>> The contents must agree with the source tag, but the source tag
>>>> is not
>>>> the release.
>>>>
>>>>> Non-compliant binaries would thus only be the result of a
>>>>> "mistake"
>>>>> (if the build system is flawed, it's another problem, unrelated 
>>>>> to
>>>>> the released contents, which is _source_) to be corrected per se.
>>>>
>>>> Not so. There are other failure modes.
>>>>
>>>> An automated build obviously reduces the chances of mistakes,
>>>> but it
>>>> can still create an archive containing files that should not be
>>>> there.
>>>> [Or indeed, omits files that should be present]
>>>> For example, the workspace contains spurious files which are
>>>> implicitly included by the assembly instructions.
>>>> Or the build process creates spurious files that are incorrectly
>>>> added
>>>> to the archive.
>>>> Or the build incorrectly includes jars that are supposed to be
>>>> provided by the end user
>>>> etc.
>>>>
>>>> I have seen all the above in RC votes.
>>>> There are probably other falure modes.
>>>>
>>>>> My proposition is that it's an independent step: once the build
>>>>> system is adjusted to the expectations, "correct" binaries can be
>>>>> generated from the same tagged release.
>>>>
>>>> It does not matter when the binary is built.
>>>> If it is distributed by the PMC as a formal release, it must not
>>>> contain any surprises, e.g. it must be licensed under the AL.
>>>>
>>>> It is therefore vital that the contents are as expected from the
>>>> build.
>>>>
>>>> Note also that a formal release becomes an act of the PMC by the
>>>> voting process.
>>>> The ASF can then assume responsibility for any legal issues that
>>>> may arise.
>>>> Otherwise it is entirely the personal responsibility of the
>>>> person who
>>>> releases it.
>>>
>>> I think the last two points are really important: binaries must be
>>> checked and the foundation provides a legal protection for the
>>> project
>>> if something weird occurs.
>>>
>>> I also think another point is important: many if not most users do
>>> really expect binaries and not source. From our internal Apache
>>> point
>>> of view, these are a by-product,. For many others it is the
>>> important
>>> thing. It is mostly true in maven land as dependencies are
>>> automatically retrieved in binary form, not source form. So the
>>> maven
>>> central repository as a distribution system is important.
>>>
>>> Even if for some security reason it sounds at first thought
>>> logical to
>>> rely on source only and compile oneself, in an industrial context
>>> project teams do not have enough time to do it for all their
>>> dependencies, so they use binaries provided by trusted third
>>> parties. A
>>> long time ago, I compiled a lot of free software tools for the
>>> department I worked for at that time. I do not do this anymore, and
>>> trust the binaries provided by the packaging team for a 
>>> distribution
>>> (typically Debian). They do rely on source and compile
>>> themselves. Hey,
>>> I even think Emmanuel here belongs to the Debian java team ;-) I
>>> guess
>>> such teams that do rely on source are rather the exception than the
>>> rule. The other examples I can think of are packaging teams,
>>> development teams that need bleeding edge (and will also directly
>>> depend on the repository, not even the release), projects that
>>> need to
>>> introduce their own patches and people who have critical needs (for
>>> example when safety of people is concerned or when they need full
>>> control for legal or contractual reasons). Many other people
>>> download
>>> binaries directly and would simply not consider using a project
>>> if it
>>> is not readily available: they don't have time for this and don't
>>> want
>>> to learn how to build tens or hundred of different projects they
>>> simply
>>> use.
>>>
>>
>> I do not disagree with anything said on this thread. [In
>> particular, I
>> did not at all imply that any one committer could take 
>> responsibility
>> for releasing unchecked items.]
>>
>> I'm simply suggesting that what is called the release
>> process/management
>> could be made simpler (and _consequently_ could lead to more
>> regularly
>> releasing the CM code), by separating the concerns.
>> The concerns are
>>  1. "code" (the contents), and
>>  2. "artefacts" (the result of the build system acting on the
>> "code").
>>
>> Checking of one of these is largely independent from checking the
>> other.
>
> Unfortunately, not really.  One principle that we have (maybe not
> crystal clear in the release doco) is that when we do distribute
> binaries, they should really be "convenience binaries" which means
> that everything needed to create them is in the source or its
> documented dependencies.  What that means is that what we tag as the
> source release needs to be able to generate any binaries that we
> subsequently release.  The only way to really test that is to
> generate the binaries and inspect them as part of verifying the 
> release.

Only way?  That's certainly not obvious to me: Since a tag/branch
uniquely identifies a set of files, that is, the "source release [that
is] able to generate any binaries that we subsequently release", if a
RM can do it at (source) release time, he (or someone else!) can do it
later, too (by running the build from a clone of the repository in its
tagged state).

> As others have pointed out, anything we release has to be verified
> and voted on.  As RM and reviewer, I think it is actually easier to
> roll and verify source and binaries together.

It's precisely my main point.
I won't dispute that you can prefer doing both (and nobody would forbid
a RM to do just that) but the point is about the possibility to release
source-only code (as the first step of a two-step procedure which I
described earlier).
[IMHO, the two-step one seems easier (both for the RM and the 
reviewer),
(mileage does vary).]

In short is it forbidden (by the official/legal rules of ASF) to 
proceed
as I propose?
It is impossible technically?


Gilles

> Phil
>
>
>> [The more so that, as you said, no fool-proof link between the two
>> can
>> be ensured: From a security POV, checking the former requires a code
>> review, while using the latter requires trust in the build system.]
>>
>> Thus we could release the "code", after checking and voting on the
>> concerned elements (i.e. the repository state corresponding to a
>> specific tag + the web site).
>>
>> Then we could release the "binaries", as a convenience, after
>> checking
>> and voting on the concerned elements (i.e. the files about to be
>> distributed).
>>
>> I think that it's an added flexibility that would, for example, 
>> allow
>> the tagging of the repository without necessarily release binaries
>> (i.e.
>> not involving that part of the work); and to release binaries
>> (say, at
>> regular intervals) based on the latest tagged code (i.e. not
>> involving
>> the work about solving/evaluating/postponing issues).
>>
>> [I completely admit that, at first, it might look a little more
>> confusing for the plain user, but (IIUC) it would be a better
>> representation of the reality covered by stating that the ASF
>> releases source code.]
>>
>>
>> Best regards,
>> Gilles
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] What's in a release

Posted by Phil Steitz <ph...@gmail.com>.
On 12/28/14 11:46 AM, Gilles wrote:
> Hi.
>
> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
>> Le 28/12/2014 00:22, sebb a écrit :
>>> On 27 December 2014 at 22:19, Gilles
>>> <gi...@harfang.homelinux.org> wrote:
>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
>>>>>
>>>>> On 24 December 2014 at 15:11, Gilles
>>>>> <gi...@harfang.homelinux.org> wrote:
>>>>>>
>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe wrote:
>>>>>>>
>>>>>>>
>>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math 3.4
>>>>>>>>>>> from release
>>>>>>>>>>> candidate 3.
>>>>>>>>>>>
>>>>>>>>>>> Tag name:
>>>>>>>>>>>   MATH_3_4_RC3 (signature can be checked from git using
>>>>>>>>>>> 'git tag
>>>>>>>>>>> -v')
>>>>>>>>>>>
>>>>>>>>>>> Tag URL:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Is there a way to check that the source code referred to
>>>>>>>>>> above
>>>>>>>>>> was the one used to create the JAR of the ".class" files.
>>>>>>>>>> [Out of curiosity, not suspicion, of course...]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes, you can look at the end of the META-INF/MANIFEST.MS
>>>>>>>>> file embedded
>>>>>>>>> in the jar. The second-to-last entry is called
>>>>>>>>> Implementation-Build.
>>>>>>>>> It
>>>>>>>>> is automatically created by maven-jgit-buildnumber-plugin
>>>>>>>>> and contains
>>>>>>>>> the SHA1 identifier of the last commit used for the build.
>>>>>>>>> Here, is is
>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can check
>>>>>>>>> it really
>>>>>>>>> corresponds to the expected status of the git repository.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Can this be considered "secure", i.e. can't this entry in
>>>>>>>> the MANIFEST
>>>>>>>> file be modified to be the checksum of the repository but
>>>>>>>> with the
>>>>>>>> .class
>>>>>>>> files being substitued with those coming from another
>>>>>>>> compilation?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Modifying anything in the jar (either this entry within the
>>>>>>> manifest or
>>>>>>> any class) will modify the jar signature. So as long as
>>>>>>> people do check
>>>>>>> the global MD5, SHA1 or gpg signature we provide with our
>>>>>>> build, they
>>>>>>> are safe to assume the artifacts are Apache artifacts.
>>>>>>>
>>>>>>> This is not different from how releases are done with
>>>>>>> subversion as the
>>>>>>> source code control system, or even in C or C++ as the
>>>>>>> language. At one
>>>>>>> time, the release manager does perform a compilation and the
>>>>>>> fellow
>>>>>>> reviewers check the result. There is no fullproof process
>>>>>>> here, as
>>>>>>> always when security is involved. Even using an automated
>>>>>>> build and
>>>>>>> automatic signing on an Apache server would involve trust
>>>>>>> (i.e. one
>>>>>>> should assume that the server has not been tampered with,
>>>>>>> that the build
>>>>>>> process really does what it is expected to do, that the
>>>>>>> artifacts put to
>>>>>>> review are really the one created by the automatic process
>>>>>>> ...).
>>>>>>>
>>>>>>> Another point is that what we officially release is the
>>>>>>> source, which
>>>>>>> can be reviewed by external users. The binary parts are
>>>>>>> merely a
>>>>>>> convenience.
>>>>>>
>>>>>>
>>>>>>
>>>>>> That's an interesting point to come back to since it looks
>>>>>> like the
>>>>>> most time-consuming part of a release is not related to the
>>>>>> sources!
>>>>>>
>>>>>> Isn't it conceivable that a release could just be a commit
>>>>>> identifier
>>>>>> and a checksum of the repository?
>>>>>>
>>>>>> If the binaries are a just a convenience, why put so much
>>>>>> effort in it?
>>>>>> As a convenience, the artefacts could be produced after the
>>>>>> release,
>>>>>> accompanied with all the "caveat" notes which you mentioned.
>>>>>>
>>>>>> That would certainly increase the release rate.
>>>>>
>>>>>
>>>>> Binary releases still need to be reviewed to ensure that the
>>>>> correct N
>>>>> & L files are present, and that the archives don't contain
>>>>> material
>>>>> with disallowed licenses.
>>>>>
>>>>> It's not unknown for automated build processes to include
>>>>> files that
>>>>> should not be present.
>>>>>
>>>>
>>>> I fail to see the difference of principle between the "release"
>>>> context
>>>> and, say, the daily snapshot context.
>>>
>>> Snapshots are not (should not) be promoted to the general public as
>>> releases of the ASF.
>>>
>>>> What I mean is that there seem to be a contradiction between
>>>> saying that
>>>> a "release" is only about _source_ and the obligation to check
>>>> _binaries_.
>>>
>>> There is no contradiction here.
>>> The ASF releases source, they are required in a release.
>>> Binaries are optional.
>>> That does not mean that the ASF mirror system can be used to
>>> distribute arbitrary binaries.
>>>
>>>> It can occur that disallowed material is, at some point in
>>>> time, part of
>>>> the repository and/or the snapshot binaries.
>>>> However, what is forbidden is... forbidden, at all times.
>>>
>>> As with most things, this is not a strict dichotomy.
>>>
>>>> If it is indeed a problem to distribute forbidden material,
>>>> shouldn't
>>>> this be corrected in the repository? [That's indeed what you
>>>> did with
>>>> the blocking of the release.]
>>>
>>> If the repo is discovered to contain disallowed material, it
>>> needs to
>>> be removed.
>>>
>>>> Then again, once the repository is "clean", it can be tagged
>>>> and that
>>>> tagged _source_ is the release.
>>>
>>> Not quite.
>>>
>>> A release is a source archive that is voted on and distributed
>>> via the
>>> ASF mirror system.
>>> The contents must agree with the source tag, but the source tag
>>> is not
>>> the release.
>>>
>>>> Non-compliant binaries would thus only be the result of a
>>>> "mistake"
>>>> (if the build system is flawed, it's another problem, unrelated to
>>>> the released contents, which is _source_) to be corrected per se.
>>>
>>> Not so. There are other failure modes.
>>>
>>> An automated build obviously reduces the chances of mistakes,
>>> but it
>>> can still create an archive containing files that should not be
>>> there.
>>> [Or indeed, omits files that should be present]
>>> For example, the workspace contains spurious files which are
>>> implicitly included by the assembly instructions.
>>> Or the build process creates spurious files that are incorrectly
>>> added
>>> to the archive.
>>> Or the build incorrectly includes jars that are supposed to be
>>> provided by the end user
>>> etc.
>>>
>>> I have seen all the above in RC votes.
>>> There are probably other falure modes.
>>>
>>>> My proposition is that it's an independent step: once the build
>>>> system is adjusted to the expectations, "correct" binaries can be
>>>> generated from the same tagged release.
>>>
>>> It does not matter when the binary is built.
>>> If it is distributed by the PMC as a formal release, it must not
>>> contain any surprises, e.g. it must be licensed under the AL.
>>>
>>> It is therefore vital that the contents are as expected from the
>>> build.
>>>
>>> Note also that a formal release becomes an act of the PMC by the
>>> voting process.
>>> The ASF can then assume responsibility for any legal issues that
>>> may arise.
>>> Otherwise it is entirely the personal responsibility of the
>>> person who
>>> releases it.
>>
>> I think the last two points are really important: binaries must be
>> checked and the foundation provides a legal protection for the
>> project
>> if something weird occurs.
>>
>> I also think another point is important: many if not most users do
>> really expect binaries and not source. From our internal Apache
>> point
>> of view, these are a by-product,. For many others it is the
>> important
>> thing. It is mostly true in maven land as dependencies are
>> automatically retrieved in binary form, not source form. So the
>> maven
>> central repository as a distribution system is important.
>>
>> Even if for some security reason it sounds at first thought
>> logical to
>> rely on source only and compile oneself, in an industrial context
>> project teams do not have enough time to do it for all their
>> dependencies, so they use binaries provided by trusted third
>> parties. A
>> long time ago, I compiled a lot of free software tools for the
>> department I worked for at that time. I do not do this anymore, and
>> trust the binaries provided by the packaging team for a distribution
>> (typically Debian). They do rely on source and compile
>> themselves. Hey,
>> I even think Emmanuel here belongs to the Debian java team ;-) I
>> guess
>> such teams that do rely on source are rather the exception than the
>> rule. The other examples I can think of are packaging teams,
>> development teams that need bleeding edge (and will also directly
>> depend on the repository, not even the release), projects that
>> need to
>> introduce their own patches and people who have critical needs (for
>> example when safety of people is concerned or when they need full
>> control for legal or contractual reasons). Many other people
>> download
>> binaries directly and would simply not consider using a project
>> if it
>> is not readily available: they don't have time for this and don't
>> want
>> to learn how to build tens or hundred of different projects they
>> simply
>> use.
>>
>
> I do not disagree with anything said on this thread. [In
> particular, I
> did not at all imply that any one committer could take responsibility
> for releasing unchecked items.]
>
> I'm simply suggesting that what is called the release
> process/management
> could be made simpler (and _consequently_ could lead to more
> regularly
> releasing the CM code), by separating the concerns.
> The concerns are
>  1. "code" (the contents), and
>  2. "artefacts" (the result of the build system acting on the
> "code").
>
> Checking of one of these is largely independent from checking the
> other.

Unfortunately, not really.  One principle that we have (maybe not
crystal clear in the release doco) is that when we do distribute
binaries, they should really be "convenience binaries" which means
that everything needed to create them is in the source or its
documented dependencies.  What that means is that what we tag as the
source release needs to be able to generate any binaries that we
subsequently release.  The only way to really test that is to
generate the binaries and inspect them as part of verifying the release.

As others have pointed out, anything we release has to be verified
and voted on.  As RM and reviewer, I think it is actually easier to
roll and verify source and binaries together. 

Phil


> [The more so that, as you said, no fool-proof link between the two
> can
> be ensured: From a security POV, checking the former requires a code
> review, while using the latter requires trust in the build system.]
>
> Thus we could release the "code", after checking and voting on the
> concerned elements (i.e. the repository state corresponding to a
> specific tag + the web site).
>
> Then we could release the "binaries", as a convenience, after
> checking
> and voting on the concerned elements (i.e. the files about to be
> distributed).
>
> I think that it's an added flexibility that would, for example, allow
> the tagging of the repository without necessarily release binaries
> (i.e.
> not involving that part of the work); and to release binaries
> (say, at
> regular intervals) based on the latest tagged code (i.e. not
> involving
> the work about solving/evaluating/postponing issues).
>
> [I completely admit that, at first, it might look a little more
> confusing for the plain user, but (IIUC) it would be a better
> representation of the reality covered by stating that the ASF
> releases source code.]
>
>
> Best regards,
> Gilles
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org