You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@incubator.apache.org by "John D. Ament" <jo...@apache.org> on 2014/12/18 14:58:22 UTC

Votes for git repos - commit id vs tag

All,

I was looking through the incubator site and I don't see anything definite.

Whenever a podling goes for a vote, and they include a git tag in their
vote message, it's typically asked to change to a commit id.  It seems to
me this is done for the reproducible builds concept.  Tags are mutable, and
therefore could be changed and rebuilding a tag could give you a different
result.

So, is this the right understanding? Do we want to ask podlings to always
submit a git commit id?  If so, is there a place in the website we can
clarify this?

Thanks,

John

Re: Votes for git repos - commit id vs tag

Posted by Konstantin Boudnik <co...@apache.org>.
Good point!

On Thu, Dec 18, 2014 at 01:58PM, John D. Ament wrote:
> All,
> 
> I was looking through the incubator site and I don't see anything definite.
> 
> Whenever a podling goes for a vote, and they include a git tag in their
> vote message, it's typically asked to change to a commit id.  It seems to
> me this is done for the reproducible builds concept.  Tags are mutable, and
> therefore could be changed and rebuilding a tag could give you a different
> result.
> 
> So, is this the right understanding? Do we want to ask podlings to always
> submit a git commit id?  If so, is there a place in the website we can
> clarify this?
> 
> Thanks,
> 
> John

Re: Votes for git repos - commit id vs tag

Posted by Branko Čibej <br...@apache.org>.
On 20.12.2014 07:16, Niclas Hedhman wrote:
> Tags are at best a convenience, and nothing else. But so are commit id,
> since in the long-term, GIT may not prevail and the commit id is in effect
> an internal artifact of Git itself, not the concept of version control
> systems. Compare how commit numbers from Subversion are imported to Git
> repositories, or not... But tags are imported, if the ttb structure in
> subversion is used.

Any release is cut from a current canonical repository, which is always
hosted on ASF infrastructure. The point is that current releases should
be identifiable in the current repository, because anyone who votes on a
release /should/ verify that the tarball matches some state in the repo;
otherwise they don't know what they're signing, and the release isn't
repeatable; that would sort of negate the whole point of version control.

In the case of Git, the commit-id is the most stable global identifier
for a particular state of the repository. (I say "most stable" because,
in general, Git history is mutable ... sigh).

If at some future date the repository is imported in some shiny new
version control system, that new system is bound to have some kind of
global state identifier, mutable or not; and commit-ids may or may not
be accurately represented by it; but that's completely irrelevant for
current releases. It's marginally relevant for reproducing past
releases, but that can be solved by archiving the whole "old" repository.

-- Brane

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Votes for git repos - commit id vs tag

Posted by Ryan Blue <bl...@cloudera.com>.
On 12/23/2014 01:27 AM, Bertrand Delacretaz wrote:
> On Tue, Dec 23, 2014 at 3:54 AM, Marvin Humphrey <ma...@rectangular.com> wrote:
>> ...Although many consider it best practice for release tarballs to be tied back
>> to a specific version control identifier (including me), Apache release policy
>> does not require it....
>
> As we tried to say above, the timeline gets into play here.
>
> Tying releases to version control is very useful *in the short term*
> for people to verify the release, but in the long term you cannot
> count on it. That's why our release policy centers on tarballs, the
> rest is convenience.

I understand wanting to follow general ASF policy, but I wouldn't mind a 
more strict policy for incubating projects, simply because it is much 
easier for IPMC members to verify the release when we most need those 
careful checks. The release checklist appears to allow requirements 
imposed by Incubator policy:

 > Each review item in this list is either required by Foundation-wide 
policy and would block a release by any Apache top-level project, or is 
required by Incubator policy.

Either way, I think we should clarify this in the documentation to state 
whether or not the Incubator requires a git id or svn equivalent, and 
where that can be linked to. If we don't require a tarball being tied 
back to git, then it doesn't matter whether a tag or id is given for 
convenience.

rb


-- 
Ryan Blue
Software Engineer
Cloudera, Inc.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


RE: Votes for git repos - commit id vs tag

Posted by "Dennis E. Hamilton" <de...@acm.org>.

 -- in reply to --
From: Bertrand Delacretaz [mailto:bdelacretaz@apache.org] 
Sent: Tuesday, December 23, 2014 01:28
To: Incubator General
Subject: Re: Votes for git repos - commit id vs tag

On Tue, Dec 23, 2014 at 3:54 AM, Marvin Humphrey <ma...@rectangular.com> wrote:
> ...Although many consider it best practice for release tarballs to be tied back
> to a specific version control identifier (including me), Apache release policy
> does not require it....

As we tried to say above, the timeline gets into play here.

Tying releases to version control is very useful *in the short term*
for people to verify the release, but in the long term you cannot
count on it. That's why our release policy centers on tarballs, the
rest is convenience.

<orcmid>
   I can see that the policy is appropriate because of the archival preservation 
   of the tarball.  On the other hand, tarballs do not usually preserve history.
   Although one cannot be assured of preservation of the repository as a match, 
   knowing what the match was, and might still be, is useful historically.  
   One could recommend the convenience without it being a substitute for
   the tarball.  (Of course, with Git one can snapshot all of it [;<).
</orcmid>

-Bertrand

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Votes for git repos - commit id vs tag

Posted by sebb <se...@gmail.com>.
On 23 December 2014 at 09:27, Bertrand Delacretaz
<bd...@apache.org> wrote:
> On Tue, Dec 23, 2014 at 3:54 AM, Marvin Humphrey <ma...@rectangular.com> wrote:
>> ...Although many consider it best practice for release tarballs to be tied back
>> to a specific version control identifier (including me), Apache release policy
>> does not require it....
>
> As we tried to say above, the timeline gets into play here.
>
> Tying releases to version control is very useful *in the short term*
> for people to verify the release, but in the long term you cannot
> count on it. That's why our release policy centers on tarballs, the
> rest is convenience.

I'm not sure it's feasible to verify that all the source files in a
release have the correct license without reference to version control.

Also provenance of source files is extremely important.

If there is ever a question as to how a particular source file came to
be included in a release, then being able to trace the release tarball
back to the vote thread and thence to version control could be vital.

As far as consumers are concerned, it is the release tarball that is important.
But as far as the ASF is concerned, ISTM that being able to trace
provenance back to version control is at least as important.

> -Bertrand
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Votes for git repos - commit id vs tag

Posted by Bertrand Delacretaz <bd...@apache.org>.
On Tue, Dec 23, 2014 at 3:54 AM, Marvin Humphrey <ma...@rectangular.com> wrote:
> ...Although many consider it best practice for release tarballs to be tied back
> to a specific version control identifier (including me), Apache release policy
> does not require it....

As we tried to say above, the timeline gets into play here.

Tying releases to version control is very useful *in the short term*
for people to verify the release, but in the long term you cannot
count on it. That's why our release policy centers on tarballs, the
rest is convenience.

-Bertrand

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Votes for git repos - commit id vs tag

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Mon, Dec 22, 2014 at 6:14 PM, Ryan Blue <bl...@cloudera.com> wrote:

> Given that there is confusion on this, I think we should decide whether it
> is required or not and update the docs to be more clear. Does that require a
> vote?

Although many consider it best practice for release tarballs to be tied back
to a specific version control identifier (including me), Apache release policy
does not require it.

Here is Leo Simons making the case against...

  http://markmail.org/message/2ncepopzgnshtyd6

... and here is a link to the most recent discussion I can recall...

  http://markmail.org/message/huhuicrjbwy2i25x

... which resulted in the Incubator's Release Checklist...

  http://incubator.apache.org/guides/releasemanagement.html#check-list

... and a list of optional checklist items, including the concern at hand:

  http://wiki.apache.org/incubator/ReleaseChecklist

    Each expanded source archive matches the corresponding SCM tag.

      It is important that any release can be reproduced from the source at
      any time in the future.  Apache releases have long active lives and are
      permanently archived.  It may be necessary (for example, for legal
      reasons) to provide a new release that is a slight alteration of a
      previous release.  Release managers owe it to those who come afterwards
      to use build processes that are reproducible.

In my view, the Incubator should not make such practice mandatory unless there
is an ASF-wide policy change.

Marvin Humphrey

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Votes for git repos - commit id vs tag

Posted by Ryan Blue <bl...@cloudera.com>.
On 12/20/2014 04:07 AM, Bertrand Delacretaz wrote:
> On Sat, Dec 20, 2014 at 7:16 AM, Niclas Hedhman <ni...@hedhman.org> wrote:
>> ...Releases are the tarball(s) prepared by the release manager, not a pointer
>> into the source control system....
>
> Agreed. I also agree with Brane about the pointer into source code
> control system being useful for PMC members to check that the released
> code is what they expect, but as you say long-term it's only the
> signed release tarball that matters.
>
>> ...So, to make this clear to the community, I would discourage to publish the
>> commit ID in the vote request, and only provide the URL link to the
>> tarball(s)....
>
> The way we work in Sling is that the tarball's name points to a
> well-known svn tag URL. This matches your idea of having the commit ID
> or equivalent somewhere else, but easily accessible. I like that.
>
> OTOH I also like to include the tarball archive's digest (sha1 or
> equivalent) in the archived vote thread as that's a long term (*)
> guarantee that what you got is what was voted on.
>
> -Bertrand
>
> (*) As long as the digest algorithm is not broken, that is.

There seems to be some support for release tarballs independent of 
version control and some support for tarballs that are tied back to a 
specific version of the repository (whether SVN tag or git ID).

I think it is not just great for convenience, but necessary to link back 
to version control. That makes it easy for PMC members to verify certain 
aspects of the release that are otherwise difficult. Tasks like 
verifying source additions were correctly mirrored in NOTICE updates are 
important, and we want that to be as easy as possible. If I'm verifying 
an independent tarball, then I can't browse history as easily.

If it is best practice to link to version control, then we have to have 
a way to verify the version control link matches the release.

I think policy should be that a release tarball is based on the most 
reasonable identifier in version control. For svn, that's a tag and 
revision number. For git, that's a commit id/hash.

Projects should have repeatable processes to get the release tarball 
from the identifier that can be verified against the RM's signature. 
`git archive` works most of the time.

Given that there is confusion on this, I think we should decide whether 
it is required or not and update the docs to be more clear. Does that 
require a vote?

One last point: if the requirement is for git id and verification is 
required, then this allows us to use links to preferred mirrors as well, 
which can be easier to work with. As long as both apache git and the 
mirror are given, of course.

rb


-- 
Ryan Blue
Software Engineer
Cloudera, Inc.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Votes for git repos - commit id vs tag

Posted by Bertrand Delacretaz <bd...@apache.org>.
On Sat, Dec 20, 2014 at 7:16 AM, Niclas Hedhman <ni...@hedhman.org> wrote:
> ...Releases are the tarball(s) prepared by the release manager, not a pointer
> into the source control system....

Agreed. I also agree with Brane about the pointer into source code
control system being useful for PMC members to check that the released
code is what they expect, but as you say long-term it's only the
signed release tarball that matters.

> ...So, to make this clear to the community, I would discourage to publish the
> commit ID in the vote request, and only provide the URL link to the
> tarball(s)....

The way we work in Sling is that the tarball's name points to a
well-known svn tag URL. This matches your idea of having the commit ID
or equivalent somewhere else, but easily accessible. I like that.

OTOH I also like to include the tarball archive's digest (sha1 or
equivalent) in the archived vote thread as that's a long term (*)
guarantee that what you got is what was voted on.

-Bertrand

(*) As long as the digest algorithm is not broken, that is.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Votes for git repos - commit id vs tag

Posted by Niclas Hedhman <ni...@hedhman.org>.
Tags are at best a convenience, and nothing else. But so are commit id,
since in the long-term, GIT may not prevail and the commit id is in effect
an internal artifact of Git itself, not the concept of version control
systems. Compare how commit numbers from Subversion are imported to Git
repositories, or not... But tags are imported, if the ttb structure in
subversion is used.

Releases are the tarball(s) prepared by the release manager, not a pointer
into the source control system. It is the tarball that is released to the
public, not the commit id, and it is the tarball that must be vetted by the
community, not their local copy of the particular commit ID.

So, to make this clear to the community, I would discourage to publish the
commit ID in the vote request, and only provide the URL link to the
tarball(s). It would be totally possible for the release manager (and build
system) to include the commit id into the tarball as additional
information, for instance a footer in README.

Cheers
Niclas

On Sat, Dec 20, 2014 at 6:15 AM, David Nalley <da...@gnsa.us> wrote:
>
> >
> > I recently found this confusing with the first parquet-format release. I
> > thought that both commit id and tag were optional, given that the actual
> > release candidate is a signed tarball (actually, the "necessary source
> code
> > to build the project" [1]).
> >
>
> Commit id is not optional. Tag is.
> The release candidate is a signed tarball, but I should be able to
> take your source tree from the commit id, and get the exact same
> tarball by following your release process. (Note that this applies
> only to source tarballs - but those are the ones that matter). If I
> can't arrive at the same exact tarball there's something amiss with
> the release or the process.
>
> > We can't necessarily recover the commit id from the tarball because the
> > parent information is lost [2], so requiring the commit id is only useful
> > for convenience and validating that a new tarball from git at the commit
> id
> > matches the vote tarball. Is this validation done? Is it a requirement?
> >
> > If it isn't a requirement for a commit to match what is being voted on,
> then
> > does it matter whether we use a tag for convenience or a commit id?
> >
> > We could also accept signed tags, though I don't know if there are issues
> > that would prevent it.
> >
> > rb
> >
> >
> > [1]: https://www.apache.org/dev/release-publishing.html#valid
> > [2]: Unless using `git archive`: http://git-scm.com/docs/git-archive
> >
> > --
> > Ryan Blue
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

-- 
Niclas Hedhman, Software Developer
http://www.qi4j.org - New Energy for Java

Re: Votes for git repos - commit id vs tag

Posted by David Nalley <da...@gnsa.us>.
>
> I recently found this confusing with the first parquet-format release. I
> thought that both commit id and tag were optional, given that the actual
> release candidate is a signed tarball (actually, the "necessary source code
> to build the project" [1]).
>

Commit id is not optional. Tag is.
The release candidate is a signed tarball, but I should be able to
take your source tree from the commit id, and get the exact same
tarball by following your release process. (Note that this applies
only to source tarballs - but those are the ones that matter). If I
can't arrive at the same exact tarball there's something amiss with
the release or the process.

> We can't necessarily recover the commit id from the tarball because the
> parent information is lost [2], so requiring the commit id is only useful
> for convenience and validating that a new tarball from git at the commit id
> matches the vote tarball. Is this validation done? Is it a requirement?
>
> If it isn't a requirement for a commit to match what is being voted on, then
> does it matter whether we use a tag for convenience or a commit id?
>
> We could also accept signed tags, though I don't know if there are issues
> that would prevent it.
>
> rb
>
>
> [1]: https://www.apache.org/dev/release-publishing.html#valid
> [2]: Unless using `git archive`: http://git-scm.com/docs/git-archive
>
> --
> Ryan Blue
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Votes for git repos - commit id vs tag

Posted by Ryan Blue <bl...@apache.org>.
On 12/18/2014 05:58 AM, John D. Ament wrote:
> All,
>
> I was looking through the incubator site and I don't see anything definite.
>
> Whenever a podling goes for a vote, and they include a git tag in their
> vote message, it's typically asked to change to a commit id.  It seems to
> me this is done for the reproducible builds concept.  Tags are mutable, and
> therefore could be changed and rebuilding a tag could give you a different
> result.
>
> So, is this the right understanding? Do we want to ask podlings to always
> submit a git commit id?  If so, is there a place in the website we can
> clarify this?
>
> Thanks,
>
> John

I recently found this confusing with the first parquet-format release. I 
thought that both commit id and tag were optional, given that the actual 
release candidate is a signed tarball (actually, the "necessary source 
code to build the project" [1]).

We can't necessarily recover the commit id from the tarball because the 
parent information is lost [2], so requiring the commit id is only 
useful for convenience and validating that a new tarball from git at the 
commit id matches the vote tarball. Is this validation done? Is it a 
requirement?

If it isn't a requirement for a commit to match what is being voted on, 
then does it matter whether we use a tag for convenience or a commit id?

We could also accept signed tags, though I don't know if there are 
issues that would prevent it.

rb


[1]: https://www.apache.org/dev/release-publishing.html#valid
[2]: Unless using `git archive`: http://git-scm.com/docs/git-archive

-- 
Ryan Blue

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Votes for git repos - commit id vs tag

Posted by jan i <ja...@apache.org>.
On Thursday, December 18, 2014, John D. Ament <jo...@apache.org> wrote:

> All,
>
> I was looking through the incubator site and I don't see anything definite.
>
> Whenever a podling goes for a vote, and they include a git tag in their
> vote message, it's typically asked to change to a commit id.  It seems to
> me this is done for the reproducible builds concept.  Tags are mutable, and
> therefore could be changed and rebuilding a tag could give you a different
> result.
>
> So, is this the right understanding? Do we want to ask podlings to always
> submit a git commit id?  If so, is there a place in the website we can
> clarify this?


+1 to using git commit id. we have a guide for podling releases that would
be a good place.

rgds
jan i

>
> Thanks,
>
> John
>


-- 
Sent from My iPad, sorry for any misspellings.

RE: Votes for git repos - commit id vs tag

Posted by "Dennis E. Hamilton" <de...@acm.org>.
+1 on including commit ID (or SVN revision number) along with any tag (or SVN tag/branch) for convenience.

-----Original Message-----
From: John D. Ament [mailto:johndament@apache.org] 
Sent: Thursday, December 18, 2014 05:58
To: general@incubator.apache.org
Subject: Votes for git repos - commit id vs tag

All,

I was looking through the incubator site and I don't see anything definite.

Whenever a podling goes for a vote, and they include a git tag in their
vote message, it's typically asked to change to a commit id.  It seems to
me this is done for the reproducible builds concept.  Tags are mutable, and
therefore could be changed and rebuilding a tag could give you a different
result.

So, is this the right understanding? Do we want to ask podlings to always
submit a git commit id?  If so, is there a place in the website we can
clarify this?

Thanks,

John


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org