You are viewing a plain text version of this content. The canonical link for it is here.
Posted to builds@apache.org by Alex Harui <ah...@adobe.com.INVALID> on 2020/02/04 02:24:14 UTC

Re: [CI] What are the troubles projects face with CI and Infra

Moving board@ to BCC.  Attempting to move discussion to builds@

I’m fine with the ASF maintaining its position on stricter provenance and therefore disallowing third-party write-access to repos.

A suggestion was made, if I understood it correctly, to create a whole other set of repos that could be written to by third-parties.  Would such a thing work?  Then a committer would have to manually bring commits back from that other set to the canonical repo.  That seems viable to me.

A concern was raised that the project might cut its release from the “other set”, but IMO, that would be ok if the release artifacts could be verified, which should be possible by comparing the canonical repo against the “other repo”, at least for the source package, and if there are reproducible binaries, for the binary artifacts as well.

Thoughts?
-Alex

From: Greg Stein <gs...@gmail.com>
Reply-To: "board@apache.org" <bo...@apache.org>
Date: Monday, February 3, 2020 at 5:17 PM
To: "board@apache.org" <bo...@apache.org>
Subject: Re: [CI] What are the troubles projects face with CI and Infra

On Mon, Feb 3, 2020 at 6:48 PM Alex Harui <ah...@adobe.com>> wrote:
>...
How does Google or other non-ASF open source projects manage the provenance tracking?

Note that most F/OSS projects don't worry about provenance to the level the Foundation worries. That affords them some flexibility that our choices do not allow. Those projects may also choose to trust tools with write access to their repositories, hoping they will not Do Something Bad(tm). We have chosen to not provide that trust.

IMO, I do not think the Foundation should relax its stance on provenance, nor trust in third parties ... but that is one of the key considerations [for the Board] at the heart of being able to leverage some third party CI/CD services.

Cheers,
-g


Re: [CI] What are the troubles projects face with CI and Infra

Posted by Greg Stein <gs...@gmail.com>.
On Mon, Feb 3, 2020 at 9:29 PM Alex Harui <ah...@adobe.com.invalid> wrote:

> Hopefully last set of questions for now...
>
> 1) It sounds like there is a risk that as the ASF grows, GH may not be
> able to grow with us.  Did I understand that correctly?
>

GitHub will always grow faster than us. Not a worry.


> 2) If we have money to offer GH, why can't we offer money to the CI
> Vendors so we aren't really abusing their free tiers?
>

We already pay TravisCI, Inc. for a set of builders. We also have lots of
donated credits from multiple vendors, and donated build nodes. See
else-thread about "expand to consume all provided capacity".

3) Does GH track my activity in the ASF GH repos as part of the API usage
> for Apache?  IOW, am I adding to the ASF API count by closing an issue on
> github.com?  Or if I ran a script on my computer that closed the issue by
> using their API?
>

API usage is per-user, not about the target repo/org, so what *you* do has
no bearing upon limits for Foundation tooling. Good question.


> I think builds.a.o is a great free service, but AIUI, the
> no-third-party-write-access rule is independent of whether CI is free or
> not.  I cannot pay money and get write-access to the ASF repos.
>

Yes and yes.

Downstream users trust Apache because of our provenance rules (per feedback
over the years). Spoiling that assurance, spoils our reputation; that is
kind of at the heart of the issue for the Board to debate.

We can conceivably code our way into a proxy that creates limitations, but
$world that is using GitHub won't be using our proxy. Our builder nodes
that publish to the asf-site branch is within our control. It *does*
effectively use our established proxy/controls.

Welcome to the Infra world of CI/CD :p

Cheers,
-g

Re: [CI] What are the troubles projects face with CI and Infra

Posted by Chesnay Schepler <ch...@apache.org>.
I believe the write permission is used by CI services mostly to attach a 
GitHub Check for the build to the commit.
 From what I know there's no dedicated permission for that.

The Flink project is actively pursuing having a separate repository for 
running CI, that is not owned by Apache.
The core motivation was that we were using too much ASF Travis 
resources, and wanted to offload some of that to a sponsored account, 
and the (seemingly) _only_ way to do that (at least with Travis) was to 
have a separate org+repo.
Pull Requests (and in the future, branches) are mirrored by a bot into 
this repository, triggering builds,
the results of which are written as a comment into the PR / sent to the ML.

Should we rethink our approach?

On 04/02/2020 06:07, Kenneth Knowles wrote:
> (Top-posting a question that rewinds this thread a bit. Feel to continue
> other discussion on the latest inline email)
>
> Why do so many tools require write access? It seems like there's at least
> *some* part of this that is a technical limitation... dare I say "error"?
>
> My years-stale understanding (from reviewable.io and codecov.io IIRC, both
> of which I would have loved to use but couldn't, and not just on ASF repos)
> was that the limitation was GitHub's ACLs were too coarse-grained. Is this
> still true? Do they know this is a big problem? Are they leaving things
> as-is deliberately or through lack of funding? OTOH my understanding of
> other tools (prow? Beam's defunct mergebot?) is that the tool itself really
> wants to manage the repo for you, queuing up merges and doing them, etc. I
> don't really know buildkite. It might be helpful to have a table on a wiki
> of where these tools fail the policies.
>
> Technical opinion: in normal git workflow as I see it, any person or *tool*
> that wishes to create a branch can do so in its own fork. Wanting to write
> to a branch in some other person's or org's fork is like wanting to write
> to their hard drive: there are reasons, but doing so has to be inextricable
> from your core functionality, or you are probably doing it wrong.
>
> Over the years, I've felt this pain of CI tools not being able to be used,
> but I have almost universally considered the *other* party to be the source
> of the pain, not ASF's very reasonable policies. Is ASF able to influence
> their roadmaps, or at least keep in touch about them? A combination of best
> practices amongst projects and tools that understand the whole point of git
> would go a long way.
>
> (I welcome opinions that I am just wrong and these CI tools are doing
> exactly the best thing they should be doing - that would be new and useful
> info for me)
>
> Kenn
>
> On Mon, Feb 3, 2020 at 7:50 PM David Nalley <da...@gnsa.us> wrote:
>
>> On Tue, Feb 4, 2020 at 4:29 AM Alex Harui <ah...@adobe.com.invalid>
>> wrote:
>>> Hopefully last set of questions for now...
>> Just wait, the rabbit hole gets deeper :)
>>
>>> 1) It sounds like there is a risk that as the ASF grows, GH may not be
>> able to grow with us.  Did I understand that correctly?
>>
>> GH CI may not be willing to continue giving us free usage. The current
>> free usage we have is limited, but they are willing to augment - to
>> what degree we aren't sure yet. We're talking with Github.
>> Github the VCS will always be free (at least for all versions of the
>> future that I can foresee short of Github being shuttered)
>>
>>
>>> 2) If we have money to offer GH, why can't we offer money to the CI
>> Vendors so we aren't really abusing their free tiers?
>>
>> We currently pay one CI vendor (Travis - the only one aside from GH
>> that doesn't need write access. We pay them 12k a year, and are
>> planning on increasing that spend in next years budget.
>> We've discussed paying or getting cloud credits from both Azure and
>> AWS - but ran into the write access problem.
>> We're currently discussing with GH getting credits or paying them for
>> more Github Actions capacity.
>>
>>> 3) Does GH track my activity in the ASF GH repos as part of the API
>> usage for Apache?  IOW, am I adding to the ASF API count by closing an
>> issue on github.com?  Or if I ran a script on my computer that closed the
>> issue by using their API?
>>
>> No, it's tied to our user/IP address. Your actions likely won't come
>> close to our complex usage.
>>> I think builds.a.o is a great free service, but AIUI, the
>> no-third-party-write-access rule is independent of whether CI is free or
>> not.  I cannot pay money and get write-access to the ASF repos.  So I think
>> I'm trying to see if there is a solution even if it did cost money.
>> I should have been more explicit - we aren't opposed to spending money
>> on this, and do already spend some money. I'm worried that there is no
>> limit to the money that could be spent - particularly when people
>> don't have good insight into what their builds might cost the
>> Foundation. So for instance, there was a project at the ASF that
>> consumed 900 dollars/month of our 1000/month spend with Travis. They
>> didn't realize that they were consuming so much. They also didn't
>> realize that other projects were feeling the pain - they had optimized
>> their CI builds to execute really fast in Travis - essentially
>> concurrently consuming every builder. But the reality is that some
>> projects need more resources than others and allocating resources
>> appropriately becomes quite the challenge.
>>
>>> Thanks in advance,
>>> -Alex
>>>
>>> On 2/3/20, 7:03 PM, "David Nalley" <da...@gnsa.us> wrote:
>>>
>>>      On Tue, Feb 4, 2020 at 3:56 AM Alex Harui <ah...@adobe.com.invalid>
>> wrote:
>>>      >
>>>      > Some questions inline.  Apologies in advance for not really
>> understanding this stuff.  I'm primarily a client-side developer.  My
>> projects do not have automated PR testing at this point in time.  I'm
>> mainly exploring in case we become popular enough some day to need it.
>>>      >
>>>      > My line of thinking is that MS has, at least for now, generously
>> provided free Azure VMs to ASF committers.  If N committers from a project
>> each get a VM, run CI on it, figure out some way to distribute PRs to those
>> VMs, is there a viable workflow?
>>>      >
>>>      > On 2/3/20, 6:38 PM, "David Nalley" <da...@gnsa.us> wrote:
>>>      >
>>>      >     Hi Alex,
>>>      >
>>>      >     So this was explored. It creates some problems - first double
>> the
>>>      >     administration overhead - most of that is automated, but it
>> means that
>>>      >     our API usage doubles, and we're already hitting limits from
>> Github.
>>>      >
>>>      > Is that a max-traffic limit or a limit on traffic before we have
>> to start paying for usage?
>>>      Max number of calls - and we've tried offering up money, they don't
>>>      offer a product with more API calls. Greg has even raised this issue
>>>      all the way to the CEO of Github.
>>>
>>>      >
>>>      >     Second - at least one CI vendor thanked us for not doing that
>> exactly
>>>      >     - because the 'best' way to do it is to create an org per
>> project or
>>>      >     org per repo - and then the free tier is dedicated to that
>> org. Except
>>>      >     that's essentially abusing their free tier.
>>>      >
>>>      > Is "best" defined as lowest cost to the CI vendor or something
>> else?  What would the "second-best" scenario look like if there is one?
>>>      Best - well it's the cheapest for us, and it gives the most control
>> to
>>>      the projects. So great from that perspective, but likely a bit
>>>      unethical and abusive. It's essentially abusing all of the CI vendors
>>>      generosity by horizontally scaling our consumption of their freebies
>>>      and using them per-repo or per project instead of per organization.
>>>
>>>
>>>      >
>>>      >     Finally - from a practical perspective, if everyone submits
>> PRs and
>>>      >     does testing against this apacheci org - that has become the
>> de facto
>>>      >     repo - it's where everyone is doing their work, and it makes
>>>      >     provenance tracking.
>>>      >
>>>      > Didn't the ASF have read-only mirrors of repos?  I think it led to
>> some confusion, but I think folks still figured out.
>>>      >
>>>
>>>      Not anymore.
>>>      We have an active-active copy of the repositories. People can
>> actively
>>>      commit against either our repos or the GH repos, and we magically
>> move
>>>      commits between the two. (There's an upcoming blog post on how all of
>>>      this magic works)
>>>
>>>      >     As an aside - the mandate for no write access is not an
>> infrastructure
>>>      >     policy, it's a legal affairs requirement - we're merely
>> implementing
>>>      >     it.
>>>      >
>>>      >     --David
>>>      >
>>>      >     On Tue, Feb 4, 2020 at 3:24 AM Alex Harui
>> <ah...@adobe.com.invalid> wrote:
>>>      >     >
>>>      >     > Moving board@ to BCC.  Attempting to move discussion to
>> builds@
>>>      >     >
>>>      >     > I’m fine with the ASF maintaining its position on stricter
>> provenance and therefore disallowing third-party write-access to repos.
>>>      >     >
>>>      >     > A suggestion was made, if I understood it correctly, to
>> create a whole other set of repos that could be written to by
>> third-parties.  Would such a thing work?  Then a committer would have to
>> manually bring commits back from that other set to the canonical repo.
>> That seems viable to me.
>>>      >     >
>>>      >     > A concern was raised that the project might cut its release
>> from the “other set”, but IMO, that would be ok if the release artifacts
>> could be verified, which should be possible by comparing the canonical repo
>> against the “other repo”, at least for the source package, and if there are
>> reproducible binaries, for the binary artifacts as well.
>>>      >     >
>>>      >     > Thoughts?
>>>      >     > -Alex
>>>      >     >
>>>      >     > From: Greg Stein <gs...@gmail.com>
>>>      >     > Reply-To: "board@apache.org" <bo...@apache.org>
>>>      >     > Date: Monday, February 3, 2020 at 5:17 PM
>>>      >     > To: "board@apache.org" <bo...@apache.org>
>>>      >     > Subject: Re: [CI] What are the troubles projects face with
>> CI and Infra
>>>      >     >
>>>      >     > On Mon, Feb 3, 2020 at 6:48 PM Alex Harui <aharui@adobe.com
>> <ma...@adobe.com>> wrote:
>>>      >     > >...
>>>      >     > How does Google or other non-ASF open source projects manage
>> the provenance tracking?
>>>      >     >
>>>      >     > Note that most F/OSS projects don't worry about provenance
>> to the level the Foundation worries. That affords them some flexibility
>> that our choices do not allow. Those projects may also choose to trust
>> tools with write access to their repositories, hoping they will not Do
>> Something Bad(tm). We have chosen to not provide that trust.
>>>      >     >
>>>      >     > IMO, I do not think the Foundation should relax its stance
>> on provenance, nor trust in third parties ... but that is one of the key
>> considerations [for the Board] at the heart of being able to leverage some
>> third party CI/CD services.
>>>      >     >
>>>      >     > Cheers,
>>>      >     > -g
>>>      >     >
>>>      >
>>>      >
>>>
>>>


Re: [CI] What are the troubles projects face with CI and Infra

Posted by Kenneth Knowles <ke...@apache.org>.
(Top-posting a question that rewinds this thread a bit. Feel to continue
other discussion on the latest inline email)

Why do so many tools require write access? It seems like there's at least
*some* part of this that is a technical limitation... dare I say "error"?

My years-stale understanding (from reviewable.io and codecov.io IIRC, both
of which I would have loved to use but couldn't, and not just on ASF repos)
was that the limitation was GitHub's ACLs were too coarse-grained. Is this
still true? Do they know this is a big problem? Are they leaving things
as-is deliberately or through lack of funding? OTOH my understanding of
other tools (prow? Beam's defunct mergebot?) is that the tool itself really
wants to manage the repo for you, queuing up merges and doing them, etc. I
don't really know buildkite. It might be helpful to have a table on a wiki
of where these tools fail the policies.

Technical opinion: in normal git workflow as I see it, any person or *tool*
that wishes to create a branch can do so in its own fork. Wanting to write
to a branch in some other person's or org's fork is like wanting to write
to their hard drive: there are reasons, but doing so has to be inextricable
from your core functionality, or you are probably doing it wrong.

Over the years, I've felt this pain of CI tools not being able to be used,
but I have almost universally considered the *other* party to be the source
of the pain, not ASF's very reasonable policies. Is ASF able to influence
their roadmaps, or at least keep in touch about them? A combination of best
practices amongst projects and tools that understand the whole point of git
would go a long way.

(I welcome opinions that I am just wrong and these CI tools are doing
exactly the best thing they should be doing - that would be new and useful
info for me)

Kenn

On Mon, Feb 3, 2020 at 7:50 PM David Nalley <da...@gnsa.us> wrote:

> On Tue, Feb 4, 2020 at 4:29 AM Alex Harui <ah...@adobe.com.invalid>
> wrote:
> >
> > Hopefully last set of questions for now...
>
> Just wait, the rabbit hole gets deeper :)
>
> >
> > 1) It sounds like there is a risk that as the ASF grows, GH may not be
> able to grow with us.  Did I understand that correctly?
>
> GH CI may not be willing to continue giving us free usage. The current
> free usage we have is limited, but they are willing to augment - to
> what degree we aren't sure yet. We're talking with Github.
> Github the VCS will always be free (at least for all versions of the
> future that I can foresee short of Github being shuttered)
>
>
> > 2) If we have money to offer GH, why can't we offer money to the CI
> Vendors so we aren't really abusing their free tiers?
>
> We currently pay one CI vendor (Travis - the only one aside from GH
> that doesn't need write access. We pay them 12k a year, and are
> planning on increasing that spend in next years budget.
> We've discussed paying or getting cloud credits from both Azure and
> AWS - but ran into the write access problem.
> We're currently discussing with GH getting credits or paying them for
> more Github Actions capacity.
>
> > 3) Does GH track my activity in the ASF GH repos as part of the API
> usage for Apache?  IOW, am I adding to the ASF API count by closing an
> issue on github.com?  Or if I ran a script on my computer that closed the
> issue by using their API?
>
> No, it's tied to our user/IP address. Your actions likely won't come
> close to our complex usage.
> >
> > I think builds.a.o is a great free service, but AIUI, the
> no-third-party-write-access rule is independent of whether CI is free or
> not.  I cannot pay money and get write-access to the ASF repos.  So I think
> I'm trying to see if there is a solution even if it did cost money.
> >
>
> I should have been more explicit - we aren't opposed to spending money
> on this, and do already spend some money. I'm worried that there is no
> limit to the money that could be spent - particularly when people
> don't have good insight into what their builds might cost the
> Foundation. So for instance, there was a project at the ASF that
> consumed 900 dollars/month of our 1000/month spend with Travis. They
> didn't realize that they were consuming so much. They also didn't
> realize that other projects were feeling the pain - they had optimized
> their CI builds to execute really fast in Travis - essentially
> concurrently consuming every builder. But the reality is that some
> projects need more resources than others and allocating resources
> appropriately becomes quite the challenge.
>
> > Thanks in advance,
> > -Alex
> >
> > On 2/3/20, 7:03 PM, "David Nalley" <da...@gnsa.us> wrote:
> >
> >     On Tue, Feb 4, 2020 at 3:56 AM Alex Harui <ah...@adobe.com.invalid>
> wrote:
> >     >
> >     > Some questions inline.  Apologies in advance for not really
> understanding this stuff.  I'm primarily a client-side developer.  My
> projects do not have automated PR testing at this point in time.  I'm
> mainly exploring in case we become popular enough some day to need it.
> >     >
> >     > My line of thinking is that MS has, at least for now, generously
> provided free Azure VMs to ASF committers.  If N committers from a project
> each get a VM, run CI on it, figure out some way to distribute PRs to those
> VMs, is there a viable workflow?
> >     >
> >     > On 2/3/20, 6:38 PM, "David Nalley" <da...@gnsa.us> wrote:
> >     >
> >     >     Hi Alex,
> >     >
> >     >     So this was explored. It creates some problems - first double
> the
> >     >     administration overhead - most of that is automated, but it
> means that
> >     >     our API usage doubles, and we're already hitting limits from
> Github.
> >     >
> >     > Is that a max-traffic limit or a limit on traffic before we have
> to start paying for usage?
> >
> >     Max number of calls - and we've tried offering up money, they don't
> >     offer a product with more API calls. Greg has even raised this issue
> >     all the way to the CEO of Github.
> >
> >     >
> >     >     Second - at least one CI vendor thanked us for not doing that
> exactly
> >     >     - because the 'best' way to do it is to create an org per
> project or
> >     >     org per repo - and then the free tier is dedicated to that
> org. Except
> >     >     that's essentially abusing their free tier.
> >     >
> >     > Is "best" defined as lowest cost to the CI vendor or something
> else?  What would the "second-best" scenario look like if there is one?
> >
> >     Best - well it's the cheapest for us, and it gives the most control
> to
> >     the projects. So great from that perspective, but likely a bit
> >     unethical and abusive. It's essentially abusing all of the CI vendors
> >     generosity by horizontally scaling our consumption of their freebies
> >     and using them per-repo or per project instead of per organization.
> >
> >
> >     >
> >     >     Finally - from a practical perspective, if everyone submits
> PRs and
> >     >     does testing against this apacheci org - that has become the
> de facto
> >     >     repo - it's where everyone is doing their work, and it makes
> >     >     provenance tracking.
> >     >
> >     > Didn't the ASF have read-only mirrors of repos?  I think it led to
> some confusion, but I think folks still figured out.
> >     >
> >
> >     Not anymore.
> >     We have an active-active copy of the repositories. People can
> actively
> >     commit against either our repos or the GH repos, and we magically
> move
> >     commits between the two. (There's an upcoming blog post on how all of
> >     this magic works)
> >
> >     >     As an aside - the mandate for no write access is not an
> infrastructure
> >     >     policy, it's a legal affairs requirement - we're merely
> implementing
> >     >     it.
> >     >
> >     >     --David
> >     >
> >     >     On Tue, Feb 4, 2020 at 3:24 AM Alex Harui
> <ah...@adobe.com.invalid> wrote:
> >     >     >
> >     >     > Moving board@ to BCC.  Attempting to move discussion to
> builds@
> >     >     >
> >     >     > I’m fine with the ASF maintaining its position on stricter
> provenance and therefore disallowing third-party write-access to repos.
> >     >     >
> >     >     > A suggestion was made, if I understood it correctly, to
> create a whole other set of repos that could be written to by
> third-parties.  Would such a thing work?  Then a committer would have to
> manually bring commits back from that other set to the canonical repo.
> That seems viable to me.
> >     >     >
> >     >     > A concern was raised that the project might cut its release
> from the “other set”, but IMO, that would be ok if the release artifacts
> could be verified, which should be possible by comparing the canonical repo
> against the “other repo”, at least for the source package, and if there are
> reproducible binaries, for the binary artifacts as well.
> >     >     >
> >     >     > Thoughts?
> >     >     > -Alex
> >     >     >
> >     >     > From: Greg Stein <gs...@gmail.com>
> >     >     > Reply-To: "board@apache.org" <bo...@apache.org>
> >     >     > Date: Monday, February 3, 2020 at 5:17 PM
> >     >     > To: "board@apache.org" <bo...@apache.org>
> >     >     > Subject: Re: [CI] What are the troubles projects face with
> CI and Infra
> >     >     >
> >     >     > On Mon, Feb 3, 2020 at 6:48 PM Alex Harui <aharui@adobe.com
> <ma...@adobe.com>> wrote:
> >     >     > >...
> >     >     > How does Google or other non-ASF open source projects manage
> the provenance tracking?
> >     >     >
> >     >     > Note that most F/OSS projects don't worry about provenance
> to the level the Foundation worries. That affords them some flexibility
> that our choices do not allow. Those projects may also choose to trust
> tools with write access to their repositories, hoping they will not Do
> Something Bad(tm). We have chosen to not provide that trust.
> >     >     >
> >     >     > IMO, I do not think the Foundation should relax its stance
> on provenance, nor trust in third parties ... but that is one of the key
> considerations [for the Board] at the heart of being able to leverage some
> third party CI/CD services.
> >     >     >
> >     >     > Cheers,
> >     >     > -g
> >     >     >
> >     >
> >     >
> >
> >
>

Re: [CI] What are the troubles projects face with CI and Infra

Posted by David Nalley <da...@gnsa.us>.
On Tue, Feb 4, 2020 at 4:29 AM Alex Harui <ah...@adobe.com.invalid> wrote:
>
> Hopefully last set of questions for now...

Just wait, the rabbit hole gets deeper :)

>
> 1) It sounds like there is a risk that as the ASF grows, GH may not be able to grow with us.  Did I understand that correctly?

GH CI may not be willing to continue giving us free usage. The current
free usage we have is limited, but they are willing to augment - to
what degree we aren't sure yet. We're talking with Github.
Github the VCS will always be free (at least for all versions of the
future that I can foresee short of Github being shuttered)


> 2) If we have money to offer GH, why can't we offer money to the CI Vendors so we aren't really abusing their free tiers?

We currently pay one CI vendor (Travis - the only one aside from GH
that doesn't need write access. We pay them 12k a year, and are
planning on increasing that spend in next years budget.
We've discussed paying or getting cloud credits from both Azure and
AWS - but ran into the write access problem.
We're currently discussing with GH getting credits or paying them for
more Github Actions capacity.

> 3) Does GH track my activity in the ASF GH repos as part of the API usage for Apache?  IOW, am I adding to the ASF API count by closing an issue on github.com?  Or if I ran a script on my computer that closed the issue by using their API?

No, it's tied to our user/IP address. Your actions likely won't come
close to our complex usage.
>
> I think builds.a.o is a great free service, but AIUI, the no-third-party-write-access rule is independent of whether CI is free or not.  I cannot pay money and get write-access to the ASF repos.  So I think I'm trying to see if there is a solution even if it did cost money.
>

I should have been more explicit - we aren't opposed to spending money
on this, and do already spend some money. I'm worried that there is no
limit to the money that could be spent - particularly when people
don't have good insight into what their builds might cost the
Foundation. So for instance, there was a project at the ASF that
consumed 900 dollars/month of our 1000/month spend with Travis. They
didn't realize that they were consuming so much. They also didn't
realize that other projects were feeling the pain - they had optimized
their CI builds to execute really fast in Travis - essentially
concurrently consuming every builder. But the reality is that some
projects need more resources than others and allocating resources
appropriately becomes quite the challenge.

> Thanks in advance,
> -Alex
>
> On 2/3/20, 7:03 PM, "David Nalley" <da...@gnsa.us> wrote:
>
>     On Tue, Feb 4, 2020 at 3:56 AM Alex Harui <ah...@adobe.com.invalid> wrote:
>     >
>     > Some questions inline.  Apologies in advance for not really understanding this stuff.  I'm primarily a client-side developer.  My projects do not have automated PR testing at this point in time.  I'm mainly exploring in case we become popular enough some day to need it.
>     >
>     > My line of thinking is that MS has, at least for now, generously provided free Azure VMs to ASF committers.  If N committers from a project each get a VM, run CI on it, figure out some way to distribute PRs to those VMs, is there a viable workflow?
>     >
>     > On 2/3/20, 6:38 PM, "David Nalley" <da...@gnsa.us> wrote:
>     >
>     >     Hi Alex,
>     >
>     >     So this was explored. It creates some problems - first double the
>     >     administration overhead - most of that is automated, but it means that
>     >     our API usage doubles, and we're already hitting limits from Github.
>     >
>     > Is that a max-traffic limit or a limit on traffic before we have to start paying for usage?
>
>     Max number of calls - and we've tried offering up money, they don't
>     offer a product with more API calls. Greg has even raised this issue
>     all the way to the CEO of Github.
>
>     >
>     >     Second - at least one CI vendor thanked us for not doing that exactly
>     >     - because the 'best' way to do it is to create an org per project or
>     >     org per repo - and then the free tier is dedicated to that org. Except
>     >     that's essentially abusing their free tier.
>     >
>     > Is "best" defined as lowest cost to the CI vendor or something else?  What would the "second-best" scenario look like if there is one?
>
>     Best - well it's the cheapest for us, and it gives the most control to
>     the projects. So great from that perspective, but likely a bit
>     unethical and abusive. It's essentially abusing all of the CI vendors
>     generosity by horizontally scaling our consumption of their freebies
>     and using them per-repo or per project instead of per organization.
>
>
>     >
>     >     Finally - from a practical perspective, if everyone submits PRs and
>     >     does testing against this apacheci org - that has become the de facto
>     >     repo - it's where everyone is doing their work, and it makes
>     >     provenance tracking.
>     >
>     > Didn't the ASF have read-only mirrors of repos?  I think it led to some confusion, but I think folks still figured out.
>     >
>
>     Not anymore.
>     We have an active-active copy of the repositories. People can actively
>     commit against either our repos or the GH repos, and we magically move
>     commits between the two. (There's an upcoming blog post on how all of
>     this magic works)
>
>     >     As an aside - the mandate for no write access is not an infrastructure
>     >     policy, it's a legal affairs requirement - we're merely implementing
>     >     it.
>     >
>     >     --David
>     >
>     >     On Tue, Feb 4, 2020 at 3:24 AM Alex Harui <ah...@adobe.com.invalid> wrote:
>     >     >
>     >     > Moving board@ to BCC.  Attempting to move discussion to builds@
>     >     >
>     >     > I’m fine with the ASF maintaining its position on stricter provenance and therefore disallowing third-party write-access to repos.
>     >     >
>     >     > A suggestion was made, if I understood it correctly, to create a whole other set of repos that could be written to by third-parties.  Would such a thing work?  Then a committer would have to manually bring commits back from that other set to the canonical repo.  That seems viable to me.
>     >     >
>     >     > A concern was raised that the project might cut its release from the “other set”, but IMO, that would be ok if the release artifacts could be verified, which should be possible by comparing the canonical repo against the “other repo”, at least for the source package, and if there are reproducible binaries, for the binary artifacts as well.
>     >     >
>     >     > Thoughts?
>     >     > -Alex
>     >     >
>     >     > From: Greg Stein <gs...@gmail.com>
>     >     > Reply-To: "board@apache.org" <bo...@apache.org>
>     >     > Date: Monday, February 3, 2020 at 5:17 PM
>     >     > To: "board@apache.org" <bo...@apache.org>
>     >     > Subject: Re: [CI] What are the troubles projects face with CI and Infra
>     >     >
>     >     > On Mon, Feb 3, 2020 at 6:48 PM Alex Harui <ah...@adobe.com>> wrote:
>     >     > >...
>     >     > How does Google or other non-ASF open source projects manage the provenance tracking?
>     >     >
>     >     > Note that most F/OSS projects don't worry about provenance to the level the Foundation worries. That affords them some flexibility that our choices do not allow. Those projects may also choose to trust tools with write access to their repositories, hoping they will not Do Something Bad(tm). We have chosen to not provide that trust.
>     >     >
>     >     > IMO, I do not think the Foundation should relax its stance on provenance, nor trust in third parties ... but that is one of the key considerations [for the Board] at the heart of being able to leverage some third party CI/CD services.
>     >     >
>     >     > Cheers,
>     >     > -g
>     >     >
>     >
>     >
>
>

Re: [CI] What are the troubles projects face with CI and Infra

Posted by Alex Harui <ah...@adobe.com.INVALID>.
Hopefully last set of questions for now...

1) It sounds like there is a risk that as the ASF grows, GH may not be able to grow with us.  Did I understand that correctly?
2) If we have money to offer GH, why can't we offer money to the CI Vendors so we aren't really abusing their free tiers?
3) Does GH track my activity in the ASF GH repos as part of the API usage for Apache?  IOW, am I adding to the ASF API count by closing an issue on github.com?  Or if I ran a script on my computer that closed the issue by using their API?

I think builds.a.o is a great free service, but AIUI, the no-third-party-write-access rule is independent of whether CI is free or not.  I cannot pay money and get write-access to the ASF repos.  So I think I'm trying to see if there is a solution even if it did cost money.

Thanks in advance,
-Alex

On 2/3/20, 7:03 PM, "David Nalley" <da...@gnsa.us> wrote:

    On Tue, Feb 4, 2020 at 3:56 AM Alex Harui <ah...@adobe.com.invalid> wrote:
    >
    > Some questions inline.  Apologies in advance for not really understanding this stuff.  I'm primarily a client-side developer.  My projects do not have automated PR testing at this point in time.  I'm mainly exploring in case we become popular enough some day to need it.
    >
    > My line of thinking is that MS has, at least for now, generously provided free Azure VMs to ASF committers.  If N committers from a project each get a VM, run CI on it, figure out some way to distribute PRs to those VMs, is there a viable workflow?
    >
    > On 2/3/20, 6:38 PM, "David Nalley" <da...@gnsa.us> wrote:
    >
    >     Hi Alex,
    >
    >     So this was explored. It creates some problems - first double the
    >     administration overhead - most of that is automated, but it means that
    >     our API usage doubles, and we're already hitting limits from Github.
    >
    > Is that a max-traffic limit or a limit on traffic before we have to start paying for usage?
    
    Max number of calls - and we've tried offering up money, they don't
    offer a product with more API calls. Greg has even raised this issue
    all the way to the CEO of Github.
    
    >
    >     Second - at least one CI vendor thanked us for not doing that exactly
    >     - because the 'best' way to do it is to create an org per project or
    >     org per repo - and then the free tier is dedicated to that org. Except
    >     that's essentially abusing their free tier.
    >
    > Is "best" defined as lowest cost to the CI vendor or something else?  What would the "second-best" scenario look like if there is one?
    
    Best - well it's the cheapest for us, and it gives the most control to
    the projects. So great from that perspective, but likely a bit
    unethical and abusive. It's essentially abusing all of the CI vendors
    generosity by horizontally scaling our consumption of their freebies
    and using them per-repo or per project instead of per organization.
    
    
    >
    >     Finally - from a practical perspective, if everyone submits PRs and
    >     does testing against this apacheci org - that has become the de facto
    >     repo - it's where everyone is doing their work, and it makes
    >     provenance tracking.
    >
    > Didn't the ASF have read-only mirrors of repos?  I think it led to some confusion, but I think folks still figured out.
    >
    
    Not anymore.
    We have an active-active copy of the repositories. People can actively
    commit against either our repos or the GH repos, and we magically move
    commits between the two. (There's an upcoming blog post on how all of
    this magic works)
    
    >     As an aside - the mandate for no write access is not an infrastructure
    >     policy, it's a legal affairs requirement - we're merely implementing
    >     it.
    >
    >     --David
    >
    >     On Tue, Feb 4, 2020 at 3:24 AM Alex Harui <ah...@adobe.com.invalid> wrote:
    >     >
    >     > Moving board@ to BCC.  Attempting to move discussion to builds@
    >     >
    >     > I’m fine with the ASF maintaining its position on stricter provenance and therefore disallowing third-party write-access to repos.
    >     >
    >     > A suggestion was made, if I understood it correctly, to create a whole other set of repos that could be written to by third-parties.  Would such a thing work?  Then a committer would have to manually bring commits back from that other set to the canonical repo.  That seems viable to me.
    >     >
    >     > A concern was raised that the project might cut its release from the “other set”, but IMO, that would be ok if the release artifacts could be verified, which should be possible by comparing the canonical repo against the “other repo”, at least for the source package, and if there are reproducible binaries, for the binary artifacts as well.
    >     >
    >     > Thoughts?
    >     > -Alex
    >     >
    >     > From: Greg Stein <gs...@gmail.com>
    >     > Reply-To: "board@apache.org" <bo...@apache.org>
    >     > Date: Monday, February 3, 2020 at 5:17 PM
    >     > To: "board@apache.org" <bo...@apache.org>
    >     > Subject: Re: [CI] What are the troubles projects face with CI and Infra
    >     >
    >     > On Mon, Feb 3, 2020 at 6:48 PM Alex Harui <ah...@adobe.com>> wrote:
    >     > >...
    >     > How does Google or other non-ASF open source projects manage the provenance tracking?
    >     >
    >     > Note that most F/OSS projects don't worry about provenance to the level the Foundation worries. That affords them some flexibility that our choices do not allow. Those projects may also choose to trust tools with write access to their repositories, hoping they will not Do Something Bad(tm). We have chosen to not provide that trust.
    >     >
    >     > IMO, I do not think the Foundation should relax its stance on provenance, nor trust in third parties ... but that is one of the key considerations [for the Board] at the heart of being able to leverage some third party CI/CD services.
    >     >
    >     > Cheers,
    >     > -g
    >     >
    >
    >
    


Re: [CI] What are the troubles projects face with CI and Infra

Posted by David Nalley <da...@gnsa.us>.
On Tue, Feb 4, 2020 at 3:56 AM Alex Harui <ah...@adobe.com.invalid> wrote:
>
> Some questions inline.  Apologies in advance for not really understanding this stuff.  I'm primarily a client-side developer.  My projects do not have automated PR testing at this point in time.  I'm mainly exploring in case we become popular enough some day to need it.
>
> My line of thinking is that MS has, at least for now, generously provided free Azure VMs to ASF committers.  If N committers from a project each get a VM, run CI on it, figure out some way to distribute PRs to those VMs, is there a viable workflow?
>
> On 2/3/20, 6:38 PM, "David Nalley" <da...@gnsa.us> wrote:
>
>     Hi Alex,
>
>     So this was explored. It creates some problems - first double the
>     administration overhead - most of that is automated, but it means that
>     our API usage doubles, and we're already hitting limits from Github.
>
> Is that a max-traffic limit or a limit on traffic before we have to start paying for usage?

Max number of calls - and we've tried offering up money, they don't
offer a product with more API calls. Greg has even raised this issue
all the way to the CEO of Github.

>
>     Second - at least one CI vendor thanked us for not doing that exactly
>     - because the 'best' way to do it is to create an org per project or
>     org per repo - and then the free tier is dedicated to that org. Except
>     that's essentially abusing their free tier.
>
> Is "best" defined as lowest cost to the CI vendor or something else?  What would the "second-best" scenario look like if there is one?

Best - well it's the cheapest for us, and it gives the most control to
the projects. So great from that perspective, but likely a bit
unethical and abusive. It's essentially abusing all of the CI vendors
generosity by horizontally scaling our consumption of their freebies
and using them per-repo or per project instead of per organization.


>
>     Finally - from a practical perspective, if everyone submits PRs and
>     does testing against this apacheci org - that has become the de facto
>     repo - it's where everyone is doing their work, and it makes
>     provenance tracking.
>
> Didn't the ASF have read-only mirrors of repos?  I think it led to some confusion, but I think folks still figured out.
>

Not anymore.
We have an active-active copy of the repositories. People can actively
commit against either our repos or the GH repos, and we magically move
commits between the two. (There's an upcoming blog post on how all of
this magic works)

>     As an aside - the mandate for no write access is not an infrastructure
>     policy, it's a legal affairs requirement - we're merely implementing
>     it.
>
>     --David
>
>     On Tue, Feb 4, 2020 at 3:24 AM Alex Harui <ah...@adobe.com.invalid> wrote:
>     >
>     > Moving board@ to BCC.  Attempting to move discussion to builds@
>     >
>     > I’m fine with the ASF maintaining its position on stricter provenance and therefore disallowing third-party write-access to repos.
>     >
>     > A suggestion was made, if I understood it correctly, to create a whole other set of repos that could be written to by third-parties.  Would such a thing work?  Then a committer would have to manually bring commits back from that other set to the canonical repo.  That seems viable to me.
>     >
>     > A concern was raised that the project might cut its release from the “other set”, but IMO, that would be ok if the release artifacts could be verified, which should be possible by comparing the canonical repo against the “other repo”, at least for the source package, and if there are reproducible binaries, for the binary artifacts as well.
>     >
>     > Thoughts?
>     > -Alex
>     >
>     > From: Greg Stein <gs...@gmail.com>
>     > Reply-To: "board@apache.org" <bo...@apache.org>
>     > Date: Monday, February 3, 2020 at 5:17 PM
>     > To: "board@apache.org" <bo...@apache.org>
>     > Subject: Re: [CI] What are the troubles projects face with CI and Infra
>     >
>     > On Mon, Feb 3, 2020 at 6:48 PM Alex Harui <ah...@adobe.com>> wrote:
>     > >...
>     > How does Google or other non-ASF open source projects manage the provenance tracking?
>     >
>     > Note that most F/OSS projects don't worry about provenance to the level the Foundation worries. That affords them some flexibility that our choices do not allow. Those projects may also choose to trust tools with write access to their repositories, hoping they will not Do Something Bad(tm). We have chosen to not provide that trust.
>     >
>     > IMO, I do not think the Foundation should relax its stance on provenance, nor trust in third parties ... but that is one of the key considerations [for the Board] at the heart of being able to leverage some third party CI/CD services.
>     >
>     > Cheers,
>     > -g
>     >
>
>

Re: [CI] What are the troubles projects face with CI and Infra

Posted by Alex Harui <ah...@adobe.com.INVALID>.
Some questions inline.  Apologies in advance for not really understanding this stuff.  I'm primarily a client-side developer.  My projects do not have automated PR testing at this point in time.  I'm mainly exploring in case we become popular enough some day to need it.

My line of thinking is that MS has, at least for now, generously provided free Azure VMs to ASF committers.  If N committers from a project each get a VM, run CI on it, figure out some way to distribute PRs to those VMs, is there a viable workflow?

On 2/3/20, 6:38 PM, "David Nalley" <da...@gnsa.us> wrote:

    Hi Alex,
    
    So this was explored. It creates some problems - first double the
    administration overhead - most of that is automated, but it means that
    our API usage doubles, and we're already hitting limits from Github.

Is that a max-traffic limit or a limit on traffic before we have to start paying for usage?
    
    Second - at least one CI vendor thanked us for not doing that exactly
    - because the 'best' way to do it is to create an org per project or
    org per repo - and then the free tier is dedicated to that org. Except
    that's essentially abusing their free tier.

Is "best" defined as lowest cost to the CI vendor or something else?  What would the "second-best" scenario look like if there is one?
    
    Finally - from a practical perspective, if everyone submits PRs and
    does testing against this apacheci org - that has become the de facto
    repo - it's where everyone is doing their work, and it makes
    provenance tracking.
    
Didn't the ASF have read-only mirrors of repos?  I think it led to some confusion, but I think folks still figured out.

    As an aside - the mandate for no write access is not an infrastructure
    policy, it's a legal affairs requirement - we're merely implementing
    it.
    
    --David
    
    On Tue, Feb 4, 2020 at 3:24 AM Alex Harui <ah...@adobe.com.invalid> wrote:
    >
    > Moving board@ to BCC.  Attempting to move discussion to builds@
    >
    > I’m fine with the ASF maintaining its position on stricter provenance and therefore disallowing third-party write-access to repos.
    >
    > A suggestion was made, if I understood it correctly, to create a whole other set of repos that could be written to by third-parties.  Would such a thing work?  Then a committer would have to manually bring commits back from that other set to the canonical repo.  That seems viable to me.
    >
    > A concern was raised that the project might cut its release from the “other set”, but IMO, that would be ok if the release artifacts could be verified, which should be possible by comparing the canonical repo against the “other repo”, at least for the source package, and if there are reproducible binaries, for the binary artifacts as well.
    >
    > Thoughts?
    > -Alex
    >
    > From: Greg Stein <gs...@gmail.com>
    > Reply-To: "board@apache.org" <bo...@apache.org>
    > Date: Monday, February 3, 2020 at 5:17 PM
    > To: "board@apache.org" <bo...@apache.org>
    > Subject: Re: [CI] What are the troubles projects face with CI and Infra
    >
    > On Mon, Feb 3, 2020 at 6:48 PM Alex Harui <ah...@adobe.com>> wrote:
    > >...
    > How does Google or other non-ASF open source projects manage the provenance tracking?
    >
    > Note that most F/OSS projects don't worry about provenance to the level the Foundation worries. That affords them some flexibility that our choices do not allow. Those projects may also choose to trust tools with write access to their repositories, hoping they will not Do Something Bad(tm). We have chosen to not provide that trust.
    >
    > IMO, I do not think the Foundation should relax its stance on provenance, nor trust in third parties ... but that is one of the key considerations [for the Board] at the heart of being able to leverage some third party CI/CD services.
    >
    > Cheers,
    > -g
    >
    


Re: [CI] What are the troubles projects face with CI and Infra

Posted by David Nalley <da...@gnsa.us>.
So from a technical perspective - we can not limit access to specific
branches. There isn't much granularity in the ACLs for Github -
essentially we have to give away write access to the repo.

Our site building tools, which we've written only write to a specific
branch - but that's a tool that we control, as opposed to the CI
tools.

But again  -  the prohibition on write access to repos is not a
technical or policy position of Infrastructure, it's one that's set by
VP, Legal.

--David

On Tue, Feb 4, 2020 at 3:51 AM Dave Fisher <wa...@apache.org> wrote:
>
> Hi David,
>
> Does the idea of having a branch that does the CI like ash-site help out in this situation.
>
> If these workflows write into a branch that is always copied to and never is merged back then we would be good. It seems like we can track all “3rd party” commits in the gitbox and have a chance to see about the source of changes and flag anything questionable.
>
> Regards,
> Dave
>
> > On Feb 3, 2020, at 6:37 PM, David Nalley <da...@gnsa.us> wrote:
> >
> > Hi Alex,
> >
> > So this was explored. It creates some problems - first double the
> > administration overhead - most of that is automated, but it means that
> > our API usage doubles, and we're already hitting limits from Github.
> >
> > Second - at least one CI vendor thanked us for not doing that exactly
> > - because the 'best' way to do it is to create an org per project or
> > org per repo - and then the free tier is dedicated to that org. Except
> > that's essentially abusing their free tier.
> >
> > Finally - from a practical perspective, if everyone submits PRs and
> > does testing against this apacheci org - that has become the de facto
> > repo - it's where everyone is doing their work, and it makes
> > provenance tracking.
> >
> > As an aside - the mandate for no write access is not an infrastructure
> > policy, it's a legal affairs requirement - we're merely implementing
> > it.
> >
> > --David
> >
> > On Tue, Feb 4, 2020 at 3:24 AM Alex Harui <ah...@adobe.com.invalid> wrote:
> >>
> >> Moving board@ to BCC.  Attempting to move discussion to builds@
> >>
> >> I’m fine with the ASF maintaining its position on stricter provenance and therefore disallowing third-party write-access to repos.
> >>
> >> A suggestion was made, if I understood it correctly, to create a whole other set of repos that could be written to by third-parties.  Would such a thing work?  Then a committer would have to manually bring commits back from that other set to the canonical repo.  That seems viable to me.
> >>
> >> A concern was raised that the project might cut its release from the “other set”, but IMO, that would be ok if the release artifacts could be verified, which should be possible by comparing the canonical repo against the “other repo”, at least for the source package, and if there are reproducible binaries, for the binary artifacts as well.
> >>
> >> Thoughts?
> >> -Alex
> >>
> >> From: Greg Stein <gs...@gmail.com>
> >> Reply-To: "board@apache.org" <bo...@apache.org>
> >> Date: Monday, February 3, 2020 at 5:17 PM
> >> To: "board@apache.org" <bo...@apache.org>
> >> Subject: Re: [CI] What are the troubles projects face with CI and Infra
> >>
> >> On Mon, Feb 3, 2020 at 6:48 PM Alex Harui <ah...@adobe.com>> wrote:
> >>> ...
> >> How does Google or other non-ASF open source projects manage the provenance tracking?
> >>
> >> Note that most F/OSS projects don't worry about provenance to the level the Foundation worries. That affords them some flexibility that our choices do not allow. Those projects may also choose to trust tools with write access to their repositories, hoping they will not Do Something Bad(tm). We have chosen to not provide that trust.
> >>
> >> IMO, I do not think the Foundation should relax its stance on provenance, nor trust in third parties ... but that is one of the key considerations [for the Board] at the heart of being able to leverage some third party CI/CD services.
> >>
> >> Cheers,
> >> -g
> >>
>

Re: [CI] What are the troubles projects face with CI and Infra

Posted by Dave Fisher <wa...@apache.org>.
Hi David,

Does the idea of having a branch that does the CI like ash-site help out in this situation.

If these workflows write into a branch that is always copied to and never is merged back then we would be good. It seems like we can track all “3rd party” commits in the gitbox and have a chance to see about the source of changes and flag anything questionable.

Regards,
Dave

> On Feb 3, 2020, at 6:37 PM, David Nalley <da...@gnsa.us> wrote:
> 
> Hi Alex,
> 
> So this was explored. It creates some problems - first double the
> administration overhead - most of that is automated, but it means that
> our API usage doubles, and we're already hitting limits from Github.
> 
> Second - at least one CI vendor thanked us for not doing that exactly
> - because the 'best' way to do it is to create an org per project or
> org per repo - and then the free tier is dedicated to that org. Except
> that's essentially abusing their free tier.
> 
> Finally - from a practical perspective, if everyone submits PRs and
> does testing against this apacheci org - that has become the de facto
> repo - it's where everyone is doing their work, and it makes
> provenance tracking.
> 
> As an aside - the mandate for no write access is not an infrastructure
> policy, it's a legal affairs requirement - we're merely implementing
> it.
> 
> --David
> 
> On Tue, Feb 4, 2020 at 3:24 AM Alex Harui <ah...@adobe.com.invalid> wrote:
>> 
>> Moving board@ to BCC.  Attempting to move discussion to builds@
>> 
>> I’m fine with the ASF maintaining its position on stricter provenance and therefore disallowing third-party write-access to repos.
>> 
>> A suggestion was made, if I understood it correctly, to create a whole other set of repos that could be written to by third-parties.  Would such a thing work?  Then a committer would have to manually bring commits back from that other set to the canonical repo.  That seems viable to me.
>> 
>> A concern was raised that the project might cut its release from the “other set”, but IMO, that would be ok if the release artifacts could be verified, which should be possible by comparing the canonical repo against the “other repo”, at least for the source package, and if there are reproducible binaries, for the binary artifacts as well.
>> 
>> Thoughts?
>> -Alex
>> 
>> From: Greg Stein <gs...@gmail.com>
>> Reply-To: "board@apache.org" <bo...@apache.org>
>> Date: Monday, February 3, 2020 at 5:17 PM
>> To: "board@apache.org" <bo...@apache.org>
>> Subject: Re: [CI] What are the troubles projects face with CI and Infra
>> 
>> On Mon, Feb 3, 2020 at 6:48 PM Alex Harui <ah...@adobe.com>> wrote:
>>> ...
>> How does Google or other non-ASF open source projects manage the provenance tracking?
>> 
>> Note that most F/OSS projects don't worry about provenance to the level the Foundation worries. That affords them some flexibility that our choices do not allow. Those projects may also choose to trust tools with write access to their repositories, hoping they will not Do Something Bad(tm). We have chosen to not provide that trust.
>> 
>> IMO, I do not think the Foundation should relax its stance on provenance, nor trust in third parties ... but that is one of the key considerations [for the Board] at the heart of being able to leverage some third party CI/CD services.
>> 
>> Cheers,
>> -g
>> 


Re: [CI] What are the troubles projects face with CI and Infra

Posted by David Nalley <da...@gnsa.us>.
Hi Alex,

So this was explored. It creates some problems - first double the
administration overhead - most of that is automated, but it means that
our API usage doubles, and we're already hitting limits from Github.

Second - at least one CI vendor thanked us for not doing that exactly
- because the 'best' way to do it is to create an org per project or
org per repo - and then the free tier is dedicated to that org. Except
that's essentially abusing their free tier.

Finally - from a practical perspective, if everyone submits PRs and
does testing against this apacheci org - that has become the de facto
repo - it's where everyone is doing their work, and it makes
provenance tracking.

As an aside - the mandate for no write access is not an infrastructure
policy, it's a legal affairs requirement - we're merely implementing
it.

--David

On Tue, Feb 4, 2020 at 3:24 AM Alex Harui <ah...@adobe.com.invalid> wrote:
>
> Moving board@ to BCC.  Attempting to move discussion to builds@
>
> I’m fine with the ASF maintaining its position on stricter provenance and therefore disallowing third-party write-access to repos.
>
> A suggestion was made, if I understood it correctly, to create a whole other set of repos that could be written to by third-parties.  Would such a thing work?  Then a committer would have to manually bring commits back from that other set to the canonical repo.  That seems viable to me.
>
> A concern was raised that the project might cut its release from the “other set”, but IMO, that would be ok if the release artifacts could be verified, which should be possible by comparing the canonical repo against the “other repo”, at least for the source package, and if there are reproducible binaries, for the binary artifacts as well.
>
> Thoughts?
> -Alex
>
> From: Greg Stein <gs...@gmail.com>
> Reply-To: "board@apache.org" <bo...@apache.org>
> Date: Monday, February 3, 2020 at 5:17 PM
> To: "board@apache.org" <bo...@apache.org>
> Subject: Re: [CI] What are the troubles projects face with CI and Infra
>
> On Mon, Feb 3, 2020 at 6:48 PM Alex Harui <ah...@adobe.com>> wrote:
> >...
> How does Google or other non-ASF open source projects manage the provenance tracking?
>
> Note that most F/OSS projects don't worry about provenance to the level the Foundation worries. That affords them some flexibility that our choices do not allow. Those projects may also choose to trust tools with write access to their repositories, hoping they will not Do Something Bad(tm). We have chosen to not provide that trust.
>
> IMO, I do not think the Foundation should relax its stance on provenance, nor trust in third parties ... but that is one of the key considerations [for the Board] at the heart of being able to leverage some third party CI/CD services.
>
> Cheers,
> -g
>