You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Ash Berlin-Taylor <as...@apache.org> on 2022/02/18 15:06:59 UTC

Improving contributor experience for "trusted" users -- faster CI by using self-hosted runners

Hi all,

I'd like to propose we start allowing more users to use the self-hosted 
runners -- they are much much quicker to run test workflows. And by 
making promising people's tests run quicker hopefully we can encourage 
them to make more PRs and continue on the path towards becoming a 
committer.

Currently only committers and PMC members test builds are run using the 
self-hosted runners, everyone else has to use the GitHub public 
runners. The "stuck in queue" issue doesn't plague us much anymore (I 
think?), but the main issue is still that the GitHub runners only have 
8GB vs the 64GB of the self-hosted (half of which is used as RAM FS) 
and as a result they are much, much slower.

So I propose that we "allow users we trust" to run on the self-hosted 
runners. This is purposefully a lighter weight process than adding 
those users to the Triage group (which we need to have a "vote"/mailing 
list for and then ask ASF Infra team to make the changes to) and is 
essentially a way to make the contributing process nicer for those that 
have shown interest and promise.

I am thinking that this would often be used for "this person is making 
a number of good quality PRs, and is on the road to being a committer".

In terms of project process, all I'm envisaging is that this requires a 
PR to add someone's GitHub username to 
<https://github.com/apache/airflow/blob/366c66b8f6eddc0d22028ef494c62bb757bd8b8b/.github/workflows/ci.yml#L80-L123> 
and then the "normal" review process to get the change merged.

By adding a user to that list the committer/PMC member is saying "I am 
sponsoring this user and trust them to not be malicious".

There will be a bit more work to finish this off, namely we'll need to 
get <https://github.com/apache/airflow-ci-infra/pull/20> finished and 
working.

We should probably be aware that if we do this it will likely be 
"people we (committers) work with" in the first instance. Are we okay 
with that, even if they haven't yet contributed (much/at all) to 
Airflow?

Are there any other criteria that people thing we should apply before 
adding users to this list?

Thoughts?


Re: Improving contributor experience for "trusted" users -- faster CI by using self-hosted runners

Posted by Howard Yoo <ho...@gmail.com>.
+1

Sent from my iPhone

> On Feb 18, 2022, at 9:18 AM, Kaxil Naik <ka...@gmail.com> wrote:
> 
> 
> +1
> 
>> On Fri, 18 Feb 2022 at 15:07, Ash Berlin-Taylor <as...@apache.org> wrote:
>> Hi all,
>> 
>> I'd like to propose we start allowing more users to use the self-hosted runners -- they are much much quicker to run test workflows. And by making promising people's tests run quicker hopefully we can encourage them to make more PRs and continue on the path towards becoming a committer.
>> 
>> Currently only committers and PMC members test builds are run using the self-hosted runners, everyone else has to use the GitHub public runners. The "stuck in queue" issue doesn't plague us much anymore (I think?), but the main issue is still that the GitHub runners only have 8GB vs the 64GB of the self-hosted (half of which is used as RAM FS) and as a result they are much, much slower.
>> 
>> So I propose that we "allow users we trust" to run on the self-hosted runners. This is purposefully a lighter weight process than adding those users to the Triage group (which we need to have a "vote"/mailing list for and then ask ASF Infra team to make the changes to) and is essentially a way to make the contributing process nicer for those that have shown interest and promise.
>> 
>> I am thinking that this would often be used for "this person is making a number of good quality PRs, and is on the road to being a committer".
>> 
>> In terms of project process, all I'm envisaging is that this requires a PR to add someone's GitHub username to https://github.com/apache/airflow/blob/366c66b8f6eddc0d22028ef494c62bb757bd8b8b/.github/workflows/ci.yml#L80-L123 and then the "normal" review process to get the change merged.
>> 
>> By adding a user to that list the committer/PMC member is saying "I am sponsoring this user and trust them to not be malicious".
>> 
>> There will be a bit more work to finish this off, namely we'll need to get https://github.com/apache/airflow-ci-infra/pull/20 finished and working.
>> 
>> We should probably be aware that if we do this it will likely be "people we (committers) work with" in the first instance. Are we okay with that, even if they haven't yet contributed (much/at all) to Airflow?
>> 
>> Are there any other criteria that people thing we should apply before adding users to this list?
>> 
>> Thoughts?

Re: Improving contributor experience for "trusted" users -- faster CI by using self-hosted runners

Posted by Kaxil Naik <ka...@gmail.com>.
+1

On Fri, 18 Feb 2022 at 15:07, Ash Berlin-Taylor <as...@apache.org> wrote:

> Hi all,
>
> I'd like to propose we start allowing more users to use the self-hosted
> runners -- they are much much quicker to run test workflows. And by making
> promising people's tests run quicker hopefully we can encourage them to
> make more PRs and continue on the path towards becoming a committer.
>
> Currently only committers and PMC members test builds are run using the
> self-hosted runners, everyone else has to use the GitHub public runners.
> The "stuck in queue" issue doesn't plague us much anymore (I think?), but
> the main issue is still that the GitHub runners only have 8GB vs the 64GB
> of the self-hosted (half of which is used as RAM FS) and as a result they
> are much, much slower.
>
> So I propose that we "allow users we trust" to run on the self-hosted
> runners. This is purposefully a lighter weight process than adding those
> users to the Triage group (which we need to have a "vote"/mailing list for
> and then ask ASF Infra team to make the changes to) and is essentially a
> way to make the contributing process nicer for those that have shown
> interest and promise.
>
> I am thinking that this would often be used for "this person is making a
> number of good quality PRs, and is on the road to being a committer".
>
> In terms of project process, all I'm envisaging is that this requires a PR
> to add someone's GitHub username to
> https://github.com/apache/airflow/blob/366c66b8f6eddc0d22028ef494c62bb757bd8b8b/.github/workflows/ci.yml#L80-L123
> and then the "normal" review process to get the change merged.
>
> By adding a user to that list the committer/PMC member is saying "I am
> sponsoring this user and trust them to not be malicious".
>
> There will be a bit more work to finish this off, namely we'll need to get
> https://github.com/apache/airflow-ci-infra/pull/20 finished and working.
>
> We should probably be aware that if we do this it will likely be "people
> we (committers) work with" in the first instance. Are we okay with that,
> even if they haven't yet contributed (much/at all) to Airflow?
>
> Are there any other criteria that people thing we should apply before
> adding users to this list?
>
> Thoughts?
>

Re: Improving contributor experience for "trusted" users -- faster CI by using self-hosted runners

Posted by Kamil Breguła <dz...@gmail.com>.
+1

On Fri, Feb 25, 2022, 17:00 Josh Fell <jo...@astronomer.io.invalid>
wrote:

> +1
>
> On Tue, Feb 22, 2022 at 11:52 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>
>> Pushing code - true. But opening a PR requires either a UI access or
>> generated token (which has expiry and requires MFA). So you should not be
>> able to open multiple PRs with stolen SSH key. Which limits potential
>> damage you can do.
>>
>> With UI/token access you could actually make all our machines busy mining
>> Bitcoin in no time.  With SSH key you would be limited to how many opened
>> PRs the user already has.
>>
>> But regardless,  actually i think promoting MFA this way is just worthy
>> anyway even if it is not strictly necessary - as pure security education,
>> so +1 on that one Elad.
>>
>>
>> J
>>
>> wt., 22 lut 2022, 16:27 użytkownik Ash Berlin-Taylor <as...@apache.org>
>> napisał:
>>
>>> MFA doesn't do much to protect against malicious code pushes via stolen
>>> SSH keys so I don't think this is a necessary pre-condition, but we could
>>> do it anyway (or ensure MFA is on by inviting them to another GH org)
>>>
>>> -a
>>>
>>> On 22 February 2022 08:53:00 GMT, Elad Kalif <el...@apache.org> wrote:
>>>>
>>>> +1
>>>>
>>>> I would like to suggest we make a requirement that the trusted user has 2F
>>>> authentication
>>>> <https://docs.github.com/en/authentication/securing-your-account-with-two-factor-authentication-2fa/configuring-two-factor-authentication>
>>>> set to reduce the risk of hacking the account.
>>>> I think by joining the triage group ASF verifies it so there are some
>>>> benefits to it. If we decide against the triage group then I guess the
>>>> sponsor should verify it(?)
>>>>
>>>> On Tue, Feb 22, 2022 at 12:11 AM Vikram Koka
>>>> <vi...@astronomer.io.invalid> wrote:
>>>>
>>>>> +1
>>>>>
>>>>> I really like the road to committership perspective on this as well!
>>>>> I also like the suggestion of a periodic (ideally automated) clean up
>>>>> of this list for inactive users and committers as well.
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Feb 18, 2022 at 7:54 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>>>>>
>>>>>> +1. That will be a huge one and actually adds an interesting "path to
>>>>>> committership" intermediate step. Being a committer (for code) is
>>>>>> mostly the trust in the person that they can approve other's code.
>>>>>> This will be "trust that they can run their code on our
>>>>>> infrastructure".
>>>>>>
>>>>>> The nice thing about it is all trackable as well - which is enough
>>>>>> reason for no-one doing anything bad. And those permissions cannot
>>>>>> really impact much more than mostly impacting results of other CI
>>>>>> builds and potentially ab-using the power of our self-hosted machines
>>>>>> (which is not worthy to abuse just this one single repo - it would
>>>>>> only make sense if you find a mass way of abusing it).
>>>>>>
>>>>>> And if we trust people also because they are in a work relationship -
>>>>>> this is quite a good "trust" reason,
>>>>>>
>>>>>> One thing that I would add - it could be  great to add a periodic
>>>>>> (automated ? ) cleanup of the list for inactive users (maybe even
>>>>>> inactive committers in this case)? This will  - in the long run - make
>>>>>> the list more "accurate" and easier to maintain. Also There is this
>>>>>> (unlikely) even when GitHub user changes the username, there is
>>>>>> currently no good protection from someone actually "taking over" the
>>>>>> github name, so keeping the list to only "recently active" might help
>>>>>> with that a lot.
>>>>>>
>>>>>>
>>>>>> J.
>>>>>>
>>>>>> On Fri, Feb 18, 2022 at 4:07 PM Ash Berlin-Taylor <as...@apache.org>
>>>>>> wrote:
>>>>>> >
>>>>>> > Hi all,
>>>>>> >
>>>>>> > I'd like to propose we start allowing more users to use the
>>>>>> self-hosted runners -- they are much much quicker to run test workflows.
>>>>>> And by making promising people's tests run quicker hopefully we can
>>>>>> encourage them to make more PRs and continue on the path towards becoming a
>>>>>> committer.
>>>>>> >
>>>>>> > Currently only committers and PMC members test builds are run using
>>>>>> the self-hosted runners, everyone else has to use the GitHub public
>>>>>> runners. The "stuck in queue" issue doesn't plague us much anymore (I
>>>>>> think?), but the main issue is still that the GitHub runners only have 8GB
>>>>>> vs the 64GB of the self-hosted (half of which is used as RAM FS) and as a
>>>>>> result they are much, much slower.
>>>>>> >
>>>>>> > So I propose that we "allow users we trust" to run on the
>>>>>> self-hosted runners. This is purposefully a lighter weight process than
>>>>>> adding those users to the Triage group (which we need to have a
>>>>>> "vote"/mailing list for and then ask ASF Infra team to make the changes to)
>>>>>> and is essentially a way to make the contributing process nicer for those
>>>>>> that have shown interest and promise.
>>>>>> >
>>>>>> > I am thinking that this would often be used for "this person is
>>>>>> making a number of good quality PRs, and is on the road to being a
>>>>>> committer".
>>>>>> >
>>>>>> > In terms of project process, all I'm envisaging is that this
>>>>>> requires a PR to add someone's GitHub username to
>>>>>> https://github.com/apache/airflow/blob/366c66b8f6eddc0d22028ef494c62bb757bd8b8b/.github/workflows/ci.yml#L80-L123
>>>>>> and then the "normal" review process to get the change merged.
>>>>>> >
>>>>>> > By adding a user to that list the committer/PMC member is saying "I
>>>>>> am sponsoring this user and trust them to not be malicious".
>>>>>> >
>>>>>> > There will be a bit more work to finish this off, namely we'll need
>>>>>> to get https://github.com/apache/airflow-ci-infra/pull/20 finished
>>>>>> and working.
>>>>>> >
>>>>>> > We should probably be aware that if we do this it will likely be
>>>>>> "people we (committers) work with" in the first instance. Are we okay with
>>>>>> that, even if they haven't yet contributed (much/at all) to Airflow?
>>>>>> >
>>>>>> > Are there any other criteria that people thing we should apply
>>>>>> before adding users to this list?
>>>>>> >
>>>>>> > Thoughts?
>>>>>>
>>>>>
>>>
>>>

Re: Improving contributor experience for "trusted" users -- faster CI by using self-hosted runners

Posted by Josh Fell <jo...@astronomer.io.INVALID>.
+1

On Tue, Feb 22, 2022 at 11:52 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> Pushing code - true. But opening a PR requires either a UI access or
> generated token (which has expiry and requires MFA). So you should not be
> able to open multiple PRs with stolen SSH key. Which limits potential
> damage you can do.
>
> With UI/token access you could actually make all our machines busy mining
> Bitcoin in no time.  With SSH key you would be limited to how many opened
> PRs the user already has.
>
> But regardless,  actually i think promoting MFA this way is just worthy
> anyway even if it is not strictly necessary - as pure security education,
> so +1 on that one Elad.
>
>
> J
>
> wt., 22 lut 2022, 16:27 użytkownik Ash Berlin-Taylor <as...@apache.org>
> napisał:
>
>> MFA doesn't do much to protect against malicious code pushes via stolen
>> SSH keys so I don't think this is a necessary pre-condition, but we could
>> do it anyway (or ensure MFA is on by inviting them to another GH org)
>>
>> -a
>>
>> On 22 February 2022 08:53:00 GMT, Elad Kalif <el...@apache.org> wrote:
>>>
>>> +1
>>>
>>> I would like to suggest we make a requirement that the trusted user has 2F
>>> authentication
>>> <https://docs.github.com/en/authentication/securing-your-account-with-two-factor-authentication-2fa/configuring-two-factor-authentication>
>>> set to reduce the risk of hacking the account.
>>> I think by joining the triage group ASF verifies it so there are some
>>> benefits to it. If we decide against the triage group then I guess the
>>> sponsor should verify it(?)
>>>
>>> On Tue, Feb 22, 2022 at 12:11 AM Vikram Koka
>>> <vi...@astronomer.io.invalid> wrote:
>>>
>>>> +1
>>>>
>>>> I really like the road to committership perspective on this as well!
>>>> I also like the suggestion of a periodic (ideally automated) clean up
>>>> of this list for inactive users and committers as well.
>>>>
>>>>
>>>>
>>>> On Fri, Feb 18, 2022 at 7:54 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>>>>
>>>>> +1. That will be a huge one and actually adds an interesting "path to
>>>>> committership" intermediate step. Being a committer (for code) is
>>>>> mostly the trust in the person that they can approve other's code.
>>>>> This will be "trust that they can run their code on our
>>>>> infrastructure".
>>>>>
>>>>> The nice thing about it is all trackable as well - which is enough
>>>>> reason for no-one doing anything bad. And those permissions cannot
>>>>> really impact much more than mostly impacting results of other CI
>>>>> builds and potentially ab-using the power of our self-hosted machines
>>>>> (which is not worthy to abuse just this one single repo - it would
>>>>> only make sense if you find a mass way of abusing it).
>>>>>
>>>>> And if we trust people also because they are in a work relationship -
>>>>> this is quite a good "trust" reason,
>>>>>
>>>>> One thing that I would add - it could be  great to add a periodic
>>>>> (automated ? ) cleanup of the list for inactive users (maybe even
>>>>> inactive committers in this case)? This will  - in the long run - make
>>>>> the list more "accurate" and easier to maintain. Also There is this
>>>>> (unlikely) even when GitHub user changes the username, there is
>>>>> currently no good protection from someone actually "taking over" the
>>>>> github name, so keeping the list to only "recently active" might help
>>>>> with that a lot.
>>>>>
>>>>>
>>>>> J.
>>>>>
>>>>> On Fri, Feb 18, 2022 at 4:07 PM Ash Berlin-Taylor <as...@apache.org>
>>>>> wrote:
>>>>> >
>>>>> > Hi all,
>>>>> >
>>>>> > I'd like to propose we start allowing more users to use the
>>>>> self-hosted runners -- they are much much quicker to run test workflows.
>>>>> And by making promising people's tests run quicker hopefully we can
>>>>> encourage them to make more PRs and continue on the path towards becoming a
>>>>> committer.
>>>>> >
>>>>> > Currently only committers and PMC members test builds are run using
>>>>> the self-hosted runners, everyone else has to use the GitHub public
>>>>> runners. The "stuck in queue" issue doesn't plague us much anymore (I
>>>>> think?), but the main issue is still that the GitHub runners only have 8GB
>>>>> vs the 64GB of the self-hosted (half of which is used as RAM FS) and as a
>>>>> result they are much, much slower.
>>>>> >
>>>>> > So I propose that we "allow users we trust" to run on the
>>>>> self-hosted runners. This is purposefully a lighter weight process than
>>>>> adding those users to the Triage group (which we need to have a
>>>>> "vote"/mailing list for and then ask ASF Infra team to make the changes to)
>>>>> and is essentially a way to make the contributing process nicer for those
>>>>> that have shown interest and promise.
>>>>> >
>>>>> > I am thinking that this would often be used for "this person is
>>>>> making a number of good quality PRs, and is on the road to being a
>>>>> committer".
>>>>> >
>>>>> > In terms of project process, all I'm envisaging is that this
>>>>> requires a PR to add someone's GitHub username to
>>>>> https://github.com/apache/airflow/blob/366c66b8f6eddc0d22028ef494c62bb757bd8b8b/.github/workflows/ci.yml#L80-L123
>>>>> and then the "normal" review process to get the change merged.
>>>>> >
>>>>> > By adding a user to that list the committer/PMC member is saying "I
>>>>> am sponsoring this user and trust them to not be malicious".
>>>>> >
>>>>> > There will be a bit more work to finish this off, namely we'll need
>>>>> to get https://github.com/apache/airflow-ci-infra/pull/20 finished
>>>>> and working.
>>>>> >
>>>>> > We should probably be aware that if we do this it will likely be
>>>>> "people we (committers) work with" in the first instance. Are we okay with
>>>>> that, even if they haven't yet contributed (much/at all) to Airflow?
>>>>> >
>>>>> > Are there any other criteria that people thing we should apply
>>>>> before adding users to this list?
>>>>> >
>>>>> > Thoughts?
>>>>>
>>>>
>>
>>

Re: Improving contributor experience for "trusted" users -- faster CI by using self-hosted runners

Posted by Jarek Potiuk <ja...@potiuk.com>.
Pushing code - true. But opening a PR requires either a UI access or
generated token (which has expiry and requires MFA). So you should not be
able to open multiple PRs with stolen SSH key. Which limits potential
damage you can do.

With UI/token access you could actually make all our machines busy mining
Bitcoin in no time.  With SSH key you would be limited to how many opened
PRs the user already has.

But regardless,  actually i think promoting MFA this way is just worthy
anyway even if it is not strictly necessary - as pure security education,
so +1 on that one Elad.


J

wt., 22 lut 2022, 16:27 użytkownik Ash Berlin-Taylor <as...@apache.org>
napisał:

> MFA doesn't do much to protect against malicious code pushes via stolen
> SSH keys so I don't think this is a necessary pre-condition, but we could
> do it anyway (or ensure MFA is on by inviting them to another GH org)
>
> -a
>
> On 22 February 2022 08:53:00 GMT, Elad Kalif <el...@apache.org> wrote:
>>
>> +1
>>
>> I would like to suggest we make a requirement that the trusted user has 2F
>> authentication
>> <https://docs.github.com/en/authentication/securing-your-account-with-two-factor-authentication-2fa/configuring-two-factor-authentication>
>> set to reduce the risk of hacking the account.
>> I think by joining the triage group ASF verifies it so there are some
>> benefits to it. If we decide against the triage group then I guess the
>> sponsor should verify it(?)
>>
>> On Tue, Feb 22, 2022 at 12:11 AM Vikram Koka <vi...@astronomer.io.invalid>
>> wrote:
>>
>>> +1
>>>
>>> I really like the road to committership perspective on this as well!
>>> I also like the suggestion of a periodic (ideally automated) clean up of
>>> this list for inactive users and committers as well.
>>>
>>>
>>>
>>> On Fri, Feb 18, 2022 at 7:54 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>>>
>>>> +1. That will be a huge one and actually adds an interesting "path to
>>>> committership" intermediate step. Being a committer (for code) is
>>>> mostly the trust in the person that they can approve other's code.
>>>> This will be "trust that they can run their code on our
>>>> infrastructure".
>>>>
>>>> The nice thing about it is all trackable as well - which is enough
>>>> reason for no-one doing anything bad. And those permissions cannot
>>>> really impact much more than mostly impacting results of other CI
>>>> builds and potentially ab-using the power of our self-hosted machines
>>>> (which is not worthy to abuse just this one single repo - it would
>>>> only make sense if you find a mass way of abusing it).
>>>>
>>>> And if we trust people also because they are in a work relationship -
>>>> this is quite a good "trust" reason,
>>>>
>>>> One thing that I would add - it could be  great to add a periodic
>>>> (automated ? ) cleanup of the list for inactive users (maybe even
>>>> inactive committers in this case)? This will  - in the long run - make
>>>> the list more "accurate" and easier to maintain. Also There is this
>>>> (unlikely) even when GitHub user changes the username, there is
>>>> currently no good protection from someone actually "taking over" the
>>>> github name, so keeping the list to only "recently active" might help
>>>> with that a lot.
>>>>
>>>>
>>>> J.
>>>>
>>>> On Fri, Feb 18, 2022 at 4:07 PM Ash Berlin-Taylor <as...@apache.org>
>>>> wrote:
>>>> >
>>>> > Hi all,
>>>> >
>>>> > I'd like to propose we start allowing more users to use the
>>>> self-hosted runners -- they are much much quicker to run test workflows.
>>>> And by making promising people's tests run quicker hopefully we can
>>>> encourage them to make more PRs and continue on the path towards becoming a
>>>> committer.
>>>> >
>>>> > Currently only committers and PMC members test builds are run using
>>>> the self-hosted runners, everyone else has to use the GitHub public
>>>> runners. The "stuck in queue" issue doesn't plague us much anymore (I
>>>> think?), but the main issue is still that the GitHub runners only have 8GB
>>>> vs the 64GB of the self-hosted (half of which is used as RAM FS) and as a
>>>> result they are much, much slower.
>>>> >
>>>> > So I propose that we "allow users we trust" to run on the self-hosted
>>>> runners. This is purposefully a lighter weight process than adding those
>>>> users to the Triage group (which we need to have a "vote"/mailing list for
>>>> and then ask ASF Infra team to make the changes to) and is essentially a
>>>> way to make the contributing process nicer for those that have shown
>>>> interest and promise.
>>>> >
>>>> > I am thinking that this would often be used for "this person is
>>>> making a number of good quality PRs, and is on the road to being a
>>>> committer".
>>>> >
>>>> > In terms of project process, all I'm envisaging is that this requires
>>>> a PR to add someone's GitHub username to
>>>> https://github.com/apache/airflow/blob/366c66b8f6eddc0d22028ef494c62bb757bd8b8b/.github/workflows/ci.yml#L80-L123
>>>> and then the "normal" review process to get the change merged.
>>>> >
>>>> > By adding a user to that list the committer/PMC member is saying "I
>>>> am sponsoring this user and trust them to not be malicious".
>>>> >
>>>> > There will be a bit more work to finish this off, namely we'll need
>>>> to get https://github.com/apache/airflow-ci-infra/pull/20 finished and
>>>> working.
>>>> >
>>>> > We should probably be aware that if we do this it will likely be
>>>> "people we (committers) work with" in the first instance. Are we okay with
>>>> that, even if they haven't yet contributed (much/at all) to Airflow?
>>>> >
>>>> > Are there any other criteria that people thing we should apply before
>>>> adding users to this list?
>>>> >
>>>> > Thoughts?
>>>>
>>>
>
>

Re: Improving contributor experience for "trusted" users -- faster CI by using self-hosted runners

Posted by Ash Berlin-Taylor <as...@apache.org>.
MFA doesn't do much to protect against malicious code pushes via stolen 
SSH keys so I don't think this is a necessary pre-condition, but we 
could do it anyway (or ensure MFA is on by inviting them to another GH 
org)

-a

On 22 February 2022 08:53:00 GMT, Elad Kalif <el...@apache.org> wrote:
> +1
> 
> I would like to suggest we make a requirement that the trusted user 
> has 2F authentication 
> <https://docs.github.com/en/authentication/securing-your-account-with-two-factor-authentication-2fa/configuring-two-factor-authentication> 
> set to reduce the risk of hacking the account.
> I think by joining the triage group ASF verifies it so there are some 
> benefits to it. If we decide against the triage group then I guess 
> the sponsor should verify it(?)
> 
> On Tue, Feb 22, 2022 at 12:11 AM Vikram Koka 
> <vi...@astronomer.io.invalid> wrote:
>> +1
>> 
>> I really like the road to committership perspective on this as well!
>> I also like the suggestion of a periodic (ideally automated) clean 
>> up of this list for inactive users and committers as well.
>> 
>> 
>> 
>> On Fri, Feb 18, 2022 at 7:54 AM Jarek Potiuk <jarek@potiuk.com 
>> <ma...@potiuk.com>> wrote:
>>> +1. That will be a huge one and actually adds an interesting "path 
>>> to
>>>  committership" intermediate step. Being a committer (for code) is
>>>  mostly the trust in the person that they can approve other's code.
>>>  This will be "trust that they can run their code on our
>>>  infrastructure".
>>> 
>>>  The nice thing about it is all trackable as well - which is enough
>>>  reason for no-one doing anything bad. And those permissions cannot
>>>  really impact much more than mostly impacting results of other CI
>>>  builds and potentially ab-using the power of our self-hosted 
>>> machines
>>>  (which is not worthy to abuse just this one single repo - it would
>>>  only make sense if you find a mass way of abusing it).
>>> 
>>>  And if we trust people also because they are in a work 
>>> relationship -
>>>  this is quite a good "trust" reason,
>>> 
>>>  One thing that I would add - it could be  great to add a periodic
>>>  (automated ? ) cleanup of the list for inactive users (maybe even
>>>  inactive committers in this case)? This will  - in the long run - 
>>> make
>>>  the list more "accurate" and easier to maintain. Also There is this
>>>  (unlikely) even when GitHub user changes the username, there is
>>>  currently no good protection from someone actually "taking over" 
>>> the
>>>  github name, so keeping the list to only "recently active" might 
>>> help
>>>  with that a lot.
>>> 
>>> 
>>>  J.
>>> 
>>>  On Fri, Feb 18, 2022 at 4:07 PM Ash Berlin-Taylor <ash@apache.org 
>>> <ma...@apache.org>> wrote:
>>>  >
>>>  > Hi all,
>>>  >
>>>  > I'd like to propose we start allowing more users to use the 
>>> self-hosted runners -- they are much much quicker to run test 
>>> workflows. And by making promising people's tests run quicker 
>>> hopefully we can encourage them to make more PRs and continue on 
>>> the path towards becoming a committer.
>>>  >
>>>  > Currently only committers and PMC members test builds are run 
>>> using the self-hosted runners, everyone else has to use the GitHub 
>>> public runners. The "stuck in queue" issue doesn't plague us much 
>>> anymore (I think?), but the main issue is still that the GitHub 
>>> runners only have 8GB vs the 64GB of the self-hosted (half of which 
>>> is used as RAM FS) and as a result they are much, much slower.
>>>  >
>>>  > So I propose that we "allow users we trust" to run on the 
>>> self-hosted runners. This is purposefully a lighter weight process 
>>> than adding those users to the Triage group (which we need to have 
>>> a "vote"/mailing list for and then ask ASF Infra team to make the 
>>> changes to) and is essentially a way to make the contributing 
>>> process nicer for those that have shown interest and promise.
>>>  >
>>>  > I am thinking that this would often be used for "this person is 
>>> making a number of good quality PRs, and is on the road to being a 
>>> committer".
>>>  >
>>>  > In terms of project process, all I'm envisaging is that this 
>>> requires a PR to add someone's GitHub username to 
>>> <https://github.com/apache/airflow/blob/366c66b8f6eddc0d22028ef494c62bb757bd8b8b/.github/workflows/ci.yml#L80-L123> 
>>> and then the "normal" review process to get the change merged.
>>>  >
>>>  > By adding a user to that list the committer/PMC member is saying 
>>> "I am sponsoring this user and trust them to not be malicious".
>>>  >
>>>  > There will be a bit more work to finish this off, namely we'll 
>>> need to get <https://github.com/apache/airflow-ci-infra/pull/20> 
>>> finished and working.
>>>  >
>>>  > We should probably be aware that if we do this it will likely be 
>>> "people we (committers) work with" in the first instance. Are we 
>>> okay with that, even if they haven't yet contributed (much/at all) 
>>> to Airflow?
>>>  >
>>>  > Are there any other criteria that people thing we should apply 
>>> before adding users to this list?
>>>  >
>>>  > Thoughts?




Re: Improving contributor experience for "trusted" users -- faster CI by using self-hosted runners

Posted by Elad Kalif <el...@apache.org>.
+1

I would like to suggest we make a requirement that the trusted user has 2F
authentication
<https://docs.github.com/en/authentication/securing-your-account-with-two-factor-authentication-2fa/configuring-two-factor-authentication>
set to reduce the risk of hacking the account.
I think by joining the triage group ASF verifies it so there are some
benefits to it. If we decide against the triage group then I guess the
sponsor should verify it(?)

On Tue, Feb 22, 2022 at 12:11 AM Vikram Koka <vi...@astronomer.io.invalid>
wrote:

> +1
>
> I really like the road to committership perspective on this as well!
> I also like the suggestion of a periodic (ideally automated) clean up of
> this list for inactive users and committers as well.
>
>
>
> On Fri, Feb 18, 2022 at 7:54 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>
>> +1. That will be a huge one and actually adds an interesting "path to
>> committership" intermediate step. Being a committer (for code) is
>> mostly the trust in the person that they can approve other's code.
>> This will be "trust that they can run their code on our
>> infrastructure".
>>
>> The nice thing about it is all trackable as well - which is enough
>> reason for no-one doing anything bad. And those permissions cannot
>> really impact much more than mostly impacting results of other CI
>> builds and potentially ab-using the power of our self-hosted machines
>> (which is not worthy to abuse just this one single repo - it would
>> only make sense if you find a mass way of abusing it).
>>
>> And if we trust people also because they are in a work relationship -
>> this is quite a good "trust" reason,
>>
>> One thing that I would add - it could be  great to add a periodic
>> (automated ? ) cleanup of the list for inactive users (maybe even
>> inactive committers in this case)? This will  - in the long run - make
>> the list more "accurate" and easier to maintain. Also There is this
>> (unlikely) even when GitHub user changes the username, there is
>> currently no good protection from someone actually "taking over" the
>> github name, so keeping the list to only "recently active" might help
>> with that a lot.
>>
>>
>> J.
>>
>> On Fri, Feb 18, 2022 at 4:07 PM Ash Berlin-Taylor <as...@apache.org> wrote:
>> >
>> > Hi all,
>> >
>> > I'd like to propose we start allowing more users to use the self-hosted
>> runners -- they are much much quicker to run test workflows. And by making
>> promising people's tests run quicker hopefully we can encourage them to
>> make more PRs and continue on the path towards becoming a committer.
>> >
>> > Currently only committers and PMC members test builds are run using the
>> self-hosted runners, everyone else has to use the GitHub public runners.
>> The "stuck in queue" issue doesn't plague us much anymore (I think?), but
>> the main issue is still that the GitHub runners only have 8GB vs the 64GB
>> of the self-hosted (half of which is used as RAM FS) and as a result they
>> are much, much slower.
>> >
>> > So I propose that we "allow users we trust" to run on the self-hosted
>> runners. This is purposefully a lighter weight process than adding those
>> users to the Triage group (which we need to have a "vote"/mailing list for
>> and then ask ASF Infra team to make the changes to) and is essentially a
>> way to make the contributing process nicer for those that have shown
>> interest and promise.
>> >
>> > I am thinking that this would often be used for "this person is making
>> a number of good quality PRs, and is on the road to being a committer".
>> >
>> > In terms of project process, all I'm envisaging is that this requires a
>> PR to add someone's GitHub username to
>> https://github.com/apache/airflow/blob/366c66b8f6eddc0d22028ef494c62bb757bd8b8b/.github/workflows/ci.yml#L80-L123
>> and then the "normal" review process to get the change merged.
>> >
>> > By adding a user to that list the committer/PMC member is saying "I am
>> sponsoring this user and trust them to not be malicious".
>> >
>> > There will be a bit more work to finish this off, namely we'll need to
>> get https://github.com/apache/airflow-ci-infra/pull/20 finished and
>> working.
>> >
>> > We should probably be aware that if we do this it will likely be
>> "people we (committers) work with" in the first instance. Are we okay with
>> that, even if they haven't yet contributed (much/at all) to Airflow?
>> >
>> > Are there any other criteria that people thing we should apply before
>> adding users to this list?
>> >
>> > Thoughts?
>>
>

Re: Improving contributor experience for "trusted" users -- faster CI by using self-hosted runners

Posted by Vikram Koka <vi...@astronomer.io.INVALID>.
+1

I really like the road to committership perspective on this as well!
I also like the suggestion of a periodic (ideally automated) clean up of
this list for inactive users and committers as well.



On Fri, Feb 18, 2022 at 7:54 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> +1. That will be a huge one and actually adds an interesting "path to
> committership" intermediate step. Being a committer (for code) is
> mostly the trust in the person that they can approve other's code.
> This will be "trust that they can run their code on our
> infrastructure".
>
> The nice thing about it is all trackable as well - which is enough
> reason for no-one doing anything bad. And those permissions cannot
> really impact much more than mostly impacting results of other CI
> builds and potentially ab-using the power of our self-hosted machines
> (which is not worthy to abuse just this one single repo - it would
> only make sense if you find a mass way of abusing it).
>
> And if we trust people also because they are in a work relationship -
> this is quite a good "trust" reason,
>
> One thing that I would add - it could be  great to add a periodic
> (automated ? ) cleanup of the list for inactive users (maybe even
> inactive committers in this case)? This will  - in the long run - make
> the list more "accurate" and easier to maintain. Also There is this
> (unlikely) even when GitHub user changes the username, there is
> currently no good protection from someone actually "taking over" the
> github name, so keeping the list to only "recently active" might help
> with that a lot.
>
>
> J.
>
> On Fri, Feb 18, 2022 at 4:07 PM Ash Berlin-Taylor <as...@apache.org> wrote:
> >
> > Hi all,
> >
> > I'd like to propose we start allowing more users to use the self-hosted
> runners -- they are much much quicker to run test workflows. And by making
> promising people's tests run quicker hopefully we can encourage them to
> make more PRs and continue on the path towards becoming a committer.
> >
> > Currently only committers and PMC members test builds are run using the
> self-hosted runners, everyone else has to use the GitHub public runners.
> The "stuck in queue" issue doesn't plague us much anymore (I think?), but
> the main issue is still that the GitHub runners only have 8GB vs the 64GB
> of the self-hosted (half of which is used as RAM FS) and as a result they
> are much, much slower.
> >
> > So I propose that we "allow users we trust" to run on the self-hosted
> runners. This is purposefully a lighter weight process than adding those
> users to the Triage group (which we need to have a "vote"/mailing list for
> and then ask ASF Infra team to make the changes to) and is essentially a
> way to make the contributing process nicer for those that have shown
> interest and promise.
> >
> > I am thinking that this would often be used for "this person is making a
> number of good quality PRs, and is on the road to being a committer".
> >
> > In terms of project process, all I'm envisaging is that this requires a
> PR to add someone's GitHub username to
> https://github.com/apache/airflow/blob/366c66b8f6eddc0d22028ef494c62bb757bd8b8b/.github/workflows/ci.yml#L80-L123
> and then the "normal" review process to get the change merged.
> >
> > By adding a user to that list the committer/PMC member is saying "I am
> sponsoring this user and trust them to not be malicious".
> >
> > There will be a bit more work to finish this off, namely we'll need to
> get https://github.com/apache/airflow-ci-infra/pull/20 finished and
> working.
> >
> > We should probably be aware that if we do this it will likely be "people
> we (committers) work with" in the first instance. Are we okay with that,
> even if they haven't yet contributed (much/at all) to Airflow?
> >
> > Are there any other criteria that people thing we should apply before
> adding users to this list?
> >
> > Thoughts?
>

Re: Improving contributor experience for "trusted" users -- faster CI by using self-hosted runners

Posted by Jarek Potiuk <ja...@potiuk.com>.
+1. That will be a huge one and actually adds an interesting "path to
committership" intermediate step. Being a committer (for code) is
mostly the trust in the person that they can approve other's code.
This will be "trust that they can run their code on our
infrastructure".

The nice thing about it is all trackable as well - which is enough
reason for no-one doing anything bad. And those permissions cannot
really impact much more than mostly impacting results of other CI
builds and potentially ab-using the power of our self-hosted machines
(which is not worthy to abuse just this one single repo - it would
only make sense if you find a mass way of abusing it).

And if we trust people also because they are in a work relationship -
this is quite a good "trust" reason,

One thing that I would add - it could be  great to add a periodic
(automated ? ) cleanup of the list for inactive users (maybe even
inactive committers in this case)? This will  - in the long run - make
the list more "accurate" and easier to maintain. Also There is this
(unlikely) even when GitHub user changes the username, there is
currently no good protection from someone actually "taking over" the
github name, so keeping the list to only "recently active" might help
with that a lot.


J.

On Fri, Feb 18, 2022 at 4:07 PM Ash Berlin-Taylor <as...@apache.org> wrote:
>
> Hi all,
>
> I'd like to propose we start allowing more users to use the self-hosted runners -- they are much much quicker to run test workflows. And by making promising people's tests run quicker hopefully we can encourage them to make more PRs and continue on the path towards becoming a committer.
>
> Currently only committers and PMC members test builds are run using the self-hosted runners, everyone else has to use the GitHub public runners. The "stuck in queue" issue doesn't plague us much anymore (I think?), but the main issue is still that the GitHub runners only have 8GB vs the 64GB of the self-hosted (half of which is used as RAM FS) and as a result they are much, much slower.
>
> So I propose that we "allow users we trust" to run on the self-hosted runners. This is purposefully a lighter weight process than adding those users to the Triage group (which we need to have a "vote"/mailing list for and then ask ASF Infra team to make the changes to) and is essentially a way to make the contributing process nicer for those that have shown interest and promise.
>
> I am thinking that this would often be used for "this person is making a number of good quality PRs, and is on the road to being a committer".
>
> In terms of project process, all I'm envisaging is that this requires a PR to add someone's GitHub username to https://github.com/apache/airflow/blob/366c66b8f6eddc0d22028ef494c62bb757bd8b8b/.github/workflows/ci.yml#L80-L123 and then the "normal" review process to get the change merged.
>
> By adding a user to that list the committer/PMC member is saying "I am sponsoring this user and trust them to not be malicious".
>
> There will be a bit more work to finish this off, namely we'll need to get https://github.com/apache/airflow-ci-infra/pull/20 finished and working.
>
> We should probably be aware that if we do this it will likely be "people we (committers) work with" in the first instance. Are we okay with that, even if they haven't yet contributed (much/at all) to Airflow?
>
> Are there any other criteria that people thing we should apply before adding users to this list?
>
> Thoughts?