You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Hyukjin Kwon <gu...@gmail.com> on 2021/04/07 05:23:59 UTC

Increase the number of parallel jobs in GitHub Actions at ASF organization level

Hi all,

I am an Apache Spark PMC, and would like to know the future plan about
GitHub Actions in ASF.
Please also see the INFRA ticket I filed:
https://issues.apache.org/jira/browse/INFRA-21646.

I am aware of the limited GitHub Actions resources that are shared
across all projects in ASF,
and many projects suffer from it. This issue significantly slows down the
development cycle of
 other projects, at least Apache Spark.

How do we plan to increase the resources in GitHub Actions, and what are
the blockers? I would appreciate any input and thoughts on this.

Thank you so much.

CC'ing Spark @dev <de...@spark.apache.org> for more visibility. Please take
it out if considered inappropriate.

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Hyukjin Kwon <gu...@gmail.com>.
> So it all has to start with 'per-project' resource limitation and self- >
budgeting. It would be GREAT if infra.could provide self-hosted GitHub >
Runners SERVICE per project, where project could donate credits or money >
for their own account, then the projects would have incentive to optimize >
their own usage. I imagine this would be the best thing since the sliced >
bread that INFRA could provide to all the projects.

Thanks Jarek. I think this sounds reasonable and realistic to me. +1



2021년 4월 7일 (수) 오후 10:30, Hyukjin Kwon <gu...@gmail.com>님이 작성:

> Thanks Martin for your feedback.
>
> > What was your reason to migrate from Apache Jenkins to Github Actions ?
>
> I am sure there were more reasons for migrating from Amplap Jenkins
> <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as far as
> I can remember:
> - To reduce the maintenance cost of machines
> - The Jenkins machines became unstable and slow causing CI jobs to fail or
> be very flaky.
> - Difficulty to manage the installed libraries.
> - Intermittent unknown issues in the machines
>
> Yes, one option might be to consider other options to migrate again.
> However, other projects will very likely suffer the
> same problem. In addition, the migration in a large project is not an
> easy work to do
>
> I would like to know the feasibility of having more resources in GitHub
> Actions, or, for example, having sub-groups where
> each group shares the resources - currently one GitHub organisation shares
> all resources across the projects.
>
>
> 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mg...@apache.org>님이 작성:
>
>>
>>
>> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> Hi Greg,
>>>
>>> I raised this thread to figure out a way that we can work together to
>>> resolve this issue, gather feedback, and to understand how other projects
>>> work around.
>>> Several projects I observed, as far as I can tell, have made enough
>>> efforts
>>> to save the resources in GitHub Actions but still suffer from the lack of
>>> resources.
>>>
>>
>> And it will get even worse because:
>> 1) more and more Apache projects migrate from TravisCI to Github Actions
>> (GA)
>> 2) new projects join ASF and many of them already use GA
>>
>>
>> What was your reason to migrate from Apache Jenkins to Github Actions ?
>> If you want dedicated resources then you will need to manage the CI
>> yourself.
>> You could use Apache Jenkins/Buildbot with dedicated agents for your
>> project.
>> Or you could set up your own CI infrastructure with Jenkins, DroneIO,
>> ConcourceCI, ...
>>
>> Yet another option is to move to CircleCI or Cirrus. They are similar to
>> TravisCI / GA and less crowded (for now).
>>
>> Martin
>>
>> I appreciate the resources provided to us but that does not resolve the
>>> issue of the development being slowed down.
>>>
>>>
>>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
>>>
>>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>> >
>>> >> Hi all,
>>> >>
>>> >> I am an Apache Spark PMC,
>>> >
>>> >
>>> > You are a member of the Apache Spark PMC. You are *not* a PMC. Please
>>> stop
>>> > with that terminology. The Foundation has about 200 PMCs, and you are a
>>> > member of one of them. You are NOT a "PMC" .. you're a person. A PMC
>>> is a
>>> > construct of the Foundation.
>>> >
>>> > >...
>>> >
>>> >> I am aware of the limited GitHub Actions resources that are shared
>>> >> across all projects in ASF,
>>> >> and many projects suffer from it. This issue significantly slows down
>>> the
>>> >> development cycle of
>>> >>  other projects, at least Apache Spark.
>>> >>
>>> >
>>> > And the Foundation gets those build minutes for GitHub Actions
>>> provided to
>>> > us from GitHub and Microsoft, and we are thankful that they provide
>>> them to
>>> > the Foundation. Maybe it isn't all the build minutes that every group
>>> > wants, but that is what we have. So it is incumbent upon all of us to
>>> > figure out how to build more, with fewer minutes.
>>> >
>>> > Say "thank you" to GitHub, please.
>>> >
>>> > Regards,
>>> > -g
>>> >
>>> >
>>>
>>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by shane knapp ☠ <sk...@berkeley.edu>.
On Wed, Apr 7, 2021 at 6:30 AM Hyukjin Kwon <gu...@gmail.com> wrote:

> Thanks Martin for your feedback.
>
> > What was your reason to migrate from Apache Jenkins to Github Actions ?
>
> I am sure there were more reasons for migrating from Amplap Jenkins
> <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as far as
> I can remember:
> - To reduce the maintenance cost of machines
> - The Jenkins machines became unstable and slow causing CI jobs to fail or
> be very flaky.
> - Difficulty to manage the installed libraries.
> - Intermittent unknown issues in the machines
>
> also:

- uc berkeley has been hosting the build system for spark for ~10 years
"free of charge"
- funding for the build system is going away (amplab funded first, riselab
second)
- i have been managing the build system solo for 7 years and my job is much
different now...
- since there are no funds coming from research labs, i am unable to staff
the build system past 2021 (tbh, even this year is a stretch)
- the hardware is far past EOL and literally falling over
- jenkins is, and always will be a PITA to run

shane
-- 
Shane Knapp
Computer Guy / Voice of Reason
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by shane knapp ☠ <sk...@berkeley.edu>.
On Wed, Apr 7, 2021 at 6:30 AM Hyukjin Kwon <gu...@gmail.com> wrote:

> Thanks Martin for your feedback.
>
> > What was your reason to migrate from Apache Jenkins to Github Actions ?
>
> I am sure there were more reasons for migrating from Amplap Jenkins
> <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as far as
> I can remember:
> - To reduce the maintenance cost of machines
> - The Jenkins machines became unstable and slow causing CI jobs to fail or
> be very flaky.
> - Difficulty to manage the installed libraries.
> - Intermittent unknown issues in the machines
>
> also:

- uc berkeley has been hosting the build system for spark for ~10 years
"free of charge"
- funding for the build system is going away (amplab funded first, riselab
second)
- i have been managing the build system solo for 7 years and my job is much
different now...
- since there are no funds coming from research labs, i am unable to staff
the build system past 2021 (tbh, even this year is a stretch)
- the hardware is far past EOL and literally falling over
- jenkins is, and always will be a PITA to run

shane
-- 
Shane Knapp
Computer Guy / Voice of Reason
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Jarek Potiuk <ja...@potiuk.com>.
śr., 7 kwi 2021, 18:45 użytkownik ocket 8888 <oc...@gmail.com> napisał:

> If your project can afford it, you can add self-hosted GHA runners:
>
> https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners
> The issue with that being that the machine running your actions will
> necessarily have write access to the repository through the API, so you
> can't just use a server donated by a company . I'm not sure if there's a
> way to limit its access based on what your actions actually need, you'll
> need to consult the documentation on that topic.
>
> What might be neat is to see some of the infra resources currently
> allocated to Jenkins - if not needed to keep up with load, which is
> something I have no idea about - be repurposed as trusted self-hosted GHA
> runners.
>

Just to repeat that again. This is official recommendation from GitHub to
NOT USE self hosted runners for public repos. We had to fork the runner and
modify it to use it in Airflow. We ever had discussion with GitHub
Developer Advocate organized by Gavin and they said 'we can't expect any
security improvement here in a foreseeable time. But maybe infra can use
our fork of runner and host it :) .

The problem with that is that if this is 'free,  the projects will have no
incentive to optimize their builds - they will quickly use whatever is
available. I repeated that many times and repeat it again. Building wider
free highways causes more traffic and does not stop traffic jams. They drop
a little initially but quickly come back to what they were before.

So it all has to start with 'per-project' resource limitation and self-
budgeting. It would be GREAT if infra.could provide self-hosted GitHub
Runners SERVICE per project, where project could donate credits or money
for their own account, then the projects would have incentive to optimize
their own usage. I imagine this would be the best thing since the sliced
bread that INFRA could provide to all the projects.

Here is more info about Airflow version of runners and why it is REALLY
dangerous to use unmodified self-hosted runners. As of few days hacker ARE
MINING CRYPTOCURRENCY using pull requests for public repos. I warned
against this scenario for months and now IT IS HAPPENING.

2) We've forked ([~ash]) github runner to add security layer to only allow
our maintainers to run the PRs build on it. https://github.com/ashb/runner
This is important because we already know that hackers already use GitHub
Actions to mine the cryptocurrencies: ​
https://www.bleepingcomputer.com/news/security/github-actions-being-actively-abused-to-mine-cryptocurrency-on-github-servers/


> On Wed, Apr 7, 2021 at 7:31 AM Hyukjin Kwon <gu...@gmail.com> wrote:
>
> > Thanks Martin for your feedback.
> >
> > > What was your reason to migrate from Apache Jenkins to Github Actions ?
> >
> > I am sure there were more reasons for migrating from Amplap Jenkins
> > <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as far
> as
> > I
> > can remember:
> > - To reduce the maintenance cost of machines
> > - The Jenkins machines became unstable and slow causing CI jobs to fail
> or
> > be very flaky.
> > - Difficulty to manage the installed libraries.
> > - Intermittent unknown issues in the machines
> >
> > Yes, one option might be to consider other options to migrate again.
> > However, other projects will very likely suffer the
> > same problem. In addition, the migration in a large project is not an
> > easy work to do
> >
> > I would like to know the feasibility of having more resources in GitHub
> > Actions, or, for example, having sub-groups where
> > each group shares the resources - currently one GitHub organisation
> shares
> > all resources across the projects.
> >
> >
> > 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mg...@apache.org>님이 작성:
> >
> > >
> > >
> > > On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com>
> wrote:
> > >
> > >> Hi Greg,
> > >>
> > >> I raised this thread to figure out a way that we can work together to
> > >> resolve this issue, gather feedback, and to understand how other
> > projects
> > >> work around.
> > >> Several projects I observed, as far as I can tell, have made enough
> > >> efforts
> > >> to save the resources in GitHub Actions but still suffer from the lack
> > of
> > >> resources.
> > >>
> > >
> > > And it will get even worse because:
> > > 1) more and more Apache projects migrate from TravisCI to Github
> Actions
> > > (GA)
> > > 2) new projects join ASF and many of them already use GA
> > >
> > >
> > > What was your reason to migrate from Apache Jenkins to Github Actions ?
> > > If you want dedicated resources then you will need to manage the CI
> > > yourself.
> > > You could use Apache Jenkins/Buildbot with dedicated agents for your
> > > project.
> > > Or you could set up your own CI infrastructure with Jenkins, DroneIO,
> > > ConcourceCI, ...
> > >
> > > Yet another option is to move to CircleCI or Cirrus. They are similar
> to
> > > TravisCI / GA and less crowded (for now).
> > >
> > > Martin
> > >
> > > I appreciate the resources provided to us but that does not resolve the
> > >> issue of the development being slowed down.
> > >>
> > >>
> > >> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
> > >>
> > >> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com>
> > >> wrote:
> > >> >
> > >> >> Hi all,
> > >> >>
> > >> >> I am an Apache Spark PMC,
> > >> >
> > >> >
> > >> > You are a member of the Apache Spark PMC. You are *not* a PMC.
> Please
> > >> stop
> > >> > with that terminology. The Foundation has about 200 PMCs, and you
> are
> > a
> > >> > member of one of them. You are NOT a "PMC" .. you're a person. A PMC
> > is
> > >> a
> > >> > construct of the Foundation.
> > >> >
> > >> > >...
> > >> >
> > >> >> I am aware of the limited GitHub Actions resources that are shared
> > >> >> across all projects in ASF,
> > >> >> and many projects suffer from it. This issue significantly slows
> down
> > >> the
> > >> >> development cycle of
> > >> >>  other projects, at least Apache Spark.
> > >> >>
> > >> >
> > >> > And the Foundation gets those build minutes for GitHub Actions
> > provided
> > >> to
> > >> > us from GitHub and Microsoft, and we are thankful that they provide
> > >> them to
> > >> > the Foundation. Maybe it isn't all the build minutes that every
> group
> > >> > wants, but that is what we have. So it is incumbent upon all of us
> to
> > >> > figure out how to build more, with fewer minutes.
> > >> >
> > >> > Say "thank you" to GitHub, please.
> > >> >
> > >> > Regards,
> > >> > -g
> > >> >
> > >> >
> > >>
> > >
> >
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by ocket 8888 <oc...@gmail.com>.
If your project can afford it, you can add self-hosted GHA runners:
https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners
The issue with that being that the machine running your actions will
necessarily have write access to the repository through the API, so you
can't just use a server donated by a company . I'm not sure if there's a
way to limit its access based on what your actions actually need, you'll
need to consult the documentation on that topic.

What might be neat is to see some of the infra resources currently
allocated to Jenkins - if not needed to keep up with load, which is
something I have no idea about - be repurposed as trusted self-hosted GHA
runners.

On Wed, Apr 7, 2021 at 7:31 AM Hyukjin Kwon <gu...@gmail.com> wrote:

> Thanks Martin for your feedback.
>
> > What was your reason to migrate from Apache Jenkins to Github Actions ?
>
> I am sure there were more reasons for migrating from Amplap Jenkins
> <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as far as
> I
> can remember:
> - To reduce the maintenance cost of machines
> - The Jenkins machines became unstable and slow causing CI jobs to fail or
> be very flaky.
> - Difficulty to manage the installed libraries.
> - Intermittent unknown issues in the machines
>
> Yes, one option might be to consider other options to migrate again.
> However, other projects will very likely suffer the
> same problem. In addition, the migration in a large project is not an
> easy work to do
>
> I would like to know the feasibility of having more resources in GitHub
> Actions, or, for example, having sub-groups where
> each group shares the resources - currently one GitHub organisation shares
> all resources across the projects.
>
>
> 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mg...@apache.org>님이 작성:
>
> >
> >
> > On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com> wrote:
> >
> >> Hi Greg,
> >>
> >> I raised this thread to figure out a way that we can work together to
> >> resolve this issue, gather feedback, and to understand how other
> projects
> >> work around.
> >> Several projects I observed, as far as I can tell, have made enough
> >> efforts
> >> to save the resources in GitHub Actions but still suffer from the lack
> of
> >> resources.
> >>
> >
> > And it will get even worse because:
> > 1) more and more Apache projects migrate from TravisCI to Github Actions
> > (GA)
> > 2) new projects join ASF and many of them already use GA
> >
> >
> > What was your reason to migrate from Apache Jenkins to Github Actions ?
> > If you want dedicated resources then you will need to manage the CI
> > yourself.
> > You could use Apache Jenkins/Buildbot with dedicated agents for your
> > project.
> > Or you could set up your own CI infrastructure with Jenkins, DroneIO,
> > ConcourceCI, ...
> >
> > Yet another option is to move to CircleCI or Cirrus. They are similar to
> > TravisCI / GA and less crowded (for now).
> >
> > Martin
> >
> > I appreciate the resources provided to us but that does not resolve the
> >> issue of the development being slowed down.
> >>
> >>
> >> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
> >>
> >> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com>
> >> wrote:
> >> >
> >> >> Hi all,
> >> >>
> >> >> I am an Apache Spark PMC,
> >> >
> >> >
> >> > You are a member of the Apache Spark PMC. You are *not* a PMC. Please
> >> stop
> >> > with that terminology. The Foundation has about 200 PMCs, and you are
> a
> >> > member of one of them. You are NOT a "PMC" .. you're a person. A PMC
> is
> >> a
> >> > construct of the Foundation.
> >> >
> >> > >...
> >> >
> >> >> I am aware of the limited GitHub Actions resources that are shared
> >> >> across all projects in ASF,
> >> >> and many projects suffer from it. This issue significantly slows down
> >> the
> >> >> development cycle of
> >> >>  other projects, at least Apache Spark.
> >> >>
> >> >
> >> > And the Foundation gets those build minutes for GitHub Actions
> provided
> >> to
> >> > us from GitHub and Microsoft, and we are thankful that they provide
> >> them to
> >> > the Foundation. Maybe it isn't all the build minutes that every group
> >> > wants, but that is what we have. So it is incumbent upon all of us to
> >> > figure out how to build more, with fewer minutes.
> >> >
> >> > Say "thank you" to GitHub, please.
> >> >
> >> > Regards,
> >> > -g
> >> >
> >> >
> >>
> >
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Hyukjin Kwon <gu...@gmail.com>.
Thank you Gavin.

On Fri, 16 Apr 2021, 23:00 Gavin McDonald, <gm...@apache.org> wrote:

> Hi Everyone,
>
> As usual, some great discussion is going on.
> I thought I'd let this thread continue for a while before
> joining in, but rest assured each and every email on this
> list gets read.
>
> I emailed Github a few days ago asking a bunch of questions;
> and hope to get another builds@ meeting with Github as guests pretty soon.
>
> Will let you know how it goes
>
>
>
> On Fri, Apr 16, 2021 at 3:46 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
> > Yes, per project budget will be great.
> >
> > Beside cancellation workaround, Apache Spark is trying another workaround
> > for the time being: distributing workflow runs
> > to forked repositories to leverage contributor's GitHub Actions resources
> > instead of consuming all ASF organisation resources.
> >
> > Please see also how I implemented it if anyone is interested in this -
> the
> > PR description should be self-explanatory:
> > - https://github.com/apache/spark/pull/32092
> > - https://github.com/apache/spark/pull/32193
> >
> > Note that this is still a workaround and it disables GitHub Actions work
> > out of the box.
> >
> >
> > On Fri, 16 Apr 2021, 22:26 Matt Sicker, <bo...@gmail.com> wrote:
> >
> > > I thought one of the key points raised by Jarek before was that even
> with
> > > infinite compute resources, unoptimized actions will fill up all the
> > > compute (like how widening highways induces higher traffic demand). The
> > > per-PMC compute budgets sounds like a great way to help “shift left” on
> > the
> > > problem to avoid demanding Infra fix hundreds of scripts they know
> > nothing
> > > about. The other optimizations Jarek has demonstrated here like
> ensuring
> > > old commits get cancelled in favor of new commits, or more
> sophisticated
> > > things like only running tests that are relevant to the commit, can all
> > go
> > > a long way toward optimizing build compute.
> > >
> > > Now if you’re attempting to run entire integration and end to end test
> > > suites on every commit, I don’t think the earth has enough compute
> power
> > > for that quite yet!
> > >
> > > On Fri, Apr 16, 2021 at 07:44 Hyukjin Kwon <gu...@gmail.com>
> wrote:
> > >
> > > > I thought Jarek was pretty clear on that. I meant this:
> > > >
> > > > > So it all has to start with 'per-project' resource limitation and
> > self-
> > > > > budgeting. It would be GREAT if infra.could provide self-hosted
> > GitHub
> > > > > Runners SERVICE per project, where project could donate credits or
> > > money
> > > > > for their own account, then the projects would have incentive to
> > > optimize
> > > > > their own usage. I imagine this would be the best thing since the
> > > sliced
> > > > > bread that INFRA could provide to all the projects.
> > > >
> > > > Maintaining and providing a self-hosted runners in GitHub Actions
> where
> > > the
> > > > resources are managed in project level where each project can donate
> > > > credits.
> > > >
> > > > In addition, Jarek mentioned that Airflow already has a working
> > version -
> > > > is it correct Jarek?
> > > >
> > > > If the infra team takes and improves it for other ASF projects, that
> > > would
> > > > permanently resolve this issue.
> > > >
> > > > This suggestion looks reasonable and realistic to me.
> > > >
> > > > How do you think about this?
> > > >
> > > >
> > > > On Fri, 16 Apr 2021, 21:36 Martin Grigorov, <mg...@apache.org>
> > > wrote:
> > > >
> > > > > Hi Hyukjin,
> > > > >
> > > > > On Fri, Apr 16, 2021 at 3:04 AM Hyukjin Kwon <gu...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > Is here the right place to expect feedback from the infra team or
> > > > related
> > > > > > people?
> > > > > > It would be great to hear what the infra team thinks about
> Jarek's
> > > > > > suggestion.
> > > > > >
> > > > >
> > > > > What suggestion exactly do you mean ?
> > > > > I've just re-read Jarek's email and I see 3 tasks for Github
> Actions
> > > > team,
> > > > > but nothing specific for Apache Infra team.
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > 2021년 4월 13일 (화) 오전 11:15, Hyukjin Kwon <gu...@gmail.com>님이
> > 작성:
> > > > > >
> > > > > >> Hi all,
> > > > > >>
> > > > > >> Could we have any update and feedback from the INFRA team about
> > > > Jarek's
> > > > > >> suggestion please?
> > > > > >>
> > > > > >> 2021년 4월 9일 (금) 오전 7:06, Jarek Potiuk <ja...@potiuk.com>님이 작성:
> > > > > >>
> > > > > >>>
> > > > > >>>> That's a good idea. We do need to thank Github to give free
> > > > resources
> > > > > to
> > > > > >>>> ASF projects, but it's better if we can make it a business: we
> > > allow
> > > > > >>>> individual projects to sign deals with Github to get dedicated
> > > > > >>>> resources.
> > > > > >>>> It's a bit wasteful to ask every project to set up its own dev
> > > ops,
> > > > > >>>> using Github Action is more convenient. Maybe we should raise
> it
> > > to
> > > > > >>>> Github?
> > > > > >>>>
> > > > > >>>
> > > > > >>> I do not think you can get per-project resources in GH - the
> most
> > > you
> > > > > >>> can do are self-hosted runners for your project.
> > > > > >>>
> > > > > >>> (BTW I am not from the INFRA team - just a humble "CI person"
> of
> > > > Apache
> > > > > >>> Airflow but very much vested into Github Actions)
> > > > > >>> maybe the infra team can chime in here. We did raise it to
> > GitHub,
> > > we
> > > > > >>> even had meeting with them
> > > > > >>> organized by Gavin and several topics were raised that could be
> > > > > >>> eventually addressed by Github:
> > > > > >>>
> > > > > >>> - observability (they could not give us per-project usage
> > > dashboard -
> > > > > we
> > > > > >>> built our own imperfect (with API limitations) one by Tobiasz
> > from
> > > > > Airllow
> > > > > >>> - security (limiting access to only project committers) - this
> we
> > > > > >>> handled by the Ash's fork of Runner (but it's also imperfect -
> > even
> > > > > today I
> > > > > >>> had to fix a problem where we had list of committers
> > desynchronised
> > > > > between
> > > > > >>> our infra/CI.yml)
> > > > > >>> - manageability (assigning resources per-project) - this works
> by
> > > > > having
> > > > > >>> self-hosted runners assigned per project (we needed infra JIRA
> > > ticket
> > > > > and
> > > > > >>> generation of a bunch of tokens for our runners and our own AWS
> > > > account
> > > > > >>> with auto-scaling).
> > > > > >>>
> > > > > >>> It would be indeed great if it could be available from GitHub,
> > but
> > > so
> > > > > >>> far we do not have any of those.
> > > > > >>>
> > > > > >>> J.
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>> On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <
> > gurwls223@gmail.com>
> > > > > >>>> wrote:
> > > > > >>>>
> > > > > >>>> > Thanks Martin for your feedback.
> > > > > >>>> >
> > > > > >>>> > > What was your reason to migrate from Apache Jenkins to
> > Github
> > > > > >>>> Actions ?
> > > > > >>>> >
> > > > > >>>> > I am sure there were more reasons for migrating from Amplap
> > > > Jenkins
> > > > > >>>> > <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions
> > but
> > > > as
> > > > > >>>> far as
> > > > > >>>> > I can remember:
> > > > > >>>> > - To reduce the maintenance cost of machines
> > > > > >>>> > - The Jenkins machines became unstable and slow causing CI
> > jobs
> > > to
> > > > > >>>> fail or
> > > > > >>>> > be very flaky.
> > > > > >>>> > - Difficulty to manage the installed libraries.
> > > > > >>>> > - Intermittent unknown issues in the machines
> > > > > >>>> >
> > > > > >>>> > Yes, one option might be to consider other options to
> migrate
> > > > again.
> > > > > >>>> > However, other projects will very likely suffer the
> > > > > >>>> > same problem. In addition, the migration in a large project
> is
> > > not
> > > > > an
> > > > > >>>> > easy work to do
> > > > > >>>> >
> > > > > >>>> > I would like to know the feasibility of having more
> resources
> > in
> > > > > >>>> GitHub
> > > > > >>>> > Actions, or, for example, having sub-groups where
> > > > > >>>> > each group shares the resources - currently one GitHub
> > > > organisation
> > > > > >>>> shares
> > > > > >>>> > all resources across the projects.
> > > > > >>>> >
> > > > > >>>> >
> > > > > >>>> > 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <
> > mgrigorov@apache.org
> > > > >님이
> > > > > >>>> 작성:
> > > > > >>>> >
> > > > > >>>> >>
> > > > > >>>> >>
> > > > > >>>> >> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <
> > > gurwls223@gmail.com
> > > > >
> > > > > >>>> wrote:
> > > > > >>>> >>
> > > > > >>>> >>> Hi Greg,
> > > > > >>>> >>>
> > > > > >>>> >>> I raised this thread to figure out a way that we can work
> > > > together
> > > > > >>>> to
> > > > > >>>> >>> resolve this issue, gather feedback, and to understand how
> > > other
> > > > > >>>> projects
> > > > > >>>> >>> work around.
> > > > > >>>> >>> Several projects I observed, as far as I can tell, have
> made
> > > > > enough
> > > > > >>>> >>> efforts
> > > > > >>>> >>> to save the resources in GitHub Actions but still suffer
> > from
> > > > the
> > > > > >>>> lack of
> > > > > >>>> >>> resources.
> > > > > >>>> >>>
> > > > > >>>> >>
> > > > > >>>> >> And it will get even worse because:
> > > > > >>>> >> 1) more and more Apache projects migrate from TravisCI to
> > > Github
> > > > > >>>> Actions
> > > > > >>>> >> (GA)
> > > > > >>>> >> 2) new projects join ASF and many of them already use GA
> > > > > >>>> >>
> > > > > >>>> >>
> > > > > >>>> >> What was your reason to migrate from Apache Jenkins to
> Github
> > > > > >>>> Actions ?
> > > > > >>>> >> If you want dedicated resources then you will need to
> manage
> > > the
> > > > CI
> > > > > >>>> >> yourself.
> > > > > >>>> >> You could use Apache Jenkins/Buildbot with dedicated agents
> > for
> > > > > your
> > > > > >>>> >> project.
> > > > > >>>> >> Or you could set up your own CI infrastructure with
> Jenkins,
> > > > > DroneIO,
> > > > > >>>> >> ConcourceCI, ...
> > > > > >>>> >>
> > > > > >>>> >> Yet another option is to move to CircleCI or Cirrus. They
> are
> > > > > >>>> similar to
> > > > > >>>> >> TravisCI / GA and less crowded (for now).
> > > > > >>>> >>
> > > > > >>>> >> Martin
> > > > > >>>> >>
> > > > > >>>> >> I appreciate the resources provided to us but that does not
> > > > resolve
> > > > > >>>> the
> > > > > >>>> >>> issue of the development being slowed down.
> > > > > >>>> >>>
> > > > > >>>> >>>
> > > > > >>>> >>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이
> > 작성:
> > > > > >>>> >>>
> > > > > >>>> >>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <
> > > > > gurwls223@gmail.com
> > > > > >>>> >
> > > > > >>>> >>> wrote:
> > > > > >>>> >>> >
> > > > > >>>> >>> >> Hi all,
> > > > > >>>> >>> >>
> > > > > >>>> >>> >> I am an Apache Spark PMC,
> > > > > >>>> >>> >
> > > > > >>>> >>> >
> > > > > >>>> >>> > You are a member of the Apache Spark PMC. You are *not*
> a
> > > PMC.
> > > > > >>>> Please
> > > > > >>>> >>> stop
> > > > > >>>> >>> > with that terminology. The Foundation has about 200
> PMCs,
> > > and
> > > > > you
> > > > > >>>> are a
> > > > > >>>> >>> > member of one of them. You are NOT a "PMC" .. you're a
> > > > person. A
> > > > > >>>> PMC
> > > > > >>>> >>> is a
> > > > > >>>> >>> > construct of the Foundation.
> > > > > >>>> >>> >
> > > > > >>>> >>> > >...
> > > > > >>>> >>> >
> > > > > >>>> >>> >> I am aware of the limited GitHub Actions resources that
> > are
> > > > > >>>> shared
> > > > > >>>> >>> >> across all projects in ASF,
> > > > > >>>> >>> >> and many projects suffer from it. This issue
> > significantly
> > > > > slows
> > > > > >>>> down
> > > > > >>>> >>> the
> > > > > >>>> >>> >> development cycle of
> > > > > >>>> >>> >>  other projects, at least Apache Spark.
> > > > > >>>> >>> >>
> > > > > >>>> >>> >
> > > > > >>>> >>> > And the Foundation gets those build minutes for GitHub
> > > Actions
> > > > > >>>> >>> provided to
> > > > > >>>> >>> > us from GitHub and Microsoft, and we are thankful that
> > they
> > > > > >>>> provide
> > > > > >>>> >>> them to
> > > > > >>>> >>> > the Foundation. Maybe it isn't all the build minutes
> that
> > > > every
> > > > > >>>> group
> > > > > >>>> >>> > wants, but that is what we have. So it is incumbent upon
> > all
> > > > of
> > > > > >>>> us to
> > > > > >>>> >>> > figure out how to build more, with fewer minutes.
> > > > > >>>> >>> >
> > > > > >>>> >>> > Say "thank you" to GitHub, please.
> > > > > >>>> >>> >
> > > > > >>>> >>> > Regards,
> > > > > >>>> >>> > -g
> > > > > >>>> >>> >
> > > > > >>>> >>> >
> > > > > >>>> >>>
> > > > > >>>> >>
> > > > > >>>>
> > > > > >>>
> > > > > >>>
> > > > > >>> --
> > > > > >>> +48 660 796 129
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
>
>
> --
>
> *Gavin McDonald*
> Systems Administrator
> ASF Infrastructure Team
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Gavin McDonald <gm...@apache.org>.
Hi Everyone,

As usual, some great discussion is going on.
I thought I'd let this thread continue for a while before
joining in, but rest assured each and every email on this
list gets read.

I emailed Github a few days ago asking a bunch of questions;
and hope to get another builds@ meeting with Github as guests pretty soon.

Will let you know how it goes



On Fri, Apr 16, 2021 at 3:46 PM Hyukjin Kwon <gu...@gmail.com> wrote:

> Yes, per project budget will be great.
>
> Beside cancellation workaround, Apache Spark is trying another workaround
> for the time being: distributing workflow runs
> to forked repositories to leverage contributor's GitHub Actions resources
> instead of consuming all ASF organisation resources.
>
> Please see also how I implemented it if anyone is interested in this - the
> PR description should be self-explanatory:
> - https://github.com/apache/spark/pull/32092
> - https://github.com/apache/spark/pull/32193
>
> Note that this is still a workaround and it disables GitHub Actions work
> out of the box.
>
>
> On Fri, 16 Apr 2021, 22:26 Matt Sicker, <bo...@gmail.com> wrote:
>
> > I thought one of the key points raised by Jarek before was that even with
> > infinite compute resources, unoptimized actions will fill up all the
> > compute (like how widening highways induces higher traffic demand). The
> > per-PMC compute budgets sounds like a great way to help “shift left” on
> the
> > problem to avoid demanding Infra fix hundreds of scripts they know
> nothing
> > about. The other optimizations Jarek has demonstrated here like ensuring
> > old commits get cancelled in favor of new commits, or more sophisticated
> > things like only running tests that are relevant to the commit, can all
> go
> > a long way toward optimizing build compute.
> >
> > Now if you’re attempting to run entire integration and end to end test
> > suites on every commit, I don’t think the earth has enough compute power
> > for that quite yet!
> >
> > On Fri, Apr 16, 2021 at 07:44 Hyukjin Kwon <gu...@gmail.com> wrote:
> >
> > > I thought Jarek was pretty clear on that. I meant this:
> > >
> > > > So it all has to start with 'per-project' resource limitation and
> self-
> > > > budgeting. It would be GREAT if infra.could provide self-hosted
> GitHub
> > > > Runners SERVICE per project, where project could donate credits or
> > money
> > > > for their own account, then the projects would have incentive to
> > optimize
> > > > their own usage. I imagine this would be the best thing since the
> > sliced
> > > > bread that INFRA could provide to all the projects.
> > >
> > > Maintaining and providing a self-hosted runners in GitHub Actions where
> > the
> > > resources are managed in project level where each project can donate
> > > credits.
> > >
> > > In addition, Jarek mentioned that Airflow already has a working
> version -
> > > is it correct Jarek?
> > >
> > > If the infra team takes and improves it for other ASF projects, that
> > would
> > > permanently resolve this issue.
> > >
> > > This suggestion looks reasonable and realistic to me.
> > >
> > > How do you think about this?
> > >
> > >
> > > On Fri, 16 Apr 2021, 21:36 Martin Grigorov, <mg...@apache.org>
> > wrote:
> > >
> > > > Hi Hyukjin,
> > > >
> > > > On Fri, Apr 16, 2021 at 3:04 AM Hyukjin Kwon <gu...@gmail.com>
> > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Is here the right place to expect feedback from the infra team or
> > > related
> > > > > people?
> > > > > It would be great to hear what the infra team thinks about Jarek's
> > > > > suggestion.
> > > > >
> > > >
> > > > What suggestion exactly do you mean ?
> > > > I've just re-read Jarek's email and I see 3 tasks for Github Actions
> > > team,
> > > > but nothing specific for Apache Infra team.
> > > >
> > > >
> > > > >
> > > > >
> > > > > 2021년 4월 13일 (화) 오전 11:15, Hyukjin Kwon <gu...@gmail.com>님이
> 작성:
> > > > >
> > > > >> Hi all,
> > > > >>
> > > > >> Could we have any update and feedback from the INFRA team about
> > > Jarek's
> > > > >> suggestion please?
> > > > >>
> > > > >> 2021년 4월 9일 (금) 오전 7:06, Jarek Potiuk <ja...@potiuk.com>님이 작성:
> > > > >>
> > > > >>>
> > > > >>>> That's a good idea. We do need to thank Github to give free
> > > resources
> > > > to
> > > > >>>> ASF projects, but it's better if we can make it a business: we
> > allow
> > > > >>>> individual projects to sign deals with Github to get dedicated
> > > > >>>> resources.
> > > > >>>> It's a bit wasteful to ask every project to set up its own dev
> > ops,
> > > > >>>> using Github Action is more convenient. Maybe we should raise it
> > to
> > > > >>>> Github?
> > > > >>>>
> > > > >>>
> > > > >>> I do not think you can get per-project resources in GH - the most
> > you
> > > > >>> can do are self-hosted runners for your project.
> > > > >>>
> > > > >>> (BTW I am not from the INFRA team - just a humble "CI person" of
> > > Apache
> > > > >>> Airflow but very much vested into Github Actions)
> > > > >>> maybe the infra team can chime in here. We did raise it to
> GitHub,
> > we
> > > > >>> even had meeting with them
> > > > >>> organized by Gavin and several topics were raised that could be
> > > > >>> eventually addressed by Github:
> > > > >>>
> > > > >>> - observability (they could not give us per-project usage
> > dashboard -
> > > > we
> > > > >>> built our own imperfect (with API limitations) one by Tobiasz
> from
> > > > Airllow
> > > > >>> - security (limiting access to only project committers) - this we
> > > > >>> handled by the Ash's fork of Runner (but it's also imperfect -
> even
> > > > today I
> > > > >>> had to fix a problem where we had list of committers
> desynchronised
> > > > between
> > > > >>> our infra/CI.yml)
> > > > >>> - manageability (assigning resources per-project) - this works by
> > > > having
> > > > >>> self-hosted runners assigned per project (we needed infra JIRA
> > ticket
> > > > and
> > > > >>> generation of a bunch of tokens for our runners and our own AWS
> > > account
> > > > >>> with auto-scaling).
> > > > >>>
> > > > >>> It would be indeed great if it could be available from GitHub,
> but
> > so
> > > > >>> far we do not have any of those.
> > > > >>>
> > > > >>> J.
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>> On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <
> gurwls223@gmail.com>
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>> > Thanks Martin for your feedback.
> > > > >>>> >
> > > > >>>> > > What was your reason to migrate from Apache Jenkins to
> Github
> > > > >>>> Actions ?
> > > > >>>> >
> > > > >>>> > I am sure there were more reasons for migrating from Amplap
> > > Jenkins
> > > > >>>> > <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions
> but
> > > as
> > > > >>>> far as
> > > > >>>> > I can remember:
> > > > >>>> > - To reduce the maintenance cost of machines
> > > > >>>> > - The Jenkins machines became unstable and slow causing CI
> jobs
> > to
> > > > >>>> fail or
> > > > >>>> > be very flaky.
> > > > >>>> > - Difficulty to manage the installed libraries.
> > > > >>>> > - Intermittent unknown issues in the machines
> > > > >>>> >
> > > > >>>> > Yes, one option might be to consider other options to migrate
> > > again.
> > > > >>>> > However, other projects will very likely suffer the
> > > > >>>> > same problem. In addition, the migration in a large project is
> > not
> > > > an
> > > > >>>> > easy work to do
> > > > >>>> >
> > > > >>>> > I would like to know the feasibility of having more resources
> in
> > > > >>>> GitHub
> > > > >>>> > Actions, or, for example, having sub-groups where
> > > > >>>> > each group shares the resources - currently one GitHub
> > > organisation
> > > > >>>> shares
> > > > >>>> > all resources across the projects.
> > > > >>>> >
> > > > >>>> >
> > > > >>>> > 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <
> mgrigorov@apache.org
> > > >님이
> > > > >>>> 작성:
> > > > >>>> >
> > > > >>>> >>
> > > > >>>> >>
> > > > >>>> >> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <
> > gurwls223@gmail.com
> > > >
> > > > >>>> wrote:
> > > > >>>> >>
> > > > >>>> >>> Hi Greg,
> > > > >>>> >>>
> > > > >>>> >>> I raised this thread to figure out a way that we can work
> > > together
> > > > >>>> to
> > > > >>>> >>> resolve this issue, gather feedback, and to understand how
> > other
> > > > >>>> projects
> > > > >>>> >>> work around.
> > > > >>>> >>> Several projects I observed, as far as I can tell, have made
> > > > enough
> > > > >>>> >>> efforts
> > > > >>>> >>> to save the resources in GitHub Actions but still suffer
> from
> > > the
> > > > >>>> lack of
> > > > >>>> >>> resources.
> > > > >>>> >>>
> > > > >>>> >>
> > > > >>>> >> And it will get even worse because:
> > > > >>>> >> 1) more and more Apache projects migrate from TravisCI to
> > Github
> > > > >>>> Actions
> > > > >>>> >> (GA)
> > > > >>>> >> 2) new projects join ASF and many of them already use GA
> > > > >>>> >>
> > > > >>>> >>
> > > > >>>> >> What was your reason to migrate from Apache Jenkins to Github
> > > > >>>> Actions ?
> > > > >>>> >> If you want dedicated resources then you will need to manage
> > the
> > > CI
> > > > >>>> >> yourself.
> > > > >>>> >> You could use Apache Jenkins/Buildbot with dedicated agents
> for
> > > > your
> > > > >>>> >> project.
> > > > >>>> >> Or you could set up your own CI infrastructure with Jenkins,
> > > > DroneIO,
> > > > >>>> >> ConcourceCI, ...
> > > > >>>> >>
> > > > >>>> >> Yet another option is to move to CircleCI or Cirrus. They are
> > > > >>>> similar to
> > > > >>>> >> TravisCI / GA and less crowded (for now).
> > > > >>>> >>
> > > > >>>> >> Martin
> > > > >>>> >>
> > > > >>>> >> I appreciate the resources provided to us but that does not
> > > resolve
> > > > >>>> the
> > > > >>>> >>> issue of the development being slowed down.
> > > > >>>> >>>
> > > > >>>> >>>
> > > > >>>> >>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이
> 작성:
> > > > >>>> >>>
> > > > >>>> >>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <
> > > > gurwls223@gmail.com
> > > > >>>> >
> > > > >>>> >>> wrote:
> > > > >>>> >>> >
> > > > >>>> >>> >> Hi all,
> > > > >>>> >>> >>
> > > > >>>> >>> >> I am an Apache Spark PMC,
> > > > >>>> >>> >
> > > > >>>> >>> >
> > > > >>>> >>> > You are a member of the Apache Spark PMC. You are *not* a
> > PMC.
> > > > >>>> Please
> > > > >>>> >>> stop
> > > > >>>> >>> > with that terminology. The Foundation has about 200 PMCs,
> > and
> > > > you
> > > > >>>> are a
> > > > >>>> >>> > member of one of them. You are NOT a "PMC" .. you're a
> > > person. A
> > > > >>>> PMC
> > > > >>>> >>> is a
> > > > >>>> >>> > construct of the Foundation.
> > > > >>>> >>> >
> > > > >>>> >>> > >...
> > > > >>>> >>> >
> > > > >>>> >>> >> I am aware of the limited GitHub Actions resources that
> are
> > > > >>>> shared
> > > > >>>> >>> >> across all projects in ASF,
> > > > >>>> >>> >> and many projects suffer from it. This issue
> significantly
> > > > slows
> > > > >>>> down
> > > > >>>> >>> the
> > > > >>>> >>> >> development cycle of
> > > > >>>> >>> >>  other projects, at least Apache Spark.
> > > > >>>> >>> >>
> > > > >>>> >>> >
> > > > >>>> >>> > And the Foundation gets those build minutes for GitHub
> > Actions
> > > > >>>> >>> provided to
> > > > >>>> >>> > us from GitHub and Microsoft, and we are thankful that
> they
> > > > >>>> provide
> > > > >>>> >>> them to
> > > > >>>> >>> > the Foundation. Maybe it isn't all the build minutes that
> > > every
> > > > >>>> group
> > > > >>>> >>> > wants, but that is what we have. So it is incumbent upon
> all
> > > of
> > > > >>>> us to
> > > > >>>> >>> > figure out how to build more, with fewer minutes.
> > > > >>>> >>> >
> > > > >>>> >>> > Say "thank you" to GitHub, please.
> > > > >>>> >>> >
> > > > >>>> >>> > Regards,
> > > > >>>> >>> > -g
> > > > >>>> >>> >
> > > > >>>> >>> >
> > > > >>>> >>>
> > > > >>>> >>
> > > > >>>>
> > > > >>>
> > > > >>>
> > > > >>> --
> > > > >>> +48 660 796 129
> > > > >>>
> > > > >>
> > > >
> > >
> >
>


-- 

*Gavin McDonald*
Systems Administrator
ASF Infrastructure Team

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Hyukjin Kwon <gu...@gmail.com>.
Yes, per project budget will be great.

Beside cancellation workaround, Apache Spark is trying another workaround
for the time being: distributing workflow runs
to forked repositories to leverage contributor's GitHub Actions resources
instead of consuming all ASF organisation resources.

Please see also how I implemented it if anyone is interested in this - the
PR description should be self-explanatory:
- https://github.com/apache/spark/pull/32092
- https://github.com/apache/spark/pull/32193

Note that this is still a workaround and it disables GitHub Actions work
out of the box.


On Fri, 16 Apr 2021, 22:26 Matt Sicker, <bo...@gmail.com> wrote:

> I thought one of the key points raised by Jarek before was that even with
> infinite compute resources, unoptimized actions will fill up all the
> compute (like how widening highways induces higher traffic demand). The
> per-PMC compute budgets sounds like a great way to help “shift left” on the
> problem to avoid demanding Infra fix hundreds of scripts they know nothing
> about. The other optimizations Jarek has demonstrated here like ensuring
> old commits get cancelled in favor of new commits, or more sophisticated
> things like only running tests that are relevant to the commit, can all go
> a long way toward optimizing build compute.
>
> Now if you’re attempting to run entire integration and end to end test
> suites on every commit, I don’t think the earth has enough compute power
> for that quite yet!
>
> On Fri, Apr 16, 2021 at 07:44 Hyukjin Kwon <gu...@gmail.com> wrote:
>
> > I thought Jarek was pretty clear on that. I meant this:
> >
> > > So it all has to start with 'per-project' resource limitation and self-
> > > budgeting. It would be GREAT if infra.could provide self-hosted GitHub
> > > Runners SERVICE per project, where project could donate credits or
> money
> > > for their own account, then the projects would have incentive to
> optimize
> > > their own usage. I imagine this would be the best thing since the
> sliced
> > > bread that INFRA could provide to all the projects.
> >
> > Maintaining and providing a self-hosted runners in GitHub Actions where
> the
> > resources are managed in project level where each project can donate
> > credits.
> >
> > In addition, Jarek mentioned that Airflow already has a working version -
> > is it correct Jarek?
> >
> > If the infra team takes and improves it for other ASF projects, that
> would
> > permanently resolve this issue.
> >
> > This suggestion looks reasonable and realistic to me.
> >
> > How do you think about this?
> >
> >
> > On Fri, 16 Apr 2021, 21:36 Martin Grigorov, <mg...@apache.org>
> wrote:
> >
> > > Hi Hyukjin,
> > >
> > > On Fri, Apr 16, 2021 at 3:04 AM Hyukjin Kwon <gu...@gmail.com>
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > Is here the right place to expect feedback from the infra team or
> > related
> > > > people?
> > > > It would be great to hear what the infra team thinks about Jarek's
> > > > suggestion.
> > > >
> > >
> > > What suggestion exactly do you mean ?
> > > I've just re-read Jarek's email and I see 3 tasks for Github Actions
> > team,
> > > but nothing specific for Apache Infra team.
> > >
> > >
> > > >
> > > >
> > > > 2021년 4월 13일 (화) 오전 11:15, Hyukjin Kwon <gu...@gmail.com>님이 작성:
> > > >
> > > >> Hi all,
> > > >>
> > > >> Could we have any update and feedback from the INFRA team about
> > Jarek's
> > > >> suggestion please?
> > > >>
> > > >> 2021년 4월 9일 (금) 오전 7:06, Jarek Potiuk <ja...@potiuk.com>님이 작성:
> > > >>
> > > >>>
> > > >>>> That's a good idea. We do need to thank Github to give free
> > resources
> > > to
> > > >>>> ASF projects, but it's better if we can make it a business: we
> allow
> > > >>>> individual projects to sign deals with Github to get dedicated
> > > >>>> resources.
> > > >>>> It's a bit wasteful to ask every project to set up its own dev
> ops,
> > > >>>> using Github Action is more convenient. Maybe we should raise it
> to
> > > >>>> Github?
> > > >>>>
> > > >>>
> > > >>> I do not think you can get per-project resources in GH - the most
> you
> > > >>> can do are self-hosted runners for your project.
> > > >>>
> > > >>> (BTW I am not from the INFRA team - just a humble "CI person" of
> > Apache
> > > >>> Airflow but very much vested into Github Actions)
> > > >>> maybe the infra team can chime in here. We did raise it to GitHub,
> we
> > > >>> even had meeting with them
> > > >>> organized by Gavin and several topics were raised that could be
> > > >>> eventually addressed by Github:
> > > >>>
> > > >>> - observability (they could not give us per-project usage
> dashboard -
> > > we
> > > >>> built our own imperfect (with API limitations) one by Tobiasz from
> > > Airllow
> > > >>> - security (limiting access to only project committers) - this we
> > > >>> handled by the Ash's fork of Runner (but it's also imperfect - even
> > > today I
> > > >>> had to fix a problem where we had list of committers desynchronised
> > > between
> > > >>> our infra/CI.yml)
> > > >>> - manageability (assigning resources per-project) - this works by
> > > having
> > > >>> self-hosted runners assigned per project (we needed infra JIRA
> ticket
> > > and
> > > >>> generation of a bunch of tokens for our runners and our own AWS
> > account
> > > >>> with auto-scaling).
> > > >>>
> > > >>> It would be indeed great if it could be available from GitHub, but
> so
> > > >>> far we do not have any of those.
> > > >>>
> > > >>> J.
> > > >>>
> > > >>>
> > > >>>
> > > >>>> On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <gu...@gmail.com>
> > > >>>> wrote:
> > > >>>>
> > > >>>> > Thanks Martin for your feedback.
> > > >>>> >
> > > >>>> > > What was your reason to migrate from Apache Jenkins to Github
> > > >>>> Actions ?
> > > >>>> >
> > > >>>> > I am sure there were more reasons for migrating from Amplap
> > Jenkins
> > > >>>> > <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but
> > as
> > > >>>> far as
> > > >>>> > I can remember:
> > > >>>> > - To reduce the maintenance cost of machines
> > > >>>> > - The Jenkins machines became unstable and slow causing CI jobs
> to
> > > >>>> fail or
> > > >>>> > be very flaky.
> > > >>>> > - Difficulty to manage the installed libraries.
> > > >>>> > - Intermittent unknown issues in the machines
> > > >>>> >
> > > >>>> > Yes, one option might be to consider other options to migrate
> > again.
> > > >>>> > However, other projects will very likely suffer the
> > > >>>> > same problem. In addition, the migration in a large project is
> not
> > > an
> > > >>>> > easy work to do
> > > >>>> >
> > > >>>> > I would like to know the feasibility of having more resources in
> > > >>>> GitHub
> > > >>>> > Actions, or, for example, having sub-groups where
> > > >>>> > each group shares the resources - currently one GitHub
> > organisation
> > > >>>> shares
> > > >>>> > all resources across the projects.
> > > >>>> >
> > > >>>> >
> > > >>>> > 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mgrigorov@apache.org
> > >님이
> > > >>>> 작성:
> > > >>>> >
> > > >>>> >>
> > > >>>> >>
> > > >>>> >> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <
> gurwls223@gmail.com
> > >
> > > >>>> wrote:
> > > >>>> >>
> > > >>>> >>> Hi Greg,
> > > >>>> >>>
> > > >>>> >>> I raised this thread to figure out a way that we can work
> > together
> > > >>>> to
> > > >>>> >>> resolve this issue, gather feedback, and to understand how
> other
> > > >>>> projects
> > > >>>> >>> work around.
> > > >>>> >>> Several projects I observed, as far as I can tell, have made
> > > enough
> > > >>>> >>> efforts
> > > >>>> >>> to save the resources in GitHub Actions but still suffer from
> > the
> > > >>>> lack of
> > > >>>> >>> resources.
> > > >>>> >>>
> > > >>>> >>
> > > >>>> >> And it will get even worse because:
> > > >>>> >> 1) more and more Apache projects migrate from TravisCI to
> Github
> > > >>>> Actions
> > > >>>> >> (GA)
> > > >>>> >> 2) new projects join ASF and many of them already use GA
> > > >>>> >>
> > > >>>> >>
> > > >>>> >> What was your reason to migrate from Apache Jenkins to Github
> > > >>>> Actions ?
> > > >>>> >> If you want dedicated resources then you will need to manage
> the
> > CI
> > > >>>> >> yourself.
> > > >>>> >> You could use Apache Jenkins/Buildbot with dedicated agents for
> > > your
> > > >>>> >> project.
> > > >>>> >> Or you could set up your own CI infrastructure with Jenkins,
> > > DroneIO,
> > > >>>> >> ConcourceCI, ...
> > > >>>> >>
> > > >>>> >> Yet another option is to move to CircleCI or Cirrus. They are
> > > >>>> similar to
> > > >>>> >> TravisCI / GA and less crowded (for now).
> > > >>>> >>
> > > >>>> >> Martin
> > > >>>> >>
> > > >>>> >> I appreciate the resources provided to us but that does not
> > resolve
> > > >>>> the
> > > >>>> >>> issue of the development being slowed down.
> > > >>>> >>>
> > > >>>> >>>
> > > >>>> >>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
> > > >>>> >>>
> > > >>>> >>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <
> > > gurwls223@gmail.com
> > > >>>> >
> > > >>>> >>> wrote:
> > > >>>> >>> >
> > > >>>> >>> >> Hi all,
> > > >>>> >>> >>
> > > >>>> >>> >> I am an Apache Spark PMC,
> > > >>>> >>> >
> > > >>>> >>> >
> > > >>>> >>> > You are a member of the Apache Spark PMC. You are *not* a
> PMC.
> > > >>>> Please
> > > >>>> >>> stop
> > > >>>> >>> > with that terminology. The Foundation has about 200 PMCs,
> and
> > > you
> > > >>>> are a
> > > >>>> >>> > member of one of them. You are NOT a "PMC" .. you're a
> > person. A
> > > >>>> PMC
> > > >>>> >>> is a
> > > >>>> >>> > construct of the Foundation.
> > > >>>> >>> >
> > > >>>> >>> > >...
> > > >>>> >>> >
> > > >>>> >>> >> I am aware of the limited GitHub Actions resources that are
> > > >>>> shared
> > > >>>> >>> >> across all projects in ASF,
> > > >>>> >>> >> and many projects suffer from it. This issue significantly
> > > slows
> > > >>>> down
> > > >>>> >>> the
> > > >>>> >>> >> development cycle of
> > > >>>> >>> >>  other projects, at least Apache Spark.
> > > >>>> >>> >>
> > > >>>> >>> >
> > > >>>> >>> > And the Foundation gets those build minutes for GitHub
> Actions
> > > >>>> >>> provided to
> > > >>>> >>> > us from GitHub and Microsoft, and we are thankful that they
> > > >>>> provide
> > > >>>> >>> them to
> > > >>>> >>> > the Foundation. Maybe it isn't all the build minutes that
> > every
> > > >>>> group
> > > >>>> >>> > wants, but that is what we have. So it is incumbent upon all
> > of
> > > >>>> us to
> > > >>>> >>> > figure out how to build more, with fewer minutes.
> > > >>>> >>> >
> > > >>>> >>> > Say "thank you" to GitHub, please.
> > > >>>> >>> >
> > > >>>> >>> > Regards,
> > > >>>> >>> > -g
> > > >>>> >>> >
> > > >>>> >>> >
> > > >>>> >>>
> > > >>>> >>
> > > >>>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>> +48 660 796 129
> > > >>>
> > > >>
> > >
> >
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Jarek Potiuk <ja...@potiuk.com>.
I really love what Hyukjin has done. I did not have the capacity to
participate in this actively, but this is exactly the way to go, I think
(with a caveat). Following the motorway metaphor - everyone (every
contributor/committer, not every project) has their own lane and they do
not interfere with each other.

One observation we had from implementing self-hosted runners in Airflow is
that the faster the builds were, the more used they were as well. It's just
become too easy to use. Also we've hit a different problem: we started to
pay for every build and cost became proportional to the number of
committers/PRs. So while we could optimize the builds (and we have a big
incentive to do so) - we've hit some limits that we cannot go lower than  x
USD/Build. And the more we grow as a project, the bigger the cost it will
be. With the proposal from Hyukjin/Apache Spark, the cost is distributed -
and the only common part to pay for are "merge builds" (but those can run
on free infrastructure as they can wait usually).

However, as of now, this is a big hack and it is rather complex to
implement and understand and it has some "brittle" parts - for example the
workflow should not be disabled by the contributor.

But I believe it could be - likely - long term implemented by GitHub, and
it would solve all the problems of ASF.

Gavin,

Maybe we should raise this to Github Team and maybe that is something they
could indeed think about implementing ? I think it is a great oss-friendly
feature they could implement.

How it could look like from the GitHub side : the workflow could have a
"run-in-fork" flag or similar. Setting this flag could cause any PR running
from a public fork, run in this fork's space (source repo) rather than
target repo.

Hyukjin had to implement a number of workarounds to make it works:

a) specific if clauses in the workflow
b) specifying branches to run in the fork
c) finding the PR for each build and labeling it appropriately
d) adding status check manually in the PR
e) scheduled scanning of PRs and updating status checks for those

This all could be implemented in a much more elegant way by GitHub in the
"underlying GA fabric" - then none of the workarounds above would be
needed. They are mostly needed because of the permission model implemented
in GA.

J,


On Mon, Apr 19, 2021 at 9:30 AM Hyukjin Kwon <gu...@gmail.com> wrote:

> Thanks all.
>
> Just to add a bit of note,
>
> >  * Create a wiki page collecting all the practices to reduce the hours
> > (using the pr cancel workflow discussed earlier + timeouts + ...?)
>
> We should probably also mention that Apache Spark managed to distribute the
> workflow runs to forked repositories in pull requests, see the PRs:
> - https://github.com/apache/spark/pull/32092
> - https://github.com/apache/spark/pull/32193
> and umbrella JIRA: https://issues.apache.org/jira/browse/SPARK-35119
>
> This is still a workaround but it managed to reduce the overhead
> significantly by leveraging the resources from forked repositories.
>
>
> 2021년 4월 19일 (월) 오전 12:41, Antoine Pitrou <an...@python.org>님이 작성:
>
> >
> > Hi Marton,
> >
> > Thanks a lot for the information you have collected and presented.  This
> > is very insightful!
> >
> > Le 18/04/2021 à 11:06, Elek, Marton a écrit :
> > >
> > > There are signs of mis-configuation of some jobs. For example in some
> > > projects I found many failure jobs with >15 hours executions even if
> the
> > > slowest successful (!) execution took only a few hours. It clearly
> shows
> > > that job level timeout is not yet configured.
> >
> > Ok, I'm curious: according to the GHA docs, the default job
> > timeout is 6 hours (360 minutes):
> >
> >
> https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobsjob_idtimeout-minutes
> >
> > In Arrow, we didn't change this setting... how come your stats show
> > jobs taking up to 24 hours?
> >
> > Apparently, what's named "jobhours" in your statistics is actually the
> > runtime for an entire workflow (the sum of all job runtimes for that
> > workflow).  That's at least what I conclude if I look at this workflow,
> > which your table lists as the longest Arrow "job" with 24 hours of
> > runtime: https://github.com/apache/arrow/actions/runs/699123317
> > None of the jobs in that workflow took more than 6 hours, but cumulated
> > they indeed end up around 24 hours... (because 4 jobs timed out at 6
> hours)
> >
> > > Also the 46 or 36 hours of max job execution time sounds very
> > > un-realistic (it's a job, not the full workflow).
> >
> > Well, according to the above it's the full workflow.  It's still
> > unexpected as far as Arrow is concerned, though, and we should implement
> > per-job timeouts reflecting our expectations.
> >
> > > My suggestion:
> > >
> > >    * Publish Github action usage in a central place which is clearly
> > > visible for all Apache projects (I would be happy to volunteer here)
> > >
> > >    * Identify official suggestion of fair-usage (monthly hours) per
> > > project (easiest way: available hours / projects using github actions)
> > >
> > >    * Create a wiki page collecting all the practices to reduce the
> hours
> > > (using the pr cancel workflow discussed earlier + timeouts + ...?)
> > >
> > > * After every month send a very polite reminder to the projects who
> > > overuses github actions (using dev lists) including detailed statistics
> > > and the wiki link to help them to improve/reduce the usage.
> >
> > As a member of the Arrow PMC, I say +1 to all of this.
> >
> > Best regards
> >
> > Antoine.
> >
>


-- 
+48 660 796 129

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Hyukjin Kwon <gu...@gmail.com>.
Thanks all.

Just to add a bit of note,

>  * Create a wiki page collecting all the practices to reduce the hours
> (using the pr cancel workflow discussed earlier + timeouts + ...?)

We should probably also mention that Apache Spark managed to distribute the
workflow runs to forked repositories in pull requests, see the PRs:
- https://github.com/apache/spark/pull/32092
- https://github.com/apache/spark/pull/32193
and umbrella JIRA: https://issues.apache.org/jira/browse/SPARK-35119

This is still a workaround but it managed to reduce the overhead
significantly by leveraging the resources from forked repositories.


2021년 4월 19일 (월) 오전 12:41, Antoine Pitrou <an...@python.org>님이 작성:

>
> Hi Marton,
>
> Thanks a lot for the information you have collected and presented.  This
> is very insightful!
>
> Le 18/04/2021 à 11:06, Elek, Marton a écrit :
> >
> > There are signs of mis-configuation of some jobs. For example in some
> > projects I found many failure jobs with >15 hours executions even if the
> > slowest successful (!) execution took only a few hours. It clearly shows
> > that job level timeout is not yet configured.
>
> Ok, I'm curious: according to the GHA docs, the default job
> timeout is 6 hours (360 minutes):
>
> https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobsjob_idtimeout-minutes
>
> In Arrow, we didn't change this setting... how come your stats show
> jobs taking up to 24 hours?
>
> Apparently, what's named "jobhours" in your statistics is actually the
> runtime for an entire workflow (the sum of all job runtimes for that
> workflow).  That's at least what I conclude if I look at this workflow,
> which your table lists as the longest Arrow "job" with 24 hours of
> runtime: https://github.com/apache/arrow/actions/runs/699123317
> None of the jobs in that workflow took more than 6 hours, but cumulated
> they indeed end up around 24 hours... (because 4 jobs timed out at 6 hours)
>
> > Also the 46 or 36 hours of max job execution time sounds very
> > un-realistic (it's a job, not the full workflow).
>
> Well, according to the above it's the full workflow.  It's still
> unexpected as far as Arrow is concerned, though, and we should implement
> per-job timeouts reflecting our expectations.
>
> > My suggestion:
> >
> >    * Publish Github action usage in a central place which is clearly
> > visible for all Apache projects (I would be happy to volunteer here)
> >
> >    * Identify official suggestion of fair-usage (monthly hours) per
> > project (easiest way: available hours / projects using github actions)
> >
> >    * Create a wiki page collecting all the practices to reduce the hours
> > (using the pr cancel workflow discussed earlier + timeouts + ...?)
> >
> > * After every month send a very polite reminder to the projects who
> > overuses github actions (using dev lists) including detailed statistics
> > and the wiki link to help them to improve/reduce the usage.
>
> As a member of the Arrow PMC, I say +1 to all of this.
>
> Best regards
>
> Antoine.
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Jarek Potiuk <ja...@potiuk.com>.
>
> Is it possible to share the raw data in some form? If you can publish
> data in any form (csv? sqlite?) I can generate static html files with
> python notebooks which can be shared with everybody...
>
>
I am adding Tobiasz who can share it :).


> (BTW, how do you get the data? Do you poll somehow the actual runs or
> collect data from workflow runs / jobs api (this is what I do)?)
>
>
The data is fetched by...... a scheduled Github Action :).

https://github.com/TobKed/fetch-apache-ga-stats

J.

-- 
+48 660 796 129

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by "Elek, Marton" <el...@apache.org>.

On 4/19/21 1:07 PM, Jarek Potiuk wrote:
> Also some comments for the stats. This is good stuff Marton.
> 
> 
>> Apparently, what's named "jobhours" in your statistics is actually the
>> runtime for an entire workflow (the sum of all job runtimes for that
>> workflow).  That's at least what I conclude if I look at this workflow,
>> which your table lists as the longest Arrow "job" with 24 hours of
>> runtime: https://github.com/apache/arrow/actions/runs/699123317
>> None of the jobs in that workflow took more than 6 hours, but cumulated
>> they indeed end up around 24 hours... (because 4 jobs timed out at 6 hours)
>>

That's correct. This is the sum of the time between started_at and 
completed_at for each jobs in a workflow run. (Using job API like this: 
https://api.github.com/repos/elek/flekszible/actions/runs/621666286/jobs).

> It does look like you have workflows rather than jobs - we had very similar
> problems
> when we (Tobiasz - one of the Airflow contributors) tried to get the stats.
> The REST API limitations are super-painful, there is no way to dig down
> to the job level (there is no GraphQL version to do it efficiently
> unfortunately).
> We found that rather than looking at jobhours, it's much better to look at
> "in-progress"
> and "queued" workflow from each project. It gives a much better
> overview of what's going on.

100% agree. And I think the "jobhours" include the queue time as well 
(and rerun overwrites all the data).

(BTW, I agree with all the other points, too ;-) )

 > we regularly run and store in Google Bigquery and simple DataStudio 
report
 > showing it (unfortunately we cannot
 > share it with everyone as it will incur some costs if it is publicly 
used).

Is it possible to share the raw data in some form? If you can publish 
data in any form (csv? sqlite?) I can generate static html files with 
python notebooks which can be shared with everybody...

(BTW, how do you get the data? Do you poll somehow the actual runs or 
collect data from workflow runs / jobs api (this is what I do)?)

Marton

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Jarek Potiuk <ja...@potiuk.com>.
Also some comments for the stats. This is good stuff Marton.


> Apparently, what's named "jobhours" in your statistics is actually the
> runtime for an entire workflow (the sum of all job runtimes for that
> workflow).  That's at least what I conclude if I look at this workflow,
> which your table lists as the longest Arrow "job" with 24 hours of
> runtime: https://github.com/apache/arrow/actions/runs/699123317
> None of the jobs in that workflow took more than 6 hours, but cumulated
> they indeed end up around 24 hours... (because 4 jobs timed out at 6 hours)
>

It does look like you have workflows rather than jobs - we had very similar
problems
when we (Tobiasz - one of the Airflow contributors) tried to get the stats.
The REST API limitations are super-painful, there is no way to dig down
to the job level (there is no GraphQL version to do it efficiently
unfortunately).
We found that rather than looking at jobhours, it's much better to look at
"in-progress"
and "queued" workflow from each project. It gives a much better
overview of what's going on.

Together with Gavin and the infra team we passed the request to Github to
get maybe
 some extracts of the stats, but until we have it we have a "poor-man's"
extracts that
we regularly run and store in Google Bigquery and simple DataStudio report
showing it (unfortunately we cannot
share it with everyone as it will incur some costs if it is publicly used).
but we try to
keep screenshots updated in this doc - where I keep status of the current
GA integration with ASF infra:

https://cwiki.apache.org/confluence/display/BUILDS/GitHub+Actions+status

Here are some latest screenshots::

April stats: https://ibb.co/mCL6kZh
March and April stats: https://ibb.co/r2zjNsV

Those two above will show you the variability.

Just some summary for those who do not like to watch the graphs:
Seems like pulsar got down quite a bit in March/April where Arrow started
to be the one that uses most jobs, Spark being on the second place (but
with the changes from Hyukjin it will go down soon I believe). In the
meantime apisix-dashboard seems on the rise and pulsar is getting back.

Here you can see the peaks in a number of workflows:

https://ibb.co/QCJdLGD

But this one is the most important:  the number of ASF projects using
GA since November: https://ibb.co/RpFyQQy

The last one is most interesting, because as I see it, none of the
proposals below
will work - they might temporarily help if some projects will optimize it
but
there will be new ones coming. It seems that since November we are
continuously
fighting for jobs in peak and various projects that got fed-up with it,
finding some
workarounds or moving elsewhere. And it will continue.


> >    * Publish Github action usage in a central place which is clearly
> > visible for all Apache projects (I would be happy to volunteer here)
>

Oh yeah. If we only can get good stats, that would be great, but with
the current API limitations that seems very difficult. But If you could do
that
it would be great - however we need peak hours stats and peak hours limits
 to be precise.


> >    * Identify official suggestion of fair-usage (monthly hours) per
> > project (easiest way: available hours / projects using github actions)
>

The problem is that with the fixed amount of jobs we have and more
projects coming AND the fact that we have problems in Peaks, this stats
is a) wrong (the build hours do not matter too much - the peak hours do).
b) will continue to trend downwards with more projects coming. And it's the
peak hours we need to limit not overall hours.

And the problem is that peak hours usage is out-of-control by the projects
themselves. The problem is that those peak hours mostly come from
Contributors
contributing new PRs. There is not much each project can do to reduce those.
It's not only best practices, cancelling etc. But the main contributor is
hbw many
PRs are raised within a time window. And there isn't much we can do - other
than
give everyone their own lane (and I mean every contributor really - this is
what
Hyukjin did). No matter how hard the projects will try, this can't be
really controlled
otherwise.


> >
> >    * Create a wiki page collecting all the practices to reduce the hours
> > (using the pr cancel workflow discussed earlier + timeouts + ...?)


It's there:
https://cwiki.apache.org/confluence/display/BUILDS/GitHub+Actions+status
Good start. We can continue improving it.


> >
> > * After every month send a very polite reminder to the projects who
> > overuses github actions (using dev lists) including detailed statistics
> > and the wiki link to help them to improve/reduce the usage.


Having good stats is a good starting point for that. But there is only so
much we can
do and with the current growth of usage this is mostly about deferring the
inevitable by
couple of weeks/months even if everyone implements all optimisations.

I think distribution of "build-hours" per-committer is really the only
sustainable long-term way.

-- 
+48 660 796 129

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Antoine Pitrou <an...@python.org>.
Hi Marton,

Thanks a lot for the information you have collected and presented.  This 
is very insightful!

Le 18/04/2021 à 11:06, Elek, Marton a écrit :
> 
> There are signs of mis-configuation of some jobs. For example in some
> projects I found many failure jobs with >15 hours executions even if the
> slowest successful (!) execution took only a few hours. It clearly shows
> that job level timeout is not yet configured.

Ok, I'm curious: according to the GHA docs, the default job
timeout is 6 hours (360 minutes):
https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobsjob_idtimeout-minutes

In Arrow, we didn't change this setting... how come your stats show
jobs taking up to 24 hours?

Apparently, what's named "jobhours" in your statistics is actually the 
runtime for an entire workflow (the sum of all job runtimes for that 
workflow).  That's at least what I conclude if I look at this workflow, 
which your table lists as the longest Arrow "job" with 24 hours of 
runtime: https://github.com/apache/arrow/actions/runs/699123317
None of the jobs in that workflow took more than 6 hours, but cumulated 
they indeed end up around 24 hours... (because 4 jobs timed out at 6 hours)

> Also the 46 or 36 hours of max job execution time sounds very
> un-realistic (it's a job, not the full workflow).

Well, according to the above it's the full workflow.  It's still 
unexpected as far as Arrow is concerned, though, and we should implement 
per-job timeouts reflecting our expectations.

> My suggestion:
> 
>    * Publish Github action usage in a central place which is clearly
> visible for all Apache projects (I would be happy to volunteer here)
> 
>    * Identify official suggestion of fair-usage (monthly hours) per
> project (easiest way: available hours / projects using github actions)
> 
>    * Create a wiki page collecting all the practices to reduce the hours
> (using the pr cancel workflow discussed earlier + timeouts + ...?)
> 
> * After every month send a very polite reminder to the projects who
> overuses github actions (using dev lists) including detailed statistics
> and the wiki link to help them to improve/reduce the usage.

As a member of the Arrow PMC, I say +1 to all of this.

Best regards

Antoine.

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Brennan Ashton <ba...@brennanashton.com>.
Marton,
This is super helpful, and thanks for linking the data.

On Sun, Apr 18, 2021, 2:06 AM Elek, Marton <el...@apache.org> wrote:

>
> There are 13 project which uses Github Actions above this possible limit:
>
>
> project build hours     average hours per job   max hours per job
> nuttx   14138.818889    3.441777        17.845833
> pulsar  10785.601944    0.478743        2.011667
> airflow 8305.211111     1.247216        20.768056
> skywalking      6852.736667     0.959095        7.520278
> arrow   6290.633889     0.503573        24.359444
> ozone   5484.444722     4.440846        17.473333
> camel   4241.184722     0.754525        18.681389
> iotdb   4007.576667     1.224061        36.676944
> shardingsphere  2858.329444     0.503937        21.633056
> beam    2782.366111     0.688364        46.451667
> nifi    2380.907500     4.641145        11.313056
> apisix  2342.009722     0.209183        24.139167
> dubbo   1815.491389     1.537249        11.023889
>


The Apache NuttX project has implemented some aggressive cancellation of
workflows in the last few weeks so I would expect our numbers to go way
down.  We also are looking and a larger overhaul of our build system but
that is a further out effort.

The run times still seem a little high compared to what I usually see, so I
will look at what is causing that. It's possible something is hanging
occasionally. This is where that repo is super nice!

We have some bigger plans in the works including adding physical hardware
racks to our testing (this is an embedded RTOS project) which will require
us to run our own runners. As part of that we have also been considering
adding our own larger runners to the mix that we would pay for, but how is
still a very open question with the security issues that have been
mentioned before.


Thanks,
Brennan

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by "Elek, Marton" <el...@apache.org>.

Hi,


Based on my understanding we have the standard Github Enterprise limit 
which is 180 parallel job at a given time [1].

Running 180 parallel jobs full-time would give us 180 * 24 * 30/31 build 
hours per month (this is 129600/133920 in 30/31 days month)

I did a quick research and found that in 2021-02 Apache used about 90k 
hours and in 2021-03 103k hours.

This was created by 84 Apache project which is increased to 91 in 2021-03.

The average parallel jobs running at a given time is around 120-130 
which is under the limit. The problem is that we have spikes. The 
highest one what I found is 200 job scheduled (when 20 is queued until 
some jobs are finished).






The problem is that not all the projects use the available capacity at 
the same level:

During 31 days in March and 91 Apache projects a fair usage would be 
~1470 build-hours per month.

There are 13 project which uses Github Actions above this possible limit:


project	build hours	average hours per job	max hours per job
nuttx	14138.818889	3.441777	17.845833
pulsar	10785.601944	0.478743	2.011667
airflow	8305.211111	1.247216	20.768056
skywalking	6852.736667	0.959095	7.520278
arrow	6290.633889	0.503573	24.359444
ozone	5484.444722	4.440846	17.473333
camel	4241.184722	0.754525	18.681389
iotdb	4007.576667	1.224061	36.676944
shardingsphere	2858.329444	0.503937	21.633056
beam	2782.366111	0.688364	46.451667
nifi	2380.907500	4.641145	11.313056
apisix	2342.009722	0.209183	24.139167
dubbo	1815.491389	1.537249	11.023889


These 13 project uses the 94% of all the build times.

(Note: there are very painful limitations of Github API: I couldn't 
identify external runners and some data in case of re-runs can be missing)





There are signs of mis-configuation of some jobs. For example in some 
projects I found many failure jobs with >15 hours executions even if the 
slowest successful (!) execution took only a few hours. It clearly shows 
that job level timeout is not yet configured.

Also the 46 or 36 hours of max job execution time sounds very 
un-realistic (it's a job, not the full workflow).



My suggestion:

  * Publish Github action usage in a central place which is clearly 
visible for all Apache projects (I would be happy to volunteer here)

  * Identify official suggestion of fair-usage (monthly hours) per 
project (easiest way: available hours / projects using github actions)

  * Create a wiki page collecting all the practices to reduce the hours 
(using the pr cancel workflow discussed earlier + timeouts + ...?)

* After every month send a very polite reminder to the projects who 
overuses github actions (using dev lists) including detailed statistics 
and the wiki link to help them to improve/reduce the usage.



What do you think about these ideas?

Thanks,
Marton

ps: my data is here: https://github.com/elek/asf-github-actions-stat
There could be bugs, feel free to ping me if you see any problems.


[1] 
https://docs.github.com/en/actions/reference/usage-limits-billing-and-administration

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Matt Sicker <bo...@gmail.com>.
I thought one of the key points raised by Jarek before was that even with
infinite compute resources, unoptimized actions will fill up all the
compute (like how widening highways induces higher traffic demand). The
per-PMC compute budgets sounds like a great way to help “shift left” on the
problem to avoid demanding Infra fix hundreds of scripts they know nothing
about. The other optimizations Jarek has demonstrated here like ensuring
old commits get cancelled in favor of new commits, or more sophisticated
things like only running tests that are relevant to the commit, can all go
a long way toward optimizing build compute.

Now if you’re attempting to run entire integration and end to end test
suites on every commit, I don’t think the earth has enough compute power
for that quite yet!

On Fri, Apr 16, 2021 at 07:44 Hyukjin Kwon <gu...@gmail.com> wrote:

> I thought Jarek was pretty clear on that. I meant this:
>
> > So it all has to start with 'per-project' resource limitation and self-
> > budgeting. It would be GREAT if infra.could provide self-hosted GitHub
> > Runners SERVICE per project, where project could donate credits or money
> > for their own account, then the projects would have incentive to optimize
> > their own usage. I imagine this would be the best thing since the sliced
> > bread that INFRA could provide to all the projects.
>
> Maintaining and providing a self-hosted runners in GitHub Actions where the
> resources are managed in project level where each project can donate
> credits.
>
> In addition, Jarek mentioned that Airflow already has a working version -
> is it correct Jarek?
>
> If the infra team takes and improves it for other ASF projects, that would
> permanently resolve this issue.
>
> This suggestion looks reasonable and realistic to me.
>
> How do you think about this?
>
>
> On Fri, 16 Apr 2021, 21:36 Martin Grigorov, <mg...@apache.org> wrote:
>
> > Hi Hyukjin,
> >
> > On Fri, Apr 16, 2021 at 3:04 AM Hyukjin Kwon <gu...@gmail.com>
> wrote:
> >
> > > Hi all,
> > >
> > > Is here the right place to expect feedback from the infra team or
> related
> > > people?
> > > It would be great to hear what the infra team thinks about Jarek's
> > > suggestion.
> > >
> >
> > What suggestion exactly do you mean ?
> > I've just re-read Jarek's email and I see 3 tasks for Github Actions
> team,
> > but nothing specific for Apache Infra team.
> >
> >
> > >
> > >
> > > 2021년 4월 13일 (화) 오전 11:15, Hyukjin Kwon <gu...@gmail.com>님이 작성:
> > >
> > >> Hi all,
> > >>
> > >> Could we have any update and feedback from the INFRA team about
> Jarek's
> > >> suggestion please?
> > >>
> > >> 2021년 4월 9일 (금) 오전 7:06, Jarek Potiuk <ja...@potiuk.com>님이 작성:
> > >>
> > >>>
> > >>>> That's a good idea. We do need to thank Github to give free
> resources
> > to
> > >>>> ASF projects, but it's better if we can make it a business: we allow
> > >>>> individual projects to sign deals with Github to get dedicated
> > >>>> resources.
> > >>>> It's a bit wasteful to ask every project to set up its own dev ops,
> > >>>> using Github Action is more convenient. Maybe we should raise it to
> > >>>> Github?
> > >>>>
> > >>>
> > >>> I do not think you can get per-project resources in GH - the most you
> > >>> can do are self-hosted runners for your project.
> > >>>
> > >>> (BTW I am not from the INFRA team - just a humble "CI person" of
> Apache
> > >>> Airflow but very much vested into Github Actions)
> > >>> maybe the infra team can chime in here. We did raise it to GitHub, we
> > >>> even had meeting with them
> > >>> organized by Gavin and several topics were raised that could be
> > >>> eventually addressed by Github:
> > >>>
> > >>> - observability (they could not give us per-project usage dashboard -
> > we
> > >>> built our own imperfect (with API limitations) one by Tobiasz from
> > Airllow
> > >>> - security (limiting access to only project committers) - this we
> > >>> handled by the Ash's fork of Runner (but it's also imperfect - even
> > today I
> > >>> had to fix a problem where we had list of committers desynchronised
> > between
> > >>> our infra/CI.yml)
> > >>> - manageability (assigning resources per-project) - this works by
> > having
> > >>> self-hosted runners assigned per project (we needed infra JIRA ticket
> > and
> > >>> generation of a bunch of tokens for our runners and our own AWS
> account
> > >>> with auto-scaling).
> > >>>
> > >>> It would be indeed great if it could be available from GitHub, but so
> > >>> far we do not have any of those.
> > >>>
> > >>> J.
> > >>>
> > >>>
> > >>>
> > >>>> On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <gu...@gmail.com>
> > >>>> wrote:
> > >>>>
> > >>>> > Thanks Martin for your feedback.
> > >>>> >
> > >>>> > > What was your reason to migrate from Apache Jenkins to Github
> > >>>> Actions ?
> > >>>> >
> > >>>> > I am sure there were more reasons for migrating from Amplap
> Jenkins
> > >>>> > <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but
> as
> > >>>> far as
> > >>>> > I can remember:
> > >>>> > - To reduce the maintenance cost of machines
> > >>>> > - The Jenkins machines became unstable and slow causing CI jobs to
> > >>>> fail or
> > >>>> > be very flaky.
> > >>>> > - Difficulty to manage the installed libraries.
> > >>>> > - Intermittent unknown issues in the machines
> > >>>> >
> > >>>> > Yes, one option might be to consider other options to migrate
> again.
> > >>>> > However, other projects will very likely suffer the
> > >>>> > same problem. In addition, the migration in a large project is not
> > an
> > >>>> > easy work to do
> > >>>> >
> > >>>> > I would like to know the feasibility of having more resources in
> > >>>> GitHub
> > >>>> > Actions, or, for example, having sub-groups where
> > >>>> > each group shares the resources - currently one GitHub
> organisation
> > >>>> shares
> > >>>> > all resources across the projects.
> > >>>> >
> > >>>> >
> > >>>> > 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mgrigorov@apache.org
> >님이
> > >>>> 작성:
> > >>>> >
> > >>>> >>
> > >>>> >>
> > >>>> >> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gurwls223@gmail.com
> >
> > >>>> wrote:
> > >>>> >>
> > >>>> >>> Hi Greg,
> > >>>> >>>
> > >>>> >>> I raised this thread to figure out a way that we can work
> together
> > >>>> to
> > >>>> >>> resolve this issue, gather feedback, and to understand how other
> > >>>> projects
> > >>>> >>> work around.
> > >>>> >>> Several projects I observed, as far as I can tell, have made
> > enough
> > >>>> >>> efforts
> > >>>> >>> to save the resources in GitHub Actions but still suffer from
> the
> > >>>> lack of
> > >>>> >>> resources.
> > >>>> >>>
> > >>>> >>
> > >>>> >> And it will get even worse because:
> > >>>> >> 1) more and more Apache projects migrate from TravisCI to Github
> > >>>> Actions
> > >>>> >> (GA)
> > >>>> >> 2) new projects join ASF and many of them already use GA
> > >>>> >>
> > >>>> >>
> > >>>> >> What was your reason to migrate from Apache Jenkins to Github
> > >>>> Actions ?
> > >>>> >> If you want dedicated resources then you will need to manage the
> CI
> > >>>> >> yourself.
> > >>>> >> You could use Apache Jenkins/Buildbot with dedicated agents for
> > your
> > >>>> >> project.
> > >>>> >> Or you could set up your own CI infrastructure with Jenkins,
> > DroneIO,
> > >>>> >> ConcourceCI, ...
> > >>>> >>
> > >>>> >> Yet another option is to move to CircleCI or Cirrus. They are
> > >>>> similar to
> > >>>> >> TravisCI / GA and less crowded (for now).
> > >>>> >>
> > >>>> >> Martin
> > >>>> >>
> > >>>> >> I appreciate the resources provided to us but that does not
> resolve
> > >>>> the
> > >>>> >>> issue of the development being slowed down.
> > >>>> >>>
> > >>>> >>>
> > >>>> >>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
> > >>>> >>>
> > >>>> >>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <
> > gurwls223@gmail.com
> > >>>> >
> > >>>> >>> wrote:
> > >>>> >>> >
> > >>>> >>> >> Hi all,
> > >>>> >>> >>
> > >>>> >>> >> I am an Apache Spark PMC,
> > >>>> >>> >
> > >>>> >>> >
> > >>>> >>> > You are a member of the Apache Spark PMC. You are *not* a PMC.
> > >>>> Please
> > >>>> >>> stop
> > >>>> >>> > with that terminology. The Foundation has about 200 PMCs, and
> > you
> > >>>> are a
> > >>>> >>> > member of one of them. You are NOT a "PMC" .. you're a
> person. A
> > >>>> PMC
> > >>>> >>> is a
> > >>>> >>> > construct of the Foundation.
> > >>>> >>> >
> > >>>> >>> > >...
> > >>>> >>> >
> > >>>> >>> >> I am aware of the limited GitHub Actions resources that are
> > >>>> shared
> > >>>> >>> >> across all projects in ASF,
> > >>>> >>> >> and many projects suffer from it. This issue significantly
> > slows
> > >>>> down
> > >>>> >>> the
> > >>>> >>> >> development cycle of
> > >>>> >>> >>  other projects, at least Apache Spark.
> > >>>> >>> >>
> > >>>> >>> >
> > >>>> >>> > And the Foundation gets those build minutes for GitHub Actions
> > >>>> >>> provided to
> > >>>> >>> > us from GitHub and Microsoft, and we are thankful that they
> > >>>> provide
> > >>>> >>> them to
> > >>>> >>> > the Foundation. Maybe it isn't all the build minutes that
> every
> > >>>> group
> > >>>> >>> > wants, but that is what we have. So it is incumbent upon all
> of
> > >>>> us to
> > >>>> >>> > figure out how to build more, with fewer minutes.
> > >>>> >>> >
> > >>>> >>> > Say "thank you" to GitHub, please.
> > >>>> >>> >
> > >>>> >>> > Regards,
> > >>>> >>> > -g
> > >>>> >>> >
> > >>>> >>> >
> > >>>> >>>
> > >>>> >>
> > >>>>
> > >>>
> > >>>
> > >>> --
> > >>> +48 660 796 129
> > >>>
> > >>
> >
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Martin Grigorov <mg...@apache.org>.
On Fri, Apr 16, 2021 at 4:07 PM Hyukjin Kwon <gu...@gmail.com> wrote:

> Just for a note, I already attempted to interact with GitHub team about
> resources problems a long ago with tagging infra team, and the infra team
> informed me that all ASF issues have to be discussed with them and
> addressed by them if I am remembering correctly.
>
> It would be great to understand what has changed if I am supposed to
> independently discuss with the GitHub team to address this now.
>

I guess it depends what you want to achieve.
If you want same GitHub Actions just with more resources for Apache
projects then you should talk to Apache Infra first.
But if you find an issue/improvement in Github Actions then you can create
an issue at their projects and once it is accepted and implemented then the
new functionality will be available for any project (not just Apache ones)
that uses Github Actions. For example Apache Airflow team members have
opened Pull Requests with improvements related to self-runners and security
but until they are merged the only way to use the improvements is by using
their forks (as they do in the linked workflow in my previous email).


>
> On Fri, 16 Apr 2021, 22:00 Martin Grigorov, <mg...@apache.org> wrote:
>
>>
>>
>> On Fri, Apr 16, 2021 at 3:44 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> I thought Jarek was pretty clear on that. I meant this:
>>>
>>> > So it all has to start with 'per-project' resource limitation and self-
>>> > budgeting. It would be GREAT if infra.could provide self-hosted GitHub
>>> > Runners SERVICE per project, where project could donate credits or
>>> money
>>> > for their own account, then the projects would have incentive to
>>> optimize
>>> > their own usage. I imagine this would be the best thing since the
>>> sliced
>>> > bread that INFRA could provide to all the projects.
>>>
>>> Maintaining and providing a self-hosted runners in GitHub Actions where
>>> the resources are managed in project level where each project can donate
>>> credits.
>>>
>>> In addition, Jarek mentioned that Airflow already has a working version
>>> - is it correct Jarek?
>>>
>>> If the infra team takes and improves it for other ASF projects, that
>>> would permanently resolve this issue.
>>>
>>> This suggestion looks reasonable and realistic to me.
>>>
>>> How do you think about this?
>>>
>>
>> I'll let Infra team respond for themselves but to me all these
>> improvements should be done by Github Actions team, not by each and every
>> project out there.
>> But if your project wants to use Apache Airflow's modifications then you
>> can do it - just follow what they did at
>> https://github.com/apache/airflow/blob/master/.github/workflows/ci.yml
>>
>>
>>>
>>>
>>> On Fri, 16 Apr 2021, 21:36 Martin Grigorov, <mg...@apache.org>
>>> wrote:
>>>
>>>> Hi Hyukjin,
>>>>
>>>> On Fri, Apr 16, 2021 at 3:04 AM Hyukjin Kwon <gu...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi all,
>>>> >
>>>> > Is here the right place to expect feedback from the infra team or
>>>> related
>>>> > people?
>>>> > It would be great to hear what the infra team thinks about Jarek's
>>>> > suggestion.
>>>> >
>>>>
>>>> What suggestion exactly do you mean ?
>>>> I've just re-read Jarek's email and I see 3 tasks for Github Actions
>>>> team,
>>>> but nothing specific for Apache Infra team.
>>>>
>>>>
>>>> >
>>>> >
>>>> > 2021년 4월 13일 (화) 오전 11:15, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>>>> >
>>>> >> Hi all,
>>>> >>
>>>> >> Could we have any update and feedback from the INFRA team about
>>>> Jarek's
>>>> >> suggestion please?
>>>> >>
>>>> >> 2021년 4월 9일 (금) 오전 7:06, Jarek Potiuk <ja...@potiuk.com>님이 작성:
>>>> >>
>>>> >>>
>>>> >>>> That's a good idea. We do need to thank Github to give free
>>>> resources to
>>>> >>>> ASF projects, but it's better if we can make it a business: we
>>>> allow
>>>> >>>> individual projects to sign deals with Github to get dedicated
>>>> >>>> resources.
>>>> >>>> It's a bit wasteful to ask every project to set up its own dev ops,
>>>> >>>> using Github Action is more convenient. Maybe we should raise it to
>>>> >>>> Github?
>>>> >>>>
>>>> >>>
>>>> >>> I do not think you can get per-project resources in GH - the most
>>>> you
>>>> >>> can do are self-hosted runners for your project.
>>>> >>>
>>>> >>> (BTW I am not from the INFRA team - just a humble "CI person" of
>>>> Apache
>>>> >>> Airflow but very much vested into Github Actions)
>>>> >>> maybe the infra team can chime in here. We did raise it to GitHub,
>>>> we
>>>> >>> even had meeting with them
>>>> >>> organized by Gavin and several topics were raised that could be
>>>> >>> eventually addressed by Github:
>>>> >>>
>>>> >>> - observability (they could not give us per-project usage dashboard
>>>> - we
>>>> >>> built our own imperfect (with API limitations) one by Tobiasz from
>>>> Airllow
>>>> >>> - security (limiting access to only project committers) - this we
>>>> >>> handled by the Ash's fork of Runner (but it's also imperfect - even
>>>> today I
>>>> >>> had to fix a problem where we had list of committers desynchronised
>>>> between
>>>> >>> our infra/CI.yml)
>>>> >>> - manageability (assigning resources per-project) - this works by
>>>> having
>>>> >>> self-hosted runners assigned per project (we needed infra JIRA
>>>> ticket and
>>>> >>> generation of a bunch of tokens for our runners and our own AWS
>>>> account
>>>> >>> with auto-scaling).
>>>> >>>
>>>> >>> It would be indeed great if it could be available from GitHub, but
>>>> so
>>>> >>> far we do not have any of those.
>>>> >>>
>>>> >>> J.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>> On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <gu...@gmail.com>
>>>> >>>> wrote:
>>>> >>>>
>>>> >>>> > Thanks Martin for your feedback.
>>>> >>>> >
>>>> >>>> > > What was your reason to migrate from Apache Jenkins to Github
>>>> >>>> Actions ?
>>>> >>>> >
>>>> >>>> > I am sure there were more reasons for migrating from Amplap
>>>> Jenkins
>>>> >>>> > <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but
>>>> as
>>>> >>>> far as
>>>> >>>> > I can remember:
>>>> >>>> > - To reduce the maintenance cost of machines
>>>> >>>> > - The Jenkins machines became unstable and slow causing CI jobs
>>>> to
>>>> >>>> fail or
>>>> >>>> > be very flaky.
>>>> >>>> > - Difficulty to manage the installed libraries.
>>>> >>>> > - Intermittent unknown issues in the machines
>>>> >>>> >
>>>> >>>> > Yes, one option might be to consider other options to migrate
>>>> again.
>>>> >>>> > However, other projects will very likely suffer the
>>>> >>>> > same problem. In addition, the migration in a large project is
>>>> not an
>>>> >>>> > easy work to do
>>>> >>>> >
>>>> >>>> > I would like to know the feasibility of having more resources in
>>>> >>>> GitHub
>>>> >>>> > Actions, or, for example, having sub-groups where
>>>> >>>> > each group shares the resources - currently one GitHub
>>>> organisation
>>>> >>>> shares
>>>> >>>> > all resources across the projects.
>>>> >>>> >
>>>> >>>> >
>>>> >>>> > 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mgrigorov@apache.org
>>>> >님이
>>>> >>>> 작성:
>>>> >>>> >
>>>> >>>> >>
>>>> >>>> >>
>>>> >>>> >> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <
>>>> gurwls223@gmail.com>
>>>> >>>> wrote:
>>>> >>>> >>
>>>> >>>> >>> Hi Greg,
>>>> >>>> >>>
>>>> >>>> >>> I raised this thread to figure out a way that we can work
>>>> together
>>>> >>>> to
>>>> >>>> >>> resolve this issue, gather feedback, and to understand how
>>>> other
>>>> >>>> projects
>>>> >>>> >>> work around.
>>>> >>>> >>> Several projects I observed, as far as I can tell, have made
>>>> enough
>>>> >>>> >>> efforts
>>>> >>>> >>> to save the resources in GitHub Actions but still suffer from
>>>> the
>>>> >>>> lack of
>>>> >>>> >>> resources.
>>>> >>>> >>>
>>>> >>>> >>
>>>> >>>> >> And it will get even worse because:
>>>> >>>> >> 1) more and more Apache projects migrate from TravisCI to Github
>>>> >>>> Actions
>>>> >>>> >> (GA)
>>>> >>>> >> 2) new projects join ASF and many of them already use GA
>>>> >>>> >>
>>>> >>>> >>
>>>> >>>> >> What was your reason to migrate from Apache Jenkins to Github
>>>> >>>> Actions ?
>>>> >>>> >> If you want dedicated resources then you will need to manage
>>>> the CI
>>>> >>>> >> yourself.
>>>> >>>> >> You could use Apache Jenkins/Buildbot with dedicated agents for
>>>> your
>>>> >>>> >> project.
>>>> >>>> >> Or you could set up your own CI infrastructure with Jenkins,
>>>> DroneIO,
>>>> >>>> >> ConcourceCI, ...
>>>> >>>> >>
>>>> >>>> >> Yet another option is to move to CircleCI or Cirrus. They are
>>>> >>>> similar to
>>>> >>>> >> TravisCI / GA and less crowded (for now).
>>>> >>>> >>
>>>> >>>> >> Martin
>>>> >>>> >>
>>>> >>>> >> I appreciate the resources provided to us but that does not
>>>> resolve
>>>> >>>> the
>>>> >>>> >>> issue of the development being slowed down.
>>>> >>>> >>>
>>>> >>>> >>>
>>>> >>>> >>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
>>>> >>>> >>>
>>>> >>>> >>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <
>>>> gurwls223@gmail.com
>>>> >>>> >
>>>> >>>> >>> wrote:
>>>> >>>> >>> >
>>>> >>>> >>> >> Hi all,
>>>> >>>> >>> >>
>>>> >>>> >>> >> I am an Apache Spark PMC,
>>>> >>>> >>> >
>>>> >>>> >>> >
>>>> >>>> >>> > You are a member of the Apache Spark PMC. You are *not* a
>>>> PMC.
>>>> >>>> Please
>>>> >>>> >>> stop
>>>> >>>> >>> > with that terminology. The Foundation has about 200 PMCs,
>>>> and you
>>>> >>>> are a
>>>> >>>> >>> > member of one of them. You are NOT a "PMC" .. you're a
>>>> person. A
>>>> >>>> PMC
>>>> >>>> >>> is a
>>>> >>>> >>> > construct of the Foundation.
>>>> >>>> >>> >
>>>> >>>> >>> > >...
>>>> >>>> >>> >
>>>> >>>> >>> >> I am aware of the limited GitHub Actions resources that are
>>>> >>>> shared
>>>> >>>> >>> >> across all projects in ASF,
>>>> >>>> >>> >> and many projects suffer from it. This issue significantly
>>>> slows
>>>> >>>> down
>>>> >>>> >>> the
>>>> >>>> >>> >> development cycle of
>>>> >>>> >>> >>  other projects, at least Apache Spark.
>>>> >>>> >>> >>
>>>> >>>> >>> >
>>>> >>>> >>> > And the Foundation gets those build minutes for GitHub
>>>> Actions
>>>> >>>> >>> provided to
>>>> >>>> >>> > us from GitHub and Microsoft, and we are thankful that they
>>>> >>>> provide
>>>> >>>> >>> them to
>>>> >>>> >>> > the Foundation. Maybe it isn't all the build minutes that
>>>> every
>>>> >>>> group
>>>> >>>> >>> > wants, but that is what we have. So it is incumbent upon all
>>>> of
>>>> >>>> us to
>>>> >>>> >>> > figure out how to build more, with fewer minutes.
>>>> >>>> >>> >
>>>> >>>> >>> > Say "thank you" to GitHub, please.
>>>> >>>> >>> >
>>>> >>>> >>> > Regards,
>>>> >>>> >>> > -g
>>>> >>>> >>> >
>>>> >>>> >>> >
>>>> >>>> >>>
>>>> >>>> >>
>>>> >>>>
>>>> >>>
>>>> >>>
>>>> >>> --
>>>> >>> +48 660 796 129
>>>> >>>
>>>> >>
>>>>
>>>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Hyukjin Kwon <gu...@gmail.com>.
Just for a note, I already attempted to interact with GitHub team about
resources problems a long ago with tagging infra team, and the infra team
informed me that all ASF issues have to be discussed with them and
addressed by them if I am remembering correctly.

It would be great to understand what has changed if I am supposed to
independently discuss with the GitHub team to address this now.

On Fri, 16 Apr 2021, 22:00 Martin Grigorov, <mg...@apache.org> wrote:

>
>
> On Fri, Apr 16, 2021 at 3:44 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> I thought Jarek was pretty clear on that. I meant this:
>>
>> > So it all has to start with 'per-project' resource limitation and self-
>> > budgeting. It would be GREAT if infra.could provide self-hosted GitHub
>> > Runners SERVICE per project, where project could donate credits or money
>> > for their own account, then the projects would have incentive to
>> optimize
>> > their own usage. I imagine this would be the best thing since the sliced
>> > bread that INFRA could provide to all the projects.
>>
>> Maintaining and providing a self-hosted runners in GitHub Actions where
>> the resources are managed in project level where each project can donate
>> credits.
>>
>> In addition, Jarek mentioned that Airflow already has a working version -
>> is it correct Jarek?
>>
>> If the infra team takes and improves it for other ASF projects, that
>> would permanently resolve this issue.
>>
>> This suggestion looks reasonable and realistic to me.
>>
>> How do you think about this?
>>
>
> I'll let Infra team respond for themselves but to me all these
> improvements should be done by Github Actions team, not by each and every
> project out there.
> But if your project wants to use Apache Airflow's modifications then you
> can do it - just follow what they did at
> https://github.com/apache/airflow/blob/master/.github/workflows/ci.yml
>
>
>>
>>
>> On Fri, 16 Apr 2021, 21:36 Martin Grigorov, <mg...@apache.org> wrote:
>>
>>> Hi Hyukjin,
>>>
>>> On Fri, Apr 16, 2021 at 3:04 AM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>>
>>> > Hi all,
>>> >
>>> > Is here the right place to expect feedback from the infra team or
>>> related
>>> > people?
>>> > It would be great to hear what the infra team thinks about Jarek's
>>> > suggestion.
>>> >
>>>
>>> What suggestion exactly do you mean ?
>>> I've just re-read Jarek's email and I see 3 tasks for Github Actions
>>> team,
>>> but nothing specific for Apache Infra team.
>>>
>>>
>>> >
>>> >
>>> > 2021년 4월 13일 (화) 오전 11:15, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>>> >
>>> >> Hi all,
>>> >>
>>> >> Could we have any update and feedback from the INFRA team about
>>> Jarek's
>>> >> suggestion please?
>>> >>
>>> >> 2021년 4월 9일 (금) 오전 7:06, Jarek Potiuk <ja...@potiuk.com>님이 작성:
>>> >>
>>> >>>
>>> >>>> That's a good idea. We do need to thank Github to give free
>>> resources to
>>> >>>> ASF projects, but it's better if we can make it a business: we allow
>>> >>>> individual projects to sign deals with Github to get dedicated
>>> >>>> resources.
>>> >>>> It's a bit wasteful to ask every project to set up its own dev ops,
>>> >>>> using Github Action is more convenient. Maybe we should raise it to
>>> >>>> Github?
>>> >>>>
>>> >>>
>>> >>> I do not think you can get per-project resources in GH - the most you
>>> >>> can do are self-hosted runners for your project.
>>> >>>
>>> >>> (BTW I am not from the INFRA team - just a humble "CI person" of
>>> Apache
>>> >>> Airflow but very much vested into Github Actions)
>>> >>> maybe the infra team can chime in here. We did raise it to GitHub, we
>>> >>> even had meeting with them
>>> >>> organized by Gavin and several topics were raised that could be
>>> >>> eventually addressed by Github:
>>> >>>
>>> >>> - observability (they could not give us per-project usage dashboard
>>> - we
>>> >>> built our own imperfect (with API limitations) one by Tobiasz from
>>> Airllow
>>> >>> - security (limiting access to only project committers) - this we
>>> >>> handled by the Ash's fork of Runner (but it's also imperfect - even
>>> today I
>>> >>> had to fix a problem where we had list of committers desynchronised
>>> between
>>> >>> our infra/CI.yml)
>>> >>> - manageability (assigning resources per-project) - this works by
>>> having
>>> >>> self-hosted runners assigned per project (we needed infra JIRA
>>> ticket and
>>> >>> generation of a bunch of tokens for our runners and our own AWS
>>> account
>>> >>> with auto-scaling).
>>> >>>
>>> >>> It would be indeed great if it could be available from GitHub, but so
>>> >>> far we do not have any of those.
>>> >>>
>>> >>> J.
>>> >>>
>>> >>>
>>> >>>
>>> >>>> On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <gu...@gmail.com>
>>> >>>> wrote:
>>> >>>>
>>> >>>> > Thanks Martin for your feedback.
>>> >>>> >
>>> >>>> > > What was your reason to migrate from Apache Jenkins to Github
>>> >>>> Actions ?
>>> >>>> >
>>> >>>> > I am sure there were more reasons for migrating from Amplap
>>> Jenkins
>>> >>>> > <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but
>>> as
>>> >>>> far as
>>> >>>> > I can remember:
>>> >>>> > - To reduce the maintenance cost of machines
>>> >>>> > - The Jenkins machines became unstable and slow causing CI jobs to
>>> >>>> fail or
>>> >>>> > be very flaky.
>>> >>>> > - Difficulty to manage the installed libraries.
>>> >>>> > - Intermittent unknown issues in the machines
>>> >>>> >
>>> >>>> > Yes, one option might be to consider other options to migrate
>>> again.
>>> >>>> > However, other projects will very likely suffer the
>>> >>>> > same problem. In addition, the migration in a large project is
>>> not an
>>> >>>> > easy work to do
>>> >>>> >
>>> >>>> > I would like to know the feasibility of having more resources in
>>> >>>> GitHub
>>> >>>> > Actions, or, for example, having sub-groups where
>>> >>>> > each group shares the resources - currently one GitHub
>>> organisation
>>> >>>> shares
>>> >>>> > all resources across the projects.
>>> >>>> >
>>> >>>> >
>>> >>>> > 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mgrigorov@apache.org
>>> >님이
>>> >>>> 작성:
>>> >>>> >
>>> >>>> >>
>>> >>>> >>
>>> >>>> >> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gurwls223@gmail.com
>>> >
>>> >>>> wrote:
>>> >>>> >>
>>> >>>> >>> Hi Greg,
>>> >>>> >>>
>>> >>>> >>> I raised this thread to figure out a way that we can work
>>> together
>>> >>>> to
>>> >>>> >>> resolve this issue, gather feedback, and to understand how other
>>> >>>> projects
>>> >>>> >>> work around.
>>> >>>> >>> Several projects I observed, as far as I can tell, have made
>>> enough
>>> >>>> >>> efforts
>>> >>>> >>> to save the resources in GitHub Actions but still suffer from
>>> the
>>> >>>> lack of
>>> >>>> >>> resources.
>>> >>>> >>>
>>> >>>> >>
>>> >>>> >> And it will get even worse because:
>>> >>>> >> 1) more and more Apache projects migrate from TravisCI to Github
>>> >>>> Actions
>>> >>>> >> (GA)
>>> >>>> >> 2) new projects join ASF and many of them already use GA
>>> >>>> >>
>>> >>>> >>
>>> >>>> >> What was your reason to migrate from Apache Jenkins to Github
>>> >>>> Actions ?
>>> >>>> >> If you want dedicated resources then you will need to manage the
>>> CI
>>> >>>> >> yourself.
>>> >>>> >> You could use Apache Jenkins/Buildbot with dedicated agents for
>>> your
>>> >>>> >> project.
>>> >>>> >> Or you could set up your own CI infrastructure with Jenkins,
>>> DroneIO,
>>> >>>> >> ConcourceCI, ...
>>> >>>> >>
>>> >>>> >> Yet another option is to move to CircleCI or Cirrus. They are
>>> >>>> similar to
>>> >>>> >> TravisCI / GA and less crowded (for now).
>>> >>>> >>
>>> >>>> >> Martin
>>> >>>> >>
>>> >>>> >> I appreciate the resources provided to us but that does not
>>> resolve
>>> >>>> the
>>> >>>> >>> issue of the development being slowed down.
>>> >>>> >>>
>>> >>>> >>>
>>> >>>> >>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
>>> >>>> >>>
>>> >>>> >>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <
>>> gurwls223@gmail.com
>>> >>>> >
>>> >>>> >>> wrote:
>>> >>>> >>> >
>>> >>>> >>> >> Hi all,
>>> >>>> >>> >>
>>> >>>> >>> >> I am an Apache Spark PMC,
>>> >>>> >>> >
>>> >>>> >>> >
>>> >>>> >>> > You are a member of the Apache Spark PMC. You are *not* a PMC.
>>> >>>> Please
>>> >>>> >>> stop
>>> >>>> >>> > with that terminology. The Foundation has about 200 PMCs, and
>>> you
>>> >>>> are a
>>> >>>> >>> > member of one of them. You are NOT a "PMC" .. you're a
>>> person. A
>>> >>>> PMC
>>> >>>> >>> is a
>>> >>>> >>> > construct of the Foundation.
>>> >>>> >>> >
>>> >>>> >>> > >...
>>> >>>> >>> >
>>> >>>> >>> >> I am aware of the limited GitHub Actions resources that are
>>> >>>> shared
>>> >>>> >>> >> across all projects in ASF,
>>> >>>> >>> >> and many projects suffer from it. This issue significantly
>>> slows
>>> >>>> down
>>> >>>> >>> the
>>> >>>> >>> >> development cycle of
>>> >>>> >>> >>  other projects, at least Apache Spark.
>>> >>>> >>> >>
>>> >>>> >>> >
>>> >>>> >>> > And the Foundation gets those build minutes for GitHub Actions
>>> >>>> >>> provided to
>>> >>>> >>> > us from GitHub and Microsoft, and we are thankful that they
>>> >>>> provide
>>> >>>> >>> them to
>>> >>>> >>> > the Foundation. Maybe it isn't all the build minutes that
>>> every
>>> >>>> group
>>> >>>> >>> > wants, but that is what we have. So it is incumbent upon all
>>> of
>>> >>>> us to
>>> >>>> >>> > figure out how to build more, with fewer minutes.
>>> >>>> >>> >
>>> >>>> >>> > Say "thank you" to GitHub, please.
>>> >>>> >>> >
>>> >>>> >>> > Regards,
>>> >>>> >>> > -g
>>> >>>> >>> >
>>> >>>> >>> >
>>> >>>> >>>
>>> >>>> >>
>>> >>>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> +48 660 796 129
>>> >>>
>>> >>
>>>
>>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Martin Grigorov <mg...@apache.org>.
On Fri, Apr 16, 2021 at 3:44 PM Hyukjin Kwon <gu...@gmail.com> wrote:

> I thought Jarek was pretty clear on that. I meant this:
>
> > So it all has to start with 'per-project' resource limitation and self-
> > budgeting. It would be GREAT if infra.could provide self-hosted GitHub
> > Runners SERVICE per project, where project could donate credits or money
> > for their own account, then the projects would have incentive to optimize
> > their own usage. I imagine this would be the best thing since the sliced
> > bread that INFRA could provide to all the projects.
>
> Maintaining and providing a self-hosted runners in GitHub Actions where
> the resources are managed in project level where each project can donate
> credits.
>
> In addition, Jarek mentioned that Airflow already has a working version -
> is it correct Jarek?
>
> If the infra team takes and improves it for other ASF projects, that would
> permanently resolve this issue.
>
> This suggestion looks reasonable and realistic to me.
>
> How do you think about this?
>

I'll let Infra team respond for themselves but to me all these improvements
should be done by Github Actions team, not by each and every project out
there.
But if your project wants to use Apache Airflow's modifications then you
can do it - just follow what they did at
https://github.com/apache/airflow/blob/master/.github/workflows/ci.yml


>
>
> On Fri, 16 Apr 2021, 21:36 Martin Grigorov, <mg...@apache.org> wrote:
>
>> Hi Hyukjin,
>>
>> On Fri, Apr 16, 2021 at 3:04 AM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>> > Hi all,
>> >
>> > Is here the right place to expect feedback from the infra team or
>> related
>> > people?
>> > It would be great to hear what the infra team thinks about Jarek's
>> > suggestion.
>> >
>>
>> What suggestion exactly do you mean ?
>> I've just re-read Jarek's email and I see 3 tasks for Github Actions team,
>> but nothing specific for Apache Infra team.
>>
>>
>> >
>> >
>> > 2021년 4월 13일 (화) 오전 11:15, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>> >
>> >> Hi all,
>> >>
>> >> Could we have any update and feedback from the INFRA team about Jarek's
>> >> suggestion please?
>> >>
>> >> 2021년 4월 9일 (금) 오전 7:06, Jarek Potiuk <ja...@potiuk.com>님이 작성:
>> >>
>> >>>
>> >>>> That's a good idea. We do need to thank Github to give free
>> resources to
>> >>>> ASF projects, but it's better if we can make it a business: we allow
>> >>>> individual projects to sign deals with Github to get dedicated
>> >>>> resources.
>> >>>> It's a bit wasteful to ask every project to set up its own dev ops,
>> >>>> using Github Action is more convenient. Maybe we should raise it to
>> >>>> Github?
>> >>>>
>> >>>
>> >>> I do not think you can get per-project resources in GH - the most you
>> >>> can do are self-hosted runners for your project.
>> >>>
>> >>> (BTW I am not from the INFRA team - just a humble "CI person" of
>> Apache
>> >>> Airflow but very much vested into Github Actions)
>> >>> maybe the infra team can chime in here. We did raise it to GitHub, we
>> >>> even had meeting with them
>> >>> organized by Gavin and several topics were raised that could be
>> >>> eventually addressed by Github:
>> >>>
>> >>> - observability (they could not give us per-project usage dashboard -
>> we
>> >>> built our own imperfect (with API limitations) one by Tobiasz from
>> Airllow
>> >>> - security (limiting access to only project committers) - this we
>> >>> handled by the Ash's fork of Runner (but it's also imperfect - even
>> today I
>> >>> had to fix a problem where we had list of committers desynchronised
>> between
>> >>> our infra/CI.yml)
>> >>> - manageability (assigning resources per-project) - this works by
>> having
>> >>> self-hosted runners assigned per project (we needed infra JIRA ticket
>> and
>> >>> generation of a bunch of tokens for our runners and our own AWS
>> account
>> >>> with auto-scaling).
>> >>>
>> >>> It would be indeed great if it could be available from GitHub, but so
>> >>> far we do not have any of those.
>> >>>
>> >>> J.
>> >>>
>> >>>
>> >>>
>> >>>> On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <gu...@gmail.com>
>> >>>> wrote:
>> >>>>
>> >>>> > Thanks Martin for your feedback.
>> >>>> >
>> >>>> > > What was your reason to migrate from Apache Jenkins to Github
>> >>>> Actions ?
>> >>>> >
>> >>>> > I am sure there were more reasons for migrating from Amplap Jenkins
>> >>>> > <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as
>> >>>> far as
>> >>>> > I can remember:
>> >>>> > - To reduce the maintenance cost of machines
>> >>>> > - The Jenkins machines became unstable and slow causing CI jobs to
>> >>>> fail or
>> >>>> > be very flaky.
>> >>>> > - Difficulty to manage the installed libraries.
>> >>>> > - Intermittent unknown issues in the machines
>> >>>> >
>> >>>> > Yes, one option might be to consider other options to migrate
>> again.
>> >>>> > However, other projects will very likely suffer the
>> >>>> > same problem. In addition, the migration in a large project is not
>> an
>> >>>> > easy work to do
>> >>>> >
>> >>>> > I would like to know the feasibility of having more resources in
>> >>>> GitHub
>> >>>> > Actions, or, for example, having sub-groups where
>> >>>> > each group shares the resources - currently one GitHub organisation
>> >>>> shares
>> >>>> > all resources across the projects.
>> >>>> >
>> >>>> >
>> >>>> > 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mg...@apache.org>님이
>> >>>> 작성:
>> >>>> >
>> >>>> >>
>> >>>> >>
>> >>>> >> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com>
>> >>>> wrote:
>> >>>> >>
>> >>>> >>> Hi Greg,
>> >>>> >>>
>> >>>> >>> I raised this thread to figure out a way that we can work
>> together
>> >>>> to
>> >>>> >>> resolve this issue, gather feedback, and to understand how other
>> >>>> projects
>> >>>> >>> work around.
>> >>>> >>> Several projects I observed, as far as I can tell, have made
>> enough
>> >>>> >>> efforts
>> >>>> >>> to save the resources in GitHub Actions but still suffer from the
>> >>>> lack of
>> >>>> >>> resources.
>> >>>> >>>
>> >>>> >>
>> >>>> >> And it will get even worse because:
>> >>>> >> 1) more and more Apache projects migrate from TravisCI to Github
>> >>>> Actions
>> >>>> >> (GA)
>> >>>> >> 2) new projects join ASF and many of them already use GA
>> >>>> >>
>> >>>> >>
>> >>>> >> What was your reason to migrate from Apache Jenkins to Github
>> >>>> Actions ?
>> >>>> >> If you want dedicated resources then you will need to manage the
>> CI
>> >>>> >> yourself.
>> >>>> >> You could use Apache Jenkins/Buildbot with dedicated agents for
>> your
>> >>>> >> project.
>> >>>> >> Or you could set up your own CI infrastructure with Jenkins,
>> DroneIO,
>> >>>> >> ConcourceCI, ...
>> >>>> >>
>> >>>> >> Yet another option is to move to CircleCI or Cirrus. They are
>> >>>> similar to
>> >>>> >> TravisCI / GA and less crowded (for now).
>> >>>> >>
>> >>>> >> Martin
>> >>>> >>
>> >>>> >> I appreciate the resources provided to us but that does not
>> resolve
>> >>>> the
>> >>>> >>> issue of the development being slowed down.
>> >>>> >>>
>> >>>> >>>
>> >>>> >>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
>> >>>> >>>
>> >>>> >>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <
>> gurwls223@gmail.com
>> >>>> >
>> >>>> >>> wrote:
>> >>>> >>> >
>> >>>> >>> >> Hi all,
>> >>>> >>> >>
>> >>>> >>> >> I am an Apache Spark PMC,
>> >>>> >>> >
>> >>>> >>> >
>> >>>> >>> > You are a member of the Apache Spark PMC. You are *not* a PMC.
>> >>>> Please
>> >>>> >>> stop
>> >>>> >>> > with that terminology. The Foundation has about 200 PMCs, and
>> you
>> >>>> are a
>> >>>> >>> > member of one of them. You are NOT a "PMC" .. you're a person.
>> A
>> >>>> PMC
>> >>>> >>> is a
>> >>>> >>> > construct of the Foundation.
>> >>>> >>> >
>> >>>> >>> > >...
>> >>>> >>> >
>> >>>> >>> >> I am aware of the limited GitHub Actions resources that are
>> >>>> shared
>> >>>> >>> >> across all projects in ASF,
>> >>>> >>> >> and many projects suffer from it. This issue significantly
>> slows
>> >>>> down
>> >>>> >>> the
>> >>>> >>> >> development cycle of
>> >>>> >>> >>  other projects, at least Apache Spark.
>> >>>> >>> >>
>> >>>> >>> >
>> >>>> >>> > And the Foundation gets those build minutes for GitHub Actions
>> >>>> >>> provided to
>> >>>> >>> > us from GitHub and Microsoft, and we are thankful that they
>> >>>> provide
>> >>>> >>> them to
>> >>>> >>> > the Foundation. Maybe it isn't all the build minutes that every
>> >>>> group
>> >>>> >>> > wants, but that is what we have. So it is incumbent upon all of
>> >>>> us to
>> >>>> >>> > figure out how to build more, with fewer minutes.
>> >>>> >>> >
>> >>>> >>> > Say "thank you" to GitHub, please.
>> >>>> >>> >
>> >>>> >>> > Regards,
>> >>>> >>> > -g
>> >>>> >>> >
>> >>>> >>> >
>> >>>> >>>
>> >>>> >>
>> >>>>
>> >>>
>> >>>
>> >>> --
>> >>> +48 660 796 129
>> >>>
>> >>
>>
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Hyukjin Kwon <gu...@gmail.com>.
I thought Jarek was pretty clear on that. I meant this:

> So it all has to start with 'per-project' resource limitation and self-
> budgeting. It would be GREAT if infra.could provide self-hosted GitHub
> Runners SERVICE per project, where project could donate credits or money
> for their own account, then the projects would have incentive to optimize
> their own usage. I imagine this would be the best thing since the sliced
> bread that INFRA could provide to all the projects.

Maintaining and providing a self-hosted runners in GitHub Actions where the
resources are managed in project level where each project can donate
credits.

In addition, Jarek mentioned that Airflow already has a working version -
is it correct Jarek?

If the infra team takes and improves it for other ASF projects, that would
permanently resolve this issue.

This suggestion looks reasonable and realistic to me.

How do you think about this?


On Fri, 16 Apr 2021, 21:36 Martin Grigorov, <mg...@apache.org> wrote:

> Hi Hyukjin,
>
> On Fri, Apr 16, 2021 at 3:04 AM Hyukjin Kwon <gu...@gmail.com> wrote:
>
> > Hi all,
> >
> > Is here the right place to expect feedback from the infra team or related
> > people?
> > It would be great to hear what the infra team thinks about Jarek's
> > suggestion.
> >
>
> What suggestion exactly do you mean ?
> I've just re-read Jarek's email and I see 3 tasks for Github Actions team,
> but nothing specific for Apache Infra team.
>
>
> >
> >
> > 2021년 4월 13일 (화) 오전 11:15, Hyukjin Kwon <gu...@gmail.com>님이 작성:
> >
> >> Hi all,
> >>
> >> Could we have any update and feedback from the INFRA team about Jarek's
> >> suggestion please?
> >>
> >> 2021년 4월 9일 (금) 오전 7:06, Jarek Potiuk <ja...@potiuk.com>님이 작성:
> >>
> >>>
> >>>> That's a good idea. We do need to thank Github to give free resources
> to
> >>>> ASF projects, but it's better if we can make it a business: we allow
> >>>> individual projects to sign deals with Github to get dedicated
> >>>> resources.
> >>>> It's a bit wasteful to ask every project to set up its own dev ops,
> >>>> using Github Action is more convenient. Maybe we should raise it to
> >>>> Github?
> >>>>
> >>>
> >>> I do not think you can get per-project resources in GH - the most you
> >>> can do are self-hosted runners for your project.
> >>>
> >>> (BTW I am not from the INFRA team - just a humble "CI person" of Apache
> >>> Airflow but very much vested into Github Actions)
> >>> maybe the infra team can chime in here. We did raise it to GitHub, we
> >>> even had meeting with them
> >>> organized by Gavin and several topics were raised that could be
> >>> eventually addressed by Github:
> >>>
> >>> - observability (they could not give us per-project usage dashboard -
> we
> >>> built our own imperfect (with API limitations) one by Tobiasz from
> Airllow
> >>> - security (limiting access to only project committers) - this we
> >>> handled by the Ash's fork of Runner (but it's also imperfect - even
> today I
> >>> had to fix a problem where we had list of committers desynchronised
> between
> >>> our infra/CI.yml)
> >>> - manageability (assigning resources per-project) - this works by
> having
> >>> self-hosted runners assigned per project (we needed infra JIRA ticket
> and
> >>> generation of a bunch of tokens for our runners and our own AWS account
> >>> with auto-scaling).
> >>>
> >>> It would be indeed great if it could be available from GitHub, but so
> >>> far we do not have any of those.
> >>>
> >>> J.
> >>>
> >>>
> >>>
> >>>> On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <gu...@gmail.com>
> >>>> wrote:
> >>>>
> >>>> > Thanks Martin for your feedback.
> >>>> >
> >>>> > > What was your reason to migrate from Apache Jenkins to Github
> >>>> Actions ?
> >>>> >
> >>>> > I am sure there were more reasons for migrating from Amplap Jenkins
> >>>> > <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as
> >>>> far as
> >>>> > I can remember:
> >>>> > - To reduce the maintenance cost of machines
> >>>> > - The Jenkins machines became unstable and slow causing CI jobs to
> >>>> fail or
> >>>> > be very flaky.
> >>>> > - Difficulty to manage the installed libraries.
> >>>> > - Intermittent unknown issues in the machines
> >>>> >
> >>>> > Yes, one option might be to consider other options to migrate again.
> >>>> > However, other projects will very likely suffer the
> >>>> > same problem. In addition, the migration in a large project is not
> an
> >>>> > easy work to do
> >>>> >
> >>>> > I would like to know the feasibility of having more resources in
> >>>> GitHub
> >>>> > Actions, or, for example, having sub-groups where
> >>>> > each group shares the resources - currently one GitHub organisation
> >>>> shares
> >>>> > all resources across the projects.
> >>>> >
> >>>> >
> >>>> > 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mg...@apache.org>님이
> >>>> 작성:
> >>>> >
> >>>> >>
> >>>> >>
> >>>> >> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com>
> >>>> wrote:
> >>>> >>
> >>>> >>> Hi Greg,
> >>>> >>>
> >>>> >>> I raised this thread to figure out a way that we can work together
> >>>> to
> >>>> >>> resolve this issue, gather feedback, and to understand how other
> >>>> projects
> >>>> >>> work around.
> >>>> >>> Several projects I observed, as far as I can tell, have made
> enough
> >>>> >>> efforts
> >>>> >>> to save the resources in GitHub Actions but still suffer from the
> >>>> lack of
> >>>> >>> resources.
> >>>> >>>
> >>>> >>
> >>>> >> And it will get even worse because:
> >>>> >> 1) more and more Apache projects migrate from TravisCI to Github
> >>>> Actions
> >>>> >> (GA)
> >>>> >> 2) new projects join ASF and many of them already use GA
> >>>> >>
> >>>> >>
> >>>> >> What was your reason to migrate from Apache Jenkins to Github
> >>>> Actions ?
> >>>> >> If you want dedicated resources then you will need to manage the CI
> >>>> >> yourself.
> >>>> >> You could use Apache Jenkins/Buildbot with dedicated agents for
> your
> >>>> >> project.
> >>>> >> Or you could set up your own CI infrastructure with Jenkins,
> DroneIO,
> >>>> >> ConcourceCI, ...
> >>>> >>
> >>>> >> Yet another option is to move to CircleCI or Cirrus. They are
> >>>> similar to
> >>>> >> TravisCI / GA and less crowded (for now).
> >>>> >>
> >>>> >> Martin
> >>>> >>
> >>>> >> I appreciate the resources provided to us but that does not resolve
> >>>> the
> >>>> >>> issue of the development being slowed down.
> >>>> >>>
> >>>> >>>
> >>>> >>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
> >>>> >>>
> >>>> >>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <
> gurwls223@gmail.com
> >>>> >
> >>>> >>> wrote:
> >>>> >>> >
> >>>> >>> >> Hi all,
> >>>> >>> >>
> >>>> >>> >> I am an Apache Spark PMC,
> >>>> >>> >
> >>>> >>> >
> >>>> >>> > You are a member of the Apache Spark PMC. You are *not* a PMC.
> >>>> Please
> >>>> >>> stop
> >>>> >>> > with that terminology. The Foundation has about 200 PMCs, and
> you
> >>>> are a
> >>>> >>> > member of one of them. You are NOT a "PMC" .. you're a person. A
> >>>> PMC
> >>>> >>> is a
> >>>> >>> > construct of the Foundation.
> >>>> >>> >
> >>>> >>> > >...
> >>>> >>> >
> >>>> >>> >> I am aware of the limited GitHub Actions resources that are
> >>>> shared
> >>>> >>> >> across all projects in ASF,
> >>>> >>> >> and many projects suffer from it. This issue significantly
> slows
> >>>> down
> >>>> >>> the
> >>>> >>> >> development cycle of
> >>>> >>> >>  other projects, at least Apache Spark.
> >>>> >>> >>
> >>>> >>> >
> >>>> >>> > And the Foundation gets those build minutes for GitHub Actions
> >>>> >>> provided to
> >>>> >>> > us from GitHub and Microsoft, and we are thankful that they
> >>>> provide
> >>>> >>> them to
> >>>> >>> > the Foundation. Maybe it isn't all the build minutes that every
> >>>> group
> >>>> >>> > wants, but that is what we have. So it is incumbent upon all of
> >>>> us to
> >>>> >>> > figure out how to build more, with fewer minutes.
> >>>> >>> >
> >>>> >>> > Say "thank you" to GitHub, please.
> >>>> >>> >
> >>>> >>> > Regards,
> >>>> >>> > -g
> >>>> >>> >
> >>>> >>> >
> >>>> >>>
> >>>> >>
> >>>>
> >>>
> >>>
> >>> --
> >>> +48 660 796 129
> >>>
> >>
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Martin Grigorov <mg...@apache.org>.
Hi Hyukjin,

On Fri, Apr 16, 2021 at 3:04 AM Hyukjin Kwon <gu...@gmail.com> wrote:

> Hi all,
>
> Is here the right place to expect feedback from the infra team or related
> people?
> It would be great to hear what the infra team thinks about Jarek's
> suggestion.
>

What suggestion exactly do you mean ?
I've just re-read Jarek's email and I see 3 tasks for Github Actions team,
but nothing specific for Apache Infra team.


>
>
> 2021년 4월 13일 (화) 오전 11:15, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>
>> Hi all,
>>
>> Could we have any update and feedback from the INFRA team about Jarek's
>> suggestion please?
>>
>> 2021년 4월 9일 (금) 오전 7:06, Jarek Potiuk <ja...@potiuk.com>님이 작성:
>>
>>>
>>>> That's a good idea. We do need to thank Github to give free resources to
>>>> ASF projects, but it's better if we can make it a business: we allow
>>>> individual projects to sign deals with Github to get dedicated
>>>> resources.
>>>> It's a bit wasteful to ask every project to set up its own dev ops,
>>>> using Github Action is more convenient. Maybe we should raise it to
>>>> Github?
>>>>
>>>
>>> I do not think you can get per-project resources in GH - the most you
>>> can do are self-hosted runners for your project.
>>>
>>> (BTW I am not from the INFRA team - just a humble "CI person" of Apache
>>> Airflow but very much vested into Github Actions)
>>> maybe the infra team can chime in here. We did raise it to GitHub, we
>>> even had meeting with them
>>> organized by Gavin and several topics were raised that could be
>>> eventually addressed by Github:
>>>
>>> - observability (they could not give us per-project usage dashboard - we
>>> built our own imperfect (with API limitations) one by Tobiasz from Airllow
>>> - security (limiting access to only project committers) - this we
>>> handled by the Ash's fork of Runner (but it's also imperfect - even today I
>>> had to fix a problem where we had list of committers desynchronised between
>>> our infra/CI.yml)
>>> - manageability (assigning resources per-project) - this works by having
>>> self-hosted runners assigned per project (we needed infra JIRA ticket and
>>> generation of a bunch of tokens for our runners and our own AWS account
>>> with auto-scaling).
>>>
>>> It would be indeed great if it could be available from GitHub, but so
>>> far we do not have any of those.
>>>
>>> J.
>>>
>>>
>>>
>>>> On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <gu...@gmail.com>
>>>> wrote:
>>>>
>>>> > Thanks Martin for your feedback.
>>>> >
>>>> > > What was your reason to migrate from Apache Jenkins to Github
>>>> Actions ?
>>>> >
>>>> > I am sure there were more reasons for migrating from Amplap Jenkins
>>>> > <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as
>>>> far as
>>>> > I can remember:
>>>> > - To reduce the maintenance cost of machines
>>>> > - The Jenkins machines became unstable and slow causing CI jobs to
>>>> fail or
>>>> > be very flaky.
>>>> > - Difficulty to manage the installed libraries.
>>>> > - Intermittent unknown issues in the machines
>>>> >
>>>> > Yes, one option might be to consider other options to migrate again.
>>>> > However, other projects will very likely suffer the
>>>> > same problem. In addition, the migration in a large project is not an
>>>> > easy work to do
>>>> >
>>>> > I would like to know the feasibility of having more resources in
>>>> GitHub
>>>> > Actions, or, for example, having sub-groups where
>>>> > each group shares the resources - currently one GitHub organisation
>>>> shares
>>>> > all resources across the projects.
>>>> >
>>>> >
>>>> > 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mg...@apache.org>님이
>>>> 작성:
>>>> >
>>>> >>
>>>> >>
>>>> >> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com>
>>>> wrote:
>>>> >>
>>>> >>> Hi Greg,
>>>> >>>
>>>> >>> I raised this thread to figure out a way that we can work together
>>>> to
>>>> >>> resolve this issue, gather feedback, and to understand how other
>>>> projects
>>>> >>> work around.
>>>> >>> Several projects I observed, as far as I can tell, have made enough
>>>> >>> efforts
>>>> >>> to save the resources in GitHub Actions but still suffer from the
>>>> lack of
>>>> >>> resources.
>>>> >>>
>>>> >>
>>>> >> And it will get even worse because:
>>>> >> 1) more and more Apache projects migrate from TravisCI to Github
>>>> Actions
>>>> >> (GA)
>>>> >> 2) new projects join ASF and many of them already use GA
>>>> >>
>>>> >>
>>>> >> What was your reason to migrate from Apache Jenkins to Github
>>>> Actions ?
>>>> >> If you want dedicated resources then you will need to manage the CI
>>>> >> yourself.
>>>> >> You could use Apache Jenkins/Buildbot with dedicated agents for your
>>>> >> project.
>>>> >> Or you could set up your own CI infrastructure with Jenkins, DroneIO,
>>>> >> ConcourceCI, ...
>>>> >>
>>>> >> Yet another option is to move to CircleCI or Cirrus. They are
>>>> similar to
>>>> >> TravisCI / GA and less crowded (for now).
>>>> >>
>>>> >> Martin
>>>> >>
>>>> >> I appreciate the resources provided to us but that does not resolve
>>>> the
>>>> >>> issue of the development being slowed down.
>>>> >>>
>>>> >>>
>>>> >>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
>>>> >>>
>>>> >>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gurwls223@gmail.com
>>>> >
>>>> >>> wrote:
>>>> >>> >
>>>> >>> >> Hi all,
>>>> >>> >>
>>>> >>> >> I am an Apache Spark PMC,
>>>> >>> >
>>>> >>> >
>>>> >>> > You are a member of the Apache Spark PMC. You are *not* a PMC.
>>>> Please
>>>> >>> stop
>>>> >>> > with that terminology. The Foundation has about 200 PMCs, and you
>>>> are a
>>>> >>> > member of one of them. You are NOT a "PMC" .. you're a person. A
>>>> PMC
>>>> >>> is a
>>>> >>> > construct of the Foundation.
>>>> >>> >
>>>> >>> > >...
>>>> >>> >
>>>> >>> >> I am aware of the limited GitHub Actions resources that are
>>>> shared
>>>> >>> >> across all projects in ASF,
>>>> >>> >> and many projects suffer from it. This issue significantly slows
>>>> down
>>>> >>> the
>>>> >>> >> development cycle of
>>>> >>> >>  other projects, at least Apache Spark.
>>>> >>> >>
>>>> >>> >
>>>> >>> > And the Foundation gets those build minutes for GitHub Actions
>>>> >>> provided to
>>>> >>> > us from GitHub and Microsoft, and we are thankful that they
>>>> provide
>>>> >>> them to
>>>> >>> > the Foundation. Maybe it isn't all the build minutes that every
>>>> group
>>>> >>> > wants, but that is what we have. So it is incumbent upon all of
>>>> us to
>>>> >>> > figure out how to build more, with fewer minutes.
>>>> >>> >
>>>> >>> > Say "thank you" to GitHub, please.
>>>> >>> >
>>>> >>> > Regards,
>>>> >>> > -g
>>>> >>> >
>>>> >>> >
>>>> >>>
>>>> >>
>>>>
>>>
>>>
>>> --
>>> +48 660 796 129
>>>
>>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Hyukjin Kwon <gu...@gmail.com>.
Hi all,

Is here the right place to expect feedback from the infra team or related
people?
It would be great to hear what the infra team thinks about Jarek's
suggestion.


2021년 4월 13일 (화) 오전 11:15, Hyukjin Kwon <gu...@gmail.com>님이 작성:

> Hi all,
>
> Could we have any update and feedback from the INFRA team about Jarek's
> suggestion please?
>
> 2021년 4월 9일 (금) 오전 7:06, Jarek Potiuk <ja...@potiuk.com>님이 작성:
>
>>
>>> That's a good idea. We do need to thank Github to give free resources to
>>> ASF projects, but it's better if we can make it a business: we allow
>>> individual projects to sign deals with Github to get dedicated resources.
>>> It's a bit wasteful to ask every project to set up its own dev ops,
>>> using Github Action is more convenient. Maybe we should raise it to
>>> Github?
>>>
>>
>> I do not think you can get per-project resources in GH - the most you can
>> do are self-hosted runners for your project.
>>
>> (BTW I am not from the INFRA team - just a humble "CI person" of Apache
>> Airflow but very much vested into Github Actions)
>> maybe the infra team can chime in here. We did raise it to GitHub, we
>> even had meeting with them
>> organized by Gavin and several topics were raised that could be
>> eventually addressed by Github:
>>
>> - observability (they could not give us per-project usage dashboard - we
>> built our own imperfect (with API limitations) one by Tobiasz from Airllow
>> - security (limiting access to only project committers) - this we handled
>> by the Ash's fork of Runner (but it's also imperfect - even today I had to
>> fix a problem where we had list of committers desynchronised between our
>> infra/CI.yml)
>> - manageability (assigning resources per-project) - this works by having
>> self-hosted runners assigned per project (we needed infra JIRA ticket and
>> generation of a bunch of tokens for our runners and our own AWS account
>> with auto-scaling).
>>
>> It would be indeed great if it could be available from GitHub, but so far
>> we do not have any of those.
>>
>> J.
>>
>>
>>
>>> On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>>
>>> > Thanks Martin for your feedback.
>>> >
>>> > > What was your reason to migrate from Apache Jenkins to Github
>>> Actions ?
>>> >
>>> > I am sure there were more reasons for migrating from Amplap Jenkins
>>> > <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as
>>> far as
>>> > I can remember:
>>> > - To reduce the maintenance cost of machines
>>> > - The Jenkins machines became unstable and slow causing CI jobs to
>>> fail or
>>> > be very flaky.
>>> > - Difficulty to manage the installed libraries.
>>> > - Intermittent unknown issues in the machines
>>> >
>>> > Yes, one option might be to consider other options to migrate again.
>>> > However, other projects will very likely suffer the
>>> > same problem. In addition, the migration in a large project is not an
>>> > easy work to do
>>> >
>>> > I would like to know the feasibility of having more resources in GitHub
>>> > Actions, or, for example, having sub-groups where
>>> > each group shares the resources - currently one GitHub organisation
>>> shares
>>> > all resources across the projects.
>>> >
>>> >
>>> > 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mg...@apache.org>님이 작성:
>>> >
>>> >>
>>> >>
>>> >> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>> >>
>>> >>> Hi Greg,
>>> >>>
>>> >>> I raised this thread to figure out a way that we can work together to
>>> >>> resolve this issue, gather feedback, and to understand how other
>>> projects
>>> >>> work around.
>>> >>> Several projects I observed, as far as I can tell, have made enough
>>> >>> efforts
>>> >>> to save the resources in GitHub Actions but still suffer from the
>>> lack of
>>> >>> resources.
>>> >>>
>>> >>
>>> >> And it will get even worse because:
>>> >> 1) more and more Apache projects migrate from TravisCI to Github
>>> Actions
>>> >> (GA)
>>> >> 2) new projects join ASF and many of them already use GA
>>> >>
>>> >>
>>> >> What was your reason to migrate from Apache Jenkins to Github Actions
>>> ?
>>> >> If you want dedicated resources then you will need to manage the CI
>>> >> yourself.
>>> >> You could use Apache Jenkins/Buildbot with dedicated agents for your
>>> >> project.
>>> >> Or you could set up your own CI infrastructure with Jenkins, DroneIO,
>>> >> ConcourceCI, ...
>>> >>
>>> >> Yet another option is to move to CircleCI or Cirrus. They are similar
>>> to
>>> >> TravisCI / GA and less crowded (for now).
>>> >>
>>> >> Martin
>>> >>
>>> >> I appreciate the resources provided to us but that does not resolve
>>> the
>>> >>> issue of the development being slowed down.
>>> >>>
>>> >>>
>>> >>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
>>> >>>
>>> >>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com>
>>> >>> wrote:
>>> >>> >
>>> >>> >> Hi all,
>>> >>> >>
>>> >>> >> I am an Apache Spark PMC,
>>> >>> >
>>> >>> >
>>> >>> > You are a member of the Apache Spark PMC. You are *not* a PMC.
>>> Please
>>> >>> stop
>>> >>> > with that terminology. The Foundation has about 200 PMCs, and you
>>> are a
>>> >>> > member of one of them. You are NOT a "PMC" .. you're a person. A
>>> PMC
>>> >>> is a
>>> >>> > construct of the Foundation.
>>> >>> >
>>> >>> > >...
>>> >>> >
>>> >>> >> I am aware of the limited GitHub Actions resources that are shared
>>> >>> >> across all projects in ASF,
>>> >>> >> and many projects suffer from it. This issue significantly slows
>>> down
>>> >>> the
>>> >>> >> development cycle of
>>> >>> >>  other projects, at least Apache Spark.
>>> >>> >>
>>> >>> >
>>> >>> > And the Foundation gets those build minutes for GitHub Actions
>>> >>> provided to
>>> >>> > us from GitHub and Microsoft, and we are thankful that they provide
>>> >>> them to
>>> >>> > the Foundation. Maybe it isn't all the build minutes that every
>>> group
>>> >>> > wants, but that is what we have. So it is incumbent upon all of us
>>> to
>>> >>> > figure out how to build more, with fewer minutes.
>>> >>> >
>>> >>> > Say "thank you" to GitHub, please.
>>> >>> >
>>> >>> > Regards,
>>> >>> > -g
>>> >>> >
>>> >>> >
>>> >>>
>>> >>
>>>
>>
>>
>> --
>> +48 660 796 129
>>
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Hyukjin Kwon <gu...@gmail.com>.
Hi all,

Could we have any update and feedback from the INFRA team about Jarek's
suggestion please?

2021년 4월 9일 (금) 오전 7:06, Jarek Potiuk <ja...@potiuk.com>님이 작성:

>
>> That's a good idea. We do need to thank Github to give free resources to
>> ASF projects, but it's better if we can make it a business: we allow
>> individual projects to sign deals with Github to get dedicated resources.
>> It's a bit wasteful to ask every project to set up its own dev ops,
>> using Github Action is more convenient. Maybe we should raise it to
>> Github?
>>
>
> I do not think you can get per-project resources in GH - the most you can
> do are self-hosted runners for your project.
>
> (BTW I am not from the INFRA team - just a humble "CI person" of Apache
> Airflow but very much vested into Github Actions)
> maybe the infra team can chime in here. We did raise it to GitHub, we even
> had meeting with them
> organized by Gavin and several topics were raised that could be eventually
> addressed by Github:
>
> - observability (they could not give us per-project usage dashboard - we
> built our own imperfect (with API limitations) one by Tobiasz from Airllow
> - security (limiting access to only project committers) - this we handled
> by the Ash's fork of Runner (but it's also imperfect - even today I had to
> fix a problem where we had list of committers desynchronised between our
> infra/CI.yml)
> - manageability (assigning resources per-project) - this works by having
> self-hosted runners assigned per project (we needed infra JIRA ticket and
> generation of a bunch of tokens for our runners and our own AWS account
> with auto-scaling).
>
> It would be indeed great if it could be available from GitHub, but so far
> we do not have any of those.
>
> J.
>
>
>
>> On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>> > Thanks Martin for your feedback.
>> >
>> > > What was your reason to migrate from Apache Jenkins to Github Actions
>> ?
>> >
>> > I am sure there were more reasons for migrating from Amplap Jenkins
>> > <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as far
>> as
>> > I can remember:
>> > - To reduce the maintenance cost of machines
>> > - The Jenkins machines became unstable and slow causing CI jobs to fail
>> or
>> > be very flaky.
>> > - Difficulty to manage the installed libraries.
>> > - Intermittent unknown issues in the machines
>> >
>> > Yes, one option might be to consider other options to migrate again.
>> > However, other projects will very likely suffer the
>> > same problem. In addition, the migration in a large project is not an
>> > easy work to do
>> >
>> > I would like to know the feasibility of having more resources in GitHub
>> > Actions, or, for example, having sub-groups where
>> > each group shares the resources - currently one GitHub organisation
>> shares
>> > all resources across the projects.
>> >
>> >
>> > 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mg...@apache.org>님이 작성:
>> >
>> >>
>> >>
>> >> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com>
>> wrote:
>> >>
>> >>> Hi Greg,
>> >>>
>> >>> I raised this thread to figure out a way that we can work together to
>> >>> resolve this issue, gather feedback, and to understand how other
>> projects
>> >>> work around.
>> >>> Several projects I observed, as far as I can tell, have made enough
>> >>> efforts
>> >>> to save the resources in GitHub Actions but still suffer from the
>> lack of
>> >>> resources.
>> >>>
>> >>
>> >> And it will get even worse because:
>> >> 1) more and more Apache projects migrate from TravisCI to Github
>> Actions
>> >> (GA)
>> >> 2) new projects join ASF and many of them already use GA
>> >>
>> >>
>> >> What was your reason to migrate from Apache Jenkins to Github Actions ?
>> >> If you want dedicated resources then you will need to manage the CI
>> >> yourself.
>> >> You could use Apache Jenkins/Buildbot with dedicated agents for your
>> >> project.
>> >> Or you could set up your own CI infrastructure with Jenkins, DroneIO,
>> >> ConcourceCI, ...
>> >>
>> >> Yet another option is to move to CircleCI or Cirrus. They are similar
>> to
>> >> TravisCI / GA and less crowded (for now).
>> >>
>> >> Martin
>> >>
>> >> I appreciate the resources provided to us but that does not resolve the
>> >>> issue of the development being slowed down.
>> >>>
>> >>>
>> >>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
>> >>>
>> >>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com>
>> >>> wrote:
>> >>> >
>> >>> >> Hi all,
>> >>> >>
>> >>> >> I am an Apache Spark PMC,
>> >>> >
>> >>> >
>> >>> > You are a member of the Apache Spark PMC. You are *not* a PMC.
>> Please
>> >>> stop
>> >>> > with that terminology. The Foundation has about 200 PMCs, and you
>> are a
>> >>> > member of one of them. You are NOT a "PMC" .. you're a person. A PMC
>> >>> is a
>> >>> > construct of the Foundation.
>> >>> >
>> >>> > >...
>> >>> >
>> >>> >> I am aware of the limited GitHub Actions resources that are shared
>> >>> >> across all projects in ASF,
>> >>> >> and many projects suffer from it. This issue significantly slows
>> down
>> >>> the
>> >>> >> development cycle of
>> >>> >>  other projects, at least Apache Spark.
>> >>> >>
>> >>> >
>> >>> > And the Foundation gets those build minutes for GitHub Actions
>> >>> provided to
>> >>> > us from GitHub and Microsoft, and we are thankful that they provide
>> >>> them to
>> >>> > the Foundation. Maybe it isn't all the build minutes that every
>> group
>> >>> > wants, but that is what we have. So it is incumbent upon all of us
>> to
>> >>> > figure out how to build more, with fewer minutes.
>> >>> >
>> >>> > Say "thank you" to GitHub, please.
>> >>> >
>> >>> > Regards,
>> >>> > -g
>> >>> >
>> >>> >
>> >>>
>> >>
>>
>
>
> --
> +48 660 796 129
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Jarek Potiuk <ja...@potiuk.com>.
>
>
> That's a good idea. We do need to thank Github to give free resources to
> ASF projects, but it's better if we can make it a business: we allow
> individual projects to sign deals with Github to get dedicated resources.
> It's a bit wasteful to ask every project to set up its own dev ops,
> using Github Action is more convenient. Maybe we should raise it to Github?
>

I do not think you can get per-project resources in GH - the most you can
do are self-hosted runners for your project.

(BTW I am not from the INFRA team - just a humble "CI person" of Apache
Airflow but very much vested into Github Actions)
maybe the infra team can chime in here. We did raise it to GitHub, we even
had meeting with them
organized by Gavin and several topics were raised that could be eventually
addressed by Github:

- observability (they could not give us per-project usage dashboard - we
built our own imperfect (with API limitations) one by Tobiasz from Airllow
- security (limiting access to only project committers) - this we handled
by the Ash's fork of Runner (but it's also imperfect - even today I had to
fix a problem where we had list of committers desynchronised between our
infra/CI.yml)
- manageability (assigning resources per-project) - this works by having
self-hosted runners assigned per project (we needed infra JIRA ticket and
generation of a bunch of tokens for our runners and our own AWS account
with auto-scaling).

It would be indeed great if it could be available from GitHub, but so far
we do not have any of those.

J.



> On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
> > Thanks Martin for your feedback.
> >
> > > What was your reason to migrate from Apache Jenkins to Github Actions ?
> >
> > I am sure there were more reasons for migrating from Amplap Jenkins
> > <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as far
> as
> > I can remember:
> > - To reduce the maintenance cost of machines
> > - The Jenkins machines became unstable and slow causing CI jobs to fail
> or
> > be very flaky.
> > - Difficulty to manage the installed libraries.
> > - Intermittent unknown issues in the machines
> >
> > Yes, one option might be to consider other options to migrate again.
> > However, other projects will very likely suffer the
> > same problem. In addition, the migration in a large project is not an
> > easy work to do
> >
> > I would like to know the feasibility of having more resources in GitHub
> > Actions, or, for example, having sub-groups where
> > each group shares the resources - currently one GitHub organisation
> shares
> > all resources across the projects.
> >
> >
> > 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mg...@apache.org>님이 작성:
> >
> >>
> >>
> >> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com>
> wrote:
> >>
> >>> Hi Greg,
> >>>
> >>> I raised this thread to figure out a way that we can work together to
> >>> resolve this issue, gather feedback, and to understand how other
> projects
> >>> work around.
> >>> Several projects I observed, as far as I can tell, have made enough
> >>> efforts
> >>> to save the resources in GitHub Actions but still suffer from the lack
> of
> >>> resources.
> >>>
> >>
> >> And it will get even worse because:
> >> 1) more and more Apache projects migrate from TravisCI to Github Actions
> >> (GA)
> >> 2) new projects join ASF and many of them already use GA
> >>
> >>
> >> What was your reason to migrate from Apache Jenkins to Github Actions ?
> >> If you want dedicated resources then you will need to manage the CI
> >> yourself.
> >> You could use Apache Jenkins/Buildbot with dedicated agents for your
> >> project.
> >> Or you could set up your own CI infrastructure with Jenkins, DroneIO,
> >> ConcourceCI, ...
> >>
> >> Yet another option is to move to CircleCI or Cirrus. They are similar to
> >> TravisCI / GA and less crowded (for now).
> >>
> >> Martin
> >>
> >> I appreciate the resources provided to us but that does not resolve the
> >>> issue of the development being slowed down.
> >>>
> >>>
> >>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
> >>>
> >>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com>
> >>> wrote:
> >>> >
> >>> >> Hi all,
> >>> >>
> >>> >> I am an Apache Spark PMC,
> >>> >
> >>> >
> >>> > You are a member of the Apache Spark PMC. You are *not* a PMC. Please
> >>> stop
> >>> > with that terminology. The Foundation has about 200 PMCs, and you
> are a
> >>> > member of one of them. You are NOT a "PMC" .. you're a person. A PMC
> >>> is a
> >>> > construct of the Foundation.
> >>> >
> >>> > >...
> >>> >
> >>> >> I am aware of the limited GitHub Actions resources that are shared
> >>> >> across all projects in ASF,
> >>> >> and many projects suffer from it. This issue significantly slows
> down
> >>> the
> >>> >> development cycle of
> >>> >>  other projects, at least Apache Spark.
> >>> >>
> >>> >
> >>> > And the Foundation gets those build minutes for GitHub Actions
> >>> provided to
> >>> > us from GitHub and Microsoft, and we are thankful that they provide
> >>> them to
> >>> > the Foundation. Maybe it isn't all the build minutes that every group
> >>> > wants, but that is what we have. So it is incumbent upon all of us to
> >>> > figure out how to build more, with fewer minutes.
> >>> >
> >>> > Say "thank you" to GitHub, please.
> >>> >
> >>> > Regards,
> >>> > -g
> >>> >
> >>> >
> >>>
> >>
>


-- 
+48 660 796 129

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Hyukjin Kwon <gu...@gmail.com>.
- builds

FYI, cc'ing Spark dev was dropped during the discussion. If you haven't
subscribed to builds@a.g, you have seen the partial discussions only.
Please subscribe builds@apache.org mailing list to participate in the
discussion further.


2021년 4월 8일 (목) 오후 1:50, Wenchen Fan <cl...@gmail.com>님이 작성:

> > for example, having sub-groups where each group shares the resources -
> currently one GitHub organisation shares all resources across the projects.
>
> That's a good idea. We do need to thank Github to give free resources to
> ASF projects, but it's better if we can make it a business: we allow
> individual projects to sign deals with Github to get dedicated resources.
> It's a bit wasteful to ask every project to set up its own dev ops,
> using Github Action is more convenient. Maybe we should raise it to Github?
>
> On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Thanks Martin for your feedback.
>>
>> > What was your reason to migrate from Apache Jenkins to Github Actions ?
>>
>> I am sure there were more reasons for migrating from Amplap Jenkins
>> <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as far
>> as I can remember:
>> - To reduce the maintenance cost of machines
>> - The Jenkins machines became unstable and slow causing CI jobs to fail
>> or be very flaky.
>> - Difficulty to manage the installed libraries.
>> - Intermittent unknown issues in the machines
>>
>> Yes, one option might be to consider other options to migrate again.
>> However, other projects will very likely suffer the
>> same problem. In addition, the migration in a large project is not an
>> easy work to do
>>
>> I would like to know the feasibility of having more resources in GitHub
>> Actions, or, for example, having sub-groups where
>> each group shares the resources - currently one GitHub organisation
>> shares all resources across the projects.
>>
>>
>> 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mg...@apache.org>님이 작성:
>>
>>>
>>>
>>> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>>
>>>> Hi Greg,
>>>>
>>>> I raised this thread to figure out a way that we can work together to
>>>> resolve this issue, gather feedback, and to understand how other
>>>> projects
>>>> work around.
>>>> Several projects I observed, as far as I can tell, have made enough
>>>> efforts
>>>> to save the resources in GitHub Actions but still suffer from the lack
>>>> of
>>>> resources.
>>>>
>>>
>>> And it will get even worse because:
>>> 1) more and more Apache projects migrate from TravisCI to Github Actions
>>> (GA)
>>> 2) new projects join ASF and many of them already use GA
>>>
>>>
>>> What was your reason to migrate from Apache Jenkins to Github Actions ?
>>> If you want dedicated resources then you will need to manage the CI
>>> yourself.
>>> You could use Apache Jenkins/Buildbot with dedicated agents for your
>>> project.
>>> Or you could set up your own CI infrastructure with Jenkins, DroneIO,
>>> ConcourceCI, ...
>>>
>>> Yet another option is to move to CircleCI or Cirrus. They are similar to
>>> TravisCI / GA and less crowded (for now).
>>>
>>> Martin
>>>
>>> I appreciate the resources provided to us but that does not resolve the
>>>> issue of the development being slowed down.
>>>>
>>>>
>>>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
>>>>
>>>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com>
>>>> wrote:
>>>> >
>>>> >> Hi all,
>>>> >>
>>>> >> I am an Apache Spark PMC,
>>>> >
>>>> >
>>>> > You are a member of the Apache Spark PMC. You are *not* a PMC. Please
>>>> stop
>>>> > with that terminology. The Foundation has about 200 PMCs, and you are
>>>> a
>>>> > member of one of them. You are NOT a "PMC" .. you're a person. A PMC
>>>> is a
>>>> > construct of the Foundation.
>>>> >
>>>> > >...
>>>> >
>>>> >> I am aware of the limited GitHub Actions resources that are shared
>>>> >> across all projects in ASF,
>>>> >> and many projects suffer from it. This issue significantly slows
>>>> down the
>>>> >> development cycle of
>>>> >>  other projects, at least Apache Spark.
>>>> >>
>>>> >
>>>> > And the Foundation gets those build minutes for GitHub Actions
>>>> provided to
>>>> > us from GitHub and Microsoft, and we are thankful that they provide
>>>> them to
>>>> > the Foundation. Maybe it isn't all the build minutes that every group
>>>> > wants, but that is what we have. So it is incumbent upon all of us to
>>>> > figure out how to build more, with fewer minutes.
>>>> >
>>>> > Say "thank you" to GitHub, please.
>>>> >
>>>> > Regards,
>>>> > -g
>>>> >
>>>> >
>>>>
>>>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Jarek Potiuk <ja...@potiuk.com>.
>
>
> That's a good idea. We do need to thank Github to give free resources to
> ASF projects, but it's better if we can make it a business: we allow
> individual projects to sign deals with Github to get dedicated resources.
> It's a bit wasteful to ask every project to set up its own dev ops,
> using Github Action is more convenient. Maybe we should raise it to Github?
>

I do not think you can get per-project resources in GH - the most you can
do are self-hosted runners for your project.

(BTW I am not from the INFRA team - just a humble "CI person" of Apache
Airflow but very much vested into Github Actions)
maybe the infra team can chime in here. We did raise it to GitHub, we even
had meeting with them
organized by Gavin and several topics were raised that could be eventually
addressed by Github:

- observability (they could not give us per-project usage dashboard - we
built our own imperfect (with API limitations) one by Tobiasz from Airllow
- security (limiting access to only project committers) - this we handled
by the Ash's fork of Runner (but it's also imperfect - even today I had to
fix a problem where we had list of committers desynchronised between our
infra/CI.yml)
- manageability (assigning resources per-project) - this works by having
self-hosted runners assigned per project (we needed infra JIRA ticket and
generation of a bunch of tokens for our runners and our own AWS account
with auto-scaling).

It would be indeed great if it could be available from GitHub, but so far
we do not have any of those.

J.



> On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
> > Thanks Martin for your feedback.
> >
> > > What was your reason to migrate from Apache Jenkins to Github Actions ?
> >
> > I am sure there were more reasons for migrating from Amplap Jenkins
> > <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as far
> as
> > I can remember:
> > - To reduce the maintenance cost of machines
> > - The Jenkins machines became unstable and slow causing CI jobs to fail
> or
> > be very flaky.
> > - Difficulty to manage the installed libraries.
> > - Intermittent unknown issues in the machines
> >
> > Yes, one option might be to consider other options to migrate again.
> > However, other projects will very likely suffer the
> > same problem. In addition, the migration in a large project is not an
> > easy work to do
> >
> > I would like to know the feasibility of having more resources in GitHub
> > Actions, or, for example, having sub-groups where
> > each group shares the resources - currently one GitHub organisation
> shares
> > all resources across the projects.
> >
> >
> > 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mg...@apache.org>님이 작성:
> >
> >>
> >>
> >> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com>
> wrote:
> >>
> >>> Hi Greg,
> >>>
> >>> I raised this thread to figure out a way that we can work together to
> >>> resolve this issue, gather feedback, and to understand how other
> projects
> >>> work around.
> >>> Several projects I observed, as far as I can tell, have made enough
> >>> efforts
> >>> to save the resources in GitHub Actions but still suffer from the lack
> of
> >>> resources.
> >>>
> >>
> >> And it will get even worse because:
> >> 1) more and more Apache projects migrate from TravisCI to Github Actions
> >> (GA)
> >> 2) new projects join ASF and many of them already use GA
> >>
> >>
> >> What was your reason to migrate from Apache Jenkins to Github Actions ?
> >> If you want dedicated resources then you will need to manage the CI
> >> yourself.
> >> You could use Apache Jenkins/Buildbot with dedicated agents for your
> >> project.
> >> Or you could set up your own CI infrastructure with Jenkins, DroneIO,
> >> ConcourceCI, ...
> >>
> >> Yet another option is to move to CircleCI or Cirrus. They are similar to
> >> TravisCI / GA and less crowded (for now).
> >>
> >> Martin
> >>
> >> I appreciate the resources provided to us but that does not resolve the
> >>> issue of the development being slowed down.
> >>>
> >>>
> >>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
> >>>
> >>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com>
> >>> wrote:
> >>> >
> >>> >> Hi all,
> >>> >>
> >>> >> I am an Apache Spark PMC,
> >>> >
> >>> >
> >>> > You are a member of the Apache Spark PMC. You are *not* a PMC. Please
> >>> stop
> >>> > with that terminology. The Foundation has about 200 PMCs, and you
> are a
> >>> > member of one of them. You are NOT a "PMC" .. you're a person. A PMC
> >>> is a
> >>> > construct of the Foundation.
> >>> >
> >>> > >...
> >>> >
> >>> >> I am aware of the limited GitHub Actions resources that are shared
> >>> >> across all projects in ASF,
> >>> >> and many projects suffer from it. This issue significantly slows
> down
> >>> the
> >>> >> development cycle of
> >>> >>  other projects, at least Apache Spark.
> >>> >>
> >>> >
> >>> > And the Foundation gets those build minutes for GitHub Actions
> >>> provided to
> >>> > us from GitHub and Microsoft, and we are thankful that they provide
> >>> them to
> >>> > the Foundation. Maybe it isn't all the build minutes that every group
> >>> > wants, but that is what we have. So it is incumbent upon all of us to
> >>> > figure out how to build more, with fewer minutes.
> >>> >
> >>> > Say "thank you" to GitHub, please.
> >>> >
> >>> > Regards,
> >>> > -g
> >>> >
> >>> >
> >>>
> >>
>


-- 
+48 660 796 129

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Wenchen Fan <cl...@gmail.com>.
> for example, having sub-groups where each group shares the resources -
currently one GitHub organisation shares all resources across the projects.

That's a good idea. We do need to thank Github to give free resources to
ASF projects, but it's better if we can make it a business: we allow
individual projects to sign deals with Github to get dedicated resources.
It's a bit wasteful to ask every project to set up its own dev ops,
using Github Action is more convenient. Maybe we should raise it to Github?

On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <gu...@gmail.com> wrote:

> Thanks Martin for your feedback.
>
> > What was your reason to migrate from Apache Jenkins to Github Actions ?
>
> I am sure there were more reasons for migrating from Amplap Jenkins
> <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as far as
> I can remember:
> - To reduce the maintenance cost of machines
> - The Jenkins machines became unstable and slow causing CI jobs to fail or
> be very flaky.
> - Difficulty to manage the installed libraries.
> - Intermittent unknown issues in the machines
>
> Yes, one option might be to consider other options to migrate again.
> However, other projects will very likely suffer the
> same problem. In addition, the migration in a large project is not an
> easy work to do
>
> I would like to know the feasibility of having more resources in GitHub
> Actions, or, for example, having sub-groups where
> each group shares the resources - currently one GitHub organisation shares
> all resources across the projects.
>
>
> 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mg...@apache.org>님이 작성:
>
>>
>>
>> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> Hi Greg,
>>>
>>> I raised this thread to figure out a way that we can work together to
>>> resolve this issue, gather feedback, and to understand how other projects
>>> work around.
>>> Several projects I observed, as far as I can tell, have made enough
>>> efforts
>>> to save the resources in GitHub Actions but still suffer from the lack of
>>> resources.
>>>
>>
>> And it will get even worse because:
>> 1) more and more Apache projects migrate from TravisCI to Github Actions
>> (GA)
>> 2) new projects join ASF and many of them already use GA
>>
>>
>> What was your reason to migrate from Apache Jenkins to Github Actions ?
>> If you want dedicated resources then you will need to manage the CI
>> yourself.
>> You could use Apache Jenkins/Buildbot with dedicated agents for your
>> project.
>> Or you could set up your own CI infrastructure with Jenkins, DroneIO,
>> ConcourceCI, ...
>>
>> Yet another option is to move to CircleCI or Cirrus. They are similar to
>> TravisCI / GA and less crowded (for now).
>>
>> Martin
>>
>> I appreciate the resources provided to us but that does not resolve the
>>> issue of the development being slowed down.
>>>
>>>
>>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
>>>
>>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>> >
>>> >> Hi all,
>>> >>
>>> >> I am an Apache Spark PMC,
>>> >
>>> >
>>> > You are a member of the Apache Spark PMC. You are *not* a PMC. Please
>>> stop
>>> > with that terminology. The Foundation has about 200 PMCs, and you are a
>>> > member of one of them. You are NOT a "PMC" .. you're a person. A PMC
>>> is a
>>> > construct of the Foundation.
>>> >
>>> > >...
>>> >
>>> >> I am aware of the limited GitHub Actions resources that are shared
>>> >> across all projects in ASF,
>>> >> and many projects suffer from it. This issue significantly slows down
>>> the
>>> >> development cycle of
>>> >>  other projects, at least Apache Spark.
>>> >>
>>> >
>>> > And the Foundation gets those build minutes for GitHub Actions
>>> provided to
>>> > us from GitHub and Microsoft, and we are thankful that they provide
>>> them to
>>> > the Foundation. Maybe it isn't all the build minutes that every group
>>> > wants, but that is what we have. So it is incumbent upon all of us to
>>> > figure out how to build more, with fewer minutes.
>>> >
>>> > Say "thank you" to GitHub, please.
>>> >
>>> > Regards,
>>> > -g
>>> >
>>> >
>>>
>>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Wenchen Fan <cl...@gmail.com>.
> for example, having sub-groups where each group shares the resources -
currently one GitHub organisation shares all resources across the projects.

That's a good idea. We do need to thank Github to give free resources to
ASF projects, but it's better if we can make it a business: we allow
individual projects to sign deals with Github to get dedicated resources.
It's a bit wasteful to ask every project to set up its own dev ops,
using Github Action is more convenient. Maybe we should raise it to Github?

On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon <gu...@gmail.com> wrote:

> Thanks Martin for your feedback.
>
> > What was your reason to migrate from Apache Jenkins to Github Actions ?
>
> I am sure there were more reasons for migrating from Amplap Jenkins
> <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as far as
> I can remember:
> - To reduce the maintenance cost of machines
> - The Jenkins machines became unstable and slow causing CI jobs to fail or
> be very flaky.
> - Difficulty to manage the installed libraries.
> - Intermittent unknown issues in the machines
>
> Yes, one option might be to consider other options to migrate again.
> However, other projects will very likely suffer the
> same problem. In addition, the migration in a large project is not an
> easy work to do
>
> I would like to know the feasibility of having more resources in GitHub
> Actions, or, for example, having sub-groups where
> each group shares the resources - currently one GitHub organisation shares
> all resources across the projects.
>
>
> 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mg...@apache.org>님이 작성:
>
>>
>>
>> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> Hi Greg,
>>>
>>> I raised this thread to figure out a way that we can work together to
>>> resolve this issue, gather feedback, and to understand how other projects
>>> work around.
>>> Several projects I observed, as far as I can tell, have made enough
>>> efforts
>>> to save the resources in GitHub Actions but still suffer from the lack of
>>> resources.
>>>
>>
>> And it will get even worse because:
>> 1) more and more Apache projects migrate from TravisCI to Github Actions
>> (GA)
>> 2) new projects join ASF and many of them already use GA
>>
>>
>> What was your reason to migrate from Apache Jenkins to Github Actions ?
>> If you want dedicated resources then you will need to manage the CI
>> yourself.
>> You could use Apache Jenkins/Buildbot with dedicated agents for your
>> project.
>> Or you could set up your own CI infrastructure with Jenkins, DroneIO,
>> ConcourceCI, ...
>>
>> Yet another option is to move to CircleCI or Cirrus. They are similar to
>> TravisCI / GA and less crowded (for now).
>>
>> Martin
>>
>> I appreciate the resources provided to us but that does not resolve the
>>> issue of the development being slowed down.
>>>
>>>
>>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
>>>
>>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>> >
>>> >> Hi all,
>>> >>
>>> >> I am an Apache Spark PMC,
>>> >
>>> >
>>> > You are a member of the Apache Spark PMC. You are *not* a PMC. Please
>>> stop
>>> > with that terminology. The Foundation has about 200 PMCs, and you are a
>>> > member of one of them. You are NOT a "PMC" .. you're a person. A PMC
>>> is a
>>> > construct of the Foundation.
>>> >
>>> > >...
>>> >
>>> >> I am aware of the limited GitHub Actions resources that are shared
>>> >> across all projects in ASF,
>>> >> and many projects suffer from it. This issue significantly slows down
>>> the
>>> >> development cycle of
>>> >>  other projects, at least Apache Spark.
>>> >>
>>> >
>>> > And the Foundation gets those build minutes for GitHub Actions
>>> provided to
>>> > us from GitHub and Microsoft, and we are thankful that they provide
>>> them to
>>> > the Foundation. Maybe it isn't all the build minutes that every group
>>> > wants, but that is what we have. So it is incumbent upon all of us to
>>> > figure out how to build more, with fewer minutes.
>>> >
>>> > Say "thank you" to GitHub, please.
>>> >
>>> > Regards,
>>> > -g
>>> >
>>> >
>>>
>>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Hyukjin Kwon <gu...@gmail.com>.
Thanks Martin for your feedback.

> What was your reason to migrate from Apache Jenkins to Github Actions ?

I am sure there were more reasons for migrating from Amplap Jenkins
<https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as far as I
can remember:
- To reduce the maintenance cost of machines
- The Jenkins machines became unstable and slow causing CI jobs to fail or
be very flaky.
- Difficulty to manage the installed libraries.
- Intermittent unknown issues in the machines

Yes, one option might be to consider other options to migrate again.
However, other projects will very likely suffer the
same problem. In addition, the migration in a large project is not an
easy work to do

I would like to know the feasibility of having more resources in GitHub
Actions, or, for example, having sub-groups where
each group shares the resources - currently one GitHub organisation shares
all resources across the projects.


2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mg...@apache.org>님이 작성:

>
>
> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Hi Greg,
>>
>> I raised this thread to figure out a way that we can work together to
>> resolve this issue, gather feedback, and to understand how other projects
>> work around.
>> Several projects I observed, as far as I can tell, have made enough
>> efforts
>> to save the resources in GitHub Actions but still suffer from the lack of
>> resources.
>>
>
> And it will get even worse because:
> 1) more and more Apache projects migrate from TravisCI to Github Actions
> (GA)
> 2) new projects join ASF and many of them already use GA
>
>
> What was your reason to migrate from Apache Jenkins to Github Actions ?
> If you want dedicated resources then you will need to manage the CI
> yourself.
> You could use Apache Jenkins/Buildbot with dedicated agents for your
> project.
> Or you could set up your own CI infrastructure with Jenkins, DroneIO,
> ConcourceCI, ...
>
> Yet another option is to move to CircleCI or Cirrus. They are similar to
> TravisCI / GA and less crowded (for now).
>
> Martin
>
> I appreciate the resources provided to us but that does not resolve the
>> issue of the development being slowed down.
>>
>>
>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
>>
>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com>
>> wrote:
>> >
>> >> Hi all,
>> >>
>> >> I am an Apache Spark PMC,
>> >
>> >
>> > You are a member of the Apache Spark PMC. You are *not* a PMC. Please
>> stop
>> > with that terminology. The Foundation has about 200 PMCs, and you are a
>> > member of one of them. You are NOT a "PMC" .. you're a person. A PMC is
>> a
>> > construct of the Foundation.
>> >
>> > >...
>> >
>> >> I am aware of the limited GitHub Actions resources that are shared
>> >> across all projects in ASF,
>> >> and many projects suffer from it. This issue significantly slows down
>> the
>> >> development cycle of
>> >>  other projects, at least Apache Spark.
>> >>
>> >
>> > And the Foundation gets those build minutes for GitHub Actions provided
>> to
>> > us from GitHub and Microsoft, and we are thankful that they provide
>> them to
>> > the Foundation. Maybe it isn't all the build minutes that every group
>> > wants, but that is what we have. So it is incumbent upon all of us to
>> > figure out how to build more, with fewer minutes.
>> >
>> > Say "thank you" to GitHub, please.
>> >
>> > Regards,
>> > -g
>> >
>> >
>>
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Hyukjin Kwon <gu...@gmail.com>.
Thanks Martin for your feedback.

> What was your reason to migrate from Apache Jenkins to Github Actions ?

I am sure there were more reasons for migrating from Amplap Jenkins
<https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions but as far as I
can remember:
- To reduce the maintenance cost of machines
- The Jenkins machines became unstable and slow causing CI jobs to fail or
be very flaky.
- Difficulty to manage the installed libraries.
- Intermittent unknown issues in the machines

Yes, one option might be to consider other options to migrate again.
However, other projects will very likely suffer the
same problem. In addition, the migration in a large project is not an
easy work to do

I would like to know the feasibility of having more resources in GitHub
Actions, or, for example, having sub-groups where
each group shares the resources - currently one GitHub organisation shares
all resources across the projects.


2021년 4월 7일 (수) 오후 10:04, Martin Grigorov <mg...@apache.org>님이 작성:

>
>
> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Hi Greg,
>>
>> I raised this thread to figure out a way that we can work together to
>> resolve this issue, gather feedback, and to understand how other projects
>> work around.
>> Several projects I observed, as far as I can tell, have made enough
>> efforts
>> to save the resources in GitHub Actions but still suffer from the lack of
>> resources.
>>
>
> And it will get even worse because:
> 1) more and more Apache projects migrate from TravisCI to Github Actions
> (GA)
> 2) new projects join ASF and many of them already use GA
>
>
> What was your reason to migrate from Apache Jenkins to Github Actions ?
> If you want dedicated resources then you will need to manage the CI
> yourself.
> You could use Apache Jenkins/Buildbot with dedicated agents for your
> project.
> Or you could set up your own CI infrastructure with Jenkins, DroneIO,
> ConcourceCI, ...
>
> Yet another option is to move to CircleCI or Cirrus. They are similar to
> TravisCI / GA and less crowded (for now).
>
> Martin
>
> I appreciate the resources provided to us but that does not resolve the
>> issue of the development being slowed down.
>>
>>
>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
>>
>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com>
>> wrote:
>> >
>> >> Hi all,
>> >>
>> >> I am an Apache Spark PMC,
>> >
>> >
>> > You are a member of the Apache Spark PMC. You are *not* a PMC. Please
>> stop
>> > with that terminology. The Foundation has about 200 PMCs, and you are a
>> > member of one of them. You are NOT a "PMC" .. you're a person. A PMC is
>> a
>> > construct of the Foundation.
>> >
>> > >...
>> >
>> >> I am aware of the limited GitHub Actions resources that are shared
>> >> across all projects in ASF,
>> >> and many projects suffer from it. This issue significantly slows down
>> the
>> >> development cycle of
>> >>  other projects, at least Apache Spark.
>> >>
>> >
>> > And the Foundation gets those build minutes for GitHub Actions provided
>> to
>> > us from GitHub and Microsoft, and we are thankful that they provide
>> them to
>> > the Foundation. Maybe it isn't all the build minutes that every group
>> > wants, but that is what we have. So it is incumbent upon all of us to
>> > figure out how to build more, with fewer minutes.
>> >
>> > Say "thank you" to GitHub, please.
>> >
>> > Regards,
>> > -g
>> >
>> >
>>
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Martin Grigorov <mg...@apache.org>.
On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com> wrote:

> Hi Greg,
>
> I raised this thread to figure out a way that we can work together to
> resolve this issue, gather feedback, and to understand how other projects
> work around.
> Several projects I observed, as far as I can tell, have made enough efforts
> to save the resources in GitHub Actions but still suffer from the lack of
> resources.
>

And it will get even worse because:
1) more and more Apache projects migrate from TravisCI to Github Actions
(GA)
2) new projects join ASF and many of them already use GA


What was your reason to migrate from Apache Jenkins to Github Actions ?
If you want dedicated resources then you will need to manage the CI
yourself.
You could use Apache Jenkins/Buildbot with dedicated agents for your
project.
Or you could set up your own CI infrastructure with Jenkins, DroneIO,
ConcourceCI, ...

Yet another option is to move to CircleCI or Cirrus. They are similar to
TravisCI / GA and less crowded (for now).

Martin

I appreciate the resources provided to us but that does not resolve the
> issue of the development being slowed down.
>
>
> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
>
> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com>
> wrote:
> >
> >> Hi all,
> >>
> >> I am an Apache Spark PMC,
> >
> >
> > You are a member of the Apache Spark PMC. You are *not* a PMC. Please
> stop
> > with that terminology. The Foundation has about 200 PMCs, and you are a
> > member of one of them. You are NOT a "PMC" .. you're a person. A PMC is a
> > construct of the Foundation.
> >
> > >...
> >
> >> I am aware of the limited GitHub Actions resources that are shared
> >> across all projects in ASF,
> >> and many projects suffer from it. This issue significantly slows down
> the
> >> development cycle of
> >>  other projects, at least Apache Spark.
> >>
> >
> > And the Foundation gets those build minutes for GitHub Actions provided
> to
> > us from GitHub and Microsoft, and we are thankful that they provide them
> to
> > the Foundation. Maybe it isn't all the build minutes that every group
> > wants, but that is what we have. So it is incumbent upon all of us to
> > figure out how to build more, with fewer minutes.
> >
> > Say "thank you" to GitHub, please.
> >
> > Regards,
> > -g
> >
> >
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Martin Grigorov <mg...@apache.org>.
On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon <gu...@gmail.com> wrote:

> Hi Greg,
>
> I raised this thread to figure out a way that we can work together to
> resolve this issue, gather feedback, and to understand how other projects
> work around.
> Several projects I observed, as far as I can tell, have made enough efforts
> to save the resources in GitHub Actions but still suffer from the lack of
> resources.
>

And it will get even worse because:
1) more and more Apache projects migrate from TravisCI to Github Actions
(GA)
2) new projects join ASF and many of them already use GA


What was your reason to migrate from Apache Jenkins to Github Actions ?
If you want dedicated resources then you will need to manage the CI
yourself.
You could use Apache Jenkins/Buildbot with dedicated agents for your
project.
Or you could set up your own CI infrastructure with Jenkins, DroneIO,
ConcourceCI, ...

Yet another option is to move to CircleCI or Cirrus. They are similar to
TravisCI / GA and less crowded (for now).

Martin

I appreciate the resources provided to us but that does not resolve the
> issue of the development being slowed down.
>
>
> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:
>
> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com>
> wrote:
> >
> >> Hi all,
> >>
> >> I am an Apache Spark PMC,
> >
> >
> > You are a member of the Apache Spark PMC. You are *not* a PMC. Please
> stop
> > with that terminology. The Foundation has about 200 PMCs, and you are a
> > member of one of them. You are NOT a "PMC" .. you're a person. A PMC is a
> > construct of the Foundation.
> >
> > >...
> >
> >> I am aware of the limited GitHub Actions resources that are shared
> >> across all projects in ASF,
> >> and many projects suffer from it. This issue significantly slows down
> the
> >> development cycle of
> >>  other projects, at least Apache Spark.
> >>
> >
> > And the Foundation gets those build minutes for GitHub Actions provided
> to
> > us from GitHub and Microsoft, and we are thankful that they provide them
> to
> > the Foundation. Maybe it isn't all the build minutes that every group
> > wants, but that is what we have. So it is incumbent upon all of us to
> > figure out how to build more, with fewer minutes.
> >
> > Say "thank you" to GitHub, please.
> >
> > Regards,
> > -g
> >
> >
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Hyukjin Kwon <gu...@gmail.com>.
Hi Greg,

I raised this thread to figure out a way that we can work together to
resolve this issue, gather feedback, and to understand how other projects
work around.
Several projects I observed, as far as I can tell, have made enough efforts
to save the resources in GitHub Actions but still suffer from the lack of
resources.
I appreciate the resources provided to us but that does not resolve the
issue of the development being slowed down.


2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:

> On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Hi all,
>>
>> I am an Apache Spark PMC,
>
>
> You are a member of the Apache Spark PMC. You are *not* a PMC. Please stop
> with that terminology. The Foundation has about 200 PMCs, and you are a
> member of one of them. You are NOT a "PMC" .. you're a person. A PMC is a
> construct of the Foundation.
>
> >...
>
>> I am aware of the limited GitHub Actions resources that are shared
>> across all projects in ASF,
>> and many projects suffer from it. This issue significantly slows down the
>> development cycle of
>>  other projects, at least Apache Spark.
>>
>
> And the Foundation gets those build minutes for GitHub Actions provided to
> us from GitHub and Microsoft, and we are thankful that they provide them to
> the Foundation. Maybe it isn't all the build minutes that every group
> wants, but that is what we have. So it is incumbent upon all of us to
> figure out how to build more, with fewer minutes.
>
> Say "thank you" to GitHub, please.
>
> Regards,
> -g
>
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Antoine Pitrou <an...@python.org>.
I am a member of the Arrow PMC ( ;-) ) and we would gladly welcome a way 
to contribute - e.g. financially - towards larger CI resources on Github 
Actions (or another similar online service that can build PRs from forks).

While "build more with fewer minutes" is definitely desirable, the 
breadth of our project and quality requirements makes it unlikely that 
we'll achieve breakthrough improvements in that dimension (we have 
implementations for around 10 different languages, several of which see 
a high contribution rate).

Regards

Antoine.


Le 07/04/2021 à 10:51, Greg Stein a écrit :
> On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com> wrote:
> 
>> Hi all,
>>
>> I am an Apache Spark PMC,
> 
> 
> You are a member of the Apache Spark PMC. You are *not* a PMC. Please stop
> with that terminology. The Foundation has about 200 PMCs, and you are a
> member of one of them. You are NOT a "PMC" .. you're a person. A PMC is a
> construct of the Foundation.
> 
>> ...
> 
>> I am aware of the limited GitHub Actions resources that are shared
>> across all projects in ASF,
>> and many projects suffer from it. This issue significantly slows down the
>> development cycle of
>>   other projects, at least Apache Spark.
>>
> 
> And the Foundation gets those build minutes for GitHub Actions provided to
> us from GitHub and Microsoft, and we are thankful that they provide them to
> the Foundation. Maybe it isn't all the build minutes that every group
> wants, but that is what we have. So it is incumbent upon all of us to
> figure out how to build more, with fewer minutes.
> 
> Say "thank you" to GitHub, please.
> 
> Regards,
> -g
> 

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Hyukjin Kwon <gu...@gmail.com>.
Hi Greg,

I raised this thread to figure out a way that we can work together to
resolve this issue, gather feedback, and to understand how other projects
work around.
Several projects I observed, as far as I can tell, have made enough efforts
to save the resources in GitHub Actions but still suffer from the lack of
resources.
I appreciate the resources provided to us but that does not resolve the
issue of the development being slowed down.


2021년 4월 7일 (수) 오후 5:52, Greg Stein <gs...@gmail.com>님이 작성:

> On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Hi all,
>>
>> I am an Apache Spark PMC,
>
>
> You are a member of the Apache Spark PMC. You are *not* a PMC. Please stop
> with that terminology. The Foundation has about 200 PMCs, and you are a
> member of one of them. You are NOT a "PMC" .. you're a person. A PMC is a
> construct of the Foundation.
>
> >...
>
>> I am aware of the limited GitHub Actions resources that are shared
>> across all projects in ASF,
>> and many projects suffer from it. This issue significantly slows down the
>> development cycle of
>>  other projects, at least Apache Spark.
>>
>
> And the Foundation gets those build minutes for GitHub Actions provided to
> us from GitHub and Microsoft, and we are thankful that they provide them to
> the Foundation. Maybe it isn't all the build minutes that every group
> wants, but that is what we have. So it is incumbent upon all of us to
> figure out how to build more, with fewer minutes.
>
> Say "thank you" to GitHub, please.
>
> Regards,
> -g
>
>

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Greg Stein <gs...@gmail.com>.
On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com> wrote:

> Hi all,
>
> I am an Apache Spark PMC,


You are a member of the Apache Spark PMC. You are *not* a PMC. Please stop
with that terminology. The Foundation has about 200 PMCs, and you are a
member of one of them. You are NOT a "PMC" .. you're a person. A PMC is a
construct of the Foundation.

>...

> I am aware of the limited GitHub Actions resources that are shared
> across all projects in ASF,
> and many projects suffer from it. This issue significantly slows down the
> development cycle of
>  other projects, at least Apache Spark.
>

And the Foundation gets those build minutes for GitHub Actions provided to
us from GitHub and Microsoft, and we are thankful that they provide them to
the Foundation. Maybe it isn't all the build minutes that every group
wants, but that is what we have. So it is incumbent upon all of us to
figure out how to build more, with fewer minutes.

Say "thank you" to GitHub, please.

Regards,
-g

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Greg Stein <gs...@gmail.com>.
On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon <gu...@gmail.com> wrote:

> Hi all,
>
> I am an Apache Spark PMC,


You are a member of the Apache Spark PMC. You are *not* a PMC. Please stop
with that terminology. The Foundation has about 200 PMCs, and you are a
member of one of them. You are NOT a "PMC" .. you're a person. A PMC is a
construct of the Foundation.

>...

> I am aware of the limited GitHub Actions resources that are shared
> across all projects in ASF,
> and many projects suffer from it. This issue significantly slows down the
> development cycle of
>  other projects, at least Apache Spark.
>

And the Foundation gets those build minutes for GitHub Actions provided to
us from GitHub and Microsoft, and we are thankful that they provide them to
the Foundation. Maybe it isn't all the build minutes that every group
wants, but that is what we have. So it is incumbent upon all of us to
figure out how to build more, with fewer minutes.

Say "thank you" to GitHub, please.

Regards,
-g

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Jarek Potiuk <ja...@potiuk.com>.
Just a comment here - as  I commented also in the ticket

The document
https://cwiki.apache.org/confluence/display/BUILDS/GitHub+Actions+status
gives complete overview of where the Github Actions are  for the ASF
project.

And we have some nice experiences in Apache Airflow that we will be able to
share soon likely with running our own self -hosted runners. More in this
comment
https://issues.apache.org/jira/browse/INFRA-21646?focusedCommentId=17316108&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17316108



J.




On Wed, Apr 7, 2021 at 7:24 AM Hyukjin Kwon <gu...@gmail.com> wrote:

> Hi all,
>
> I am an Apache Spark PMC, and would like to know the future plan about
> GitHub Actions in ASF.
> Please also see the INFRA ticket I filed:
> https://issues.apache.org/jira/browse/INFRA-21646.
>
> I am aware of the limited GitHub Actions resources that are shared
> across all projects in ASF,
> and many projects suffer from it. This issue significantly slows down the
> development cycle of
>  other projects, at least Apache Spark.
>
> How do we plan to increase the resources in GitHub Actions, and what are
> the blockers? I would appreciate any input and thoughts on this.
>
> Thank you so much.
>
> CC'ing Spark @dev <de...@spark.apache.org> for more visibility. Please take
> it out if considered inappropriate.
>


-- 
+48 660 796 129

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

Posted by Jarek Potiuk <ja...@potiuk.com>.
Just a comment here - as  I commented also in the ticket

The document
https://cwiki.apache.org/confluence/display/BUILDS/GitHub+Actions+status
gives complete overview of where the Github Actions are  for the ASF
project.

And we have some nice experiences in Apache Airflow that we will be able to
share soon likely with running our own self -hosted runners. More in this
comment
https://issues.apache.org/jira/browse/INFRA-21646?focusedCommentId=17316108&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17316108



J.




On Wed, Apr 7, 2021 at 7:24 AM Hyukjin Kwon <gu...@gmail.com> wrote:

> Hi all,
>
> I am an Apache Spark PMC, and would like to know the future plan about
> GitHub Actions in ASF.
> Please also see the INFRA ticket I filed:
> https://issues.apache.org/jira/browse/INFRA-21646.
>
> I am aware of the limited GitHub Actions resources that are shared
> across all projects in ASF,
> and many projects suffer from it. This issue significantly slows down the
> development cycle of
>  other projects, at least Apache Spark.
>
> How do we plan to increase the resources in GitHub Actions, and what are
> the blockers? I would appreciate any input and thoughts on this.
>
> Thank you so much.
>
> CC'ing Spark @dev <de...@spark.apache.org> for more visibility. Please take
> it out if considered inappropriate.
>


-- 
+48 660 796 129