You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Ismaël Mejía <ie...@gmail.com> on 2021/02/08 11:21:52 UTC

Re: Builds Meeting this Thursday

Just for reference and related to this thread. It seems we may end up
also having this queue issue (even if we don't fully move to Github
actions).
"For Apache projects, starting December 2020 we are experiencing a
high strain of GitHub Actions jobs. All Apache projects are sharing
180 jobs and as more projects are using GitHub Actions the job queue
becomes a serious bottleneck."

An interesting document shared recently on builds@ goes deeper on how
the Airflow project is dealing with this:
https://docs.google.com/document/d/1ZZeZ4BYMNX7ycGRUKAXv0s6etz1g-90Onn5nRQQHOfE/edit#

On Mon, Jan 18, 2021 at 1:28 PM Elliotte Rusty Harold
<el...@ibiblio.org> wrote:
>
> On Mon, Jan 18, 2021 at 10:49 AM Ismaël Mejía <ie...@gmail.com> wrote:
> >
> > Thanks for sharing this Pablo, This looks super interesting. We should
> > see if it could make sense to migrate our Jenkins infra to GitHub
> > Actions given that it is free and quickly becoming the new 'standard',
> > Good points it is 'free' because we will bring our machines and Google
> > pays :) bad points we will become 100% github dependant.
> >
>
> Github actions have a really big advantage over Jenkins: they run on
> forks, not just branches. This is very useful to non-commmiter
> contributors.
>
> On the minus side it's not clear if one can see the logs from the
> integration tests, which is blocking some work in the
> maven-site-plugin:
>
> https://github.com/apache/maven-site-plugin/pull/34#issuecomment-762207488
>
> --
> Elliotte Rusty Harold
> elharo@ibiblio.org

Re: Builds Meeting this Thursday

Posted by Tyson Hamilton <ty...@google.com>.
I suspect that Beam would suffer from the same problems that Airflow did,
specifically: "1) we won't be stuck in a queue behind other ASF projects
waiting for our "slot"". This is similar to why Beam migrated to using a
dedicated pool of Jenkins agents.

I'm not following the performance of our GH actions and whether they've
been blocking peoples PRs. They haven't been a problem for me. I believe
the urgency is unknown (but I suspect not high) and the first step would be
to perform an analysis of cost/benefit of: 1) migrating existing GH actions
to our own workers (maybe shared VMs with Jenkins agents), 2) migrating all
of Jenkins to GH actions, which would require having our own GH actions
workers anyways.

On Wed, Feb 10, 2021 at 1:59 PM Ahmet Altay <al...@google.com> wrote:

> Nice, thank you for sharing Aizhamal. Change looks relatively
> straightforward.
>
> What is the urgency of this for Beam? Is this already impacting Beam's gh
> actions?
>
> On Tue, Feb 9, 2021 at 7:15 PM Aizhamal Nurmamat kyzy <ai...@apache.org>
> wrote:
>
>> Hi all,
>> In case you may find this interesting / valuable: Airflow has configured
>> their own machines for Github actions.
>>
>> Here's the PR https://github.com/apache/airflow/pull/13730
>>
>> And here's the thread:
>> https://lists.apache.org/thread.html/r2e398f86479e4cbfca13c22e4499fb0becdbba20dd9d6d47e1ed30bd%40%3Cdev.airflow.apache.org%3E
>>
>>
>> On Mon, Feb 8, 2021 at 2:56 PM Ahmet Altay <al...@google.com> wrote:
>>
>>> Thank you for sharing this Ismaël.
>>>
>>> This 180 jobs limit across all Apache projects sounds like a problem for
>>> Beam, because we are running quite a bit of GH actions already. Following
>>> the Airflow suggestions, we can add VMs to apache-beam-testing projects to
>>> add Beam specifici private runners to address the issue. GHs suggestion
>>> against using private VMs in public projects [1] is related to the risk of
>>> unauthorized PRs running unexpected workloads in these VMs. As far as I
>>> remember, we did not have this problem with our jenkins machines and anyone
>>> being able to run code with their PRs. And Airflow has the suggestion of
>>> use preemptible machines. We can do the same and these machines are always
>>> recycled after 24 hours limiting the risks.
>>>
>>> /cc @Tyson Hamilton <ty...@google.com> @David Lu <lu...@google.com> @Alan
>>> Myrvold <am...@google.com>
>>>
>>> [1]
>>> https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners#self-hosted-runner-security-with-public-repositories
>>>
>>> On Mon, Feb 8, 2021 at 3:30 AM JB Onofré <jb...@nanthrax.net> wrote:
>>>
>>>> Hi Ismaël.
>>>>
>>>> Thanks for sharing. I started to evaluate GitHub actions on some other
>>>> Apache projects and the doc is interesting.
>>>>
>>>> Regards
>>>> JB
>>>>
>>>> > Le 8 févr. 2021 à 12:22, Ismaël Mejía <ie...@gmail.com> a écrit :
>>>> >
>>>> > Just for reference and related to this thread. It seems we may end up
>>>> > also having this queue issue (even if we don't fully move to Github
>>>> > actions).
>>>> > "For Apache projects, starting December 2020 we are experiencing a
>>>> > high strain of GitHub Actions jobs. All Apache projects are sharing
>>>> > 180 jobs and as more projects are using GitHub Actions the job queue
>>>> > becomes a serious bottleneck."
>>>> >
>>>> > An interesting document shared recently on builds@ goes deeper on how
>>>> > the Airflow project is dealing with this:
>>>> >
>>>> https://docs.google.com/document/d/1ZZeZ4BYMNX7ycGRUKAXv0s6etz1g-90Onn5nRQQHOfE/edit#
>>>> >
>>>> >> On Mon, Jan 18, 2021 at 1:28 PM Elliotte Rusty Harold
>>>> >> <el...@ibiblio.org> wrote:
>>>> >>
>>>> >>> On Mon, Jan 18, 2021 at 10:49 AM Ismaël Mejía <ie...@gmail.com>
>>>> wrote:
>>>> >>>
>>>> >>> Thanks for sharing this Pablo, This looks super interesting. We
>>>> should
>>>> >>> see if it could make sense to migrate our Jenkins infra to GitHub
>>>> >>> Actions given that it is free and quickly becoming the new
>>>> 'standard',
>>>> >>> Good points it is 'free' because we will bring our machines and
>>>> Google
>>>> >>> pays :) bad points we will become 100% github dependant.
>>>> >>>
>>>> >>
>>>> >> Github actions have a really big advantage over Jenkins: they run on
>>>> >> forks, not just branches. This is very useful to non-commmiter
>>>> >> contributors.
>>>> >>
>>>> >> On the minus side it's not clear if one can see the logs from the
>>>> >> integration tests, which is blocking some work in the
>>>> >> maven-site-plugin:
>>>> >>
>>>> >>
>>>> https://github.com/apache/maven-site-plugin/pull/34#issuecomment-762207488
>>>> >>
>>>> >> --
>>>> >> Elliotte Rusty Harold
>>>> >> elharo@ibiblio.org
>>>>
>>>>

Re: Builds Meeting this Thursday

Posted by Ahmet Altay <al...@google.com>.
Nice, thank you for sharing Aizhamal. Change looks relatively
straightforward.

What is the urgency of this for Beam? Is this already impacting Beam's gh
actions?

On Tue, Feb 9, 2021 at 7:15 PM Aizhamal Nurmamat kyzy <ai...@apache.org>
wrote:

> Hi all,
> In case you may find this interesting / valuable: Airflow has configured
> their own machines for Github actions.
>
> Here's the PR https://github.com/apache/airflow/pull/13730
>
> And here's the thread:
> https://lists.apache.org/thread.html/r2e398f86479e4cbfca13c22e4499fb0becdbba20dd9d6d47e1ed30bd%40%3Cdev.airflow.apache.org%3E
>
>
> On Mon, Feb 8, 2021 at 2:56 PM Ahmet Altay <al...@google.com> wrote:
>
>> Thank you for sharing this Ismaël.
>>
>> This 180 jobs limit across all Apache projects sounds like a problem for
>> Beam, because we are running quite a bit of GH actions already. Following
>> the Airflow suggestions, we can add VMs to apache-beam-testing projects to
>> add Beam specifici private runners to address the issue. GHs suggestion
>> against using private VMs in public projects [1] is related to the risk of
>> unauthorized PRs running unexpected workloads in these VMs. As far as I
>> remember, we did not have this problem with our jenkins machines and anyone
>> being able to run code with their PRs. And Airflow has the suggestion of
>> use preemptible machines. We can do the same and these machines are always
>> recycled after 24 hours limiting the risks.
>>
>> /cc @Tyson Hamilton <ty...@google.com> @David Lu <lu...@google.com> @Alan
>> Myrvold <am...@google.com>
>>
>> [1]
>> https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners#self-hosted-runner-security-with-public-repositories
>>
>> On Mon, Feb 8, 2021 at 3:30 AM JB Onofré <jb...@nanthrax.net> wrote:
>>
>>> Hi Ismaël.
>>>
>>> Thanks for sharing. I started to evaluate GitHub actions on some other
>>> Apache projects and the doc is interesting.
>>>
>>> Regards
>>> JB
>>>
>>> > Le 8 févr. 2021 à 12:22, Ismaël Mejía <ie...@gmail.com> a écrit :
>>> >
>>> > Just for reference and related to this thread. It seems we may end up
>>> > also having this queue issue (even if we don't fully move to Github
>>> > actions).
>>> > "For Apache projects, starting December 2020 we are experiencing a
>>> > high strain of GitHub Actions jobs. All Apache projects are sharing
>>> > 180 jobs and as more projects are using GitHub Actions the job queue
>>> > becomes a serious bottleneck."
>>> >
>>> > An interesting document shared recently on builds@ goes deeper on how
>>> > the Airflow project is dealing with this:
>>> >
>>> https://docs.google.com/document/d/1ZZeZ4BYMNX7ycGRUKAXv0s6etz1g-90Onn5nRQQHOfE/edit#
>>> >
>>> >> On Mon, Jan 18, 2021 at 1:28 PM Elliotte Rusty Harold
>>> >> <el...@ibiblio.org> wrote:
>>> >>
>>> >>> On Mon, Jan 18, 2021 at 10:49 AM Ismaël Mejía <ie...@gmail.com>
>>> wrote:
>>> >>>
>>> >>> Thanks for sharing this Pablo, This looks super interesting. We
>>> should
>>> >>> see if it could make sense to migrate our Jenkins infra to GitHub
>>> >>> Actions given that it is free and quickly becoming the new
>>> 'standard',
>>> >>> Good points it is 'free' because we will bring our machines and
>>> Google
>>> >>> pays :) bad points we will become 100% github dependant.
>>> >>>
>>> >>
>>> >> Github actions have a really big advantage over Jenkins: they run on
>>> >> forks, not just branches. This is very useful to non-commmiter
>>> >> contributors.
>>> >>
>>> >> On the minus side it's not clear if one can see the logs from the
>>> >> integration tests, which is blocking some work in the
>>> >> maven-site-plugin:
>>> >>
>>> >>
>>> https://github.com/apache/maven-site-plugin/pull/34#issuecomment-762207488
>>> >>
>>> >> --
>>> >> Elliotte Rusty Harold
>>> >> elharo@ibiblio.org
>>>
>>>

Re: Builds Meeting this Thursday

Posted by Aizhamal Nurmamat kyzy <ai...@apache.org>.
Hi all,
In case you may find this interesting / valuable: Airflow has configured
their own machines for Github actions.

Here's the PR https://github.com/apache/airflow/pull/13730

And here's the thread:
https://lists.apache.org/thread.html/r2e398f86479e4cbfca13c22e4499fb0becdbba20dd9d6d47e1ed30bd%40%3Cdev.airflow.apache.org%3E


On Mon, Feb 8, 2021 at 2:56 PM Ahmet Altay <al...@google.com> wrote:

> Thank you for sharing this Ismaël.
>
> This 180 jobs limit across all Apache projects sounds like a problem for
> Beam, because we are running quite a bit of GH actions already. Following
> the Airflow suggestions, we can add VMs to apache-beam-testing projects to
> add Beam specifici private runners to address the issue. GHs suggestion
> against using private VMs in public projects [1] is related to the risk of
> unauthorized PRs running unexpected workloads in these VMs. As far as I
> remember, we did not have this problem with our jenkins machines and anyone
> being able to run code with their PRs. And Airflow has the suggestion of
> use preemptible machines. We can do the same and these machines are always
> recycled after 24 hours limiting the risks.
>
> /cc @Tyson Hamilton <ty...@google.com> @David Lu <lu...@google.com> @Alan
> Myrvold <am...@google.com>
>
> [1]
> https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners#self-hosted-runner-security-with-public-repositories
>
> On Mon, Feb 8, 2021 at 3:30 AM JB Onofré <jb...@nanthrax.net> wrote:
>
>> Hi Ismaël.
>>
>> Thanks for sharing. I started to evaluate GitHub actions on some other
>> Apache projects and the doc is interesting.
>>
>> Regards
>> JB
>>
>> > Le 8 févr. 2021 à 12:22, Ismaël Mejía <ie...@gmail.com> a écrit :
>> >
>> > Just for reference and related to this thread. It seems we may end up
>> > also having this queue issue (even if we don't fully move to Github
>> > actions).
>> > "For Apache projects, starting December 2020 we are experiencing a
>> > high strain of GitHub Actions jobs. All Apache projects are sharing
>> > 180 jobs and as more projects are using GitHub Actions the job queue
>> > becomes a serious bottleneck."
>> >
>> > An interesting document shared recently on builds@ goes deeper on how
>> > the Airflow project is dealing with this:
>> >
>> https://docs.google.com/document/d/1ZZeZ4BYMNX7ycGRUKAXv0s6etz1g-90Onn5nRQQHOfE/edit#
>> >
>> >> On Mon, Jan 18, 2021 at 1:28 PM Elliotte Rusty Harold
>> >> <el...@ibiblio.org> wrote:
>> >>
>> >>> On Mon, Jan 18, 2021 at 10:49 AM Ismaël Mejía <ie...@gmail.com>
>> wrote:
>> >>>
>> >>> Thanks for sharing this Pablo, This looks super interesting. We should
>> >>> see if it could make sense to migrate our Jenkins infra to GitHub
>> >>> Actions given that it is free and quickly becoming the new 'standard',
>> >>> Good points it is 'free' because we will bring our machines and Google
>> >>> pays :) bad points we will become 100% github dependant.
>> >>>
>> >>
>> >> Github actions have a really big advantage over Jenkins: they run on
>> >> forks, not just branches. This is very useful to non-commmiter
>> >> contributors.
>> >>
>> >> On the minus side it's not clear if one can see the logs from the
>> >> integration tests, which is blocking some work in the
>> >> maven-site-plugin:
>> >>
>> >>
>> https://github.com/apache/maven-site-plugin/pull/34#issuecomment-762207488
>> >>
>> >> --
>> >> Elliotte Rusty Harold
>> >> elharo@ibiblio.org
>>
>>

Re: Builds Meeting this Thursday

Posted by Ahmet Altay <al...@google.com>.
Thank you for sharing this Ismaël.

This 180 jobs limit across all Apache projects sounds like a problem for
Beam, because we are running quite a bit of GH actions already. Following
the Airflow suggestions, we can add VMs to apache-beam-testing projects to
add Beam specifici private runners to address the issue. GHs suggestion
against using private VMs in public projects [1] is related to the risk of
unauthorized PRs running unexpected workloads in these VMs. As far as I
remember, we did not have this problem with our jenkins machines and anyone
being able to run code with their PRs. And Airflow has the suggestion of
use preemptible machines. We can do the same and these machines are always
recycled after 24 hours limiting the risks.

/cc @Tyson Hamilton <ty...@google.com> @David Lu <lu...@google.com> @Alan
Myrvold <am...@google.com>

[1]
https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners#self-hosted-runner-security-with-public-repositories

On Mon, Feb 8, 2021 at 3:30 AM JB Onofré <jb...@nanthrax.net> wrote:

> Hi Ismaël.
>
> Thanks for sharing. I started to evaluate GitHub actions on some other
> Apache projects and the doc is interesting.
>
> Regards
> JB
>
> > Le 8 févr. 2021 à 12:22, Ismaël Mejía <ie...@gmail.com> a écrit :
> >
> > Just for reference and related to this thread. It seems we may end up
> > also having this queue issue (even if we don't fully move to Github
> > actions).
> > "For Apache projects, starting December 2020 we are experiencing a
> > high strain of GitHub Actions jobs. All Apache projects are sharing
> > 180 jobs and as more projects are using GitHub Actions the job queue
> > becomes a serious bottleneck."
> >
> > An interesting document shared recently on builds@ goes deeper on how
> > the Airflow project is dealing with this:
> >
> https://docs.google.com/document/d/1ZZeZ4BYMNX7ycGRUKAXv0s6etz1g-90Onn5nRQQHOfE/edit#
> >
> >> On Mon, Jan 18, 2021 at 1:28 PM Elliotte Rusty Harold
> >> <el...@ibiblio.org> wrote:
> >>
> >>> On Mon, Jan 18, 2021 at 10:49 AM Ismaël Mejía <ie...@gmail.com>
> wrote:
> >>>
> >>> Thanks for sharing this Pablo, This looks super interesting. We should
> >>> see if it could make sense to migrate our Jenkins infra to GitHub
> >>> Actions given that it is free and quickly becoming the new 'standard',
> >>> Good points it is 'free' because we will bring our machines and Google
> >>> pays :) bad points we will become 100% github dependant.
> >>>
> >>
> >> Github actions have a really big advantage over Jenkins: they run on
> >> forks, not just branches. This is very useful to non-commmiter
> >> contributors.
> >>
> >> On the minus side it's not clear if one can see the logs from the
> >> integration tests, which is blocking some work in the
> >> maven-site-plugin:
> >>
> >>
> https://github.com/apache/maven-site-plugin/pull/34#issuecomment-762207488
> >>
> >> --
> >> Elliotte Rusty Harold
> >> elharo@ibiblio.org
>
>

Re: Builds Meeting this Thursday

Posted by JB Onofré <jb...@nanthrax.net>.
Hi Ismaël. 

Thanks for sharing. I started to evaluate GitHub actions on some other Apache projects and the doc is interesting. 

Regards 
JB

> Le 8 févr. 2021 à 12:22, Ismaël Mejía <ie...@gmail.com> a écrit :
> 
> Just for reference and related to this thread. It seems we may end up
> also having this queue issue (even if we don't fully move to Github
> actions).
> "For Apache projects, starting December 2020 we are experiencing a
> high strain of GitHub Actions jobs. All Apache projects are sharing
> 180 jobs and as more projects are using GitHub Actions the job queue
> becomes a serious bottleneck."
> 
> An interesting document shared recently on builds@ goes deeper on how
> the Airflow project is dealing with this:
> https://docs.google.com/document/d/1ZZeZ4BYMNX7ycGRUKAXv0s6etz1g-90Onn5nRQQHOfE/edit#
> 
>> On Mon, Jan 18, 2021 at 1:28 PM Elliotte Rusty Harold
>> <el...@ibiblio.org> wrote:
>> 
>>> On Mon, Jan 18, 2021 at 10:49 AM Ismaël Mejía <ie...@gmail.com> wrote:
>>> 
>>> Thanks for sharing this Pablo, This looks super interesting. We should
>>> see if it could make sense to migrate our Jenkins infra to GitHub
>>> Actions given that it is free and quickly becoming the new 'standard',
>>> Good points it is 'free' because we will bring our machines and Google
>>> pays :) bad points we will become 100% github dependant.
>>> 
>> 
>> Github actions have a really big advantage over Jenkins: they run on
>> forks, not just branches. This is very useful to non-commmiter
>> contributors.
>> 
>> On the minus side it's not clear if one can see the logs from the
>> integration tests, which is blocking some work in the
>> maven-site-plugin:
>> 
>> https://github.com/apache/maven-site-plugin/pull/34#issuecomment-762207488
>> 
>> --
>> Elliotte Rusty Harold
>> elharo@ibiblio.org