You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Jarek Potiuk <Ja...@polidea.com> on 2019/07/10 20:55:57 UTC

Re: Travis builds in a queue for hours

Hello Everyone,

I have some really good news. I just had a call with Google OSS team (Gris,
Aizhamal) and they are willing to donate VMs on Google Cloud Platform to
run CI for Airflow. In order to simplify the setup (and make sure it is ok
according to Apache regulations) we think we should go exactly the same
route as Apache Beam project (Google donated 16x 16CPU machines for them).
The route of Apache Beam is to use the machines as workers for Apache
Jenkins (https://builds.apache.org/). Apache Jenkins is one of the
encouraged CI solutions by Apache and if we can have workers connected to
the existing Jenkins master of Apache, it means that the maintenance
overhead will be pretty minimal. And we can follow Apache Beam setup so I
do not expect any legal problems.

I also work very closely with the team that uses Apache Beam Jenkins
heavily so I have access to all the necessary experts to help with the
setup (and I am happy to help with that).

I really hope everyone in the community will be really happy to go in that
direction - it's. Please let me know if you have any concerns !

We do not need as many machines as Beam for sure (Beam uses the machines to
process a lot of data for tests including some load testing) but we need to
estimate the number/types of machines that we are going to need.
Fokko, Ash, others - do you have some recent numbers for the current usage
or should I open an Infrastructure ticket for it?

J

On Fri, Jun 28, 2019 at 4:50 PM Jarek Potiuk <Ja...@polidea.com>
wrote:

> Thanks Aizhamal! I spoke already to Gris and she confirmed that as well
> and the 8th of July date is ok for us as we will have to evaluate and
> prepare as well. Have a nice trip.
>
> J.
>
> On Fri, Jun 28, 2019 at 4:25 PM Aizhamal Nurmamat kyzy
> <ai...@google.com.invalid> wrote:
>
>> Hi all,
>>
>> On Thu, Jun 27, 2019 at 15:28 Jarek Potiuk <Ja...@polidea.com>
>> wrote:
>>
>> > Yeah. I also have a working version of Cloud build configuration and we
>> can
>> > run the tests on cloud build if we can get some credits from Google.
>>
>>
>> I can look into getting a small amount of credits approved for this, to
>> see
>> if it’s useful to offload some tests to Cloud Build, or to provision some
>> VMs to run on Apache Infra.
>>
>> I am traveling at the moment, but I’ll be back in the office on July 8,
>> and
>> I’ll try to get this done.
>>
>>
>> Thanks,
>> Aizhamal
>>
>> And
>> > the changes from the upcoming CI image will make it much easier to run
>> > tests on any CI provider. Except Kubernetes tests they are pretty much
>> > CI-agnostic. Kubernetes tests will likely be also fixed soon.
>> >
>> > Another idea: I thought that in the future we can also run only subset
>> of
>> > postgres/mysql/sqlite tests on all combinations. I think there are just
>> > handful of tests that are specific for backend (and we already know
>> which
>> > ones they are - they are skipped-if).
>> >
>> > J.
>> >
>> > Principal Software Engineer
>> > Phone: +48660796129
>> >
>> > czw., 27 cze 2019, 15:12 użytkownik Philippe Gagnon <
>> philgagnon1@gmail.com
>> > >
>> > napisał:
>> >
>> > > I think the combinations that you are proposing are sensible for
>> > pre-merge
>> > > checks.
>> > >
>> > > I am working on a proposal to offload extra combinations to another CI
>> > > provider (Azure DevOps specifically seems like a good candidate),
>> either
>> > > pre or post merge. Ideally I'd like to run more combinations pre-merge
>> > but
>> > > there is a trade-off to be conscious of here between development
>> velocity
>> > > and quality assurance, which I think this issue highlights quite well.
>> > >
>> > > Please let me know your thoughts
>> > >
>> > > Philippe
>> > >
>> > > On Thu, Jun 27, 2019 at 9:05 AM Jarek Potiuk <
>> Jarek.Potiuk@polidea.com>
>> > > wrote:
>> > >
>> > > > Agree that we should be thoughtful about others as well: In the
>> latest
>> > > push
>> > > > (few minutes ago) of the upcoming official CI image i implemented
>> the
>> > > > change we discussed in the Github where we limit the number of
>> > > combinations
>> > > > we test:
>> > > >
>> > > > You can see it yourself:
>> > > > https://travis-ci.org/apache/airflow/builds/551305240
>> > > >
>> > > > Those are the combinations I propose:
>> > > >
>> > > >  Python: 3.6
>> > > >  BACKEND=mysql ENV=docker
>> > > >
>> > > >  Python: 3.6
>> > > >  BACKEND=postgres ENV=docker
>> > > >
>> > > >  Python: 3.5
>> > > >  BACKEND=sqlite ENV=docker
>> > > >
>> > > >  Python: 3.6
>> > > >  BACKEND=postgres ENV=kubernetes KUBERNETES_VERSION=v1.13.0
>> > > >
>> > > > J,
>> > > >
>> > > >
>> > > > On Thu, Jun 27, 2019 at 11:00 AM Driesprong, Fokko
>> > <fokko@driesprong.frl
>> > > >
>> > > > wrote:
>> > > >
>> > > > > We got this message last year:
>> > > > >
>> > > > > > Hello, Airflow PPMC.
>> > > > > > While going through the usage statistics for our Travis CI
>> > service, I
>> > > > > > have noticed that the Airflow project is using an abnormally
>> large
>> > > > > > amount of resources, 2600 hours per month or the equivalent of
>> > having
>> > > > > > almost 4 machines building airflow non-stop 24/7. As this is not
>> > > free,
>> > > > > > but rather costing us money, I'm contacting you with the
>> intention
>> > of
>> > > > > > figuring out ways to reduce the use of Travis for the project.
>> > > > >
>> > > > > > We would greatly prefer that the project itself comes up with a
>> > > > solution
>> > > > > > to lower the usage of Travis, as we'd hate to simply turn it off
>> > for
>> > > > > > you, but the usage is at a rather severe level, totaling more
>> than
>> > > 21%
>> > > > > > of the total build time of all projects using Travis, so
>> something
>> > > > > > actionable should be decided upon and (preferably) completed by
>> the
>> > > end
>> > > > > > of May that will reduce the consumption of Travis resources.
>> > > > >
>> > > > > > Alternately, if you are unable to lower the pressure on Travis,
>> the
>> > > > > > podling and/or IPMC may ask the board of directors for a
>> separate
>> > > > budget
>> > > > > > for additional build nodes to cope with the added load - I'll
>> leave
>> > > > this
>> > > > > > for the podling and IPMC to decide on.
>> > > > >
>> > > > > > Please let us know when you have decided on a plan to remedy
>> this
>> > > > > situation.
>> > > > >
>> > > > > > With regards,
>> > > > > > Daniel on behalf of ASF Infrastructure.
>> > > > >
>> > > > > I think more and more projects are still migrating to the ASF
>> Travis,
>> > > so
>> > > > I
>> > > > > think natural that there is more load. However, this still leaves
>> the
>> > > > > question if we have to run the full matrix.
>> > > > >
>> > > > > Cheers, Fokko
>> > > > >
>> > > > >
>> > > > >
>> > > > > Op do 27 jun. 2019 om 10:56 schreef Jarek Potiuk <
>> > > > Jarek.Potiuk@polidea.com
>> > > > > >:
>> > > > >
>> > > > > > I think we should really involve infra to increase the slot
>> number
>> > or
>> > > > > maybe
>> > > > > > even somehow allocate slots per project.
>> > > > > > The problem is that we cannot control what other apache projects
>> > are
>> > > > > doing,
>> > > > > > so even if we decrease our runtime, it's the other projects that
>> > > might
>> > > > > hold
>> > > > > > us in the queue :(
>> > > > > >
>> > > > > > J.
>> > > > > >
>> > > > > > On Thu, Jun 27, 2019 at 10:19 AM Driesprong, Fokko
>> > > > <fokko@driesprong.frl
>> > > > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > I've noticed this at other Apache projects as well, sometimes
>> it
>> > > > takes
>> > > > > up
>> > > > > > > to 7-8 hours. The only thing we can do, is reduce the runtime
>> of
>> > > the
>> > > > > jobs
>> > > > > > > so we take less slots :-)
>> > > > > > >
>> > > > > > > Cheers, Fokko
>> > > > > > >
>> > > > > > > Op wo 26 jun. 2019 om 21:59 schreef Jarek Potiuk <
>> > > > > > Jarek.Potiuk@polidea.com
>> > > > > > > >:
>> > > > > > >
>> > > > > > > > Yep. That's what I suggested as the reason in the ticket - I
>> > > guess
>> > > > > > INFRA
>> > > > > > > > are the only people who can do anything about it (increase
>> > > > > concurrency
>> > > > > > ?
>> > > > > > > > pay more for Travis :)? ).
>> > > > > > > >
>> > > > > > > > On Wed, Jun 26, 2019 at 9:51 PM Ash Berlin-Taylor <
>> > > ash@apache.org>
>> > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > I asked Travis on twitter and they said it was due to the
>> > > Apache
>> > > > > > other
>> > > > > > > > > projects build queues
>> > > > > > > > >
>> > > > > > > > > https://twitter.com/travisci/status/1143893051460526080
>> > > > > > > > >
>> > > > > > > > > -ash
>> > > > > > > > >
>> > > > > > > > > On 26 June 2019 20:48:33 BST, Jarek Potiuk <
>> > > > > Jarek.Potiuk@polidea.com
>> > > > > > >
>> > > > > > > > > wrote:
>> > > > > > > > >>
>> > > > > > > > >> Hello everyone,
>> > > > > > > > >>
>> > > > > > > > >> For the last few days the Travis builds for
>> apache/airflow
>> > > > project
>> > > > > > are
>> > > > > > > > >> waiting in a queue for hours. This is not a normal
>> > situation.
>> > > > I've
>> > > > > > > > opened
>> > > > > > > > >> INFRA ticket for that:
>> > > > > > > > https://issues.apache.org/jira/browse/INFRA-18657
>> > > > > > > > >>
>> > > > > > > > >> J.
>> > > > > > > > >>
>> > > > > > > > >>
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > >
>> > > > > > > > Jarek Potiuk
>> > > > > > > > Polidea <https://www.polidea.com/> | Principal Software
>> > Engineer
>> > > > > > > >
>> > > > > > > > M: +48 660 796 129 <+48660796129>
>> > > > > > > > [image: Polidea] <https://www.polidea.com/>
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > >
>> > > > > > Jarek Potiuk
>> > > > > > Polidea <https://www.polidea.com/> | Principal Software
>> Engineer
>> > > > > >
>> > > > > > M: +48 660 796 129 <+48660796129>
>> > > > > > [image: Polidea] <https://www.polidea.com/>
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > >
>> > > > Jarek Potiuk
>> > > > Polidea <https://www.polidea.com/> | Principal Software Engineer
>> > > >
>> > > > M: +48 660 796 129 <+48660796129>
>> > > > [image: Polidea] <https://www.polidea.com/>
>> > > >
>> > >
>> >
>>
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Travis builds in a queue for hours

Posted by Jarek Potiuk <Ja...@polidea.com>.
It does not need to be Jenkins (we will hopefully get VMs that we can put
whatever on) - but it seems that (If it works out of course which is
something we need to finalise) from "process" point of view following the
footsteps of Apache Beam and using the same setup might be one of the best
choices (and then we would not have to maintain our own CI service - just
connect to the one maintained by Apache Infrastructure). From our point of
view we would just have to have a pool of worker machines running - Jenkins
master will manage everything for us.

As many others I have a "love-hate" relationship with Jenkins - but I
always loved the Dashboard/Job UI. My hare is mostly based on the old way
of managing the builds where you had to use UI to manage the builds.
With Jenkins
Pipelines <https://jenkins.io/doc/book/pipeline/getting-started/> it's
really close to any modern CI - actually it is even better as you manage
your build configuration as a code (Gradle in this case) and keep it in
repo - similar to travis.yml.

We had discussion recently about it and I think the general conclusion was
that as long as you can read logs and have the dashboard with job status -
nobody really cares what CI system is used.

On Wed, Jul 10, 2019 at 11:27 PM Bolke de Bruin <bd...@gmail.com> wrote:

> Awesome! But I hope you are not serious about using Jenkins right? If I
> need to start a Holy War it would be against Jenkins.
>
> B.
>
> Verstuurd vanaf mijn iPad
>
> > Op 10 jul. 2019 om 22:55 heeft Jarek Potiuk <Ja...@polidea.com>
> het volgende geschreven:
> >
> > Hello Everyone,
> >
> > I have some really good news. I just had a call with Google OSS team
> (Gris,
> > Aizhamal) and they are willing to donate VMs on Google Cloud Platform to
> > run CI for Airflow. In order to simplify the setup (and make sure it is
> ok
> > according to Apache regulations) we think we should go exactly the same
> > route as Apache Beam project (Google donated 16x 16CPU machines for
> them).
> > The route of Apache Beam is to use the machines as workers for Apache
> > Jenkins (https://builds.apache.org/). Apache Jenkins is one of the
> > encouraged CI solutions by Apache and if we can have workers connected to
> > the existing Jenkins master of Apache, it means that the maintenance
> > overhead will be pretty minimal. And we can follow Apache Beam setup so I
> > do not expect any legal problems.
> >
> > I also work very closely with the team that uses Apache Beam Jenkins
> > heavily so I have access to all the necessary experts to help with the
> > setup (and I am happy to help with that).
> >
> > I really hope everyone in the community will be really happy to go in
> that
> > direction - it's. Please let me know if you have any concerns !
> >
> > We do not need as many machines as Beam for sure (Beam uses the machines
> to
> > process a lot of data for tests including some load testing) but we need
> to
> > estimate the number/types of machines that we are going to need.
> > Fokko, Ash, others - do you have some recent numbers for the current
> usage
> > or should I open an Infrastructure ticket for it?
> >
> > J
> >
> > On Fri, Jun 28, 2019 at 4:50 PM Jarek Potiuk <Ja...@polidea.com>
> > wrote:
> >
> >> Thanks Aizhamal! I spoke already to Gris and she confirmed that as well
> >> and the 8th of July date is ok for us as we will have to evaluate and
> >> prepare as well. Have a nice trip.
> >>
> >> J.
> >>
> >> On Fri, Jun 28, 2019 at 4:25 PM Aizhamal Nurmamat kyzy
> >> <ai...@google.com.invalid> wrote:
> >>
> >>> Hi all,
> >>>
> >>> On Thu, Jun 27, 2019 at 15:28 Jarek Potiuk <Ja...@polidea.com>
> >>> wrote:
> >>>
> >>>> Yeah. I also have a working version of Cloud build configuration and
> we
> >>> can
> >>>> run the tests on cloud build if we can get some credits from Google.
> >>>
> >>>
> >>> I can look into getting a small amount of credits approved for this, to
> >>> see
> >>> if it’s useful to offload some tests to Cloud Build, or to provision
> some
> >>> VMs to run on Apache Infra.
> >>>
> >>> I am traveling at the moment, but I’ll be back in the office on July 8,
> >>> and
> >>> I’ll try to get this done.
> >>>
> >>>
> >>> Thanks,
> >>> Aizhamal
> >>>
> >>> And
> >>>> the changes from the upcoming CI image will make it much easier to run
> >>>> tests on any CI provider. Except Kubernetes tests they are pretty much
> >>>> CI-agnostic. Kubernetes tests will likely be also fixed soon.
> >>>>
> >>>> Another idea: I thought that in the future we can also run only subset
> >>> of
> >>>> postgres/mysql/sqlite tests on all combinations. I think there are
> just
> >>>> handful of tests that are specific for backend (and we already know
> >>> which
> >>>> ones they are - they are skipped-if).
> >>>>
> >>>> J.
> >>>>
> >>>> Principal Software Engineer
> >>>> Phone: +48660796129
> >>>>
> >>>> czw., 27 cze 2019, 15:12 użytkownik Philippe Gagnon <
> >>> philgagnon1@gmail.com
> >>>>>
> >>>> napisał:
> >>>>
> >>>>> I think the combinations that you are proposing are sensible for
> >>>> pre-merge
> >>>>> checks.
> >>>>>
> >>>>> I am working on a proposal to offload extra combinations to another
> CI
> >>>>> provider (Azure DevOps specifically seems like a good candidate),
> >>> either
> >>>>> pre or post merge. Ideally I'd like to run more combinations
> pre-merge
> >>>> but
> >>>>> there is a trade-off to be conscious of here between development
> >>> velocity
> >>>>> and quality assurance, which I think this issue highlights quite
> well.
> >>>>>
> >>>>> Please let me know your thoughts
> >>>>>
> >>>>> Philippe
> >>>>>
> >>>>> On Thu, Jun 27, 2019 at 9:05 AM Jarek Potiuk <
> >>> Jarek.Potiuk@polidea.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Agree that we should be thoughtful about others as well: In the
> >>> latest
> >>>>> push
> >>>>>> (few minutes ago) of the upcoming official CI image i implemented
> >>> the
> >>>>>> change we discussed in the Github where we limit the number of
> >>>>> combinations
> >>>>>> we test:
> >>>>>>
> >>>>>> You can see it yourself:
> >>>>>> https://travis-ci.org/apache/airflow/builds/551305240
> >>>>>>
> >>>>>> Those are the combinations I propose:
> >>>>>>
> >>>>>> Python: 3.6
> >>>>>> BACKEND=mysql ENV=docker
> >>>>>>
> >>>>>> Python: 3.6
> >>>>>> BACKEND=postgres ENV=docker
> >>>>>>
> >>>>>> Python: 3.5
> >>>>>> BACKEND=sqlite ENV=docker
> >>>>>>
> >>>>>> Python: 3.6
> >>>>>> BACKEND=postgres ENV=kubernetes KUBERNETES_VERSION=v1.13.0
> >>>>>>
> >>>>>> J,
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Jun 27, 2019 at 11:00 AM Driesprong, Fokko
> >>>> <fokko@driesprong.frl
> >>>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> We got this message last year:
> >>>>>>>
> >>>>>>>> Hello, Airflow PPMC.
> >>>>>>>> While going through the usage statistics for our Travis CI
> >>>> service, I
> >>>>>>>> have noticed that the Airflow project is using an abnormally
> >>> large
> >>>>>>>> amount of resources, 2600 hours per month or the equivalent of
> >>>> having
> >>>>>>>> almost 4 machines building airflow non-stop 24/7. As this is not
> >>>>> free,
> >>>>>>>> but rather costing us money, I'm contacting you with the
> >>> intention
> >>>> of
> >>>>>>>> figuring out ways to reduce the use of Travis for the project.
> >>>>>>>
> >>>>>>>> We would greatly prefer that the project itself comes up with a
> >>>>>> solution
> >>>>>>>> to lower the usage of Travis, as we'd hate to simply turn it off
> >>>> for
> >>>>>>>> you, but the usage is at a rather severe level, totaling more
> >>> than
> >>>>> 21%
> >>>>>>>> of the total build time of all projects using Travis, so
> >>> something
> >>>>>>>> actionable should be decided upon and (preferably) completed by
> >>> the
> >>>>> end
> >>>>>>>> of May that will reduce the consumption of Travis resources.
> >>>>>>>
> >>>>>>>> Alternately, if you are unable to lower the pressure on Travis,
> >>> the
> >>>>>>>> podling and/or IPMC may ask the board of directors for a
> >>> separate
> >>>>>> budget
> >>>>>>>> for additional build nodes to cope with the added load - I'll
> >>> leave
> >>>>>> this
> >>>>>>>> for the podling and IPMC to decide on.
> >>>>>>>
> >>>>>>>> Please let us know when you have decided on a plan to remedy
> >>> this
> >>>>>>> situation.
> >>>>>>>
> >>>>>>>> With regards,
> >>>>>>>> Daniel on behalf of ASF Infrastructure.
> >>>>>>>
> >>>>>>> I think more and more projects are still migrating to the ASF
> >>> Travis,
> >>>>> so
> >>>>>> I
> >>>>>>> think natural that there is more load. However, this still leaves
> >>> the
> >>>>>>> question if we have to run the full matrix.
> >>>>>>>
> >>>>>>> Cheers, Fokko
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Op do 27 jun. 2019 om 10:56 schreef Jarek Potiuk <
> >>>>>> Jarek.Potiuk@polidea.com
> >>>>>>>> :
> >>>>>>>
> >>>>>>>> I think we should really involve infra to increase the slot
> >>> number
> >>>> or
> >>>>>>> maybe
> >>>>>>>> even somehow allocate slots per project.
> >>>>>>>> The problem is that we cannot control what other apache projects
> >>>> are
> >>>>>>> doing,
> >>>>>>>> so even if we decrease our runtime, it's the other projects that
> >>>>> might
> >>>>>>> hold
> >>>>>>>> us in the queue :(
> >>>>>>>>
> >>>>>>>> J.
> >>>>>>>>
> >>>>>>>> On Thu, Jun 27, 2019 at 10:19 AM Driesprong, Fokko
> >>>>>> <fokko@driesprong.frl
> >>>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> I've noticed this at other Apache projects as well, sometimes
> >>> it
> >>>>>> takes
> >>>>>>> up
> >>>>>>>>> to 7-8 hours. The only thing we can do, is reduce the runtime
> >>> of
> >>>>> the
> >>>>>>> jobs
> >>>>>>>>> so we take less slots :-)
> >>>>>>>>>
> >>>>>>>>> Cheers, Fokko
> >>>>>>>>>
> >>>>>>>>> Op wo 26 jun. 2019 om 21:59 schreef Jarek Potiuk <
> >>>>>>>> Jarek.Potiuk@polidea.com
> >>>>>>>>>> :
> >>>>>>>>>
> >>>>>>>>>> Yep. That's what I suggested as the reason in the ticket - I
> >>>>> guess
> >>>>>>>> INFRA
> >>>>>>>>>> are the only people who can do anything about it (increase
> >>>>>>> concurrency
> >>>>>>>> ?
> >>>>>>>>>> pay more for Travis :)? ).
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Jun 26, 2019 at 9:51 PM Ash Berlin-Taylor <
> >>>>> ash@apache.org>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> I asked Travis on twitter and they said it was due to the
> >>>>> Apache
> >>>>>>>> other
> >>>>>>>>>>> projects build queues
> >>>>>>>>>>>
> >>>>>>>>>>> https://twitter.com/travisci/status/1143893051460526080
> >>>>>>>>>>>
> >>>>>>>>>>> -ash
> >>>>>>>>>>>
> >>>>>>>>>>> On 26 June 2019 20:48:33 BST, Jarek Potiuk <
> >>>>>>> Jarek.Potiuk@polidea.com
> >>>>>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hello everyone,
> >>>>>>>>>>>>
> >>>>>>>>>>>> For the last few days the Travis builds for
> >>> apache/airflow
> >>>>>> project
> >>>>>>>> are
> >>>>>>>>>>>> waiting in a queue for hours. This is not a normal
> >>>> situation.
> >>>>>> I've
> >>>>>>>>>> opened
> >>>>>>>>>>>> INFRA ticket for that:
> >>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-18657
> >>>>>>>>>>>>
> >>>>>>>>>>>> J.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>>
> >>>>>>>>>> Jarek Potiuk
> >>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
> >>>> Engineer
> >>>>>>>>>>
> >>>>>>>>>> M: +48 660 796 129 <+48660796129>
> >>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>>
> >>>>>>>> Jarek Potiuk
> >>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
> >>> Engineer
> >>>>>>>>
> >>>>>>>> M: +48 660 796 129 <+48660796129>
> >>>>>>>> [image: Polidea] <https://www.polidea.com/>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>>
> >>>>>> Jarek Potiuk
> >>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>>>>>
> >>>>>> M: +48 660 796 129 <+48660796129>
> >>>>>> [image: Polidea] <https://www.polidea.com/>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >>
> >> --
> >>
> >> Jarek Potiuk
> >> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>
> >> M: +48 660 796 129 <+48660796129>
> >> [image: Polidea] <https://www.polidea.com/>
> >>
> >>
> >
> > --
> >
> > Jarek Potiuk
> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >
> > M: +48 660 796 129 <+48660796129>
> > [image: Polidea] <https://www.polidea.com/>
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Travis builds in a queue for hours

Posted by Aizhamal Nurmamat kyzy <ai...@apache.org>.
Hello everyone,

Great news! I've been able to provision some Google Cloud credits to start
trying out a CI solution (~3k USD). They should last for a couple months
(let's optimize it so it lasts as long as possible :)) while Gris and I
work on a longer-term solution to provision the credits.

The name of project is apache-airflow-testing. I will be glad to add any
PMC members as owners of the project. Please contact me directly with the
email account that I should add to the project.

Also, Jarek, please send your preferred email address so you'll have access
to the project as well. Thanks so much for working on this : )

Best,
Aizhamal

On Fri, Jul 12, 2019 at 12:48 PM Jarek Potiuk <Ja...@polidea.com>
wrote:

> Just FYI - I opened a ticket to get stats of the machine usage for Travis
> to infra: https://issues.apache.org/jira/browse/INFRA-18742
>
> On Thu, Jul 11, 2019 at 1:48 PM Jarek Potiuk <Ja...@polidea.com>
> wrote:
>
> > Absolutely! I thought about it today and GKE cluster would be perfect for
> > us - especially that we can also use it to run Kubernetes tests on it !
> > That's still a major pain having to setup minikube for the tests and
> having
> > a GKE cluster that we can simply use would simplify this part a LOT.
> >
> > Principal Software Engineer
> > Phone: +48660796129
> >
> > czw., 11 lip 2019, 09:26 użytkownik Driesprong, Fokko
> <fo...@driesprong.frl>
> > napisał:
> >
> >> Yes, Gitlab works very well with GCP. A Kubernetes cluster with
> >> autoscaling
> >> for the runners would be perfect, and will also minimize the resources
> >> provided by Google.
> >>
> >> Cheers, Fokko
> >>
> >> Op do 11 jul. 2019 om 07:13 schreef Jarek Potiuk <
> >> Jarek.Potiuk@polidea.com>
> >>
> >> > Since more than few people (including myself) are in favour of GitLab
> >> CI,
> >> > and since Apache Infra is talking to GitLab CI, I will make sure to
> >> check
> >> > if we can combine the two approaches - workers from Google and
> managed,
> >> > central GitlabCI interface to manage it (likely managed by the Infra
> >> team).
> >> > Airflow can easily be a  "guinea pig" for GitLab CI / Apache
> >> integration.
> >> > We also have quite an expertise in managin GitLab in my company (we
> use
> >> > GitLab in Polidea for most of our mobile project CI and all the cloud
> >> > builds that we run internally).
> >> >
> >> > I will make an AIP for that soon and involve the right people :).
> >> >
> >> > J.
> >> >
> >> > On Thu, Jul 11, 2019 at 8:01 AM Driesprong, Fokko
> <fokko@driesprong.frl
> >> >
> >> > wrote:
> >> >
> >> > > Regardings the numbers, I believe that INFRA has an overview of the
> >> usage
> >> > > per project. I think they are happy to share these numbers with you.
> >> > Also,
> >> > > it seems like there is also a queue in Jenkins:
> >> > https://status.apache.org/
> >> > >
> >> > > Talking about Jenkins. I'm not a big fan of it. For example, Spark
> >> uses
> >> > it,
> >> > > and it is rather difficult to set up the environment yourself, in
> >> > contrast
> >> > > with Travis. I also have good experiences with Gitlab since that is
> >> the
> >> > > only Docker native CI in my personal opinion.
> >> > >
> >> > > > But we can try both of course. And even switch later.
> >> > > There is nothing as permanent as a temporary solution :-) However,
> I'm
> >> > not
> >> > > against trying. I've checked the beam project, and the integration
> >> with
> >> > > Github looks good.
> >> > >
> >> > > Thanks again Jarek and Aizhamal for all the work an effort.
> >> > >
> >> > > Cheers, Fokko
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > Op wo 10 jul. 2019 om 23:11 schreef Aizhamal Nurmamat kyzy <
> >> > > aizhamal@apache.org>:
> >> > >
> >> > > > Hi all,
> >> > > >
> >> > > > I am still working on trying to get approvals for this, so this is
> >> not
> >> > > yet
> >> > > > a done deal. I'll keep y'all updated.
> >> > > >
> >> > > > As for the CI solution to use, we have no particular inclination.
> As
> >> > long
> >> > > > as the community supports it, and it is consistent with any Apache
> >> > > > guidelines for CI from their projects. Jenkins and GitLab CI both
> >> sound
> >> > > > sensible.
> >> > > >
> >> > > > The email from INFRA says that Airflow runs 2600 hours of tests
> per
> >> > > month,
> >> > > > or the equivalent of about 4 machines. Can the community help
> with a
> >> > > > reasonable estimate for this, so I can use it as a reference for
> the
> >> > > > request?
> >> > > >
> >> > > > Thanks!
> >> > > >
> >> > > > On Wed, Jul 10, 2019 at 2:43 PM Jarek Potiuk <
> >> Jarek.Potiuk@polidea.com
> >> > >
> >> > > > wrote:
> >> > > >
> >> > > > > Yeah. Gitlab CI is definitely what I would prefer as well from
> the
> >> > > > > "modernity" point of view (and one of my very close friends is
> >> Gitlab
> >> > > CI
> >> > > > > maintainer and actually The person who introduced CI to GitLab
> >> > > > offering). I
> >> > > > > also actually already catalysed discussion between GitLab and
> >> Apache
> >> > > > > infrastructure to introduce GitLab CI on the "Apache" level
> (they
> >> are
> >> > > > > talking about it now I believe).
> >> > > > >
> >> > > > > But from Google <> Apache/Procedural point of view it might
> >> simply be
> >> > > > > easier to follow footsteps of Apache Beam. It might simply be
> few
> >> > > clicks
> >> > > > > away for the Apache Infrastructure to add more machines and
> >> connect
> >> > > them
> >> > > > to
> >> > > > > the Apache Jenkins for our project. If we have a path cleared by
> >> > > others,
> >> > > > > following it might be simply much faster.
> >> > > > >
> >> > > > > But we can try both of course. And even switch later. The Docker
> >> CI
> >> > > > > approach I am about to merge is designed to be super-easy to
> >> switch
> >> > > > betwen
> >> > > > > CI systems. Virtually ALL the build logic is in scripts  in
> shared
> >> > > Docker
> >> > > > > images. There is basically one file per CI system to add and we
> >> can
> >> > > > support
> >> > > > > Travis/Jenkins/CloudBuild/CircleCI - whatever we imaging. We can
> >> even
> >> > > > > support all of them at the same time :)
> >> > > > >
> >> > > > > J.
> >> > > > >
> >> > > > > On Wed, Jul 10, 2019 at 11:32 PM Bolke de Bruin <
> >> bdbruin@gmail.com>
> >> > > > wrote:
> >> > > > >
> >> > > > > > If you need an alternative why not use a couple of gitlab-ci
> >> > runners?
> >> > > > > Much
> >> > > > > > easier to maintain, light weight, and much closer to what we
> use
> >> > now.
> >> > > > > >
> >> > > > > > B.
> >> > > > > >
> >> > > > > > Verstuurd vanaf mijn iPad
> >> > > > > >
> >> > > > > > > Op 10 jul. 2019 om 23:27 heeft Bolke de Bruin <
> >> bdbruin@gmail.com
> >> > >
> >> > > > het
> >> > > > > > volgende geschreven:
> >> > > > > > >
> >> > > > > > > Awesome! But I hope you are not serious about using Jenkins
> >> > right?
> >> > > > If I
> >> > > > > > need to start a Holy War it would be against Jenkins.
> >> > > > > > >
> >> > > > > > > B.
> >> > > > > > >
> >> > > > > > > Verstuurd vanaf mijn iPad
> >> > > > > > >
> >> > > > > > >> Op 10 jul. 2019 om 22:55 heeft Jarek Potiuk <
> >> > > > Jarek.Potiuk@polidea.com
> >> > > > > >
> >> > > > > > het volgende geschreven:
> >> > > > > > >>
> >> > > > > > >> Hello Everyone,
> >> > > > > > >>
> >> > > > > > >> I have some really good news. I just had a call with Google
> >> OSS
> >> > > team
> >> > > > > > (Gris,
> >> > > > > > >> Aizhamal) and they are willing to donate VMs on Google
> Cloud
> >> > > > Platform
> >> > > > > to
> >> > > > > > >> run CI for Airflow. In order to simplify the setup (and
> make
> >> > sure
> >> > > it
> >> > > > > is
> >> > > > > > ok
> >> > > > > > >> according to Apache regulations) we think we should go
> >> exactly
> >> > the
> >> > > > > same
> >> > > > > > >> route as Apache Beam project (Google donated 16x 16CPU
> >> machines
> >> > > for
> >> > > > > > them).
> >> > > > > > >> The route of Apache Beam is to use the machines as workers
> >> for
> >> > > > Apache
> >> > > > > > >> Jenkins (https://builds.apache.org/). Apache Jenkins is
> one
> >> of
> >> > > the
> >> > > > > > >> encouraged CI solutions by Apache and if we can have
> workers
> >> > > > connected
> >> > > > > > to
> >> > > > > > >> the existing Jenkins master of Apache, it means that the
> >> > > maintenance
> >> > > > > > >> overhead will be pretty minimal. And we can follow Apache
> >> Beam
> >> > > setup
> >> > > > > so
> >> > > > > > I
> >> > > > > > >> do not expect any legal problems.
> >> > > > > > >>
> >> > > > > > >> I also work very closely with the team that uses Apache
> Beam
> >> > > Jenkins
> >> > > > > > >> heavily so I have access to all the necessary experts to
> help
> >> > with
> >> > > > the
> >> > > > > > >> setup (and I am happy to help with that).
> >> > > > > > >>
> >> > > > > > >> I really hope everyone in the community will be really
> happy
> >> to
> >> > go
> >> > > > in
> >> > > > > > that
> >> > > > > > >> direction - it's. Please let me know if you have any
> >> concerns !
> >> > > > > > >>
> >> > > > > > >> We do not need as many machines as Beam for sure (Beam uses
> >> the
> >> > > > > > machines to
> >> > > > > > >> process a lot of data for tests including some load
> testing)
> >> but
> >> > > we
> >> > > > > > need to
> >> > > > > > >> estimate the number/types of machines that we are going to
> >> need.
> >> > > > > > >> Fokko, Ash, others - do you have some recent numbers for
> the
> >> > > current
> >> > > > > > usage
> >> > > > > > >> or should I open an Infrastructure ticket for it?
> >> > > > > > >>
> >> > > > > > >> J
> >> > > > > > >>
> >> > > > > > >> On Fri, Jun 28, 2019 at 4:50 PM Jarek Potiuk <
> >> > > > > Jarek.Potiuk@polidea.com>
> >> > > > > > >> wrote:
> >> > > > > > >>
> >> > > > > > >>> Thanks Aizhamal! I spoke already to Gris and she confirmed
> >> that
> >> > > as
> >> > > > > well
> >> > > > > > >>> and the 8th of July date is ok for us as we will have to
> >> > evaluate
> >> > > > and
> >> > > > > > >>> prepare as well. Have a nice trip.
> >> > > > > > >>>
> >> > > > > > >>> J.
> >> > > > > > >>>
> >> > > > > > >>> On Fri, Jun 28, 2019 at 4:25 PM Aizhamal Nurmamat kyzy
> >> > > > > > >>> <ai...@google.com.invalid> wrote:
> >> > > > > > >>>
> >> > > > > > >>>> Hi all,
> >> > > > > > >>>>
> >> > > > > > >>>> On Thu, Jun 27, 2019 at 15:28 Jarek Potiuk <
> >> > > > > Jarek.Potiuk@polidea.com>
> >> > > > > > >>>> wrote:
> >> > > > > > >>>>
> >> > > > > > >>>>> Yeah. I also have a working version of Cloud build
> >> > > configuration
> >> > > > > and
> >> > > > > > we
> >> > > > > > >>>> can
> >> > > > > > >>>>> run the tests on cloud build if we can get some credits
> >> from
> >> > > > > Google.
> >> > > > > > >>>>
> >> > > > > > >>>>
> >> > > > > > >>>> I can look into getting a small amount of credits
> approved
> >> for
> >> > > > this,
> >> > > > > > to
> >> > > > > > >>>> see
> >> > > > > > >>>> if it’s useful to offload some tests to Cloud Build, or
> to
> >> > > > provision
> >> > > > > > some
> >> > > > > > >>>> VMs to run on Apache Infra.
> >> > > > > > >>>>
> >> > > > > > >>>> I am traveling at the moment, but I’ll be back in the
> >> office
> >> > on
> >> > > > July
> >> > > > > > 8,
> >> > > > > > >>>> and
> >> > > > > > >>>> I’ll try to get this done.
> >> > > > > > >>>>
> >> > > > > > >>>>
> >> > > > > > >>>> Thanks,
> >> > > > > > >>>> Aizhamal
> >> > > > > > >>>>
> >> > > > > > >>>> And
> >> > > > > > >>>>> the changes from the upcoming CI image will make it much
> >> > easier
> >> > > > to
> >> > > > > > run
> >> > > > > > >>>>> tests on any CI provider. Except Kubernetes tests they
> are
> >> > > pretty
> >> > > > > > much
> >> > > > > > >>>>> CI-agnostic. Kubernetes tests will likely be also fixed
> >> soon.
> >> > > > > > >>>>>
> >> > > > > > >>>>> Another idea: I thought that in the future we can also
> run
> >> > only
> >> > > > > > subset
> >> > > > > > >>>> of
> >> > > > > > >>>>> postgres/mysql/sqlite tests on all combinations. I think
> >> > there
> >> > > > are
> >> > > > > > just
> >> > > > > > >>>>> handful of tests that are specific for backend (and we
> >> > already
> >> > > > know
> >> > > > > > >>>> which
> >> > > > > > >>>>> ones they are - they are skipped-if).
> >> > > > > > >>>>>
> >> > > > > > >>>>> J.
> >> > > > > > >>>>>
> >> > > > > > >>>>> Principal Software Engineer
> >> > > > > > >>>>> Phone: +48660796129
> >> > > > > > >>>>>
> >> > > > > > >>>>> czw., 27 cze 2019, 15:12 użytkownik Philippe Gagnon <
> >> > > > > > >>>> philgagnon1@gmail.com
> >> > > > > > >>>>>>
> >> > > > > > >>>>> napisał:
> >> > > > > > >>>>>
> >> > > > > > >>>>>> I think the combinations that you are proposing are
> >> sensible
> >> > > for
> >> > > > > > >>>>> pre-merge
> >> > > > > > >>>>>> checks.
> >> > > > > > >>>>>>
> >> > > > > > >>>>>> I am working on a proposal to offload extra
> combinations
> >> to
> >> > > > > another
> >> > > > > > CI
> >> > > > > > >>>>>> provider (Azure DevOps specifically seems like a good
> >> > > > candidate),
> >> > > > > > >>>> either
> >> > > > > > >>>>>> pre or post merge. Ideally I'd like to run more
> >> combinations
> >> > > > > > pre-merge
> >> > > > > > >>>>> but
> >> > > > > > >>>>>> there is a trade-off to be conscious of here between
> >> > > development
> >> > > > > > >>>> velocity
> >> > > > > > >>>>>> and quality assurance, which I think this issue
> >> highlights
> >> > > quite
> >> > > > > > well.
> >> > > > > > >>>>>>
> >> > > > > > >>>>>> Please let me know your thoughts
> >> > > > > > >>>>>>
> >> > > > > > >>>>>> Philippe
> >> > > > > > >>>>>>
> >> > > > > > >>>>>> On Thu, Jun 27, 2019 at 9:05 AM Jarek Potiuk <
> >> > > > > > >>>> Jarek.Potiuk@polidea.com>
> >> > > > > > >>>>>> wrote:
> >> > > > > > >>>>>>
> >> > > > > > >>>>>>> Agree that we should be thoughtful about others as
> >> well: In
> >> > > the
> >> > > > > > >>>> latest
> >> > > > > > >>>>>> push
> >> > > > > > >>>>>>> (few minutes ago) of the upcoming official CI image i
> >> > > > implemented
> >> > > > > > >>>> the
> >> > > > > > >>>>>>> change we discussed in the Github where we limit the
> >> number
> >> > > of
> >> > > > > > >>>>>> combinations
> >> > > > > > >>>>>>> we test:
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> You can see it yourself:
> >> > > > > > >>>>>>> https://travis-ci.org/apache/airflow/builds/551305240
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> Those are the combinations I propose:
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> Python: 3.6
> >> > > > > > >>>>>>> BACKEND=mysql ENV=docker
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> Python: 3.6
> >> > > > > > >>>>>>> BACKEND=postgres ENV=docker
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> Python: 3.5
> >> > > > > > >>>>>>> BACKEND=sqlite ENV=docker
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> Python: 3.6
> >> > > > > > >>>>>>> BACKEND=postgres ENV=kubernetes
> >> KUBERNETES_VERSION=v1.13.0
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> J,
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> On Thu, Jun 27, 2019 at 11:00 AM Driesprong, Fokko
> >> > > > > > >>>>> <fokko@driesprong.frl
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> wrote:
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> We got this message last year:
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>>> Hello, Airflow PPMC.
> >> > > > > > >>>>>>>>> While going through the usage statistics for our
> >> Travis
> >> > CI
> >> > > > > > >>>>> service, I
> >> > > > > > >>>>>>>>> have noticed that the Airflow project is using an
> >> > > abnormally
> >> > > > > > >>>> large
> >> > > > > > >>>>>>>>> amount of resources, 2600 hours per month or the
> >> > equivalent
> >> > > > of
> >> > > > > > >>>>> having
> >> > > > > > >>>>>>>>> almost 4 machines building airflow non-stop 24/7. As
> >> this
> >> > > is
> >> > > > > not
> >> > > > > > >>>>>> free,
> >> > > > > > >>>>>>>>> but rather costing us money, I'm contacting you with
> >> the
> >> > > > > > >>>> intention
> >> > > > > > >>>>> of
> >> > > > > > >>>>>>>>> figuring out ways to reduce the use of Travis for
> the
> >> > > > project.
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>>> We would greatly prefer that the project itself
> comes
> >> up
> >> > > > with a
> >> > > > > > >>>>>>> solution
> >> > > > > > >>>>>>>>> to lower the usage of Travis, as we'd hate to simply
> >> turn
> >> > > it
> >> > > > > off
> >> > > > > > >>>>> for
> >> > > > > > >>>>>>>>> you, but the usage is at a rather severe level,
> >> totaling
> >> > > more
> >> > > > > > >>>> than
> >> > > > > > >>>>>> 21%
> >> > > > > > >>>>>>>>> of the total build time of all projects using
> Travis,
> >> so
> >> > > > > > >>>> something
> >> > > > > > >>>>>>>>> actionable should be decided upon and (preferably)
> >> > > completed
> >> > > > by
> >> > > > > > >>>> the
> >> > > > > > >>>>>> end
> >> > > > > > >>>>>>>>> of May that will reduce the consumption of Travis
> >> > > resources.
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>>> Alternately, if you are unable to lower the pressure
> >> on
> >> > > > Travis,
> >> > > > > > >>>> the
> >> > > > > > >>>>>>>>> podling and/or IPMC may ask the board of directors
> >> for a
> >> > > > > > >>>> separate
> >> > > > > > >>>>>>> budget
> >> > > > > > >>>>>>>>> for additional build nodes to cope with the added
> >> load -
> >> > > I'll
> >> > > > > > >>>> leave
> >> > > > > > >>>>>>> this
> >> > > > > > >>>>>>>>> for the podling and IPMC to decide on.
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>>> Please let us know when you have decided on a plan
> to
> >> > > remedy
> >> > > > > > >>>> this
> >> > > > > > >>>>>>>> situation.
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>>> With regards,
> >> > > > > > >>>>>>>>> Daniel on behalf of ASF Infrastructure.
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>> I think more and more projects are still migrating to
> >> the
> >> > > ASF
> >> > > > > > >>>> Travis,
> >> > > > > > >>>>>> so
> >> > > > > > >>>>>>> I
> >> > > > > > >>>>>>>> think natural that there is more load. However, this
> >> still
> >> > > > > leaves
> >> > > > > > >>>> the
> >> > > > > > >>>>>>>> question if we have to run the full matrix.
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>> Cheers, Fokko
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>> Op do 27 jun. 2019 om 10:56 schreef Jarek Potiuk <
> >> > > > > > >>>>>>> Jarek.Potiuk@polidea.com
> >> > > > > > >>>>>>>>> :
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>>> I think we should really involve infra to increase
> the
> >> > slot
> >> > > > > > >>>> number
> >> > > > > > >>>>> or
> >> > > > > > >>>>>>>> maybe
> >> > > > > > >>>>>>>>> even somehow allocate slots per project.
> >> > > > > > >>>>>>>>> The problem is that we cannot control what other
> >> apache
> >> > > > > projects
> >> > > > > > >>>>> are
> >> > > > > > >>>>>>>> doing,
> >> > > > > > >>>>>>>>> so even if we decrease our runtime, it's the other
> >> > projects
> >> > > > > that
> >> > > > > > >>>>>> might
> >> > > > > > >>>>>>>> hold
> >> > > > > > >>>>>>>>> us in the queue :(
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> J.
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> On Thu, Jun 27, 2019 at 10:19 AM Driesprong, Fokko
> >> > > > > > >>>>>>> <fokko@driesprong.frl
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> wrote:
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>>> I've noticed this at other Apache projects as well,
> >> > > > sometimes
> >> > > > > > >>>> it
> >> > > > > > >>>>>>> takes
> >> > > > > > >>>>>>>> up
> >> > > > > > >>>>>>>>>> to 7-8 hours. The only thing we can do, is reduce
> the
> >> > > > runtime
> >> > > > > > >>>> of
> >> > > > > > >>>>>> the
> >> > > > > > >>>>>>>> jobs
> >> > > > > > >>>>>>>>>> so we take less slots :-)
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>> Cheers, Fokko
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>> Op wo 26 jun. 2019 om 21:59 schreef Jarek Potiuk <
> >> > > > > > >>>>>>>>> Jarek.Potiuk@polidea.com
> >> > > > > > >>>>>>>>>>> :
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>>> Yep. That's what I suggested as the reason in the
> >> > ticket
> >> > > -
> >> > > > I
> >> > > > > > >>>>>> guess
> >> > > > > > >>>>>>>>> INFRA
> >> > > > > > >>>>>>>>>>> are the only people who can do anything about it
> >> > > (increase
> >> > > > > > >>>>>>>> concurrency
> >> > > > > > >>>>>>>>> ?
> >> > > > > > >>>>>>>>>>> pay more for Travis :)? ).
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> On Wed, Jun 26, 2019 at 9:51 PM Ash Berlin-Taylor
> <
> >> > > > > > >>>>>> ash@apache.org>
> >> > > > > > >>>>>>>>>> wrote:
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> I asked Travis on twitter and they said it was
> due
> >> to
> >> > > the
> >> > > > > > >>>>>> Apache
> >> > > > > > >>>>>>>>> other
> >> > > > > > >>>>>>>>>>>> projects build queues
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>
> >> > https://twitter.com/travisci/status/1143893051460526080
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> -ash
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> On 26 June 2019 20:48:33 BST, Jarek Potiuk <
> >> > > > > > >>>>>>>> Jarek.Potiuk@polidea.com
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> wrote:
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>> Hello everyone,
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>> For the last few days the Travis builds for
> >> > > > > > >>>> apache/airflow
> >> > > > > > >>>>>>> project
> >> > > > > > >>>>>>>>> are
> >> > > > > > >>>>>>>>>>>>> waiting in a queue for hours. This is not a
> normal
> >> > > > > > >>>>> situation.
> >> > > > > > >>>>>>> I've
> >> > > > > > >>>>>>>>>>> opened
> >> > > > > > >>>>>>>>>>>>> INFRA ticket for that:
> >> > > > > > >>>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-18657
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>> J.
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> --
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> Jarek Potiuk
> >> > > > > > >>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal
> >> > Software
> >> > > > > > >>>>> Engineer
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> M: +48 660 796 129 <+48660796129>
> >> > > > > > >>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> --
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> Jarek Potiuk
> >> > > > > > >>>>>>>>> Polidea <https://www.polidea.com/> | Principal
> >> Software
> >> > > > > > >>>> Engineer
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> M: +48 660 796 129 <+48660796129>
> >> > > > > > >>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> --
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> Jarek Potiuk
> >> > > > > > >>>>>>> Polidea <https://www.polidea.com/> | Principal
> Software
> >> > > > Engineer
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> M: +48 660 796 129 <+48660796129>
> >> > > > > > >>>>>>> [image: Polidea] <https://www.polidea.com/>
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>
> >> > > > > > >>>>>
> >> > > > > > >>>>
> >> > > > > > >>>
> >> > > > > > >>>
> >> > > > > > >>> --
> >> > > > > > >>>
> >> > > > > > >>> Jarek Potiuk
> >> > > > > > >>> Polidea <https://www.polidea.com/> | Principal Software
> >> > Engineer
> >> > > > > > >>>
> >> > > > > > >>> M: +48 660 796 129 <+48660796129>
> >> > > > > > >>> [image: Polidea] <https://www.polidea.com/>
> >> > > > > > >>>
> >> > > > > > >>>
> >> > > > > > >>
> >> > > > > > >> --
> >> > > > > > >>
> >> > > > > > >> Jarek Potiuk
> >> > > > > > >> Polidea <https://www.polidea.com/> | Principal Software
> >> > Engineer
> >> > > > > > >>
> >> > > > > > >> M: +48 660 796 129 <+48660796129>
> >> > > > > > >> [image: Polidea] <https://www.polidea.com/>
> >> > > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > >
> >> > > > > Jarek Potiuk
> >> > > > > Polidea <https://www.polidea.com/> | Principal Software
> Engineer
> >> > > > >
> >> > > > > M: +48 660 796 129 <+48660796129>
> >> > > > > [image: Polidea] <https://www.polidea.com/>
> >> > > > >
> >> > > >
> >> > >
> >> >
> >> >
> >> > --
> >> >
> >> > Jarek Potiuk
> >> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >> >
> >> > M: +48 660 796 129 <+48660796129>
> >> > [image: Polidea] <https://www.polidea.com/>
> >> >
> >>
> >
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>

Re: Travis builds in a queue for hours

Posted by Jarek Potiuk <Ja...@polidea.com>.
Just FYI - I opened a ticket to get stats of the machine usage for Travis
to infra: https://issues.apache.org/jira/browse/INFRA-18742

On Thu, Jul 11, 2019 at 1:48 PM Jarek Potiuk <Ja...@polidea.com>
wrote:

> Absolutely! I thought about it today and GKE cluster would be perfect for
> us - especially that we can also use it to run Kubernetes tests on it !
> That's still a major pain having to setup minikube for the tests and having
> a GKE cluster that we can simply use would simplify this part a LOT.
>
> Principal Software Engineer
> Phone: +48660796129
>
> czw., 11 lip 2019, 09:26 użytkownik Driesprong, Fokko <fo...@driesprong.frl>
> napisał:
>
>> Yes, Gitlab works very well with GCP. A Kubernetes cluster with
>> autoscaling
>> for the runners would be perfect, and will also minimize the resources
>> provided by Google.
>>
>> Cheers, Fokko
>>
>> Op do 11 jul. 2019 om 07:13 schreef Jarek Potiuk <
>> Jarek.Potiuk@polidea.com>
>>
>> > Since more than few people (including myself) are in favour of GitLab
>> CI,
>> > and since Apache Infra is talking to GitLab CI, I will make sure to
>> check
>> > if we can combine the two approaches - workers from Google and managed,
>> > central GitlabCI interface to manage it (likely managed by the Infra
>> team).
>> > Airflow can easily be a  "guinea pig" for GitLab CI / Apache
>> integration.
>> > We also have quite an expertise in managin GitLab in my company (we use
>> > GitLab in Polidea for most of our mobile project CI and all the cloud
>> > builds that we run internally).
>> >
>> > I will make an AIP for that soon and involve the right people :).
>> >
>> > J.
>> >
>> > On Thu, Jul 11, 2019 at 8:01 AM Driesprong, Fokko <fokko@driesprong.frl
>> >
>> > wrote:
>> >
>> > > Regardings the numbers, I believe that INFRA has an overview of the
>> usage
>> > > per project. I think they are happy to share these numbers with you.
>> > Also,
>> > > it seems like there is also a queue in Jenkins:
>> > https://status.apache.org/
>> > >
>> > > Talking about Jenkins. I'm not a big fan of it. For example, Spark
>> uses
>> > it,
>> > > and it is rather difficult to set up the environment yourself, in
>> > contrast
>> > > with Travis. I also have good experiences with Gitlab since that is
>> the
>> > > only Docker native CI in my personal opinion.
>> > >
>> > > > But we can try both of course. And even switch later.
>> > > There is nothing as permanent as a temporary solution :-) However, I'm
>> > not
>> > > against trying. I've checked the beam project, and the integration
>> with
>> > > Github looks good.
>> > >
>> > > Thanks again Jarek and Aizhamal for all the work an effort.
>> > >
>> > > Cheers, Fokko
>> > >
>> > >
>> > >
>> > >
>> > > Op wo 10 jul. 2019 om 23:11 schreef Aizhamal Nurmamat kyzy <
>> > > aizhamal@apache.org>:
>> > >
>> > > > Hi all,
>> > > >
>> > > > I am still working on trying to get approvals for this, so this is
>> not
>> > > yet
>> > > > a done deal. I'll keep y'all updated.
>> > > >
>> > > > As for the CI solution to use, we have no particular inclination. As
>> > long
>> > > > as the community supports it, and it is consistent with any Apache
>> > > > guidelines for CI from their projects. Jenkins and GitLab CI both
>> sound
>> > > > sensible.
>> > > >
>> > > > The email from INFRA says that Airflow runs 2600 hours of tests per
>> > > month,
>> > > > or the equivalent of about 4 machines. Can the community help with a
>> > > > reasonable estimate for this, so I can use it as a reference for the
>> > > > request?
>> > > >
>> > > > Thanks!
>> > > >
>> > > > On Wed, Jul 10, 2019 at 2:43 PM Jarek Potiuk <
>> Jarek.Potiuk@polidea.com
>> > >
>> > > > wrote:
>> > > >
>> > > > > Yeah. Gitlab CI is definitely what I would prefer as well from the
>> > > > > "modernity" point of view (and one of my very close friends is
>> Gitlab
>> > > CI
>> > > > > maintainer and actually The person who introduced CI to GitLab
>> > > > offering). I
>> > > > > also actually already catalysed discussion between GitLab and
>> Apache
>> > > > > infrastructure to introduce GitLab CI on the "Apache" level (they
>> are
>> > > > > talking about it now I believe).
>> > > > >
>> > > > > But from Google <> Apache/Procedural point of view it might
>> simply be
>> > > > > easier to follow footsteps of Apache Beam. It might simply be few
>> > > clicks
>> > > > > away for the Apache Infrastructure to add more machines and
>> connect
>> > > them
>> > > > to
>> > > > > the Apache Jenkins for our project. If we have a path cleared by
>> > > others,
>> > > > > following it might be simply much faster.
>> > > > >
>> > > > > But we can try both of course. And even switch later. The Docker
>> CI
>> > > > > approach I am about to merge is designed to be super-easy to
>> switch
>> > > > betwen
>> > > > > CI systems. Virtually ALL the build logic is in scripts  in shared
>> > > Docker
>> > > > > images. There is basically one file per CI system to add and we
>> can
>> > > > support
>> > > > > Travis/Jenkins/CloudBuild/CircleCI - whatever we imaging. We can
>> even
>> > > > > support all of them at the same time :)
>> > > > >
>> > > > > J.
>> > > > >
>> > > > > On Wed, Jul 10, 2019 at 11:32 PM Bolke de Bruin <
>> bdbruin@gmail.com>
>> > > > wrote:
>> > > > >
>> > > > > > If you need an alternative why not use a couple of gitlab-ci
>> > runners?
>> > > > > Much
>> > > > > > easier to maintain, light weight, and much closer to what we use
>> > now.
>> > > > > >
>> > > > > > B.
>> > > > > >
>> > > > > > Verstuurd vanaf mijn iPad
>> > > > > >
>> > > > > > > Op 10 jul. 2019 om 23:27 heeft Bolke de Bruin <
>> bdbruin@gmail.com
>> > >
>> > > > het
>> > > > > > volgende geschreven:
>> > > > > > >
>> > > > > > > Awesome! But I hope you are not serious about using Jenkins
>> > right?
>> > > > If I
>> > > > > > need to start a Holy War it would be against Jenkins.
>> > > > > > >
>> > > > > > > B.
>> > > > > > >
>> > > > > > > Verstuurd vanaf mijn iPad
>> > > > > > >
>> > > > > > >> Op 10 jul. 2019 om 22:55 heeft Jarek Potiuk <
>> > > > Jarek.Potiuk@polidea.com
>> > > > > >
>> > > > > > het volgende geschreven:
>> > > > > > >>
>> > > > > > >> Hello Everyone,
>> > > > > > >>
>> > > > > > >> I have some really good news. I just had a call with Google
>> OSS
>> > > team
>> > > > > > (Gris,
>> > > > > > >> Aizhamal) and they are willing to donate VMs on Google Cloud
>> > > > Platform
>> > > > > to
>> > > > > > >> run CI for Airflow. In order to simplify the setup (and make
>> > sure
>> > > it
>> > > > > is
>> > > > > > ok
>> > > > > > >> according to Apache regulations) we think we should go
>> exactly
>> > the
>> > > > > same
>> > > > > > >> route as Apache Beam project (Google donated 16x 16CPU
>> machines
>> > > for
>> > > > > > them).
>> > > > > > >> The route of Apache Beam is to use the machines as workers
>> for
>> > > > Apache
>> > > > > > >> Jenkins (https://builds.apache.org/). Apache Jenkins is one
>> of
>> > > the
>> > > > > > >> encouraged CI solutions by Apache and if we can have workers
>> > > > connected
>> > > > > > to
>> > > > > > >> the existing Jenkins master of Apache, it means that the
>> > > maintenance
>> > > > > > >> overhead will be pretty minimal. And we can follow Apache
>> Beam
>> > > setup
>> > > > > so
>> > > > > > I
>> > > > > > >> do not expect any legal problems.
>> > > > > > >>
>> > > > > > >> I also work very closely with the team that uses Apache Beam
>> > > Jenkins
>> > > > > > >> heavily so I have access to all the necessary experts to help
>> > with
>> > > > the
>> > > > > > >> setup (and I am happy to help with that).
>> > > > > > >>
>> > > > > > >> I really hope everyone in the community will be really happy
>> to
>> > go
>> > > > in
>> > > > > > that
>> > > > > > >> direction - it's. Please let me know if you have any
>> concerns !
>> > > > > > >>
>> > > > > > >> We do not need as many machines as Beam for sure (Beam uses
>> the
>> > > > > > machines to
>> > > > > > >> process a lot of data for tests including some load testing)
>> but
>> > > we
>> > > > > > need to
>> > > > > > >> estimate the number/types of machines that we are going to
>> need.
>> > > > > > >> Fokko, Ash, others - do you have some recent numbers for the
>> > > current
>> > > > > > usage
>> > > > > > >> or should I open an Infrastructure ticket for it?
>> > > > > > >>
>> > > > > > >> J
>> > > > > > >>
>> > > > > > >> On Fri, Jun 28, 2019 at 4:50 PM Jarek Potiuk <
>> > > > > Jarek.Potiuk@polidea.com>
>> > > > > > >> wrote:
>> > > > > > >>
>> > > > > > >>> Thanks Aizhamal! I spoke already to Gris and she confirmed
>> that
>> > > as
>> > > > > well
>> > > > > > >>> and the 8th of July date is ok for us as we will have to
>> > evaluate
>> > > > and
>> > > > > > >>> prepare as well. Have a nice trip.
>> > > > > > >>>
>> > > > > > >>> J.
>> > > > > > >>>
>> > > > > > >>> On Fri, Jun 28, 2019 at 4:25 PM Aizhamal Nurmamat kyzy
>> > > > > > >>> <ai...@google.com.invalid> wrote:
>> > > > > > >>>
>> > > > > > >>>> Hi all,
>> > > > > > >>>>
>> > > > > > >>>> On Thu, Jun 27, 2019 at 15:28 Jarek Potiuk <
>> > > > > Jarek.Potiuk@polidea.com>
>> > > > > > >>>> wrote:
>> > > > > > >>>>
>> > > > > > >>>>> Yeah. I also have a working version of Cloud build
>> > > configuration
>> > > > > and
>> > > > > > we
>> > > > > > >>>> can
>> > > > > > >>>>> run the tests on cloud build if we can get some credits
>> from
>> > > > > Google.
>> > > > > > >>>>
>> > > > > > >>>>
>> > > > > > >>>> I can look into getting a small amount of credits approved
>> for
>> > > > this,
>> > > > > > to
>> > > > > > >>>> see
>> > > > > > >>>> if it’s useful to offload some tests to Cloud Build, or to
>> > > > provision
>> > > > > > some
>> > > > > > >>>> VMs to run on Apache Infra.
>> > > > > > >>>>
>> > > > > > >>>> I am traveling at the moment, but I’ll be back in the
>> office
>> > on
>> > > > July
>> > > > > > 8,
>> > > > > > >>>> and
>> > > > > > >>>> I’ll try to get this done.
>> > > > > > >>>>
>> > > > > > >>>>
>> > > > > > >>>> Thanks,
>> > > > > > >>>> Aizhamal
>> > > > > > >>>>
>> > > > > > >>>> And
>> > > > > > >>>>> the changes from the upcoming CI image will make it much
>> > easier
>> > > > to
>> > > > > > run
>> > > > > > >>>>> tests on any CI provider. Except Kubernetes tests they are
>> > > pretty
>> > > > > > much
>> > > > > > >>>>> CI-agnostic. Kubernetes tests will likely be also fixed
>> soon.
>> > > > > > >>>>>
>> > > > > > >>>>> Another idea: I thought that in the future we can also run
>> > only
>> > > > > > subset
>> > > > > > >>>> of
>> > > > > > >>>>> postgres/mysql/sqlite tests on all combinations. I think
>> > there
>> > > > are
>> > > > > > just
>> > > > > > >>>>> handful of tests that are specific for backend (and we
>> > already
>> > > > know
>> > > > > > >>>> which
>> > > > > > >>>>> ones they are - they are skipped-if).
>> > > > > > >>>>>
>> > > > > > >>>>> J.
>> > > > > > >>>>>
>> > > > > > >>>>> Principal Software Engineer
>> > > > > > >>>>> Phone: +48660796129
>> > > > > > >>>>>
>> > > > > > >>>>> czw., 27 cze 2019, 15:12 użytkownik Philippe Gagnon <
>> > > > > > >>>> philgagnon1@gmail.com
>> > > > > > >>>>>>
>> > > > > > >>>>> napisał:
>> > > > > > >>>>>
>> > > > > > >>>>>> I think the combinations that you are proposing are
>> sensible
>> > > for
>> > > > > > >>>>> pre-merge
>> > > > > > >>>>>> checks.
>> > > > > > >>>>>>
>> > > > > > >>>>>> I am working on a proposal to offload extra combinations
>> to
>> > > > > another
>> > > > > > CI
>> > > > > > >>>>>> provider (Azure DevOps specifically seems like a good
>> > > > candidate),
>> > > > > > >>>> either
>> > > > > > >>>>>> pre or post merge. Ideally I'd like to run more
>> combinations
>> > > > > > pre-merge
>> > > > > > >>>>> but
>> > > > > > >>>>>> there is a trade-off to be conscious of here between
>> > > development
>> > > > > > >>>> velocity
>> > > > > > >>>>>> and quality assurance, which I think this issue
>> highlights
>> > > quite
>> > > > > > well.
>> > > > > > >>>>>>
>> > > > > > >>>>>> Please let me know your thoughts
>> > > > > > >>>>>>
>> > > > > > >>>>>> Philippe
>> > > > > > >>>>>>
>> > > > > > >>>>>> On Thu, Jun 27, 2019 at 9:05 AM Jarek Potiuk <
>> > > > > > >>>> Jarek.Potiuk@polidea.com>
>> > > > > > >>>>>> wrote:
>> > > > > > >>>>>>
>> > > > > > >>>>>>> Agree that we should be thoughtful about others as
>> well: In
>> > > the
>> > > > > > >>>> latest
>> > > > > > >>>>>> push
>> > > > > > >>>>>>> (few minutes ago) of the upcoming official CI image i
>> > > > implemented
>> > > > > > >>>> the
>> > > > > > >>>>>>> change we discussed in the Github where we limit the
>> number
>> > > of
>> > > > > > >>>>>> combinations
>> > > > > > >>>>>>> we test:
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> You can see it yourself:
>> > > > > > >>>>>>> https://travis-ci.org/apache/airflow/builds/551305240
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> Those are the combinations I propose:
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> Python: 3.6
>> > > > > > >>>>>>> BACKEND=mysql ENV=docker
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> Python: 3.6
>> > > > > > >>>>>>> BACKEND=postgres ENV=docker
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> Python: 3.5
>> > > > > > >>>>>>> BACKEND=sqlite ENV=docker
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> Python: 3.6
>> > > > > > >>>>>>> BACKEND=postgres ENV=kubernetes
>> KUBERNETES_VERSION=v1.13.0
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> J,
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> On Thu, Jun 27, 2019 at 11:00 AM Driesprong, Fokko
>> > > > > > >>>>> <fokko@driesprong.frl
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> wrote:
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> We got this message last year:
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>>> Hello, Airflow PPMC.
>> > > > > > >>>>>>>>> While going through the usage statistics for our
>> Travis
>> > CI
>> > > > > > >>>>> service, I
>> > > > > > >>>>>>>>> have noticed that the Airflow project is using an
>> > > abnormally
>> > > > > > >>>> large
>> > > > > > >>>>>>>>> amount of resources, 2600 hours per month or the
>> > equivalent
>> > > > of
>> > > > > > >>>>> having
>> > > > > > >>>>>>>>> almost 4 machines building airflow non-stop 24/7. As
>> this
>> > > is
>> > > > > not
>> > > > > > >>>>>> free,
>> > > > > > >>>>>>>>> but rather costing us money, I'm contacting you with
>> the
>> > > > > > >>>> intention
>> > > > > > >>>>> of
>> > > > > > >>>>>>>>> figuring out ways to reduce the use of Travis for the
>> > > > project.
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>>> We would greatly prefer that the project itself comes
>> up
>> > > > with a
>> > > > > > >>>>>>> solution
>> > > > > > >>>>>>>>> to lower the usage of Travis, as we'd hate to simply
>> turn
>> > > it
>> > > > > off
>> > > > > > >>>>> for
>> > > > > > >>>>>>>>> you, but the usage is at a rather severe level,
>> totaling
>> > > more
>> > > > > > >>>> than
>> > > > > > >>>>>> 21%
>> > > > > > >>>>>>>>> of the total build time of all projects using Travis,
>> so
>> > > > > > >>>> something
>> > > > > > >>>>>>>>> actionable should be decided upon and (preferably)
>> > > completed
>> > > > by
>> > > > > > >>>> the
>> > > > > > >>>>>> end
>> > > > > > >>>>>>>>> of May that will reduce the consumption of Travis
>> > > resources.
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>>> Alternately, if you are unable to lower the pressure
>> on
>> > > > Travis,
>> > > > > > >>>> the
>> > > > > > >>>>>>>>> podling and/or IPMC may ask the board of directors
>> for a
>> > > > > > >>>> separate
>> > > > > > >>>>>>> budget
>> > > > > > >>>>>>>>> for additional build nodes to cope with the added
>> load -
>> > > I'll
>> > > > > > >>>> leave
>> > > > > > >>>>>>> this
>> > > > > > >>>>>>>>> for the podling and IPMC to decide on.
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>>> Please let us know when you have decided on a plan to
>> > > remedy
>> > > > > > >>>> this
>> > > > > > >>>>>>>> situation.
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>>> With regards,
>> > > > > > >>>>>>>>> Daniel on behalf of ASF Infrastructure.
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>> I think more and more projects are still migrating to
>> the
>> > > ASF
>> > > > > > >>>> Travis,
>> > > > > > >>>>>> so
>> > > > > > >>>>>>> I
>> > > > > > >>>>>>>> think natural that there is more load. However, this
>> still
>> > > > > leaves
>> > > > > > >>>> the
>> > > > > > >>>>>>>> question if we have to run the full matrix.
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>> Cheers, Fokko
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>> Op do 27 jun. 2019 om 10:56 schreef Jarek Potiuk <
>> > > > > > >>>>>>> Jarek.Potiuk@polidea.com
>> > > > > > >>>>>>>>> :
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>>> I think we should really involve infra to increase the
>> > slot
>> > > > > > >>>> number
>> > > > > > >>>>> or
>> > > > > > >>>>>>>> maybe
>> > > > > > >>>>>>>>> even somehow allocate slots per project.
>> > > > > > >>>>>>>>> The problem is that we cannot control what other
>> apache
>> > > > > projects
>> > > > > > >>>>> are
>> > > > > > >>>>>>>> doing,
>> > > > > > >>>>>>>>> so even if we decrease our runtime, it's the other
>> > projects
>> > > > > that
>> > > > > > >>>>>> might
>> > > > > > >>>>>>>> hold
>> > > > > > >>>>>>>>> us in the queue :(
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> J.
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> On Thu, Jun 27, 2019 at 10:19 AM Driesprong, Fokko
>> > > > > > >>>>>>> <fokko@driesprong.frl
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> wrote:
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>> I've noticed this at other Apache projects as well,
>> > > > sometimes
>> > > > > > >>>> it
>> > > > > > >>>>>>> takes
>> > > > > > >>>>>>>> up
>> > > > > > >>>>>>>>>> to 7-8 hours. The only thing we can do, is reduce the
>> > > > runtime
>> > > > > > >>>> of
>> > > > > > >>>>>> the
>> > > > > > >>>>>>>> jobs
>> > > > > > >>>>>>>>>> so we take less slots :-)
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>> Cheers, Fokko
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>> Op wo 26 jun. 2019 om 21:59 schreef Jarek Potiuk <
>> > > > > > >>>>>>>>> Jarek.Potiuk@polidea.com
>> > > > > > >>>>>>>>>>> :
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>> Yep. That's what I suggested as the reason in the
>> > ticket
>> > > -
>> > > > I
>> > > > > > >>>>>> guess
>> > > > > > >>>>>>>>> INFRA
>> > > > > > >>>>>>>>>>> are the only people who can do anything about it
>> > > (increase
>> > > > > > >>>>>>>> concurrency
>> > > > > > >>>>>>>>> ?
>> > > > > > >>>>>>>>>>> pay more for Travis :)? ).
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>> On Wed, Jun 26, 2019 at 9:51 PM Ash Berlin-Taylor <
>> > > > > > >>>>>> ash@apache.org>
>> > > > > > >>>>>>>>>> wrote:
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> I asked Travis on twitter and they said it was due
>> to
>> > > the
>> > > > > > >>>>>> Apache
>> > > > > > >>>>>>>>> other
>> > > > > > >>>>>>>>>>>> projects build queues
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>
>> > https://twitter.com/travisci/status/1143893051460526080
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> -ash
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> On 26 June 2019 20:48:33 BST, Jarek Potiuk <
>> > > > > > >>>>>>>> Jarek.Potiuk@polidea.com
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>>> wrote:
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>> Hello everyone,
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>> For the last few days the Travis builds for
>> > > > > > >>>> apache/airflow
>> > > > > > >>>>>>> project
>> > > > > > >>>>>>>>> are
>> > > > > > >>>>>>>>>>>>> waiting in a queue for hours. This is not a normal
>> > > > > > >>>>> situation.
>> > > > > > >>>>>>> I've
>> > > > > > >>>>>>>>>>> opened
>> > > > > > >>>>>>>>>>>>> INFRA ticket for that:
>> > > > > > >>>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-18657
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>> J.
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>> --
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>> Jarek Potiuk
>> > > > > > >>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal
>> > Software
>> > > > > > >>>>> Engineer
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>> M: +48 660 796 129 <+48660796129>
>> > > > > > >>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> --
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> Jarek Potiuk
>> > > > > > >>>>>>>>> Polidea <https://www.polidea.com/> | Principal
>> Software
>> > > > > > >>>> Engineer
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> M: +48 660 796 129 <+48660796129>
>> > > > > > >>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> --
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> Jarek Potiuk
>> > > > > > >>>>>>> Polidea <https://www.polidea.com/> | Principal Software
>> > > > Engineer
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> M: +48 660 796 129 <+48660796129>
>> > > > > > >>>>>>> [image: Polidea] <https://www.polidea.com/>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>
>> > > > > > >>>>>
>> > > > > > >>>>
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>> --
>> > > > > > >>>
>> > > > > > >>> Jarek Potiuk
>> > > > > > >>> Polidea <https://www.polidea.com/> | Principal Software
>> > Engineer
>> > > > > > >>>
>> > > > > > >>> M: +48 660 796 129 <+48660796129>
>> > > > > > >>> [image: Polidea] <https://www.polidea.com/>
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>
>> > > > > > >> --
>> > > > > > >>
>> > > > > > >> Jarek Potiuk
>> > > > > > >> Polidea <https://www.polidea.com/> | Principal Software
>> > Engineer
>> > > > > > >>
>> > > > > > >> M: +48 660 796 129 <+48660796129>
>> > > > > > >> [image: Polidea] <https://www.polidea.com/>
>> > > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > >
>> > > > > Jarek Potiuk
>> > > > > Polidea <https://www.polidea.com/> | Principal Software Engineer
>> > > > >
>> > > > > M: +48 660 796 129 <+48660796129>
>> > > > > [image: Polidea] <https://www.polidea.com/>
>> > > > >
>> > > >
>> > >
>> >
>> >
>> > --
>> >
>> > Jarek Potiuk
>> > Polidea <https://www.polidea.com/> | Principal Software Engineer
>> >
>> > M: +48 660 796 129 <+48660796129>
>> > [image: Polidea] <https://www.polidea.com/>
>> >
>>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Travis builds in a queue for hours

Posted by Jarek Potiuk <Ja...@polidea.com>.
Absolutely! I thought about it today and GKE cluster would be perfect for
us - especially that we can also use it to run Kubernetes tests on it !
That's still a major pain having to setup minikube for the tests and having
a GKE cluster that we can simply use would simplify this part a LOT.

Principal Software Engineer
Phone: +48660796129

czw., 11 lip 2019, 09:26 użytkownik Driesprong, Fokko <fo...@driesprong.frl>
napisał:

> Yes, Gitlab works very well with GCP. A Kubernetes cluster with autoscaling
> for the runners would be perfect, and will also minimize the resources
> provided by Google.
>
> Cheers, Fokko
>
> Op do 11 jul. 2019 om 07:13 schreef Jarek Potiuk <Jarek.Potiuk@polidea.com
> >
>
> > Since more than few people (including myself) are in favour of GitLab CI,
> > and since Apache Infra is talking to GitLab CI, I will make sure to check
> > if we can combine the two approaches - workers from Google and managed,
> > central GitlabCI interface to manage it (likely managed by the Infra
> team).
> > Airflow can easily be a  "guinea pig" for GitLab CI / Apache integration.
> > We also have quite an expertise in managin GitLab in my company (we use
> > GitLab in Polidea for most of our mobile project CI and all the cloud
> > builds that we run internally).
> >
> > I will make an AIP for that soon and involve the right people :).
> >
> > J.
> >
> > On Thu, Jul 11, 2019 at 8:01 AM Driesprong, Fokko <fo...@driesprong.frl>
> > wrote:
> >
> > > Regardings the numbers, I believe that INFRA has an overview of the
> usage
> > > per project. I think they are happy to share these numbers with you.
> > Also,
> > > it seems like there is also a queue in Jenkins:
> > https://status.apache.org/
> > >
> > > Talking about Jenkins. I'm not a big fan of it. For example, Spark uses
> > it,
> > > and it is rather difficult to set up the environment yourself, in
> > contrast
> > > with Travis. I also have good experiences with Gitlab since that is the
> > > only Docker native CI in my personal opinion.
> > >
> > > > But we can try both of course. And even switch later.
> > > There is nothing as permanent as a temporary solution :-) However, I'm
> > not
> > > against trying. I've checked the beam project, and the integration with
> > > Github looks good.
> > >
> > > Thanks again Jarek and Aizhamal for all the work an effort.
> > >
> > > Cheers, Fokko
> > >
> > >
> > >
> > >
> > > Op wo 10 jul. 2019 om 23:11 schreef Aizhamal Nurmamat kyzy <
> > > aizhamal@apache.org>:
> > >
> > > > Hi all,
> > > >
> > > > I am still working on trying to get approvals for this, so this is
> not
> > > yet
> > > > a done deal. I'll keep y'all updated.
> > > >
> > > > As for the CI solution to use, we have no particular inclination. As
> > long
> > > > as the community supports it, and it is consistent with any Apache
> > > > guidelines for CI from their projects. Jenkins and GitLab CI both
> sound
> > > > sensible.
> > > >
> > > > The email from INFRA says that Airflow runs 2600 hours of tests per
> > > month,
> > > > or the equivalent of about 4 machines. Can the community help with a
> > > > reasonable estimate for this, so I can use it as a reference for the
> > > > request?
> > > >
> > > > Thanks!
> > > >
> > > > On Wed, Jul 10, 2019 at 2:43 PM Jarek Potiuk <
> Jarek.Potiuk@polidea.com
> > >
> > > > wrote:
> > > >
> > > > > Yeah. Gitlab CI is definitely what I would prefer as well from the
> > > > > "modernity" point of view (and one of my very close friends is
> Gitlab
> > > CI
> > > > > maintainer and actually The person who introduced CI to GitLab
> > > > offering). I
> > > > > also actually already catalysed discussion between GitLab and
> Apache
> > > > > infrastructure to introduce GitLab CI on the "Apache" level (they
> are
> > > > > talking about it now I believe).
> > > > >
> > > > > But from Google <> Apache/Procedural point of view it might simply
> be
> > > > > easier to follow footsteps of Apache Beam. It might simply be few
> > > clicks
> > > > > away for the Apache Infrastructure to add more machines and connect
> > > them
> > > > to
> > > > > the Apache Jenkins for our project. If we have a path cleared by
> > > others,
> > > > > following it might be simply much faster.
> > > > >
> > > > > But we can try both of course. And even switch later. The Docker CI
> > > > > approach I am about to merge is designed to be super-easy to switch
> > > > betwen
> > > > > CI systems. Virtually ALL the build logic is in scripts  in shared
> > > Docker
> > > > > images. There is basically one file per CI system to add and we can
> > > > support
> > > > > Travis/Jenkins/CloudBuild/CircleCI - whatever we imaging. We can
> even
> > > > > support all of them at the same time :)
> > > > >
> > > > > J.
> > > > >
> > > > > On Wed, Jul 10, 2019 at 11:32 PM Bolke de Bruin <bdbruin@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > If you need an alternative why not use a couple of gitlab-ci
> > runners?
> > > > > Much
> > > > > > easier to maintain, light weight, and much closer to what we use
> > now.
> > > > > >
> > > > > > B.
> > > > > >
> > > > > > Verstuurd vanaf mijn iPad
> > > > > >
> > > > > > > Op 10 jul. 2019 om 23:27 heeft Bolke de Bruin <
> bdbruin@gmail.com
> > >
> > > > het
> > > > > > volgende geschreven:
> > > > > > >
> > > > > > > Awesome! But I hope you are not serious about using Jenkins
> > right?
> > > > If I
> > > > > > need to start a Holy War it would be against Jenkins.
> > > > > > >
> > > > > > > B.
> > > > > > >
> > > > > > > Verstuurd vanaf mijn iPad
> > > > > > >
> > > > > > >> Op 10 jul. 2019 om 22:55 heeft Jarek Potiuk <
> > > > Jarek.Potiuk@polidea.com
> > > > > >
> > > > > > het volgende geschreven:
> > > > > > >>
> > > > > > >> Hello Everyone,
> > > > > > >>
> > > > > > >> I have some really good news. I just had a call with Google
> OSS
> > > team
> > > > > > (Gris,
> > > > > > >> Aizhamal) and they are willing to donate VMs on Google Cloud
> > > > Platform
> > > > > to
> > > > > > >> run CI for Airflow. In order to simplify the setup (and make
> > sure
> > > it
> > > > > is
> > > > > > ok
> > > > > > >> according to Apache regulations) we think we should go exactly
> > the
> > > > > same
> > > > > > >> route as Apache Beam project (Google donated 16x 16CPU
> machines
> > > for
> > > > > > them).
> > > > > > >> The route of Apache Beam is to use the machines as workers for
> > > > Apache
> > > > > > >> Jenkins (https://builds.apache.org/). Apache Jenkins is one
> of
> > > the
> > > > > > >> encouraged CI solutions by Apache and if we can have workers
> > > > connected
> > > > > > to
> > > > > > >> the existing Jenkins master of Apache, it means that the
> > > maintenance
> > > > > > >> overhead will be pretty minimal. And we can follow Apache Beam
> > > setup
> > > > > so
> > > > > > I
> > > > > > >> do not expect any legal problems.
> > > > > > >>
> > > > > > >> I also work very closely with the team that uses Apache Beam
> > > Jenkins
> > > > > > >> heavily so I have access to all the necessary experts to help
> > with
> > > > the
> > > > > > >> setup (and I am happy to help with that).
> > > > > > >>
> > > > > > >> I really hope everyone in the community will be really happy
> to
> > go
> > > > in
> > > > > > that
> > > > > > >> direction - it's. Please let me know if you have any concerns
> !
> > > > > > >>
> > > > > > >> We do not need as many machines as Beam for sure (Beam uses
> the
> > > > > > machines to
> > > > > > >> process a lot of data for tests including some load testing)
> but
> > > we
> > > > > > need to
> > > > > > >> estimate the number/types of machines that we are going to
> need.
> > > > > > >> Fokko, Ash, others - do you have some recent numbers for the
> > > current
> > > > > > usage
> > > > > > >> or should I open an Infrastructure ticket for it?
> > > > > > >>
> > > > > > >> J
> > > > > > >>
> > > > > > >> On Fri, Jun 28, 2019 at 4:50 PM Jarek Potiuk <
> > > > > Jarek.Potiuk@polidea.com>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >>> Thanks Aizhamal! I spoke already to Gris and she confirmed
> that
> > > as
> > > > > well
> > > > > > >>> and the 8th of July date is ok for us as we will have to
> > evaluate
> > > > and
> > > > > > >>> prepare as well. Have a nice trip.
> > > > > > >>>
> > > > > > >>> J.
> > > > > > >>>
> > > > > > >>> On Fri, Jun 28, 2019 at 4:25 PM Aizhamal Nurmamat kyzy
> > > > > > >>> <ai...@google.com.invalid> wrote:
> > > > > > >>>
> > > > > > >>>> Hi all,
> > > > > > >>>>
> > > > > > >>>> On Thu, Jun 27, 2019 at 15:28 Jarek Potiuk <
> > > > > Jarek.Potiuk@polidea.com>
> > > > > > >>>> wrote:
> > > > > > >>>>
> > > > > > >>>>> Yeah. I also have a working version of Cloud build
> > > configuration
> > > > > and
> > > > > > we
> > > > > > >>>> can
> > > > > > >>>>> run the tests on cloud build if we can get some credits
> from
> > > > > Google.
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>> I can look into getting a small amount of credits approved
> for
> > > > this,
> > > > > > to
> > > > > > >>>> see
> > > > > > >>>> if it’s useful to offload some tests to Cloud Build, or to
> > > > provision
> > > > > > some
> > > > > > >>>> VMs to run on Apache Infra.
> > > > > > >>>>
> > > > > > >>>> I am traveling at the moment, but I’ll be back in the office
> > on
> > > > July
> > > > > > 8,
> > > > > > >>>> and
> > > > > > >>>> I’ll try to get this done.
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>> Thanks,
> > > > > > >>>> Aizhamal
> > > > > > >>>>
> > > > > > >>>> And
> > > > > > >>>>> the changes from the upcoming CI image will make it much
> > easier
> > > > to
> > > > > > run
> > > > > > >>>>> tests on any CI provider. Except Kubernetes tests they are
> > > pretty
> > > > > > much
> > > > > > >>>>> CI-agnostic. Kubernetes tests will likely be also fixed
> soon.
> > > > > > >>>>>
> > > > > > >>>>> Another idea: I thought that in the future we can also run
> > only
> > > > > > subset
> > > > > > >>>> of
> > > > > > >>>>> postgres/mysql/sqlite tests on all combinations. I think
> > there
> > > > are
> > > > > > just
> > > > > > >>>>> handful of tests that are specific for backend (and we
> > already
> > > > know
> > > > > > >>>> which
> > > > > > >>>>> ones they are - they are skipped-if).
> > > > > > >>>>>
> > > > > > >>>>> J.
> > > > > > >>>>>
> > > > > > >>>>> Principal Software Engineer
> > > > > > >>>>> Phone: +48660796129
> > > > > > >>>>>
> > > > > > >>>>> czw., 27 cze 2019, 15:12 użytkownik Philippe Gagnon <
> > > > > > >>>> philgagnon1@gmail.com
> > > > > > >>>>>>
> > > > > > >>>>> napisał:
> > > > > > >>>>>
> > > > > > >>>>>> I think the combinations that you are proposing are
> sensible
> > > for
> > > > > > >>>>> pre-merge
> > > > > > >>>>>> checks.
> > > > > > >>>>>>
> > > > > > >>>>>> I am working on a proposal to offload extra combinations
> to
> > > > > another
> > > > > > CI
> > > > > > >>>>>> provider (Azure DevOps specifically seems like a good
> > > > candidate),
> > > > > > >>>> either
> > > > > > >>>>>> pre or post merge. Ideally I'd like to run more
> combinations
> > > > > > pre-merge
> > > > > > >>>>> but
> > > > > > >>>>>> there is a trade-off to be conscious of here between
> > > development
> > > > > > >>>> velocity
> > > > > > >>>>>> and quality assurance, which I think this issue highlights
> > > quite
> > > > > > well.
> > > > > > >>>>>>
> > > > > > >>>>>> Please let me know your thoughts
> > > > > > >>>>>>
> > > > > > >>>>>> Philippe
> > > > > > >>>>>>
> > > > > > >>>>>> On Thu, Jun 27, 2019 at 9:05 AM Jarek Potiuk <
> > > > > > >>>> Jarek.Potiuk@polidea.com>
> > > > > > >>>>>> wrote:
> > > > > > >>>>>>
> > > > > > >>>>>>> Agree that we should be thoughtful about others as well:
> In
> > > the
> > > > > > >>>> latest
> > > > > > >>>>>> push
> > > > > > >>>>>>> (few minutes ago) of the upcoming official CI image i
> > > > implemented
> > > > > > >>>> the
> > > > > > >>>>>>> change we discussed in the Github where we limit the
> number
> > > of
> > > > > > >>>>>> combinations
> > > > > > >>>>>>> we test:
> > > > > > >>>>>>>
> > > > > > >>>>>>> You can see it yourself:
> > > > > > >>>>>>> https://travis-ci.org/apache/airflow/builds/551305240
> > > > > > >>>>>>>
> > > > > > >>>>>>> Those are the combinations I propose:
> > > > > > >>>>>>>
> > > > > > >>>>>>> Python: 3.6
> > > > > > >>>>>>> BACKEND=mysql ENV=docker
> > > > > > >>>>>>>
> > > > > > >>>>>>> Python: 3.6
> > > > > > >>>>>>> BACKEND=postgres ENV=docker
> > > > > > >>>>>>>
> > > > > > >>>>>>> Python: 3.5
> > > > > > >>>>>>> BACKEND=sqlite ENV=docker
> > > > > > >>>>>>>
> > > > > > >>>>>>> Python: 3.6
> > > > > > >>>>>>> BACKEND=postgres ENV=kubernetes
> KUBERNETES_VERSION=v1.13.0
> > > > > > >>>>>>>
> > > > > > >>>>>>> J,
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>> On Thu, Jun 27, 2019 at 11:00 AM Driesprong, Fokko
> > > > > > >>>>> <fokko@driesprong.frl
> > > > > > >>>>>>>
> > > > > > >>>>>>> wrote:
> > > > > > >>>>>>>
> > > > > > >>>>>>>> We got this message last year:
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>> Hello, Airflow PPMC.
> > > > > > >>>>>>>>> While going through the usage statistics for our Travis
> > CI
> > > > > > >>>>> service, I
> > > > > > >>>>>>>>> have noticed that the Airflow project is using an
> > > abnormally
> > > > > > >>>> large
> > > > > > >>>>>>>>> amount of resources, 2600 hours per month or the
> > equivalent
> > > > of
> > > > > > >>>>> having
> > > > > > >>>>>>>>> almost 4 machines building airflow non-stop 24/7. As
> this
> > > is
> > > > > not
> > > > > > >>>>>> free,
> > > > > > >>>>>>>>> but rather costing us money, I'm contacting you with
> the
> > > > > > >>>> intention
> > > > > > >>>>> of
> > > > > > >>>>>>>>> figuring out ways to reduce the use of Travis for the
> > > > project.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>> We would greatly prefer that the project itself comes
> up
> > > > with a
> > > > > > >>>>>>> solution
> > > > > > >>>>>>>>> to lower the usage of Travis, as we'd hate to simply
> turn
> > > it
> > > > > off
> > > > > > >>>>> for
> > > > > > >>>>>>>>> you, but the usage is at a rather severe level,
> totaling
> > > more
> > > > > > >>>> than
> > > > > > >>>>>> 21%
> > > > > > >>>>>>>>> of the total build time of all projects using Travis,
> so
> > > > > > >>>> something
> > > > > > >>>>>>>>> actionable should be decided upon and (preferably)
> > > completed
> > > > by
> > > > > > >>>> the
> > > > > > >>>>>> end
> > > > > > >>>>>>>>> of May that will reduce the consumption of Travis
> > > resources.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>> Alternately, if you are unable to lower the pressure on
> > > > Travis,
> > > > > > >>>> the
> > > > > > >>>>>>>>> podling and/or IPMC may ask the board of directors for
> a
> > > > > > >>>> separate
> > > > > > >>>>>>> budget
> > > > > > >>>>>>>>> for additional build nodes to cope with the added load
> -
> > > I'll
> > > > > > >>>> leave
> > > > > > >>>>>>> this
> > > > > > >>>>>>>>> for the podling and IPMC to decide on.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>> Please let us know when you have decided on a plan to
> > > remedy
> > > > > > >>>> this
> > > > > > >>>>>>>> situation.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>> With regards,
> > > > > > >>>>>>>>> Daniel on behalf of ASF Infrastructure.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> I think more and more projects are still migrating to
> the
> > > ASF
> > > > > > >>>> Travis,
> > > > > > >>>>>> so
> > > > > > >>>>>>> I
> > > > > > >>>>>>>> think natural that there is more load. However, this
> still
> > > > > leaves
> > > > > > >>>> the
> > > > > > >>>>>>>> question if we have to run the full matrix.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Cheers, Fokko
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Op do 27 jun. 2019 om 10:56 schreef Jarek Potiuk <
> > > > > > >>>>>>> Jarek.Potiuk@polidea.com
> > > > > > >>>>>>>>> :
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>> I think we should really involve infra to increase the
> > slot
> > > > > > >>>> number
> > > > > > >>>>> or
> > > > > > >>>>>>>> maybe
> > > > > > >>>>>>>>> even somehow allocate slots per project.
> > > > > > >>>>>>>>> The problem is that we cannot control what other apache
> > > > > projects
> > > > > > >>>>> are
> > > > > > >>>>>>>> doing,
> > > > > > >>>>>>>>> so even if we decrease our runtime, it's the other
> > projects
> > > > > that
> > > > > > >>>>>> might
> > > > > > >>>>>>>> hold
> > > > > > >>>>>>>>> us in the queue :(
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> J.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> On Thu, Jun 27, 2019 at 10:19 AM Driesprong, Fokko
> > > > > > >>>>>>> <fokko@driesprong.frl
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> wrote:
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>> I've noticed this at other Apache projects as well,
> > > > sometimes
> > > > > > >>>> it
> > > > > > >>>>>>> takes
> > > > > > >>>>>>>> up
> > > > > > >>>>>>>>>> to 7-8 hours. The only thing we can do, is reduce the
> > > > runtime
> > > > > > >>>> of
> > > > > > >>>>>> the
> > > > > > >>>>>>>> jobs
> > > > > > >>>>>>>>>> so we take less slots :-)
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Cheers, Fokko
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Op wo 26 jun. 2019 om 21:59 schreef Jarek Potiuk <
> > > > > > >>>>>>>>> Jarek.Potiuk@polidea.com
> > > > > > >>>>>>>>>>> :
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>> Yep. That's what I suggested as the reason in the
> > ticket
> > > -
> > > > I
> > > > > > >>>>>> guess
> > > > > > >>>>>>>>> INFRA
> > > > > > >>>>>>>>>>> are the only people who can do anything about it
> > > (increase
> > > > > > >>>>>>>> concurrency
> > > > > > >>>>>>>>> ?
> > > > > > >>>>>>>>>>> pay more for Travis :)? ).
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> On Wed, Jun 26, 2019 at 9:51 PM Ash Berlin-Taylor <
> > > > > > >>>>>> ash@apache.org>
> > > > > > >>>>>>>>>> wrote:
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> I asked Travis on twitter and they said it was due
> to
> > > the
> > > > > > >>>>>> Apache
> > > > > > >>>>>>>>> other
> > > > > > >>>>>>>>>>>> projects build queues
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>
> > https://twitter.com/travisci/status/1143893051460526080
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> -ash
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> On 26 June 2019 20:48:33 BST, Jarek Potiuk <
> > > > > > >>>>>>>> Jarek.Potiuk@polidea.com
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>>> wrote:
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> Hello everyone,
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> For the last few days the Travis builds for
> > > > > > >>>> apache/airflow
> > > > > > >>>>>>> project
> > > > > > >>>>>>>>> are
> > > > > > >>>>>>>>>>>>> waiting in a queue for hours. This is not a normal
> > > > > > >>>>> situation.
> > > > > > >>>>>>> I've
> > > > > > >>>>>>>>>>> opened
> > > > > > >>>>>>>>>>>>> INFRA ticket for that:
> > > > > > >>>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-18657
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> J.
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> --
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> Jarek Potiuk
> > > > > > >>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal
> > Software
> > > > > > >>>>> Engineer
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> M: +48 660 796 129 <+48660796129>
> > > > > > >>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> --
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> Jarek Potiuk
> > > > > > >>>>>>>>> Polidea <https://www.polidea.com/> | Principal
> Software
> > > > > > >>>> Engineer
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> M: +48 660 796 129 <+48660796129>
> > > > > > >>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>> --
> > > > > > >>>>>>>
> > > > > > >>>>>>> Jarek Potiuk
> > > > > > >>>>>>> Polidea <https://www.polidea.com/> | Principal Software
> > > > Engineer
> > > > > > >>>>>>>
> > > > > > >>>>>>> M: +48 660 796 129 <+48660796129>
> > > > > > >>>>>>> [image: Polidea] <https://www.polidea.com/>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> --
> > > > > > >>>
> > > > > > >>> Jarek Potiuk
> > > > > > >>> Polidea <https://www.polidea.com/> | Principal Software
> > Engineer
> > > > > > >>>
> > > > > > >>> M: +48 660 796 129 <+48660796129>
> > > > > > >>> [image: Polidea] <https://www.polidea.com/>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>
> > > > > > >> --
> > > > > > >>
> > > > > > >> Jarek Potiuk
> > > > > > >> Polidea <https://www.polidea.com/> | Principal Software
> > Engineer
> > > > > > >>
> > > > > > >> M: +48 660 796 129 <+48660796129>
> > > > > > >> [image: Polidea] <https://www.polidea.com/>
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Jarek Potiuk
> > > > > Polidea <https://www.polidea.com/> | Principal Software Engineer
> > > > >
> > > > > M: +48 660 796 129 <+48660796129>
> > > > > [image: Polidea] <https://www.polidea.com/>
> > > > >
> > > >
> > >
> >
> >
> > --
> >
> > Jarek Potiuk
> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >
> > M: +48 660 796 129 <+48660796129>
> > [image: Polidea] <https://www.polidea.com/>
> >
>

Re: Travis builds in a queue for hours

Posted by "Driesprong, Fokko" <fo...@driesprong.frl>.
Yes, Gitlab works very well with GCP. A Kubernetes cluster with autoscaling
for the runners would be perfect, and will also minimize the resources
provided by Google.

Cheers, Fokko

Op do 11 jul. 2019 om 07:13 schreef Jarek Potiuk <Ja...@polidea.com>

> Since more than few people (including myself) are in favour of GitLab CI,
> and since Apache Infra is talking to GitLab CI, I will make sure to check
> if we can combine the two approaches - workers from Google and managed,
> central GitlabCI interface to manage it (likely managed by the Infra team).
> Airflow can easily be a  "guinea pig" for GitLab CI / Apache integration.
> We also have quite an expertise in managin GitLab in my company (we use
> GitLab in Polidea for most of our mobile project CI and all the cloud
> builds that we run internally).
>
> I will make an AIP for that soon and involve the right people :).
>
> J.
>
> On Thu, Jul 11, 2019 at 8:01 AM Driesprong, Fokko <fo...@driesprong.frl>
> wrote:
>
> > Regardings the numbers, I believe that INFRA has an overview of the usage
> > per project. I think they are happy to share these numbers with you.
> Also,
> > it seems like there is also a queue in Jenkins:
> https://status.apache.org/
> >
> > Talking about Jenkins. I'm not a big fan of it. For example, Spark uses
> it,
> > and it is rather difficult to set up the environment yourself, in
> contrast
> > with Travis. I also have good experiences with Gitlab since that is the
> > only Docker native CI in my personal opinion.
> >
> > > But we can try both of course. And even switch later.
> > There is nothing as permanent as a temporary solution :-) However, I'm
> not
> > against trying. I've checked the beam project, and the integration with
> > Github looks good.
> >
> > Thanks again Jarek and Aizhamal for all the work an effort.
> >
> > Cheers, Fokko
> >
> >
> >
> >
> > Op wo 10 jul. 2019 om 23:11 schreef Aizhamal Nurmamat kyzy <
> > aizhamal@apache.org>:
> >
> > > Hi all,
> > >
> > > I am still working on trying to get approvals for this, so this is not
> > yet
> > > a done deal. I'll keep y'all updated.
> > >
> > > As for the CI solution to use, we have no particular inclination. As
> long
> > > as the community supports it, and it is consistent with any Apache
> > > guidelines for CI from their projects. Jenkins and GitLab CI both sound
> > > sensible.
> > >
> > > The email from INFRA says that Airflow runs 2600 hours of tests per
> > month,
> > > or the equivalent of about 4 machines. Can the community help with a
> > > reasonable estimate for this, so I can use it as a reference for the
> > > request?
> > >
> > > Thanks!
> > >
> > > On Wed, Jul 10, 2019 at 2:43 PM Jarek Potiuk <Jarek.Potiuk@polidea.com
> >
> > > wrote:
> > >
> > > > Yeah. Gitlab CI is definitely what I would prefer as well from the
> > > > "modernity" point of view (and one of my very close friends is Gitlab
> > CI
> > > > maintainer and actually The person who introduced CI to GitLab
> > > offering). I
> > > > also actually already catalysed discussion between GitLab and Apache
> > > > infrastructure to introduce GitLab CI on the "Apache" level (they are
> > > > talking about it now I believe).
> > > >
> > > > But from Google <> Apache/Procedural point of view it might simply be
> > > > easier to follow footsteps of Apache Beam. It might simply be few
> > clicks
> > > > away for the Apache Infrastructure to add more machines and connect
> > them
> > > to
> > > > the Apache Jenkins for our project. If we have a path cleared by
> > others,
> > > > following it might be simply much faster.
> > > >
> > > > But we can try both of course. And even switch later. The Docker CI
> > > > approach I am about to merge is designed to be super-easy to switch
> > > betwen
> > > > CI systems. Virtually ALL the build logic is in scripts  in shared
> > Docker
> > > > images. There is basically one file per CI system to add and we can
> > > support
> > > > Travis/Jenkins/CloudBuild/CircleCI - whatever we imaging. We can even
> > > > support all of them at the same time :)
> > > >
> > > > J.
> > > >
> > > > On Wed, Jul 10, 2019 at 11:32 PM Bolke de Bruin <bd...@gmail.com>
> > > wrote:
> > > >
> > > > > If you need an alternative why not use a couple of gitlab-ci
> runners?
> > > > Much
> > > > > easier to maintain, light weight, and much closer to what we use
> now.
> > > > >
> > > > > B.
> > > > >
> > > > > Verstuurd vanaf mijn iPad
> > > > >
> > > > > > Op 10 jul. 2019 om 23:27 heeft Bolke de Bruin <bdbruin@gmail.com
> >
> > > het
> > > > > volgende geschreven:
> > > > > >
> > > > > > Awesome! But I hope you are not serious about using Jenkins
> right?
> > > If I
> > > > > need to start a Holy War it would be against Jenkins.
> > > > > >
> > > > > > B.
> > > > > >
> > > > > > Verstuurd vanaf mijn iPad
> > > > > >
> > > > > >> Op 10 jul. 2019 om 22:55 heeft Jarek Potiuk <
> > > Jarek.Potiuk@polidea.com
> > > > >
> > > > > het volgende geschreven:
> > > > > >>
> > > > > >> Hello Everyone,
> > > > > >>
> > > > > >> I have some really good news. I just had a call with Google OSS
> > team
> > > > > (Gris,
> > > > > >> Aizhamal) and they are willing to donate VMs on Google Cloud
> > > Platform
> > > > to
> > > > > >> run CI for Airflow. In order to simplify the setup (and make
> sure
> > it
> > > > is
> > > > > ok
> > > > > >> according to Apache regulations) we think we should go exactly
> the
> > > > same
> > > > > >> route as Apache Beam project (Google donated 16x 16CPU machines
> > for
> > > > > them).
> > > > > >> The route of Apache Beam is to use the machines as workers for
> > > Apache
> > > > > >> Jenkins (https://builds.apache.org/). Apache Jenkins is one of
> > the
> > > > > >> encouraged CI solutions by Apache and if we can have workers
> > > connected
> > > > > to
> > > > > >> the existing Jenkins master of Apache, it means that the
> > maintenance
> > > > > >> overhead will be pretty minimal. And we can follow Apache Beam
> > setup
> > > > so
> > > > > I
> > > > > >> do not expect any legal problems.
> > > > > >>
> > > > > >> I also work very closely with the team that uses Apache Beam
> > Jenkins
> > > > > >> heavily so I have access to all the necessary experts to help
> with
> > > the
> > > > > >> setup (and I am happy to help with that).
> > > > > >>
> > > > > >> I really hope everyone in the community will be really happy to
> go
> > > in
> > > > > that
> > > > > >> direction - it's. Please let me know if you have any concerns !
> > > > > >>
> > > > > >> We do not need as many machines as Beam for sure (Beam uses the
> > > > > machines to
> > > > > >> process a lot of data for tests including some load testing) but
> > we
> > > > > need to
> > > > > >> estimate the number/types of machines that we are going to need.
> > > > > >> Fokko, Ash, others - do you have some recent numbers for the
> > current
> > > > > usage
> > > > > >> or should I open an Infrastructure ticket for it?
> > > > > >>
> > > > > >> J
> > > > > >>
> > > > > >> On Fri, Jun 28, 2019 at 4:50 PM Jarek Potiuk <
> > > > Jarek.Potiuk@polidea.com>
> > > > > >> wrote:
> > > > > >>
> > > > > >>> Thanks Aizhamal! I spoke already to Gris and she confirmed that
> > as
> > > > well
> > > > > >>> and the 8th of July date is ok for us as we will have to
> evaluate
> > > and
> > > > > >>> prepare as well. Have a nice trip.
> > > > > >>>
> > > > > >>> J.
> > > > > >>>
> > > > > >>> On Fri, Jun 28, 2019 at 4:25 PM Aizhamal Nurmamat kyzy
> > > > > >>> <ai...@google.com.invalid> wrote:
> > > > > >>>
> > > > > >>>> Hi all,
> > > > > >>>>
> > > > > >>>> On Thu, Jun 27, 2019 at 15:28 Jarek Potiuk <
> > > > Jarek.Potiuk@polidea.com>
> > > > > >>>> wrote:
> > > > > >>>>
> > > > > >>>>> Yeah. I also have a working version of Cloud build
> > configuration
> > > > and
> > > > > we
> > > > > >>>> can
> > > > > >>>>> run the tests on cloud build if we can get some credits from
> > > > Google.
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> I can look into getting a small amount of credits approved for
> > > this,
> > > > > to
> > > > > >>>> see
> > > > > >>>> if it’s useful to offload some tests to Cloud Build, or to
> > > provision
> > > > > some
> > > > > >>>> VMs to run on Apache Infra.
> > > > > >>>>
> > > > > >>>> I am traveling at the moment, but I’ll be back in the office
> on
> > > July
> > > > > 8,
> > > > > >>>> and
> > > > > >>>> I’ll try to get this done.
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> Thanks,
> > > > > >>>> Aizhamal
> > > > > >>>>
> > > > > >>>> And
> > > > > >>>>> the changes from the upcoming CI image will make it much
> easier
> > > to
> > > > > run
> > > > > >>>>> tests on any CI provider. Except Kubernetes tests they are
> > pretty
> > > > > much
> > > > > >>>>> CI-agnostic. Kubernetes tests will likely be also fixed soon.
> > > > > >>>>>
> > > > > >>>>> Another idea: I thought that in the future we can also run
> only
> > > > > subset
> > > > > >>>> of
> > > > > >>>>> postgres/mysql/sqlite tests on all combinations. I think
> there
> > > are
> > > > > just
> > > > > >>>>> handful of tests that are specific for backend (and we
> already
> > > know
> > > > > >>>> which
> > > > > >>>>> ones they are - they are skipped-if).
> > > > > >>>>>
> > > > > >>>>> J.
> > > > > >>>>>
> > > > > >>>>> Principal Software Engineer
> > > > > >>>>> Phone: +48660796129
> > > > > >>>>>
> > > > > >>>>> czw., 27 cze 2019, 15:12 użytkownik Philippe Gagnon <
> > > > > >>>> philgagnon1@gmail.com
> > > > > >>>>>>
> > > > > >>>>> napisał:
> > > > > >>>>>
> > > > > >>>>>> I think the combinations that you are proposing are sensible
> > for
> > > > > >>>>> pre-merge
> > > > > >>>>>> checks.
> > > > > >>>>>>
> > > > > >>>>>> I am working on a proposal to offload extra combinations to
> > > > another
> > > > > CI
> > > > > >>>>>> provider (Azure DevOps specifically seems like a good
> > > candidate),
> > > > > >>>> either
> > > > > >>>>>> pre or post merge. Ideally I'd like to run more combinations
> > > > > pre-merge
> > > > > >>>>> but
> > > > > >>>>>> there is a trade-off to be conscious of here between
> > development
> > > > > >>>> velocity
> > > > > >>>>>> and quality assurance, which I think this issue highlights
> > quite
> > > > > well.
> > > > > >>>>>>
> > > > > >>>>>> Please let me know your thoughts
> > > > > >>>>>>
> > > > > >>>>>> Philippe
> > > > > >>>>>>
> > > > > >>>>>> On Thu, Jun 27, 2019 at 9:05 AM Jarek Potiuk <
> > > > > >>>> Jarek.Potiuk@polidea.com>
> > > > > >>>>>> wrote:
> > > > > >>>>>>
> > > > > >>>>>>> Agree that we should be thoughtful about others as well: In
> > the
> > > > > >>>> latest
> > > > > >>>>>> push
> > > > > >>>>>>> (few minutes ago) of the upcoming official CI image i
> > > implemented
> > > > > >>>> the
> > > > > >>>>>>> change we discussed in the Github where we limit the number
> > of
> > > > > >>>>>> combinations
> > > > > >>>>>>> we test:
> > > > > >>>>>>>
> > > > > >>>>>>> You can see it yourself:
> > > > > >>>>>>> https://travis-ci.org/apache/airflow/builds/551305240
> > > > > >>>>>>>
> > > > > >>>>>>> Those are the combinations I propose:
> > > > > >>>>>>>
> > > > > >>>>>>> Python: 3.6
> > > > > >>>>>>> BACKEND=mysql ENV=docker
> > > > > >>>>>>>
> > > > > >>>>>>> Python: 3.6
> > > > > >>>>>>> BACKEND=postgres ENV=docker
> > > > > >>>>>>>
> > > > > >>>>>>> Python: 3.5
> > > > > >>>>>>> BACKEND=sqlite ENV=docker
> > > > > >>>>>>>
> > > > > >>>>>>> Python: 3.6
> > > > > >>>>>>> BACKEND=postgres ENV=kubernetes KUBERNETES_VERSION=v1.13.0
> > > > > >>>>>>>
> > > > > >>>>>>> J,
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> On Thu, Jun 27, 2019 at 11:00 AM Driesprong, Fokko
> > > > > >>>>> <fokko@driesprong.frl
> > > > > >>>>>>>
> > > > > >>>>>>> wrote:
> > > > > >>>>>>>
> > > > > >>>>>>>> We got this message last year:
> > > > > >>>>>>>>
> > > > > >>>>>>>>> Hello, Airflow PPMC.
> > > > > >>>>>>>>> While going through the usage statistics for our Travis
> CI
> > > > > >>>>> service, I
> > > > > >>>>>>>>> have noticed that the Airflow project is using an
> > abnormally
> > > > > >>>> large
> > > > > >>>>>>>>> amount of resources, 2600 hours per month or the
> equivalent
> > > of
> > > > > >>>>> having
> > > > > >>>>>>>>> almost 4 machines building airflow non-stop 24/7. As this
> > is
> > > > not
> > > > > >>>>>> free,
> > > > > >>>>>>>>> but rather costing us money, I'm contacting you with the
> > > > > >>>> intention
> > > > > >>>>> of
> > > > > >>>>>>>>> figuring out ways to reduce the use of Travis for the
> > > project.
> > > > > >>>>>>>>
> > > > > >>>>>>>>> We would greatly prefer that the project itself comes up
> > > with a
> > > > > >>>>>>> solution
> > > > > >>>>>>>>> to lower the usage of Travis, as we'd hate to simply turn
> > it
> > > > off
> > > > > >>>>> for
> > > > > >>>>>>>>> you, but the usage is at a rather severe level, totaling
> > more
> > > > > >>>> than
> > > > > >>>>>> 21%
> > > > > >>>>>>>>> of the total build time of all projects using Travis, so
> > > > > >>>> something
> > > > > >>>>>>>>> actionable should be decided upon and (preferably)
> > completed
> > > by
> > > > > >>>> the
> > > > > >>>>>> end
> > > > > >>>>>>>>> of May that will reduce the consumption of Travis
> > resources.
> > > > > >>>>>>>>
> > > > > >>>>>>>>> Alternately, if you are unable to lower the pressure on
> > > Travis,
> > > > > >>>> the
> > > > > >>>>>>>>> podling and/or IPMC may ask the board of directors for a
> > > > > >>>> separate
> > > > > >>>>>>> budget
> > > > > >>>>>>>>> for additional build nodes to cope with the added load -
> > I'll
> > > > > >>>> leave
> > > > > >>>>>>> this
> > > > > >>>>>>>>> for the podling and IPMC to decide on.
> > > > > >>>>>>>>
> > > > > >>>>>>>>> Please let us know when you have decided on a plan to
> > remedy
> > > > > >>>> this
> > > > > >>>>>>>> situation.
> > > > > >>>>>>>>
> > > > > >>>>>>>>> With regards,
> > > > > >>>>>>>>> Daniel on behalf of ASF Infrastructure.
> > > > > >>>>>>>>
> > > > > >>>>>>>> I think more and more projects are still migrating to the
> > ASF
> > > > > >>>> Travis,
> > > > > >>>>>> so
> > > > > >>>>>>> I
> > > > > >>>>>>>> think natural that there is more load. However, this still
> > > > leaves
> > > > > >>>> the
> > > > > >>>>>>>> question if we have to run the full matrix.
> > > > > >>>>>>>>
> > > > > >>>>>>>> Cheers, Fokko
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> Op do 27 jun. 2019 om 10:56 schreef Jarek Potiuk <
> > > > > >>>>>>> Jarek.Potiuk@polidea.com
> > > > > >>>>>>>>> :
> > > > > >>>>>>>>
> > > > > >>>>>>>>> I think we should really involve infra to increase the
> slot
> > > > > >>>> number
> > > > > >>>>> or
> > > > > >>>>>>>> maybe
> > > > > >>>>>>>>> even somehow allocate slots per project.
> > > > > >>>>>>>>> The problem is that we cannot control what other apache
> > > > projects
> > > > > >>>>> are
> > > > > >>>>>>>> doing,
> > > > > >>>>>>>>> so even if we decrease our runtime, it's the other
> projects
> > > > that
> > > > > >>>>>> might
> > > > > >>>>>>>> hold
> > > > > >>>>>>>>> us in the queue :(
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> J.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> On Thu, Jun 27, 2019 at 10:19 AM Driesprong, Fokko
> > > > > >>>>>>> <fokko@driesprong.frl
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> wrote:
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>> I've noticed this at other Apache projects as well,
> > > sometimes
> > > > > >>>> it
> > > > > >>>>>>> takes
> > > > > >>>>>>>> up
> > > > > >>>>>>>>>> to 7-8 hours. The only thing we can do, is reduce the
> > > runtime
> > > > > >>>> of
> > > > > >>>>>> the
> > > > > >>>>>>>> jobs
> > > > > >>>>>>>>>> so we take less slots :-)
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Cheers, Fokko
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Op wo 26 jun. 2019 om 21:59 schreef Jarek Potiuk <
> > > > > >>>>>>>>> Jarek.Potiuk@polidea.com
> > > > > >>>>>>>>>>> :
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>> Yep. That's what I suggested as the reason in the
> ticket
> > -
> > > I
> > > > > >>>>>> guess
> > > > > >>>>>>>>> INFRA
> > > > > >>>>>>>>>>> are the only people who can do anything about it
> > (increase
> > > > > >>>>>>>> concurrency
> > > > > >>>>>>>>> ?
> > > > > >>>>>>>>>>> pay more for Travis :)? ).
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> On Wed, Jun 26, 2019 at 9:51 PM Ash Berlin-Taylor <
> > > > > >>>>>> ash@apache.org>
> > > > > >>>>>>>>>> wrote:
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> I asked Travis on twitter and they said it was due to
> > the
> > > > > >>>>>> Apache
> > > > > >>>>>>>>> other
> > > > > >>>>>>>>>>>> projects build queues
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>>
> https://twitter.com/travisci/status/1143893051460526080
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> -ash
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> On 26 June 2019 20:48:33 BST, Jarek Potiuk <
> > > > > >>>>>>>> Jarek.Potiuk@polidea.com
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>>> wrote:
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> Hello everyone,
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> For the last few days the Travis builds for
> > > > > >>>> apache/airflow
> > > > > >>>>>>> project
> > > > > >>>>>>>>> are
> > > > > >>>>>>>>>>>>> waiting in a queue for hours. This is not a normal
> > > > > >>>>> situation.
> > > > > >>>>>>> I've
> > > > > >>>>>>>>>>> opened
> > > > > >>>>>>>>>>>>> INFRA ticket for that:
> > > > > >>>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-18657
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> J.
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> --
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> Jarek Potiuk
> > > > > >>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal
> Software
> > > > > >>>>> Engineer
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> M: +48 660 796 129 <+48660796129>
> > > > > >>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> --
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Jarek Potiuk
> > > > > >>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
> > > > > >>>> Engineer
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> M: +48 660 796 129 <+48660796129>
> > > > > >>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> --
> > > > > >>>>>>>
> > > > > >>>>>>> Jarek Potiuk
> > > > > >>>>>>> Polidea <https://www.polidea.com/> | Principal Software
> > > Engineer
> > > > > >>>>>>>
> > > > > >>>>>>> M: +48 660 796 129 <+48660796129>
> > > > > >>>>>>> [image: Polidea] <https://www.polidea.com/>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>>
> > > > > >>>
> > > > > >>>
> > > > > >>> --
> > > > > >>>
> > > > > >>> Jarek Potiuk
> > > > > >>> Polidea <https://www.polidea.com/> | Principal Software
> Engineer
> > > > > >>>
> > > > > >>> M: +48 660 796 129 <+48660796129>
> > > > > >>> [image: Polidea] <https://www.polidea.com/>
> > > > > >>>
> > > > > >>>
> > > > > >>
> > > > > >> --
> > > > > >>
> > > > > >> Jarek Potiuk
> > > > > >> Polidea <https://www.polidea.com/> | Principal Software
> Engineer
> > > > > >>
> > > > > >> M: +48 660 796 129 <+48660796129>
> > > > > >> [image: Polidea] <https://www.polidea.com/>
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Jarek Potiuk
> > > > Polidea <https://www.polidea.com/> | Principal Software Engineer
> > > >
> > > > M: +48 660 796 129 <+48660796129>
> > > > [image: Polidea] <https://www.polidea.com/>
> > > >
> > >
> >
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>

Re: Travis builds in a queue for hours

Posted by Jarek Potiuk <Ja...@polidea.com>.
Since more than few people (including myself) are in favour of GitLab CI,
and since Apache Infra is talking to GitLab CI, I will make sure to check
if we can combine the two approaches - workers from Google and managed,
central GitlabCI interface to manage it (likely managed by the Infra team).
Airflow can easily be a  "guinea pig" for GitLab CI / Apache integration.
We also have quite an expertise in managin GitLab in my company (we use
GitLab in Polidea for most of our mobile project CI and all the cloud
builds that we run internally).

I will make an AIP for that soon and involve the right people :).

J.

On Thu, Jul 11, 2019 at 8:01 AM Driesprong, Fokko <fo...@driesprong.frl>
wrote:

> Regardings the numbers, I believe that INFRA has an overview of the usage
> per project. I think they are happy to share these numbers with you. Also,
> it seems like there is also a queue in Jenkins: https://status.apache.org/
>
> Talking about Jenkins. I'm not a big fan of it. For example, Spark uses it,
> and it is rather difficult to set up the environment yourself, in contrast
> with Travis. I also have good experiences with Gitlab since that is the
> only Docker native CI in my personal opinion.
>
> > But we can try both of course. And even switch later.
> There is nothing as permanent as a temporary solution :-) However, I'm not
> against trying. I've checked the beam project, and the integration with
> Github looks good.
>
> Thanks again Jarek and Aizhamal for all the work an effort.
>
> Cheers, Fokko
>
>
>
>
> Op wo 10 jul. 2019 om 23:11 schreef Aizhamal Nurmamat kyzy <
> aizhamal@apache.org>:
>
> > Hi all,
> >
> > I am still working on trying to get approvals for this, so this is not
> yet
> > a done deal. I'll keep y'all updated.
> >
> > As for the CI solution to use, we have no particular inclination. As long
> > as the community supports it, and it is consistent with any Apache
> > guidelines for CI from their projects. Jenkins and GitLab CI both sound
> > sensible.
> >
> > The email from INFRA says that Airflow runs 2600 hours of tests per
> month,
> > or the equivalent of about 4 machines. Can the community help with a
> > reasonable estimate for this, so I can use it as a reference for the
> > request?
> >
> > Thanks!
> >
> > On Wed, Jul 10, 2019 at 2:43 PM Jarek Potiuk <Ja...@polidea.com>
> > wrote:
> >
> > > Yeah. Gitlab CI is definitely what I would prefer as well from the
> > > "modernity" point of view (and one of my very close friends is Gitlab
> CI
> > > maintainer and actually The person who introduced CI to GitLab
> > offering). I
> > > also actually already catalysed discussion between GitLab and Apache
> > > infrastructure to introduce GitLab CI on the "Apache" level (they are
> > > talking about it now I believe).
> > >
> > > But from Google <> Apache/Procedural point of view it might simply be
> > > easier to follow footsteps of Apache Beam. It might simply be few
> clicks
> > > away for the Apache Infrastructure to add more machines and connect
> them
> > to
> > > the Apache Jenkins for our project. If we have a path cleared by
> others,
> > > following it might be simply much faster.
> > >
> > > But we can try both of course. And even switch later. The Docker CI
> > > approach I am about to merge is designed to be super-easy to switch
> > betwen
> > > CI systems. Virtually ALL the build logic is in scripts  in shared
> Docker
> > > images. There is basically one file per CI system to add and we can
> > support
> > > Travis/Jenkins/CloudBuild/CircleCI - whatever we imaging. We can even
> > > support all of them at the same time :)
> > >
> > > J.
> > >
> > > On Wed, Jul 10, 2019 at 11:32 PM Bolke de Bruin <bd...@gmail.com>
> > wrote:
> > >
> > > > If you need an alternative why not use a couple of gitlab-ci runners?
> > > Much
> > > > easier to maintain, light weight, and much closer to what we use now.
> > > >
> > > > B.
> > > >
> > > > Verstuurd vanaf mijn iPad
> > > >
> > > > > Op 10 jul. 2019 om 23:27 heeft Bolke de Bruin <bd...@gmail.com>
> > het
> > > > volgende geschreven:
> > > > >
> > > > > Awesome! But I hope you are not serious about using Jenkins right?
> > If I
> > > > need to start a Holy War it would be against Jenkins.
> > > > >
> > > > > B.
> > > > >
> > > > > Verstuurd vanaf mijn iPad
> > > > >
> > > > >> Op 10 jul. 2019 om 22:55 heeft Jarek Potiuk <
> > Jarek.Potiuk@polidea.com
> > > >
> > > > het volgende geschreven:
> > > > >>
> > > > >> Hello Everyone,
> > > > >>
> > > > >> I have some really good news. I just had a call with Google OSS
> team
> > > > (Gris,
> > > > >> Aizhamal) and they are willing to donate VMs on Google Cloud
> > Platform
> > > to
> > > > >> run CI for Airflow. In order to simplify the setup (and make sure
> it
> > > is
> > > > ok
> > > > >> according to Apache regulations) we think we should go exactly the
> > > same
> > > > >> route as Apache Beam project (Google donated 16x 16CPU machines
> for
> > > > them).
> > > > >> The route of Apache Beam is to use the machines as workers for
> > Apache
> > > > >> Jenkins (https://builds.apache.org/). Apache Jenkins is one of
> the
> > > > >> encouraged CI solutions by Apache and if we can have workers
> > connected
> > > > to
> > > > >> the existing Jenkins master of Apache, it means that the
> maintenance
> > > > >> overhead will be pretty minimal. And we can follow Apache Beam
> setup
> > > so
> > > > I
> > > > >> do not expect any legal problems.
> > > > >>
> > > > >> I also work very closely with the team that uses Apache Beam
> Jenkins
> > > > >> heavily so I have access to all the necessary experts to help with
> > the
> > > > >> setup (and I am happy to help with that).
> > > > >>
> > > > >> I really hope everyone in the community will be really happy to go
> > in
> > > > that
> > > > >> direction - it's. Please let me know if you have any concerns !
> > > > >>
> > > > >> We do not need as many machines as Beam for sure (Beam uses the
> > > > machines to
> > > > >> process a lot of data for tests including some load testing) but
> we
> > > > need to
> > > > >> estimate the number/types of machines that we are going to need.
> > > > >> Fokko, Ash, others - do you have some recent numbers for the
> current
> > > > usage
> > > > >> or should I open an Infrastructure ticket for it?
> > > > >>
> > > > >> J
> > > > >>
> > > > >> On Fri, Jun 28, 2019 at 4:50 PM Jarek Potiuk <
> > > Jarek.Potiuk@polidea.com>
> > > > >> wrote:
> > > > >>
> > > > >>> Thanks Aizhamal! I spoke already to Gris and she confirmed that
> as
> > > well
> > > > >>> and the 8th of July date is ok for us as we will have to evaluate
> > and
> > > > >>> prepare as well. Have a nice trip.
> > > > >>>
> > > > >>> J.
> > > > >>>
> > > > >>> On Fri, Jun 28, 2019 at 4:25 PM Aizhamal Nurmamat kyzy
> > > > >>> <ai...@google.com.invalid> wrote:
> > > > >>>
> > > > >>>> Hi all,
> > > > >>>>
> > > > >>>> On Thu, Jun 27, 2019 at 15:28 Jarek Potiuk <
> > > Jarek.Potiuk@polidea.com>
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>> Yeah. I also have a working version of Cloud build
> configuration
> > > and
> > > > we
> > > > >>>> can
> > > > >>>>> run the tests on cloud build if we can get some credits from
> > > Google.
> > > > >>>>
> > > > >>>>
> > > > >>>> I can look into getting a small amount of credits approved for
> > this,
> > > > to
> > > > >>>> see
> > > > >>>> if it’s useful to offload some tests to Cloud Build, or to
> > provision
> > > > some
> > > > >>>> VMs to run on Apache Infra.
> > > > >>>>
> > > > >>>> I am traveling at the moment, but I’ll be back in the office on
> > July
> > > > 8,
> > > > >>>> and
> > > > >>>> I’ll try to get this done.
> > > > >>>>
> > > > >>>>
> > > > >>>> Thanks,
> > > > >>>> Aizhamal
> > > > >>>>
> > > > >>>> And
> > > > >>>>> the changes from the upcoming CI image will make it much easier
> > to
> > > > run
> > > > >>>>> tests on any CI provider. Except Kubernetes tests they are
> pretty
> > > > much
> > > > >>>>> CI-agnostic. Kubernetes tests will likely be also fixed soon.
> > > > >>>>>
> > > > >>>>> Another idea: I thought that in the future we can also run only
> > > > subset
> > > > >>>> of
> > > > >>>>> postgres/mysql/sqlite tests on all combinations. I think there
> > are
> > > > just
> > > > >>>>> handful of tests that are specific for backend (and we already
> > know
> > > > >>>> which
> > > > >>>>> ones they are - they are skipped-if).
> > > > >>>>>
> > > > >>>>> J.
> > > > >>>>>
> > > > >>>>> Principal Software Engineer
> > > > >>>>> Phone: +48660796129
> > > > >>>>>
> > > > >>>>> czw., 27 cze 2019, 15:12 użytkownik Philippe Gagnon <
> > > > >>>> philgagnon1@gmail.com
> > > > >>>>>>
> > > > >>>>> napisał:
> > > > >>>>>
> > > > >>>>>> I think the combinations that you are proposing are sensible
> for
> > > > >>>>> pre-merge
> > > > >>>>>> checks.
> > > > >>>>>>
> > > > >>>>>> I am working on a proposal to offload extra combinations to
> > > another
> > > > CI
> > > > >>>>>> provider (Azure DevOps specifically seems like a good
> > candidate),
> > > > >>>> either
> > > > >>>>>> pre or post merge. Ideally I'd like to run more combinations
> > > > pre-merge
> > > > >>>>> but
> > > > >>>>>> there is a trade-off to be conscious of here between
> development
> > > > >>>> velocity
> > > > >>>>>> and quality assurance, which I think this issue highlights
> quite
> > > > well.
> > > > >>>>>>
> > > > >>>>>> Please let me know your thoughts
> > > > >>>>>>
> > > > >>>>>> Philippe
> > > > >>>>>>
> > > > >>>>>> On Thu, Jun 27, 2019 at 9:05 AM Jarek Potiuk <
> > > > >>>> Jarek.Potiuk@polidea.com>
> > > > >>>>>> wrote:
> > > > >>>>>>
> > > > >>>>>>> Agree that we should be thoughtful about others as well: In
> the
> > > > >>>> latest
> > > > >>>>>> push
> > > > >>>>>>> (few minutes ago) of the upcoming official CI image i
> > implemented
> > > > >>>> the
> > > > >>>>>>> change we discussed in the Github where we limit the number
> of
> > > > >>>>>> combinations
> > > > >>>>>>> we test:
> > > > >>>>>>>
> > > > >>>>>>> You can see it yourself:
> > > > >>>>>>> https://travis-ci.org/apache/airflow/builds/551305240
> > > > >>>>>>>
> > > > >>>>>>> Those are the combinations I propose:
> > > > >>>>>>>
> > > > >>>>>>> Python: 3.6
> > > > >>>>>>> BACKEND=mysql ENV=docker
> > > > >>>>>>>
> > > > >>>>>>> Python: 3.6
> > > > >>>>>>> BACKEND=postgres ENV=docker
> > > > >>>>>>>
> > > > >>>>>>> Python: 3.5
> > > > >>>>>>> BACKEND=sqlite ENV=docker
> > > > >>>>>>>
> > > > >>>>>>> Python: 3.6
> > > > >>>>>>> BACKEND=postgres ENV=kubernetes KUBERNETES_VERSION=v1.13.0
> > > > >>>>>>>
> > > > >>>>>>> J,
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Thu, Jun 27, 2019 at 11:00 AM Driesprong, Fokko
> > > > >>>>> <fokko@driesprong.frl
> > > > >>>>>>>
> > > > >>>>>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> We got this message last year:
> > > > >>>>>>>>
> > > > >>>>>>>>> Hello, Airflow PPMC.
> > > > >>>>>>>>> While going through the usage statistics for our Travis CI
> > > > >>>>> service, I
> > > > >>>>>>>>> have noticed that the Airflow project is using an
> abnormally
> > > > >>>> large
> > > > >>>>>>>>> amount of resources, 2600 hours per month or the equivalent
> > of
> > > > >>>>> having
> > > > >>>>>>>>> almost 4 machines building airflow non-stop 24/7. As this
> is
> > > not
> > > > >>>>>> free,
> > > > >>>>>>>>> but rather costing us money, I'm contacting you with the
> > > > >>>> intention
> > > > >>>>> of
> > > > >>>>>>>>> figuring out ways to reduce the use of Travis for the
> > project.
> > > > >>>>>>>>
> > > > >>>>>>>>> We would greatly prefer that the project itself comes up
> > with a
> > > > >>>>>>> solution
> > > > >>>>>>>>> to lower the usage of Travis, as we'd hate to simply turn
> it
> > > off
> > > > >>>>> for
> > > > >>>>>>>>> you, but the usage is at a rather severe level, totaling
> more
> > > > >>>> than
> > > > >>>>>> 21%
> > > > >>>>>>>>> of the total build time of all projects using Travis, so
> > > > >>>> something
> > > > >>>>>>>>> actionable should be decided upon and (preferably)
> completed
> > by
> > > > >>>> the
> > > > >>>>>> end
> > > > >>>>>>>>> of May that will reduce the consumption of Travis
> resources.
> > > > >>>>>>>>
> > > > >>>>>>>>> Alternately, if you are unable to lower the pressure on
> > Travis,
> > > > >>>> the
> > > > >>>>>>>>> podling and/or IPMC may ask the board of directors for a
> > > > >>>> separate
> > > > >>>>>>> budget
> > > > >>>>>>>>> for additional build nodes to cope with the added load -
> I'll
> > > > >>>> leave
> > > > >>>>>>> this
> > > > >>>>>>>>> for the podling and IPMC to decide on.
> > > > >>>>>>>>
> > > > >>>>>>>>> Please let us know when you have decided on a plan to
> remedy
> > > > >>>> this
> > > > >>>>>>>> situation.
> > > > >>>>>>>>
> > > > >>>>>>>>> With regards,
> > > > >>>>>>>>> Daniel on behalf of ASF Infrastructure.
> > > > >>>>>>>>
> > > > >>>>>>>> I think more and more projects are still migrating to the
> ASF
> > > > >>>> Travis,
> > > > >>>>>> so
> > > > >>>>>>> I
> > > > >>>>>>>> think natural that there is more load. However, this still
> > > leaves
> > > > >>>> the
> > > > >>>>>>>> question if we have to run the full matrix.
> > > > >>>>>>>>
> > > > >>>>>>>> Cheers, Fokko
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> Op do 27 jun. 2019 om 10:56 schreef Jarek Potiuk <
> > > > >>>>>>> Jarek.Potiuk@polidea.com
> > > > >>>>>>>>> :
> > > > >>>>>>>>
> > > > >>>>>>>>> I think we should really involve infra to increase the slot
> > > > >>>> number
> > > > >>>>> or
> > > > >>>>>>>> maybe
> > > > >>>>>>>>> even somehow allocate slots per project.
> > > > >>>>>>>>> The problem is that we cannot control what other apache
> > > projects
> > > > >>>>> are
> > > > >>>>>>>> doing,
> > > > >>>>>>>>> so even if we decrease our runtime, it's the other projects
> > > that
> > > > >>>>>> might
> > > > >>>>>>>> hold
> > > > >>>>>>>>> us in the queue :(
> > > > >>>>>>>>>
> > > > >>>>>>>>> J.
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Thu, Jun 27, 2019 at 10:19 AM Driesprong, Fokko
> > > > >>>>>>> <fokko@driesprong.frl
> > > > >>>>>>>>>
> > > > >>>>>>>>> wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>>> I've noticed this at other Apache projects as well,
> > sometimes
> > > > >>>> it
> > > > >>>>>>> takes
> > > > >>>>>>>> up
> > > > >>>>>>>>>> to 7-8 hours. The only thing we can do, is reduce the
> > runtime
> > > > >>>> of
> > > > >>>>>> the
> > > > >>>>>>>> jobs
> > > > >>>>>>>>>> so we take less slots :-)
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Cheers, Fokko
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Op wo 26 jun. 2019 om 21:59 schreef Jarek Potiuk <
> > > > >>>>>>>>> Jarek.Potiuk@polidea.com
> > > > >>>>>>>>>>> :
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> Yep. That's what I suggested as the reason in the ticket
> -
> > I
> > > > >>>>>> guess
> > > > >>>>>>>>> INFRA
> > > > >>>>>>>>>>> are the only people who can do anything about it
> (increase
> > > > >>>>>>>> concurrency
> > > > >>>>>>>>> ?
> > > > >>>>>>>>>>> pay more for Travis :)? ).
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> On Wed, Jun 26, 2019 at 9:51 PM Ash Berlin-Taylor <
> > > > >>>>>> ash@apache.org>
> > > > >>>>>>>>>> wrote:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> I asked Travis on twitter and they said it was due to
> the
> > > > >>>>>> Apache
> > > > >>>>>>>>> other
> > > > >>>>>>>>>>>> projects build queues
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> https://twitter.com/travisci/status/1143893051460526080
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> -ash
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> On 26 June 2019 20:48:33 BST, Jarek Potiuk <
> > > > >>>>>>>> Jarek.Potiuk@polidea.com
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Hello everyone,
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> For the last few days the Travis builds for
> > > > >>>> apache/airflow
> > > > >>>>>>> project
> > > > >>>>>>>>> are
> > > > >>>>>>>>>>>>> waiting in a queue for hours. This is not a normal
> > > > >>>>> situation.
> > > > >>>>>>> I've
> > > > >>>>>>>>>>> opened
> > > > >>>>>>>>>>>>> INFRA ticket for that:
> > > > >>>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-18657
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> J.
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> --
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Jarek Potiuk
> > > > >>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
> > > > >>>>> Engineer
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> M: +48 660 796 129 <+48660796129>
> > > > >>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>> --
> > > > >>>>>>>>>
> > > > >>>>>>>>> Jarek Potiuk
> > > > >>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
> > > > >>>> Engineer
> > > > >>>>>>>>>
> > > > >>>>>>>>> M: +48 660 796 129 <+48660796129>
> > > > >>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> --
> > > > >>>>>>>
> > > > >>>>>>> Jarek Potiuk
> > > > >>>>>>> Polidea <https://www.polidea.com/> | Principal Software
> > Engineer
> > > > >>>>>>>
> > > > >>>>>>> M: +48 660 796 129 <+48660796129>
> > > > >>>>>>> [image: Polidea] <https://www.polidea.com/>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > > >>>
> > > > >>> --
> > > > >>>
> > > > >>> Jarek Potiuk
> > > > >>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> > > > >>>
> > > > >>> M: +48 660 796 129 <+48660796129>
> > > > >>> [image: Polidea] <https://www.polidea.com/>
> > > > >>>
> > > > >>>
> > > > >>
> > > > >> --
> > > > >>
> > > > >> Jarek Potiuk
> > > > >> Polidea <https://www.polidea.com/> | Principal Software Engineer
> > > > >>
> > > > >> M: +48 660 796 129 <+48660796129>
> > > > >> [image: Polidea] <https://www.polidea.com/>
> > > >
> > >
> > >
> > > --
> > >
> > > Jarek Potiuk
> > > Polidea <https://www.polidea.com/> | Principal Software Engineer
> > >
> > > M: +48 660 796 129 <+48660796129>
> > > [image: Polidea] <https://www.polidea.com/>
> > >
> >
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Travis builds in a queue for hours

Posted by "Driesprong, Fokko" <fo...@driesprong.frl>.
Regardings the numbers, I believe that INFRA has an overview of the usage
per project. I think they are happy to share these numbers with you. Also,
it seems like there is also a queue in Jenkins: https://status.apache.org/

Talking about Jenkins. I'm not a big fan of it. For example, Spark uses it,
and it is rather difficult to set up the environment yourself, in contrast
with Travis. I also have good experiences with Gitlab since that is the
only Docker native CI in my personal opinion.

> But we can try both of course. And even switch later.
There is nothing as permanent as a temporary solution :-) However, I'm not
against trying. I've checked the beam project, and the integration with
Github looks good.

Thanks again Jarek and Aizhamal for all the work an effort.

Cheers, Fokko




Op wo 10 jul. 2019 om 23:11 schreef Aizhamal Nurmamat kyzy <
aizhamal@apache.org>:

> Hi all,
>
> I am still working on trying to get approvals for this, so this is not yet
> a done deal. I'll keep y'all updated.
>
> As for the CI solution to use, we have no particular inclination. As long
> as the community supports it, and it is consistent with any Apache
> guidelines for CI from their projects. Jenkins and GitLab CI both sound
> sensible.
>
> The email from INFRA says that Airflow runs 2600 hours of tests per month,
> or the equivalent of about 4 machines. Can the community help with a
> reasonable estimate for this, so I can use it as a reference for the
> request?
>
> Thanks!
>
> On Wed, Jul 10, 2019 at 2:43 PM Jarek Potiuk <Ja...@polidea.com>
> wrote:
>
> > Yeah. Gitlab CI is definitely what I would prefer as well from the
> > "modernity" point of view (and one of my very close friends is Gitlab CI
> > maintainer and actually The person who introduced CI to GitLab
> offering). I
> > also actually already catalysed discussion between GitLab and Apache
> > infrastructure to introduce GitLab CI on the "Apache" level (they are
> > talking about it now I believe).
> >
> > But from Google <> Apache/Procedural point of view it might simply be
> > easier to follow footsteps of Apache Beam. It might simply be few clicks
> > away for the Apache Infrastructure to add more machines and connect them
> to
> > the Apache Jenkins for our project. If we have a path cleared by others,
> > following it might be simply much faster.
> >
> > But we can try both of course. And even switch later. The Docker CI
> > approach I am about to merge is designed to be super-easy to switch
> betwen
> > CI systems. Virtually ALL the build logic is in scripts  in shared Docker
> > images. There is basically one file per CI system to add and we can
> support
> > Travis/Jenkins/CloudBuild/CircleCI - whatever we imaging. We can even
> > support all of them at the same time :)
> >
> > J.
> >
> > On Wed, Jul 10, 2019 at 11:32 PM Bolke de Bruin <bd...@gmail.com>
> wrote:
> >
> > > If you need an alternative why not use a couple of gitlab-ci runners?
> > Much
> > > easier to maintain, light weight, and much closer to what we use now.
> > >
> > > B.
> > >
> > > Verstuurd vanaf mijn iPad
> > >
> > > > Op 10 jul. 2019 om 23:27 heeft Bolke de Bruin <bd...@gmail.com>
> het
> > > volgende geschreven:
> > > >
> > > > Awesome! But I hope you are not serious about using Jenkins right?
> If I
> > > need to start a Holy War it would be against Jenkins.
> > > >
> > > > B.
> > > >
> > > > Verstuurd vanaf mijn iPad
> > > >
> > > >> Op 10 jul. 2019 om 22:55 heeft Jarek Potiuk <
> Jarek.Potiuk@polidea.com
> > >
> > > het volgende geschreven:
> > > >>
> > > >> Hello Everyone,
> > > >>
> > > >> I have some really good news. I just had a call with Google OSS team
> > > (Gris,
> > > >> Aizhamal) and they are willing to donate VMs on Google Cloud
> Platform
> > to
> > > >> run CI for Airflow. In order to simplify the setup (and make sure it
> > is
> > > ok
> > > >> according to Apache regulations) we think we should go exactly the
> > same
> > > >> route as Apache Beam project (Google donated 16x 16CPU machines for
> > > them).
> > > >> The route of Apache Beam is to use the machines as workers for
> Apache
> > > >> Jenkins (https://builds.apache.org/). Apache Jenkins is one of the
> > > >> encouraged CI solutions by Apache and if we can have workers
> connected
> > > to
> > > >> the existing Jenkins master of Apache, it means that the maintenance
> > > >> overhead will be pretty minimal. And we can follow Apache Beam setup
> > so
> > > I
> > > >> do not expect any legal problems.
> > > >>
> > > >> I also work very closely with the team that uses Apache Beam Jenkins
> > > >> heavily so I have access to all the necessary experts to help with
> the
> > > >> setup (and I am happy to help with that).
> > > >>
> > > >> I really hope everyone in the community will be really happy to go
> in
> > > that
> > > >> direction - it's. Please let me know if you have any concerns !
> > > >>
> > > >> We do not need as many machines as Beam for sure (Beam uses the
> > > machines to
> > > >> process a lot of data for tests including some load testing) but we
> > > need to
> > > >> estimate the number/types of machines that we are going to need.
> > > >> Fokko, Ash, others - do you have some recent numbers for the current
> > > usage
> > > >> or should I open an Infrastructure ticket for it?
> > > >>
> > > >> J
> > > >>
> > > >> On Fri, Jun 28, 2019 at 4:50 PM Jarek Potiuk <
> > Jarek.Potiuk@polidea.com>
> > > >> wrote:
> > > >>
> > > >>> Thanks Aizhamal! I spoke already to Gris and she confirmed that as
> > well
> > > >>> and the 8th of July date is ok for us as we will have to evaluate
> and
> > > >>> prepare as well. Have a nice trip.
> > > >>>
> > > >>> J.
> > > >>>
> > > >>> On Fri, Jun 28, 2019 at 4:25 PM Aizhamal Nurmamat kyzy
> > > >>> <ai...@google.com.invalid> wrote:
> > > >>>
> > > >>>> Hi all,
> > > >>>>
> > > >>>> On Thu, Jun 27, 2019 at 15:28 Jarek Potiuk <
> > Jarek.Potiuk@polidea.com>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Yeah. I also have a working version of Cloud build configuration
> > and
> > > we
> > > >>>> can
> > > >>>>> run the tests on cloud build if we can get some credits from
> > Google.
> > > >>>>
> > > >>>>
> > > >>>> I can look into getting a small amount of credits approved for
> this,
> > > to
> > > >>>> see
> > > >>>> if it’s useful to offload some tests to Cloud Build, or to
> provision
> > > some
> > > >>>> VMs to run on Apache Infra.
> > > >>>>
> > > >>>> I am traveling at the moment, but I’ll be back in the office on
> July
> > > 8,
> > > >>>> and
> > > >>>> I’ll try to get this done.
> > > >>>>
> > > >>>>
> > > >>>> Thanks,
> > > >>>> Aizhamal
> > > >>>>
> > > >>>> And
> > > >>>>> the changes from the upcoming CI image will make it much easier
> to
> > > run
> > > >>>>> tests on any CI provider. Except Kubernetes tests they are pretty
> > > much
> > > >>>>> CI-agnostic. Kubernetes tests will likely be also fixed soon.
> > > >>>>>
> > > >>>>> Another idea: I thought that in the future we can also run only
> > > subset
> > > >>>> of
> > > >>>>> postgres/mysql/sqlite tests on all combinations. I think there
> are
> > > just
> > > >>>>> handful of tests that are specific for backend (and we already
> know
> > > >>>> which
> > > >>>>> ones they are - they are skipped-if).
> > > >>>>>
> > > >>>>> J.
> > > >>>>>
> > > >>>>> Principal Software Engineer
> > > >>>>> Phone: +48660796129
> > > >>>>>
> > > >>>>> czw., 27 cze 2019, 15:12 użytkownik Philippe Gagnon <
> > > >>>> philgagnon1@gmail.com
> > > >>>>>>
> > > >>>>> napisał:
> > > >>>>>
> > > >>>>>> I think the combinations that you are proposing are sensible for
> > > >>>>> pre-merge
> > > >>>>>> checks.
> > > >>>>>>
> > > >>>>>> I am working on a proposal to offload extra combinations to
> > another
> > > CI
> > > >>>>>> provider (Azure DevOps specifically seems like a good
> candidate),
> > > >>>> either
> > > >>>>>> pre or post merge. Ideally I'd like to run more combinations
> > > pre-merge
> > > >>>>> but
> > > >>>>>> there is a trade-off to be conscious of here between development
> > > >>>> velocity
> > > >>>>>> and quality assurance, which I think this issue highlights quite
> > > well.
> > > >>>>>>
> > > >>>>>> Please let me know your thoughts
> > > >>>>>>
> > > >>>>>> Philippe
> > > >>>>>>
> > > >>>>>> On Thu, Jun 27, 2019 at 9:05 AM Jarek Potiuk <
> > > >>>> Jarek.Potiuk@polidea.com>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Agree that we should be thoughtful about others as well: In the
> > > >>>> latest
> > > >>>>>> push
> > > >>>>>>> (few minutes ago) of the upcoming official CI image i
> implemented
> > > >>>> the
> > > >>>>>>> change we discussed in the Github where we limit the number of
> > > >>>>>> combinations
> > > >>>>>>> we test:
> > > >>>>>>>
> > > >>>>>>> You can see it yourself:
> > > >>>>>>> https://travis-ci.org/apache/airflow/builds/551305240
> > > >>>>>>>
> > > >>>>>>> Those are the combinations I propose:
> > > >>>>>>>
> > > >>>>>>> Python: 3.6
> > > >>>>>>> BACKEND=mysql ENV=docker
> > > >>>>>>>
> > > >>>>>>> Python: 3.6
> > > >>>>>>> BACKEND=postgres ENV=docker
> > > >>>>>>>
> > > >>>>>>> Python: 3.5
> > > >>>>>>> BACKEND=sqlite ENV=docker
> > > >>>>>>>
> > > >>>>>>> Python: 3.6
> > > >>>>>>> BACKEND=postgres ENV=kubernetes KUBERNETES_VERSION=v1.13.0
> > > >>>>>>>
> > > >>>>>>> J,
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Thu, Jun 27, 2019 at 11:00 AM Driesprong, Fokko
> > > >>>>> <fokko@driesprong.frl
> > > >>>>>>>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> We got this message last year:
> > > >>>>>>>>
> > > >>>>>>>>> Hello, Airflow PPMC.
> > > >>>>>>>>> While going through the usage statistics for our Travis CI
> > > >>>>> service, I
> > > >>>>>>>>> have noticed that the Airflow project is using an abnormally
> > > >>>> large
> > > >>>>>>>>> amount of resources, 2600 hours per month or the equivalent
> of
> > > >>>>> having
> > > >>>>>>>>> almost 4 machines building airflow non-stop 24/7. As this is
> > not
> > > >>>>>> free,
> > > >>>>>>>>> but rather costing us money, I'm contacting you with the
> > > >>>> intention
> > > >>>>> of
> > > >>>>>>>>> figuring out ways to reduce the use of Travis for the
> project.
> > > >>>>>>>>
> > > >>>>>>>>> We would greatly prefer that the project itself comes up
> with a
> > > >>>>>>> solution
> > > >>>>>>>>> to lower the usage of Travis, as we'd hate to simply turn it
> > off
> > > >>>>> for
> > > >>>>>>>>> you, but the usage is at a rather severe level, totaling more
> > > >>>> than
> > > >>>>>> 21%
> > > >>>>>>>>> of the total build time of all projects using Travis, so
> > > >>>> something
> > > >>>>>>>>> actionable should be decided upon and (preferably) completed
> by
> > > >>>> the
> > > >>>>>> end
> > > >>>>>>>>> of May that will reduce the consumption of Travis resources.
> > > >>>>>>>>
> > > >>>>>>>>> Alternately, if you are unable to lower the pressure on
> Travis,
> > > >>>> the
> > > >>>>>>>>> podling and/or IPMC may ask the board of directors for a
> > > >>>> separate
> > > >>>>>>> budget
> > > >>>>>>>>> for additional build nodes to cope with the added load - I'll
> > > >>>> leave
> > > >>>>>>> this
> > > >>>>>>>>> for the podling and IPMC to decide on.
> > > >>>>>>>>
> > > >>>>>>>>> Please let us know when you have decided on a plan to remedy
> > > >>>> this
> > > >>>>>>>> situation.
> > > >>>>>>>>
> > > >>>>>>>>> With regards,
> > > >>>>>>>>> Daniel on behalf of ASF Infrastructure.
> > > >>>>>>>>
> > > >>>>>>>> I think more and more projects are still migrating to the ASF
> > > >>>> Travis,
> > > >>>>>> so
> > > >>>>>>> I
> > > >>>>>>>> think natural that there is more load. However, this still
> > leaves
> > > >>>> the
> > > >>>>>>>> question if we have to run the full matrix.
> > > >>>>>>>>
> > > >>>>>>>> Cheers, Fokko
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> Op do 27 jun. 2019 om 10:56 schreef Jarek Potiuk <
> > > >>>>>>> Jarek.Potiuk@polidea.com
> > > >>>>>>>>> :
> > > >>>>>>>>
> > > >>>>>>>>> I think we should really involve infra to increase the slot
> > > >>>> number
> > > >>>>> or
> > > >>>>>>>> maybe
> > > >>>>>>>>> even somehow allocate slots per project.
> > > >>>>>>>>> The problem is that we cannot control what other apache
> > projects
> > > >>>>> are
> > > >>>>>>>> doing,
> > > >>>>>>>>> so even if we decrease our runtime, it's the other projects
> > that
> > > >>>>>> might
> > > >>>>>>>> hold
> > > >>>>>>>>> us in the queue :(
> > > >>>>>>>>>
> > > >>>>>>>>> J.
> > > >>>>>>>>>
> > > >>>>>>>>> On Thu, Jun 27, 2019 at 10:19 AM Driesprong, Fokko
> > > >>>>>>> <fokko@driesprong.frl
> > > >>>>>>>>>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> I've noticed this at other Apache projects as well,
> sometimes
> > > >>>> it
> > > >>>>>>> takes
> > > >>>>>>>> up
> > > >>>>>>>>>> to 7-8 hours. The only thing we can do, is reduce the
> runtime
> > > >>>> of
> > > >>>>>> the
> > > >>>>>>>> jobs
> > > >>>>>>>>>> so we take less slots :-)
> > > >>>>>>>>>>
> > > >>>>>>>>>> Cheers, Fokko
> > > >>>>>>>>>>
> > > >>>>>>>>>> Op wo 26 jun. 2019 om 21:59 schreef Jarek Potiuk <
> > > >>>>>>>>> Jarek.Potiuk@polidea.com
> > > >>>>>>>>>>> :
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Yep. That's what I suggested as the reason in the ticket -
> I
> > > >>>>>> guess
> > > >>>>>>>>> INFRA
> > > >>>>>>>>>>> are the only people who can do anything about it (increase
> > > >>>>>>>> concurrency
> > > >>>>>>>>> ?
> > > >>>>>>>>>>> pay more for Travis :)? ).
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Wed, Jun 26, 2019 at 9:51 PM Ash Berlin-Taylor <
> > > >>>>>> ash@apache.org>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> I asked Travis on twitter and they said it was due to the
> > > >>>>>> Apache
> > > >>>>>>>>> other
> > > >>>>>>>>>>>> projects build queues
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> https://twitter.com/travisci/status/1143893051460526080
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> -ash
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> On 26 June 2019 20:48:33 BST, Jarek Potiuk <
> > > >>>>>>>> Jarek.Potiuk@polidea.com
> > > >>>>>>>>>>
> > > >>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Hello everyone,
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> For the last few days the Travis builds for
> > > >>>> apache/airflow
> > > >>>>>>> project
> > > >>>>>>>>> are
> > > >>>>>>>>>>>>> waiting in a queue for hours. This is not a normal
> > > >>>>> situation.
> > > >>>>>>> I've
> > > >>>>>>>>>>> opened
> > > >>>>>>>>>>>>> INFRA ticket for that:
> > > >>>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-18657
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> J.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> --
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Jarek Potiuk
> > > >>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
> > > >>>>> Engineer
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> M: +48 660 796 129 <+48660796129>
> > > >>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> --
> > > >>>>>>>>>
> > > >>>>>>>>> Jarek Potiuk
> > > >>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
> > > >>>> Engineer
> > > >>>>>>>>>
> > > >>>>>>>>> M: +48 660 796 129 <+48660796129>
> > > >>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> --
> > > >>>>>>>
> > > >>>>>>> Jarek Potiuk
> > > >>>>>>> Polidea <https://www.polidea.com/> | Principal Software
> Engineer
> > > >>>>>>>
> > > >>>>>>> M: +48 660 796 129 <+48660796129>
> > > >>>>>>> [image: Polidea] <https://www.polidea.com/>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>>
> > > >>> Jarek Potiuk
> > > >>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> > > >>>
> > > >>> M: +48 660 796 129 <+48660796129>
> > > >>> [image: Polidea] <https://www.polidea.com/>
> > > >>>
> > > >>>
> > > >>
> > > >> --
> > > >>
> > > >> Jarek Potiuk
> > > >> Polidea <https://www.polidea.com/> | Principal Software Engineer
> > > >>
> > > >> M: +48 660 796 129 <+48660796129>
> > > >> [image: Polidea] <https://www.polidea.com/>
> > >
> >
> >
> > --
> >
> > Jarek Potiuk
> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >
> > M: +48 660 796 129 <+48660796129>
> > [image: Polidea] <https://www.polidea.com/>
> >
>

Re: Travis builds in a queue for hours

Posted by Aizhamal Nurmamat kyzy <ai...@apache.org>.
Hi all,

I am still working on trying to get approvals for this, so this is not yet
a done deal. I'll keep y'all updated.

As for the CI solution to use, we have no particular inclination. As long
as the community supports it, and it is consistent with any Apache
guidelines for CI from their projects. Jenkins and GitLab CI both sound
sensible.

The email from INFRA says that Airflow runs 2600 hours of tests per month,
or the equivalent of about 4 machines. Can the community help with a
reasonable estimate for this, so I can use it as a reference for the
request?

Thanks!

On Wed, Jul 10, 2019 at 2:43 PM Jarek Potiuk <Ja...@polidea.com>
wrote:

> Yeah. Gitlab CI is definitely what I would prefer as well from the
> "modernity" point of view (and one of my very close friends is Gitlab CI
> maintainer and actually The person who introduced CI to GitLab offering). I
> also actually already catalysed discussion between GitLab and Apache
> infrastructure to introduce GitLab CI on the "Apache" level (they are
> talking about it now I believe).
>
> But from Google <> Apache/Procedural point of view it might simply be
> easier to follow footsteps of Apache Beam. It might simply be few clicks
> away for the Apache Infrastructure to add more machines and connect them to
> the Apache Jenkins for our project. If we have a path cleared by others,
> following it might be simply much faster.
>
> But we can try both of course. And even switch later. The Docker CI
> approach I am about to merge is designed to be super-easy to switch betwen
> CI systems. Virtually ALL the build logic is in scripts  in shared Docker
> images. There is basically one file per CI system to add and we can support
> Travis/Jenkins/CloudBuild/CircleCI - whatever we imaging. We can even
> support all of them at the same time :)
>
> J.
>
> On Wed, Jul 10, 2019 at 11:32 PM Bolke de Bruin <bd...@gmail.com> wrote:
>
> > If you need an alternative why not use a couple of gitlab-ci runners?
> Much
> > easier to maintain, light weight, and much closer to what we use now.
> >
> > B.
> >
> > Verstuurd vanaf mijn iPad
> >
> > > Op 10 jul. 2019 om 23:27 heeft Bolke de Bruin <bd...@gmail.com> het
> > volgende geschreven:
> > >
> > > Awesome! But I hope you are not serious about using Jenkins right? If I
> > need to start a Holy War it would be against Jenkins.
> > >
> > > B.
> > >
> > > Verstuurd vanaf mijn iPad
> > >
> > >> Op 10 jul. 2019 om 22:55 heeft Jarek Potiuk <Jarek.Potiuk@polidea.com
> >
> > het volgende geschreven:
> > >>
> > >> Hello Everyone,
> > >>
> > >> I have some really good news. I just had a call with Google OSS team
> > (Gris,
> > >> Aizhamal) and they are willing to donate VMs on Google Cloud Platform
> to
> > >> run CI for Airflow. In order to simplify the setup (and make sure it
> is
> > ok
> > >> according to Apache regulations) we think we should go exactly the
> same
> > >> route as Apache Beam project (Google donated 16x 16CPU machines for
> > them).
> > >> The route of Apache Beam is to use the machines as workers for Apache
> > >> Jenkins (https://builds.apache.org/). Apache Jenkins is one of the
> > >> encouraged CI solutions by Apache and if we can have workers connected
> > to
> > >> the existing Jenkins master of Apache, it means that the maintenance
> > >> overhead will be pretty minimal. And we can follow Apache Beam setup
> so
> > I
> > >> do not expect any legal problems.
> > >>
> > >> I also work very closely with the team that uses Apache Beam Jenkins
> > >> heavily so I have access to all the necessary experts to help with the
> > >> setup (and I am happy to help with that).
> > >>
> > >> I really hope everyone in the community will be really happy to go in
> > that
> > >> direction - it's. Please let me know if you have any concerns !
> > >>
> > >> We do not need as many machines as Beam for sure (Beam uses the
> > machines to
> > >> process a lot of data for tests including some load testing) but we
> > need to
> > >> estimate the number/types of machines that we are going to need.
> > >> Fokko, Ash, others - do you have some recent numbers for the current
> > usage
> > >> or should I open an Infrastructure ticket for it?
> > >>
> > >> J
> > >>
> > >> On Fri, Jun 28, 2019 at 4:50 PM Jarek Potiuk <
> Jarek.Potiuk@polidea.com>
> > >> wrote:
> > >>
> > >>> Thanks Aizhamal! I spoke already to Gris and she confirmed that as
> well
> > >>> and the 8th of July date is ok for us as we will have to evaluate and
> > >>> prepare as well. Have a nice trip.
> > >>>
> > >>> J.
> > >>>
> > >>> On Fri, Jun 28, 2019 at 4:25 PM Aizhamal Nurmamat kyzy
> > >>> <ai...@google.com.invalid> wrote:
> > >>>
> > >>>> Hi all,
> > >>>>
> > >>>> On Thu, Jun 27, 2019 at 15:28 Jarek Potiuk <
> Jarek.Potiuk@polidea.com>
> > >>>> wrote:
> > >>>>
> > >>>>> Yeah. I also have a working version of Cloud build configuration
> and
> > we
> > >>>> can
> > >>>>> run the tests on cloud build if we can get some credits from
> Google.
> > >>>>
> > >>>>
> > >>>> I can look into getting a small amount of credits approved for this,
> > to
> > >>>> see
> > >>>> if it’s useful to offload some tests to Cloud Build, or to provision
> > some
> > >>>> VMs to run on Apache Infra.
> > >>>>
> > >>>> I am traveling at the moment, but I’ll be back in the office on July
> > 8,
> > >>>> and
> > >>>> I’ll try to get this done.
> > >>>>
> > >>>>
> > >>>> Thanks,
> > >>>> Aizhamal
> > >>>>
> > >>>> And
> > >>>>> the changes from the upcoming CI image will make it much easier to
> > run
> > >>>>> tests on any CI provider. Except Kubernetes tests they are pretty
> > much
> > >>>>> CI-agnostic. Kubernetes tests will likely be also fixed soon.
> > >>>>>
> > >>>>> Another idea: I thought that in the future we can also run only
> > subset
> > >>>> of
> > >>>>> postgres/mysql/sqlite tests on all combinations. I think there are
> > just
> > >>>>> handful of tests that are specific for backend (and we already know
> > >>>> which
> > >>>>> ones they are - they are skipped-if).
> > >>>>>
> > >>>>> J.
> > >>>>>
> > >>>>> Principal Software Engineer
> > >>>>> Phone: +48660796129
> > >>>>>
> > >>>>> czw., 27 cze 2019, 15:12 użytkownik Philippe Gagnon <
> > >>>> philgagnon1@gmail.com
> > >>>>>>
> > >>>>> napisał:
> > >>>>>
> > >>>>>> I think the combinations that you are proposing are sensible for
> > >>>>> pre-merge
> > >>>>>> checks.
> > >>>>>>
> > >>>>>> I am working on a proposal to offload extra combinations to
> another
> > CI
> > >>>>>> provider (Azure DevOps specifically seems like a good candidate),
> > >>>> either
> > >>>>>> pre or post merge. Ideally I'd like to run more combinations
> > pre-merge
> > >>>>> but
> > >>>>>> there is a trade-off to be conscious of here between development
> > >>>> velocity
> > >>>>>> and quality assurance, which I think this issue highlights quite
> > well.
> > >>>>>>
> > >>>>>> Please let me know your thoughts
> > >>>>>>
> > >>>>>> Philippe
> > >>>>>>
> > >>>>>> On Thu, Jun 27, 2019 at 9:05 AM Jarek Potiuk <
> > >>>> Jarek.Potiuk@polidea.com>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> Agree that we should be thoughtful about others as well: In the
> > >>>> latest
> > >>>>>> push
> > >>>>>>> (few minutes ago) of the upcoming official CI image i implemented
> > >>>> the
> > >>>>>>> change we discussed in the Github where we limit the number of
> > >>>>>> combinations
> > >>>>>>> we test:
> > >>>>>>>
> > >>>>>>> You can see it yourself:
> > >>>>>>> https://travis-ci.org/apache/airflow/builds/551305240
> > >>>>>>>
> > >>>>>>> Those are the combinations I propose:
> > >>>>>>>
> > >>>>>>> Python: 3.6
> > >>>>>>> BACKEND=mysql ENV=docker
> > >>>>>>>
> > >>>>>>> Python: 3.6
> > >>>>>>> BACKEND=postgres ENV=docker
> > >>>>>>>
> > >>>>>>> Python: 3.5
> > >>>>>>> BACKEND=sqlite ENV=docker
> > >>>>>>>
> > >>>>>>> Python: 3.6
> > >>>>>>> BACKEND=postgres ENV=kubernetes KUBERNETES_VERSION=v1.13.0
> > >>>>>>>
> > >>>>>>> J,
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Thu, Jun 27, 2019 at 11:00 AM Driesprong, Fokko
> > >>>>> <fokko@driesprong.frl
> > >>>>>>>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> We got this message last year:
> > >>>>>>>>
> > >>>>>>>>> Hello, Airflow PPMC.
> > >>>>>>>>> While going through the usage statistics for our Travis CI
> > >>>>> service, I
> > >>>>>>>>> have noticed that the Airflow project is using an abnormally
> > >>>> large
> > >>>>>>>>> amount of resources, 2600 hours per month or the equivalent of
> > >>>>> having
> > >>>>>>>>> almost 4 machines building airflow non-stop 24/7. As this is
> not
> > >>>>>> free,
> > >>>>>>>>> but rather costing us money, I'm contacting you with the
> > >>>> intention
> > >>>>> of
> > >>>>>>>>> figuring out ways to reduce the use of Travis for the project.
> > >>>>>>>>
> > >>>>>>>>> We would greatly prefer that the project itself comes up with a
> > >>>>>>> solution
> > >>>>>>>>> to lower the usage of Travis, as we'd hate to simply turn it
> off
> > >>>>> for
> > >>>>>>>>> you, but the usage is at a rather severe level, totaling more
> > >>>> than
> > >>>>>> 21%
> > >>>>>>>>> of the total build time of all projects using Travis, so
> > >>>> something
> > >>>>>>>>> actionable should be decided upon and (preferably) completed by
> > >>>> the
> > >>>>>> end
> > >>>>>>>>> of May that will reduce the consumption of Travis resources.
> > >>>>>>>>
> > >>>>>>>>> Alternately, if you are unable to lower the pressure on Travis,
> > >>>> the
> > >>>>>>>>> podling and/or IPMC may ask the board of directors for a
> > >>>> separate
> > >>>>>>> budget
> > >>>>>>>>> for additional build nodes to cope with the added load - I'll
> > >>>> leave
> > >>>>>>> this
> > >>>>>>>>> for the podling and IPMC to decide on.
> > >>>>>>>>
> > >>>>>>>>> Please let us know when you have decided on a plan to remedy
> > >>>> this
> > >>>>>>>> situation.
> > >>>>>>>>
> > >>>>>>>>> With regards,
> > >>>>>>>>> Daniel on behalf of ASF Infrastructure.
> > >>>>>>>>
> > >>>>>>>> I think more and more projects are still migrating to the ASF
> > >>>> Travis,
> > >>>>>> so
> > >>>>>>> I
> > >>>>>>>> think natural that there is more load. However, this still
> leaves
> > >>>> the
> > >>>>>>>> question if we have to run the full matrix.
> > >>>>>>>>
> > >>>>>>>> Cheers, Fokko
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Op do 27 jun. 2019 om 10:56 schreef Jarek Potiuk <
> > >>>>>>> Jarek.Potiuk@polidea.com
> > >>>>>>>>> :
> > >>>>>>>>
> > >>>>>>>>> I think we should really involve infra to increase the slot
> > >>>> number
> > >>>>> or
> > >>>>>>>> maybe
> > >>>>>>>>> even somehow allocate slots per project.
> > >>>>>>>>> The problem is that we cannot control what other apache
> projects
> > >>>>> are
> > >>>>>>>> doing,
> > >>>>>>>>> so even if we decrease our runtime, it's the other projects
> that
> > >>>>>> might
> > >>>>>>>> hold
> > >>>>>>>>> us in the queue :(
> > >>>>>>>>>
> > >>>>>>>>> J.
> > >>>>>>>>>
> > >>>>>>>>> On Thu, Jun 27, 2019 at 10:19 AM Driesprong, Fokko
> > >>>>>>> <fokko@driesprong.frl
> > >>>>>>>>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> I've noticed this at other Apache projects as well, sometimes
> > >>>> it
> > >>>>>>> takes
> > >>>>>>>> up
> > >>>>>>>>>> to 7-8 hours. The only thing we can do, is reduce the runtime
> > >>>> of
> > >>>>>> the
> > >>>>>>>> jobs
> > >>>>>>>>>> so we take less slots :-)
> > >>>>>>>>>>
> > >>>>>>>>>> Cheers, Fokko
> > >>>>>>>>>>
> > >>>>>>>>>> Op wo 26 jun. 2019 om 21:59 schreef Jarek Potiuk <
> > >>>>>>>>> Jarek.Potiuk@polidea.com
> > >>>>>>>>>>> :
> > >>>>>>>>>>
> > >>>>>>>>>>> Yep. That's what I suggested as the reason in the ticket - I
> > >>>>>> guess
> > >>>>>>>>> INFRA
> > >>>>>>>>>>> are the only people who can do anything about it (increase
> > >>>>>>>> concurrency
> > >>>>>>>>> ?
> > >>>>>>>>>>> pay more for Travis :)? ).
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Wed, Jun 26, 2019 at 9:51 PM Ash Berlin-Taylor <
> > >>>>>> ash@apache.org>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> I asked Travis on twitter and they said it was due to the
> > >>>>>> Apache
> > >>>>>>>>> other
> > >>>>>>>>>>>> projects build queues
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> https://twitter.com/travisci/status/1143893051460526080
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> -ash
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On 26 June 2019 20:48:33 BST, Jarek Potiuk <
> > >>>>>>>> Jarek.Potiuk@polidea.com
> > >>>>>>>>>>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Hello everyone,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> For the last few days the Travis builds for
> > >>>> apache/airflow
> > >>>>>>> project
> > >>>>>>>>> are
> > >>>>>>>>>>>>> waiting in a queue for hours. This is not a normal
> > >>>>> situation.
> > >>>>>>> I've
> > >>>>>>>>>>> opened
> > >>>>>>>>>>>>> INFRA ticket for that:
> > >>>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-18657
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> J.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> --
> > >>>>>>>>>>>
> > >>>>>>>>>>> Jarek Potiuk
> > >>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
> > >>>>> Engineer
> > >>>>>>>>>>>
> > >>>>>>>>>>> M: +48 660 796 129 <+48660796129>
> > >>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> --
> > >>>>>>>>>
> > >>>>>>>>> Jarek Potiuk
> > >>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
> > >>>> Engineer
> > >>>>>>>>>
> > >>>>>>>>> M: +48 660 796 129 <+48660796129>
> > >>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>>
> > >>>>>>> Jarek Potiuk
> > >>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> > >>>>>>>
> > >>>>>>> M: +48 660 796 129 <+48660796129>
> > >>>>>>> [image: Polidea] <https://www.polidea.com/>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>>
> > >>> --
> > >>>
> > >>> Jarek Potiuk
> > >>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> > >>>
> > >>> M: +48 660 796 129 <+48660796129>
> > >>> [image: Polidea] <https://www.polidea.com/>
> > >>>
> > >>>
> > >>
> > >> --
> > >>
> > >> Jarek Potiuk
> > >> Polidea <https://www.polidea.com/> | Principal Software Engineer
> > >>
> > >> M: +48 660 796 129 <+48660796129>
> > >> [image: Polidea] <https://www.polidea.com/>
> >
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>

Re: Travis builds in a queue for hours

Posted by Jarek Potiuk <Ja...@polidea.com>.
Yeah. Gitlab CI is definitely what I would prefer as well from the
"modernity" point of view (and one of my very close friends is Gitlab CI
maintainer and actually The person who introduced CI to GitLab offering). I
also actually already catalysed discussion between GitLab and Apache
infrastructure to introduce GitLab CI on the "Apache" level (they are
talking about it now I believe).

But from Google <> Apache/Procedural point of view it might simply be
easier to follow footsteps of Apache Beam. It might simply be few clicks
away for the Apache Infrastructure to add more machines and connect them to
the Apache Jenkins for our project. If we have a path cleared by others,
following it might be simply much faster.

But we can try both of course. And even switch later. The Docker CI
approach I am about to merge is designed to be super-easy to switch betwen
CI systems. Virtually ALL the build logic is in scripts  in shared Docker
images. There is basically one file per CI system to add and we can support
Travis/Jenkins/CloudBuild/CircleCI - whatever we imaging. We can even
support all of them at the same time :)

J.

On Wed, Jul 10, 2019 at 11:32 PM Bolke de Bruin <bd...@gmail.com> wrote:

> If you need an alternative why not use a couple of gitlab-ci runners? Much
> easier to maintain, light weight, and much closer to what we use now.
>
> B.
>
> Verstuurd vanaf mijn iPad
>
> > Op 10 jul. 2019 om 23:27 heeft Bolke de Bruin <bd...@gmail.com> het
> volgende geschreven:
> >
> > Awesome! But I hope you are not serious about using Jenkins right? If I
> need to start a Holy War it would be against Jenkins.
> >
> > B.
> >
> > Verstuurd vanaf mijn iPad
> >
> >> Op 10 jul. 2019 om 22:55 heeft Jarek Potiuk <Ja...@polidea.com>
> het volgende geschreven:
> >>
> >> Hello Everyone,
> >>
> >> I have some really good news. I just had a call with Google OSS team
> (Gris,
> >> Aizhamal) and they are willing to donate VMs on Google Cloud Platform to
> >> run CI for Airflow. In order to simplify the setup (and make sure it is
> ok
> >> according to Apache regulations) we think we should go exactly the same
> >> route as Apache Beam project (Google donated 16x 16CPU machines for
> them).
> >> The route of Apache Beam is to use the machines as workers for Apache
> >> Jenkins (https://builds.apache.org/). Apache Jenkins is one of the
> >> encouraged CI solutions by Apache and if we can have workers connected
> to
> >> the existing Jenkins master of Apache, it means that the maintenance
> >> overhead will be pretty minimal. And we can follow Apache Beam setup so
> I
> >> do not expect any legal problems.
> >>
> >> I also work very closely with the team that uses Apache Beam Jenkins
> >> heavily so I have access to all the necessary experts to help with the
> >> setup (and I am happy to help with that).
> >>
> >> I really hope everyone in the community will be really happy to go in
> that
> >> direction - it's. Please let me know if you have any concerns !
> >>
> >> We do not need as many machines as Beam for sure (Beam uses the
> machines to
> >> process a lot of data for tests including some load testing) but we
> need to
> >> estimate the number/types of machines that we are going to need.
> >> Fokko, Ash, others - do you have some recent numbers for the current
> usage
> >> or should I open an Infrastructure ticket for it?
> >>
> >> J
> >>
> >> On Fri, Jun 28, 2019 at 4:50 PM Jarek Potiuk <Ja...@polidea.com>
> >> wrote:
> >>
> >>> Thanks Aizhamal! I spoke already to Gris and she confirmed that as well
> >>> and the 8th of July date is ok for us as we will have to evaluate and
> >>> prepare as well. Have a nice trip.
> >>>
> >>> J.
> >>>
> >>> On Fri, Jun 28, 2019 at 4:25 PM Aizhamal Nurmamat kyzy
> >>> <ai...@google.com.invalid> wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> On Thu, Jun 27, 2019 at 15:28 Jarek Potiuk <Ja...@polidea.com>
> >>>> wrote:
> >>>>
> >>>>> Yeah. I also have a working version of Cloud build configuration and
> we
> >>>> can
> >>>>> run the tests on cloud build if we can get some credits from Google.
> >>>>
> >>>>
> >>>> I can look into getting a small amount of credits approved for this,
> to
> >>>> see
> >>>> if it’s useful to offload some tests to Cloud Build, or to provision
> some
> >>>> VMs to run on Apache Infra.
> >>>>
> >>>> I am traveling at the moment, but I’ll be back in the office on July
> 8,
> >>>> and
> >>>> I’ll try to get this done.
> >>>>
> >>>>
> >>>> Thanks,
> >>>> Aizhamal
> >>>>
> >>>> And
> >>>>> the changes from the upcoming CI image will make it much easier to
> run
> >>>>> tests on any CI provider. Except Kubernetes tests they are pretty
> much
> >>>>> CI-agnostic. Kubernetes tests will likely be also fixed soon.
> >>>>>
> >>>>> Another idea: I thought that in the future we can also run only
> subset
> >>>> of
> >>>>> postgres/mysql/sqlite tests on all combinations. I think there are
> just
> >>>>> handful of tests that are specific for backend (and we already know
> >>>> which
> >>>>> ones they are - they are skipped-if).
> >>>>>
> >>>>> J.
> >>>>>
> >>>>> Principal Software Engineer
> >>>>> Phone: +48660796129
> >>>>>
> >>>>> czw., 27 cze 2019, 15:12 użytkownik Philippe Gagnon <
> >>>> philgagnon1@gmail.com
> >>>>>>
> >>>>> napisał:
> >>>>>
> >>>>>> I think the combinations that you are proposing are sensible for
> >>>>> pre-merge
> >>>>>> checks.
> >>>>>>
> >>>>>> I am working on a proposal to offload extra combinations to another
> CI
> >>>>>> provider (Azure DevOps specifically seems like a good candidate),
> >>>> either
> >>>>>> pre or post merge. Ideally I'd like to run more combinations
> pre-merge
> >>>>> but
> >>>>>> there is a trade-off to be conscious of here between development
> >>>> velocity
> >>>>>> and quality assurance, which I think this issue highlights quite
> well.
> >>>>>>
> >>>>>> Please let me know your thoughts
> >>>>>>
> >>>>>> Philippe
> >>>>>>
> >>>>>> On Thu, Jun 27, 2019 at 9:05 AM Jarek Potiuk <
> >>>> Jarek.Potiuk@polidea.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Agree that we should be thoughtful about others as well: In the
> >>>> latest
> >>>>>> push
> >>>>>>> (few minutes ago) of the upcoming official CI image i implemented
> >>>> the
> >>>>>>> change we discussed in the Github where we limit the number of
> >>>>>> combinations
> >>>>>>> we test:
> >>>>>>>
> >>>>>>> You can see it yourself:
> >>>>>>> https://travis-ci.org/apache/airflow/builds/551305240
> >>>>>>>
> >>>>>>> Those are the combinations I propose:
> >>>>>>>
> >>>>>>> Python: 3.6
> >>>>>>> BACKEND=mysql ENV=docker
> >>>>>>>
> >>>>>>> Python: 3.6
> >>>>>>> BACKEND=postgres ENV=docker
> >>>>>>>
> >>>>>>> Python: 3.5
> >>>>>>> BACKEND=sqlite ENV=docker
> >>>>>>>
> >>>>>>> Python: 3.6
> >>>>>>> BACKEND=postgres ENV=kubernetes KUBERNETES_VERSION=v1.13.0
> >>>>>>>
> >>>>>>> J,
> >>>>>>>
> >>>>>>>
> >>>>>>> On Thu, Jun 27, 2019 at 11:00 AM Driesprong, Fokko
> >>>>> <fokko@driesprong.frl
> >>>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> We got this message last year:
> >>>>>>>>
> >>>>>>>>> Hello, Airflow PPMC.
> >>>>>>>>> While going through the usage statistics for our Travis CI
> >>>>> service, I
> >>>>>>>>> have noticed that the Airflow project is using an abnormally
> >>>> large
> >>>>>>>>> amount of resources, 2600 hours per month or the equivalent of
> >>>>> having
> >>>>>>>>> almost 4 machines building airflow non-stop 24/7. As this is not
> >>>>>> free,
> >>>>>>>>> but rather costing us money, I'm contacting you with the
> >>>> intention
> >>>>> of
> >>>>>>>>> figuring out ways to reduce the use of Travis for the project.
> >>>>>>>>
> >>>>>>>>> We would greatly prefer that the project itself comes up with a
> >>>>>>> solution
> >>>>>>>>> to lower the usage of Travis, as we'd hate to simply turn it off
> >>>>> for
> >>>>>>>>> you, but the usage is at a rather severe level, totaling more
> >>>> than
> >>>>>> 21%
> >>>>>>>>> of the total build time of all projects using Travis, so
> >>>> something
> >>>>>>>>> actionable should be decided upon and (preferably) completed by
> >>>> the
> >>>>>> end
> >>>>>>>>> of May that will reduce the consumption of Travis resources.
> >>>>>>>>
> >>>>>>>>> Alternately, if you are unable to lower the pressure on Travis,
> >>>> the
> >>>>>>>>> podling and/or IPMC may ask the board of directors for a
> >>>> separate
> >>>>>>> budget
> >>>>>>>>> for additional build nodes to cope with the added load - I'll
> >>>> leave
> >>>>>>> this
> >>>>>>>>> for the podling and IPMC to decide on.
> >>>>>>>>
> >>>>>>>>> Please let us know when you have decided on a plan to remedy
> >>>> this
> >>>>>>>> situation.
> >>>>>>>>
> >>>>>>>>> With regards,
> >>>>>>>>> Daniel on behalf of ASF Infrastructure.
> >>>>>>>>
> >>>>>>>> I think more and more projects are still migrating to the ASF
> >>>> Travis,
> >>>>>> so
> >>>>>>> I
> >>>>>>>> think natural that there is more load. However, this still leaves
> >>>> the
> >>>>>>>> question if we have to run the full matrix.
> >>>>>>>>
> >>>>>>>> Cheers, Fokko
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Op do 27 jun. 2019 om 10:56 schreef Jarek Potiuk <
> >>>>>>> Jarek.Potiuk@polidea.com
> >>>>>>>>> :
> >>>>>>>>
> >>>>>>>>> I think we should really involve infra to increase the slot
> >>>> number
> >>>>> or
> >>>>>>>> maybe
> >>>>>>>>> even somehow allocate slots per project.
> >>>>>>>>> The problem is that we cannot control what other apache projects
> >>>>> are
> >>>>>>>> doing,
> >>>>>>>>> so even if we decrease our runtime, it's the other projects that
> >>>>>> might
> >>>>>>>> hold
> >>>>>>>>> us in the queue :(
> >>>>>>>>>
> >>>>>>>>> J.
> >>>>>>>>>
> >>>>>>>>> On Thu, Jun 27, 2019 at 10:19 AM Driesprong, Fokko
> >>>>>>> <fokko@driesprong.frl
> >>>>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> I've noticed this at other Apache projects as well, sometimes
> >>>> it
> >>>>>>> takes
> >>>>>>>> up
> >>>>>>>>>> to 7-8 hours. The only thing we can do, is reduce the runtime
> >>>> of
> >>>>>> the
> >>>>>>>> jobs
> >>>>>>>>>> so we take less slots :-)
> >>>>>>>>>>
> >>>>>>>>>> Cheers, Fokko
> >>>>>>>>>>
> >>>>>>>>>> Op wo 26 jun. 2019 om 21:59 schreef Jarek Potiuk <
> >>>>>>>>> Jarek.Potiuk@polidea.com
> >>>>>>>>>>> :
> >>>>>>>>>>
> >>>>>>>>>>> Yep. That's what I suggested as the reason in the ticket - I
> >>>>>> guess
> >>>>>>>>> INFRA
> >>>>>>>>>>> are the only people who can do anything about it (increase
> >>>>>>>> concurrency
> >>>>>>>>> ?
> >>>>>>>>>>> pay more for Travis :)? ).
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Jun 26, 2019 at 9:51 PM Ash Berlin-Taylor <
> >>>>>> ash@apache.org>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> I asked Travis on twitter and they said it was due to the
> >>>>>> Apache
> >>>>>>>>> other
> >>>>>>>>>>>> projects build queues
> >>>>>>>>>>>>
> >>>>>>>>>>>> https://twitter.com/travisci/status/1143893051460526080
> >>>>>>>>>>>>
> >>>>>>>>>>>> -ash
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 26 June 2019 20:48:33 BST, Jarek Potiuk <
> >>>>>>>> Jarek.Potiuk@polidea.com
> >>>>>>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hello everyone,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> For the last few days the Travis builds for
> >>>> apache/airflow
> >>>>>>> project
> >>>>>>>>> are
> >>>>>>>>>>>>> waiting in a queue for hours. This is not a normal
> >>>>> situation.
> >>>>>>> I've
> >>>>>>>>>>> opened
> >>>>>>>>>>>>> INFRA ticket for that:
> >>>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-18657
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> J.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>>
> >>>>>>>>>>> Jarek Potiuk
> >>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
> >>>>> Engineer
> >>>>>>>>>>>
> >>>>>>>>>>> M: +48 660 796 129 <+48660796129>
> >>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>>
> >>>>>>>>> Jarek Potiuk
> >>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
> >>>> Engineer
> >>>>>>>>>
> >>>>>>>>> M: +48 660 796 129 <+48660796129>
> >>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>>
> >>>>>>> Jarek Potiuk
> >>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>>>>>>
> >>>>>>> M: +48 660 796 129 <+48660796129>
> >>>>>>> [image: Polidea] <https://www.polidea.com/>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>> --
> >>>
> >>> Jarek Potiuk
> >>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>>
> >>> M: +48 660 796 129 <+48660796129>
> >>> [image: Polidea] <https://www.polidea.com/>
> >>>
> >>>
> >>
> >> --
> >>
> >> Jarek Potiuk
> >> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>
> >> M: +48 660 796 129 <+48660796129>
> >> [image: Polidea] <https://www.polidea.com/>
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Travis builds in a queue for hours

Posted by Bolke de Bruin <bd...@gmail.com>.
If you need an alternative why not use a couple of gitlab-ci runners? Much easier to maintain, light weight, and much closer to what we use now.

B.

Verstuurd vanaf mijn iPad

> Op 10 jul. 2019 om 23:27 heeft Bolke de Bruin <bd...@gmail.com> het volgende geschreven:
> 
> Awesome! But I hope you are not serious about using Jenkins right? If I need to start a Holy War it would be against Jenkins.
> 
> B.
> 
> Verstuurd vanaf mijn iPad
> 
>> Op 10 jul. 2019 om 22:55 heeft Jarek Potiuk <Ja...@polidea.com> het volgende geschreven:
>> 
>> Hello Everyone,
>> 
>> I have some really good news. I just had a call with Google OSS team (Gris,
>> Aizhamal) and they are willing to donate VMs on Google Cloud Platform to
>> run CI for Airflow. In order to simplify the setup (and make sure it is ok
>> according to Apache regulations) we think we should go exactly the same
>> route as Apache Beam project (Google donated 16x 16CPU machines for them).
>> The route of Apache Beam is to use the machines as workers for Apache
>> Jenkins (https://builds.apache.org/). Apache Jenkins is one of the
>> encouraged CI solutions by Apache and if we can have workers connected to
>> the existing Jenkins master of Apache, it means that the maintenance
>> overhead will be pretty minimal. And we can follow Apache Beam setup so I
>> do not expect any legal problems.
>> 
>> I also work very closely with the team that uses Apache Beam Jenkins
>> heavily so I have access to all the necessary experts to help with the
>> setup (and I am happy to help with that).
>> 
>> I really hope everyone in the community will be really happy to go in that
>> direction - it's. Please let me know if you have any concerns !
>> 
>> We do not need as many machines as Beam for sure (Beam uses the machines to
>> process a lot of data for tests including some load testing) but we need to
>> estimate the number/types of machines that we are going to need.
>> Fokko, Ash, others - do you have some recent numbers for the current usage
>> or should I open an Infrastructure ticket for it?
>> 
>> J
>> 
>> On Fri, Jun 28, 2019 at 4:50 PM Jarek Potiuk <Ja...@polidea.com>
>> wrote:
>> 
>>> Thanks Aizhamal! I spoke already to Gris and she confirmed that as well
>>> and the 8th of July date is ok for us as we will have to evaluate and
>>> prepare as well. Have a nice trip.
>>> 
>>> J.
>>> 
>>> On Fri, Jun 28, 2019 at 4:25 PM Aizhamal Nurmamat kyzy
>>> <ai...@google.com.invalid> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> On Thu, Jun 27, 2019 at 15:28 Jarek Potiuk <Ja...@polidea.com>
>>>> wrote:
>>>> 
>>>>> Yeah. I also have a working version of Cloud build configuration and we
>>>> can
>>>>> run the tests on cloud build if we can get some credits from Google.
>>>> 
>>>> 
>>>> I can look into getting a small amount of credits approved for this, to
>>>> see
>>>> if it’s useful to offload some tests to Cloud Build, or to provision some
>>>> VMs to run on Apache Infra.
>>>> 
>>>> I am traveling at the moment, but I’ll be back in the office on July 8,
>>>> and
>>>> I’ll try to get this done.
>>>> 
>>>> 
>>>> Thanks,
>>>> Aizhamal
>>>> 
>>>> And
>>>>> the changes from the upcoming CI image will make it much easier to run
>>>>> tests on any CI provider. Except Kubernetes tests they are pretty much
>>>>> CI-agnostic. Kubernetes tests will likely be also fixed soon.
>>>>> 
>>>>> Another idea: I thought that in the future we can also run only subset
>>>> of
>>>>> postgres/mysql/sqlite tests on all combinations. I think there are just
>>>>> handful of tests that are specific for backend (and we already know
>>>> which
>>>>> ones they are - they are skipped-if).
>>>>> 
>>>>> J.
>>>>> 
>>>>> Principal Software Engineer
>>>>> Phone: +48660796129
>>>>> 
>>>>> czw., 27 cze 2019, 15:12 użytkownik Philippe Gagnon <
>>>> philgagnon1@gmail.com
>>>>>> 
>>>>> napisał:
>>>>> 
>>>>>> I think the combinations that you are proposing are sensible for
>>>>> pre-merge
>>>>>> checks.
>>>>>> 
>>>>>> I am working on a proposal to offload extra combinations to another CI
>>>>>> provider (Azure DevOps specifically seems like a good candidate),
>>>> either
>>>>>> pre or post merge. Ideally I'd like to run more combinations pre-merge
>>>>> but
>>>>>> there is a trade-off to be conscious of here between development
>>>> velocity
>>>>>> and quality assurance, which I think this issue highlights quite well.
>>>>>> 
>>>>>> Please let me know your thoughts
>>>>>> 
>>>>>> Philippe
>>>>>> 
>>>>>> On Thu, Jun 27, 2019 at 9:05 AM Jarek Potiuk <
>>>> Jarek.Potiuk@polidea.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Agree that we should be thoughtful about others as well: In the
>>>> latest
>>>>>> push
>>>>>>> (few minutes ago) of the upcoming official CI image i implemented
>>>> the
>>>>>>> change we discussed in the Github where we limit the number of
>>>>>> combinations
>>>>>>> we test:
>>>>>>> 
>>>>>>> You can see it yourself:
>>>>>>> https://travis-ci.org/apache/airflow/builds/551305240
>>>>>>> 
>>>>>>> Those are the combinations I propose:
>>>>>>> 
>>>>>>> Python: 3.6
>>>>>>> BACKEND=mysql ENV=docker
>>>>>>> 
>>>>>>> Python: 3.6
>>>>>>> BACKEND=postgres ENV=docker
>>>>>>> 
>>>>>>> Python: 3.5
>>>>>>> BACKEND=sqlite ENV=docker
>>>>>>> 
>>>>>>> Python: 3.6
>>>>>>> BACKEND=postgres ENV=kubernetes KUBERNETES_VERSION=v1.13.0
>>>>>>> 
>>>>>>> J,
>>>>>>> 
>>>>>>> 
>>>>>>> On Thu, Jun 27, 2019 at 11:00 AM Driesprong, Fokko
>>>>> <fokko@driesprong.frl
>>>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> We got this message last year:
>>>>>>>> 
>>>>>>>>> Hello, Airflow PPMC.
>>>>>>>>> While going through the usage statistics for our Travis CI
>>>>> service, I
>>>>>>>>> have noticed that the Airflow project is using an abnormally
>>>> large
>>>>>>>>> amount of resources, 2600 hours per month or the equivalent of
>>>>> having
>>>>>>>>> almost 4 machines building airflow non-stop 24/7. As this is not
>>>>>> free,
>>>>>>>>> but rather costing us money, I'm contacting you with the
>>>> intention
>>>>> of
>>>>>>>>> figuring out ways to reduce the use of Travis for the project.
>>>>>>>> 
>>>>>>>>> We would greatly prefer that the project itself comes up with a
>>>>>>> solution
>>>>>>>>> to lower the usage of Travis, as we'd hate to simply turn it off
>>>>> for
>>>>>>>>> you, but the usage is at a rather severe level, totaling more
>>>> than
>>>>>> 21%
>>>>>>>>> of the total build time of all projects using Travis, so
>>>> something
>>>>>>>>> actionable should be decided upon and (preferably) completed by
>>>> the
>>>>>> end
>>>>>>>>> of May that will reduce the consumption of Travis resources.
>>>>>>>> 
>>>>>>>>> Alternately, if you are unable to lower the pressure on Travis,
>>>> the
>>>>>>>>> podling and/or IPMC may ask the board of directors for a
>>>> separate
>>>>>>> budget
>>>>>>>>> for additional build nodes to cope with the added load - I'll
>>>> leave
>>>>>>> this
>>>>>>>>> for the podling and IPMC to decide on.
>>>>>>>> 
>>>>>>>>> Please let us know when you have decided on a plan to remedy
>>>> this
>>>>>>>> situation.
>>>>>>>> 
>>>>>>>>> With regards,
>>>>>>>>> Daniel on behalf of ASF Infrastructure.
>>>>>>>> 
>>>>>>>> I think more and more projects are still migrating to the ASF
>>>> Travis,
>>>>>> so
>>>>>>> I
>>>>>>>> think natural that there is more load. However, this still leaves
>>>> the
>>>>>>>> question if we have to run the full matrix.
>>>>>>>> 
>>>>>>>> Cheers, Fokko
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Op do 27 jun. 2019 om 10:56 schreef Jarek Potiuk <
>>>>>>> Jarek.Potiuk@polidea.com
>>>>>>>>> :
>>>>>>>> 
>>>>>>>>> I think we should really involve infra to increase the slot
>>>> number
>>>>> or
>>>>>>>> maybe
>>>>>>>>> even somehow allocate slots per project.
>>>>>>>>> The problem is that we cannot control what other apache projects
>>>>> are
>>>>>>>> doing,
>>>>>>>>> so even if we decrease our runtime, it's the other projects that
>>>>>> might
>>>>>>>> hold
>>>>>>>>> us in the queue :(
>>>>>>>>> 
>>>>>>>>> J.
>>>>>>>>> 
>>>>>>>>> On Thu, Jun 27, 2019 at 10:19 AM Driesprong, Fokko
>>>>>>> <fokko@driesprong.frl
>>>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> I've noticed this at other Apache projects as well, sometimes
>>>> it
>>>>>>> takes
>>>>>>>> up
>>>>>>>>>> to 7-8 hours. The only thing we can do, is reduce the runtime
>>>> of
>>>>>> the
>>>>>>>> jobs
>>>>>>>>>> so we take less slots :-)
>>>>>>>>>> 
>>>>>>>>>> Cheers, Fokko
>>>>>>>>>> 
>>>>>>>>>> Op wo 26 jun. 2019 om 21:59 schreef Jarek Potiuk <
>>>>>>>>> Jarek.Potiuk@polidea.com
>>>>>>>>>>> :
>>>>>>>>>> 
>>>>>>>>>>> Yep. That's what I suggested as the reason in the ticket - I
>>>>>> guess
>>>>>>>>> INFRA
>>>>>>>>>>> are the only people who can do anything about it (increase
>>>>>>>> concurrency
>>>>>>>>> ?
>>>>>>>>>>> pay more for Travis :)? ).
>>>>>>>>>>> 
>>>>>>>>>>> On Wed, Jun 26, 2019 at 9:51 PM Ash Berlin-Taylor <
>>>>>> ash@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> I asked Travis on twitter and they said it was due to the
>>>>>> Apache
>>>>>>>>> other
>>>>>>>>>>>> projects build queues
>>>>>>>>>>>> 
>>>>>>>>>>>> https://twitter.com/travisci/status/1143893051460526080
>>>>>>>>>>>> 
>>>>>>>>>>>> -ash
>>>>>>>>>>>> 
>>>>>>>>>>>> On 26 June 2019 20:48:33 BST, Jarek Potiuk <
>>>>>>>> Jarek.Potiuk@polidea.com
>>>>>>>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hello everyone,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> For the last few days the Travis builds for
>>>> apache/airflow
>>>>>>> project
>>>>>>>>> are
>>>>>>>>>>>>> waiting in a queue for hours. This is not a normal
>>>>> situation.
>>>>>>> I've
>>>>>>>>>>> opened
>>>>>>>>>>>>> INFRA ticket for that:
>>>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-18657
>>>>>>>>>>>>> 
>>>>>>>>>>>>> J.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> 
>>>>>>>>>>> Jarek Potiuk
>>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
>>>>> Engineer
>>>>>>>>>>> 
>>>>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> 
>>>>>>>>> Jarek Potiuk
>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
>>>> Engineer
>>>>>>>>> 
>>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> 
>>>>>>> Jarek Potiuk
>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>> 
>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> 
>>> Jarek Potiuk
>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>> 
>>> M: +48 660 796 129 <+48660796129>
>>> [image: Polidea] <https://www.polidea.com/>
>>> 
>>> 
>> 
>> -- 
>> 
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>> 
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>

Re: Travis builds in a queue for hours

Posted by Bolke de Bruin <bd...@gmail.com>.
Awesome! But I hope you are not serious about using Jenkins right? If I need to start a Holy War it would be against Jenkins.

B.

Verstuurd vanaf mijn iPad

> Op 10 jul. 2019 om 22:55 heeft Jarek Potiuk <Ja...@polidea.com> het volgende geschreven:
> 
> Hello Everyone,
> 
> I have some really good news. I just had a call with Google OSS team (Gris,
> Aizhamal) and they are willing to donate VMs on Google Cloud Platform to
> run CI for Airflow. In order to simplify the setup (and make sure it is ok
> according to Apache regulations) we think we should go exactly the same
> route as Apache Beam project (Google donated 16x 16CPU machines for them).
> The route of Apache Beam is to use the machines as workers for Apache
> Jenkins (https://builds.apache.org/). Apache Jenkins is one of the
> encouraged CI solutions by Apache and if we can have workers connected to
> the existing Jenkins master of Apache, it means that the maintenance
> overhead will be pretty minimal. And we can follow Apache Beam setup so I
> do not expect any legal problems.
> 
> I also work very closely with the team that uses Apache Beam Jenkins
> heavily so I have access to all the necessary experts to help with the
> setup (and I am happy to help with that).
> 
> I really hope everyone in the community will be really happy to go in that
> direction - it's. Please let me know if you have any concerns !
> 
> We do not need as many machines as Beam for sure (Beam uses the machines to
> process a lot of data for tests including some load testing) but we need to
> estimate the number/types of machines that we are going to need.
> Fokko, Ash, others - do you have some recent numbers for the current usage
> or should I open an Infrastructure ticket for it?
> 
> J
> 
> On Fri, Jun 28, 2019 at 4:50 PM Jarek Potiuk <Ja...@polidea.com>
> wrote:
> 
>> Thanks Aizhamal! I spoke already to Gris and she confirmed that as well
>> and the 8th of July date is ok for us as we will have to evaluate and
>> prepare as well. Have a nice trip.
>> 
>> J.
>> 
>> On Fri, Jun 28, 2019 at 4:25 PM Aizhamal Nurmamat kyzy
>> <ai...@google.com.invalid> wrote:
>> 
>>> Hi all,
>>> 
>>> On Thu, Jun 27, 2019 at 15:28 Jarek Potiuk <Ja...@polidea.com>
>>> wrote:
>>> 
>>>> Yeah. I also have a working version of Cloud build configuration and we
>>> can
>>>> run the tests on cloud build if we can get some credits from Google.
>>> 
>>> 
>>> I can look into getting a small amount of credits approved for this, to
>>> see
>>> if it’s useful to offload some tests to Cloud Build, or to provision some
>>> VMs to run on Apache Infra.
>>> 
>>> I am traveling at the moment, but I’ll be back in the office on July 8,
>>> and
>>> I’ll try to get this done.
>>> 
>>> 
>>> Thanks,
>>> Aizhamal
>>> 
>>> And
>>>> the changes from the upcoming CI image will make it much easier to run
>>>> tests on any CI provider. Except Kubernetes tests they are pretty much
>>>> CI-agnostic. Kubernetes tests will likely be also fixed soon.
>>>> 
>>>> Another idea: I thought that in the future we can also run only subset
>>> of
>>>> postgres/mysql/sqlite tests on all combinations. I think there are just
>>>> handful of tests that are specific for backend (and we already know
>>> which
>>>> ones they are - they are skipped-if).
>>>> 
>>>> J.
>>>> 
>>>> Principal Software Engineer
>>>> Phone: +48660796129
>>>> 
>>>> czw., 27 cze 2019, 15:12 użytkownik Philippe Gagnon <
>>> philgagnon1@gmail.com
>>>>> 
>>>> napisał:
>>>> 
>>>>> I think the combinations that you are proposing are sensible for
>>>> pre-merge
>>>>> checks.
>>>>> 
>>>>> I am working on a proposal to offload extra combinations to another CI
>>>>> provider (Azure DevOps specifically seems like a good candidate),
>>> either
>>>>> pre or post merge. Ideally I'd like to run more combinations pre-merge
>>>> but
>>>>> there is a trade-off to be conscious of here between development
>>> velocity
>>>>> and quality assurance, which I think this issue highlights quite well.
>>>>> 
>>>>> Please let me know your thoughts
>>>>> 
>>>>> Philippe
>>>>> 
>>>>> On Thu, Jun 27, 2019 at 9:05 AM Jarek Potiuk <
>>> Jarek.Potiuk@polidea.com>
>>>>> wrote:
>>>>> 
>>>>>> Agree that we should be thoughtful about others as well: In the
>>> latest
>>>>> push
>>>>>> (few minutes ago) of the upcoming official CI image i implemented
>>> the
>>>>>> change we discussed in the Github where we limit the number of
>>>>> combinations
>>>>>> we test:
>>>>>> 
>>>>>> You can see it yourself:
>>>>>> https://travis-ci.org/apache/airflow/builds/551305240
>>>>>> 
>>>>>> Those are the combinations I propose:
>>>>>> 
>>>>>> Python: 3.6
>>>>>> BACKEND=mysql ENV=docker
>>>>>> 
>>>>>> Python: 3.6
>>>>>> BACKEND=postgres ENV=docker
>>>>>> 
>>>>>> Python: 3.5
>>>>>> BACKEND=sqlite ENV=docker
>>>>>> 
>>>>>> Python: 3.6
>>>>>> BACKEND=postgres ENV=kubernetes KUBERNETES_VERSION=v1.13.0
>>>>>> 
>>>>>> J,
>>>>>> 
>>>>>> 
>>>>>> On Thu, Jun 27, 2019 at 11:00 AM Driesprong, Fokko
>>>> <fokko@driesprong.frl
>>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> We got this message last year:
>>>>>>> 
>>>>>>>> Hello, Airflow PPMC.
>>>>>>>> While going through the usage statistics for our Travis CI
>>>> service, I
>>>>>>>> have noticed that the Airflow project is using an abnormally
>>> large
>>>>>>>> amount of resources, 2600 hours per month or the equivalent of
>>>> having
>>>>>>>> almost 4 machines building airflow non-stop 24/7. As this is not
>>>>> free,
>>>>>>>> but rather costing us money, I'm contacting you with the
>>> intention
>>>> of
>>>>>>>> figuring out ways to reduce the use of Travis for the project.
>>>>>>> 
>>>>>>>> We would greatly prefer that the project itself comes up with a
>>>>>> solution
>>>>>>>> to lower the usage of Travis, as we'd hate to simply turn it off
>>>> for
>>>>>>>> you, but the usage is at a rather severe level, totaling more
>>> than
>>>>> 21%
>>>>>>>> of the total build time of all projects using Travis, so
>>> something
>>>>>>>> actionable should be decided upon and (preferably) completed by
>>> the
>>>>> end
>>>>>>>> of May that will reduce the consumption of Travis resources.
>>>>>>> 
>>>>>>>> Alternately, if you are unable to lower the pressure on Travis,
>>> the
>>>>>>>> podling and/or IPMC may ask the board of directors for a
>>> separate
>>>>>> budget
>>>>>>>> for additional build nodes to cope with the added load - I'll
>>> leave
>>>>>> this
>>>>>>>> for the podling and IPMC to decide on.
>>>>>>> 
>>>>>>>> Please let us know when you have decided on a plan to remedy
>>> this
>>>>>>> situation.
>>>>>>> 
>>>>>>>> With regards,
>>>>>>>> Daniel on behalf of ASF Infrastructure.
>>>>>>> 
>>>>>>> I think more and more projects are still migrating to the ASF
>>> Travis,
>>>>> so
>>>>>> I
>>>>>>> think natural that there is more load. However, this still leaves
>>> the
>>>>>>> question if we have to run the full matrix.
>>>>>>> 
>>>>>>> Cheers, Fokko
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Op do 27 jun. 2019 om 10:56 schreef Jarek Potiuk <
>>>>>> Jarek.Potiuk@polidea.com
>>>>>>>> :
>>>>>>> 
>>>>>>>> I think we should really involve infra to increase the slot
>>> number
>>>> or
>>>>>>> maybe
>>>>>>>> even somehow allocate slots per project.
>>>>>>>> The problem is that we cannot control what other apache projects
>>>> are
>>>>>>> doing,
>>>>>>>> so even if we decrease our runtime, it's the other projects that
>>>>> might
>>>>>>> hold
>>>>>>>> us in the queue :(
>>>>>>>> 
>>>>>>>> J.
>>>>>>>> 
>>>>>>>> On Thu, Jun 27, 2019 at 10:19 AM Driesprong, Fokko
>>>>>> <fokko@driesprong.frl
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> I've noticed this at other Apache projects as well, sometimes
>>> it
>>>>>> takes
>>>>>>> up
>>>>>>>>> to 7-8 hours. The only thing we can do, is reduce the runtime
>>> of
>>>>> the
>>>>>>> jobs
>>>>>>>>> so we take less slots :-)
>>>>>>>>> 
>>>>>>>>> Cheers, Fokko
>>>>>>>>> 
>>>>>>>>> Op wo 26 jun. 2019 om 21:59 schreef Jarek Potiuk <
>>>>>>>> Jarek.Potiuk@polidea.com
>>>>>>>>>> :
>>>>>>>>> 
>>>>>>>>>> Yep. That's what I suggested as the reason in the ticket - I
>>>>> guess
>>>>>>>> INFRA
>>>>>>>>>> are the only people who can do anything about it (increase
>>>>>>> concurrency
>>>>>>>> ?
>>>>>>>>>> pay more for Travis :)? ).
>>>>>>>>>> 
>>>>>>>>>> On Wed, Jun 26, 2019 at 9:51 PM Ash Berlin-Taylor <
>>>>> ash@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> I asked Travis on twitter and they said it was due to the
>>>>> Apache
>>>>>>>> other
>>>>>>>>>>> projects build queues
>>>>>>>>>>> 
>>>>>>>>>>> https://twitter.com/travisci/status/1143893051460526080
>>>>>>>>>>> 
>>>>>>>>>>> -ash
>>>>>>>>>>> 
>>>>>>>>>>> On 26 June 2019 20:48:33 BST, Jarek Potiuk <
>>>>>>> Jarek.Potiuk@polidea.com
>>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Hello everyone,
>>>>>>>>>>>> 
>>>>>>>>>>>> For the last few days the Travis builds for
>>> apache/airflow
>>>>>> project
>>>>>>>> are
>>>>>>>>>>>> waiting in a queue for hours. This is not a normal
>>>> situation.
>>>>>> I've
>>>>>>>>>> opened
>>>>>>>>>>>> INFRA ticket for that:
>>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-18657
>>>>>>>>>>>> 
>>>>>>>>>>>> J.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> 
>>>>>>>>>> Jarek Potiuk
>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
>>>> Engineer
>>>>>>>>>> 
>>>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> 
>>>>>>>> Jarek Potiuk
>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
>>> Engineer
>>>>>>>> 
>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> Jarek Potiuk
>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>> 
>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>> 
>> --
>> 
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>> 
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>
>> 
>> 
> 
> -- 
> 
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
> 
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>