You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Krisztián Szűcs <sz...@gmail.com> on 2019/11/08 18:07:25 UTC

Re: [CI] Docker-compose refactor and GitHub Actions

I've trimmed down the number of triggered builds on pull requests by
converting them to run on master only or cron builds. Alsoe added
the action filters including the changed path patterns. I've also
collected ~30 follow up JIRAs aggregating the problems I came across
during the refactor and possible further improvements and optimizations
like caching.

The PR is ready to be merged [1]. I expect multiple issues after merging
the PR, despite that I was trying to port everything thoroughly.
The changeset is simply too big, but we need to start somewhere, and
sooner is better.

[1]: https://github.com/apache/arrow/pull/5589

On Thu, Oct 31, 2019 at 10:21 PM Wes McKinney <we...@gmail.com> wrote:
>
> hi Krisz -- I just left comments on the PR. This definitely looks like
> a good step forward. My main comment is that I think there are too
> many C++/Python tasks to run on an each-commit basis. Ideally many of
> these would be run nightly. There is also a certain amount of
> redundancy in rebuilding the C++ library multiple times before running
> each dependent set of tests, whereas in Travis we build the C++
> library once then test both C++ and Python. If there is sufficient
> number of builders then perhaps it doesn't matter so much
>
> It seems there are a few things, like action filtering (similar to
> "detect-changes.py") based on what was changed that would need to get
> done before this can be merged.
>
> - Wes
>
> On Fri, Oct 25, 2019 at 7:25 PM Krisztián Szűcs
> <sz...@gmail.com> wrote:
> >
> > Hi,
> >
> > During the release of 0.15.1-RC0 I literally had to wait days
> > to ensure that the Travis, Appveyor and Crossbow builds
> > were all passing for the release branch. Additionally each
> > newly added patch was delaying the process by 8 hrs or so
> > (actually felt like 16).
> >
> > Recently I've been working on to incorporate the advantages
> > of the Buildbot setup into our current docker-compose
> > configuration, including support for multiple architectures
> > and platforms, reusing docker images and caching dependency
> > installation steps. It tries to follow the semantics of ursabot,
> > but using only docker-compose and tiny shell scripts.
> >
> > This refactoring also includes GitHub Actions workflows for
> > Windows and macOS as well, reusing the same (bash) builds
> > scripts. The docker configuration and the scripts are CI agnostic.
> > Last but not least, I've managed to clean up a lot of things
> > including every travis builds, and three Appveyor builds.
> > As an example the ci [3] and dev [4] folders got much cleaner.
> >
> > The majority of the builds are passing [2], but due to the size
> > of the pull request [1] reviews for relevant workflows like the
> > JavaScript, C#, Rust, JNI, etc. would be much appreciated.
> > I'll be on vacation until Wednesday, but will try to respond on
> > both GH and the ML.
> >
> > Thanks, Krisztian
> >
> > [1]: https://github.com/apache/arrow/pull/5589
> > [2]: https://github.com/apache/arrow/runs/275685241
> > [3]: https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/ci
> > [4]: https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/dev

Re: [CI] Docker-compose refactor and GitHub Actions

Posted by Krisztián Szűcs <sz...@gmail.com>.
Note that 20 builds are running on master only, including windows, macos,
and documentation builds.

On Mon, Nov 11, 2019, 12:33 PM Antoine Pitrou <an...@python.org> wrote:

>
> Ah, no, sorry, I read wrongly.  It's 20 builds total.  That's still a
> lot, IMO.
>
> Regards
>
> Antoine.
>
>
> Le 11/11/2019 à 12:32, Antoine Pitrou a écrit :
> >
> > A C++ change will end up triggering 38 builds (C++, Python, Ruby, R)...
> >
> > Regards
> >
> > Antoine.
> >
> >
> > Le 11/11/2019 à 12:25, Wes McKinney a écrit :
> >> That's too many for C++ and Python changes in my opinion. Let's focus
> >> on getting the PR merged and then we can address this problem
> >>
> >> On Mon, Nov 11, 2019 at 3:21 AM Krisztián Szűcs
> >> <sz...@gmail.com> wrote:
> >>>
> >>> It depends on the change:
> >>>
> >>> - [C++]: 20
> >>>
> https://github.com/kszucs/arrow/commit/5654da07c21a7cf6a0e08894f4bf4e7aaa70ed26/checks?check_suite_id=304713473
> >>> - [Python]: 8
> >>>
> https://github.com/kszucs/arrow/commit/5084c23f4d8b0fe05f1056808e7066aadbf0033e/checks?check_suite_id=304716138
> >>> - [Ruby]: 5
> >>>
> https://github.com/kszucs/arrow/commit/e90e92514f69b0734e8126a949858476bfb70c1a/checks?check_suite_id=304717369
> >>> - [JS]: 6
> >>>
> https://github.com/kszucs/arrow/commit/f27d6500f628c152c732f2372beae15a34f088e9/checks?check_suite_id=304718837
> >>> - [R]: 5
> >>>
> https://github.com/kszucs/arrow/commit/eaec66cbde923687beebe31114dea6d0410043c6/checks?check_suite_id=304719993
> >>> - [Rust]: 6
> >>>
> https://github.com/kszucs/arrow/commit/0c52ad15a54e78fdd1fe797668ebf0cdfbf6f4ab/checks?check_suite_id=304720884
> >>> - [Go]: 7
> >>>
> https://github.com/kszucs/arrow/commit/30ab7f5e8cf6373774ce6ea75b0473da5ec808a3/checks?check_suite_id=304722182
> >>>
> >>>
> >>> On Sat, Nov 9, 2019 at 2:19 AM Wes McKinney <we...@gmail.com>
> wrote:
> >>>>
> >>>> Just to be sure, if this PR is merged, how many GHA tasks will be run
> >>>> on each commit to master?
> >>>>
> >>>> On Fri, Nov 8, 2019 at 12:07 PM Krisztián Szűcs
> >>>> <sz...@gmail.com> wrote:
> >>>>>
> >>>>> I've trimmed down the number of triggered builds on pull requests by
> >>>>> converting them to run on master only or cron builds. Alsoe added
> >>>>> the action filters including the changed path patterns. I've also
> >>>>> collected ~30 follow up JIRAs aggregating the problems I came across
> >>>>> during the refactor and possible further improvements and
> optimizations
> >>>>> like caching.
> >>>>>
> >>>>> The PR is ready to be merged [1]. I expect multiple issues after
> merging
> >>>>> the PR, despite that I was trying to port everything thoroughly.
> >>>>> The changeset is simply too big, but we need to start somewhere, and
> >>>>> sooner is better.
> >>>>>
> >>>>> [1]: https://github.com/apache/arrow/pull/5589
> >>>>>
> >>>>> On Thu, Oct 31, 2019 at 10:21 PM Wes McKinney <we...@gmail.com>
> wrote:
> >>>>>>
> >>>>>> hi Krisz -- I just left comments on the PR. This definitely looks
> like
> >>>>>> a good step forward. My main comment is that I think there are too
> >>>>>> many C++/Python tasks to run on an each-commit basis. Ideally many
> of
> >>>>>> these would be run nightly. There is also a certain amount of
> >>>>>> redundancy in rebuilding the C++ library multiple times before
> running
> >>>>>> each dependent set of tests, whereas in Travis we build the C++
> >>>>>> library once then test both C++ and Python. If there is sufficient
> >>>>>> number of builders then perhaps it doesn't matter so much
> >>>>>>
> >>>>>> It seems there are a few things, like action filtering (similar to
> >>>>>> "detect-changes.py") based on what was changed that would need to
> get
> >>>>>> done before this can be merged.
> >>>>>>
> >>>>>> - Wes
> >>>>>>
> >>>>>> On Fri, Oct 25, 2019 at 7:25 PM Krisztián Szűcs
> >>>>>> <sz...@gmail.com> wrote:
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> During the release of 0.15.1-RC0 I literally had to wait days
> >>>>>>> to ensure that the Travis, Appveyor and Crossbow builds
> >>>>>>> were all passing for the release branch. Additionally each
> >>>>>>> newly added patch was delaying the process by 8 hrs or so
> >>>>>>> (actually felt like 16).
> >>>>>>>
> >>>>>>> Recently I've been working on to incorporate the advantages
> >>>>>>> of the Buildbot setup into our current docker-compose
> >>>>>>> configuration, including support for multiple architectures
> >>>>>>> and platforms, reusing docker images and caching dependency
> >>>>>>> installation steps. It tries to follow the semantics of ursabot,
> >>>>>>> but using only docker-compose and tiny shell scripts.
> >>>>>>>
> >>>>>>> This refactoring also includes GitHub Actions workflows for
> >>>>>>> Windows and macOS as well, reusing the same (bash) builds
> >>>>>>> scripts. The docker configuration and the scripts are CI agnostic.
> >>>>>>> Last but not least, I've managed to clean up a lot of things
> >>>>>>> including every travis builds, and three Appveyor builds.
> >>>>>>> As an example the ci [3] and dev [4] folders got much cleaner.
> >>>>>>>
> >>>>>>> The majority of the builds are passing [2], but due to the size
> >>>>>>> of the pull request [1] reviews for relevant workflows like the
> >>>>>>> JavaScript, C#, Rust, JNI, etc. would be much appreciated.
> >>>>>>> I'll be on vacation until Wednesday, but will try to respond on
> >>>>>>> both GH and the ML.
> >>>>>>>
> >>>>>>> Thanks, Krisztian
> >>>>>>>
> >>>>>>> [1]: https://github.com/apache/arrow/pull/5589
> >>>>>>> [2]: https://github.com/apache/arrow/runs/275685241
> >>>>>>> [3]:
> https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/ci
> >>>>>>> [4]:
> https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/dev
>

Re: [CI] Docker-compose refactor and GitHub Actions

Posted by Antoine Pitrou <an...@python.org>.
Ah, no, sorry, I read wrongly.  It's 20 builds total.  That's still a
lot, IMO.

Regards

Antoine.


Le 11/11/2019 à 12:32, Antoine Pitrou a écrit :
> 
> A C++ change will end up triggering 38 builds (C++, Python, Ruby, R)...
> 
> Regards
> 
> Antoine.
> 
> 
> Le 11/11/2019 à 12:25, Wes McKinney a écrit :
>> That's too many for C++ and Python changes in my opinion. Let's focus
>> on getting the PR merged and then we can address this problem
>>
>> On Mon, Nov 11, 2019 at 3:21 AM Krisztián Szűcs
>> <sz...@gmail.com> wrote:
>>>
>>> It depends on the change:
>>>
>>> - [C++]: 20
>>>   https://github.com/kszucs/arrow/commit/5654da07c21a7cf6a0e08894f4bf4e7aaa70ed26/checks?check_suite_id=304713473
>>> - [Python]: 8
>>>   https://github.com/kszucs/arrow/commit/5084c23f4d8b0fe05f1056808e7066aadbf0033e/checks?check_suite_id=304716138
>>> - [Ruby]: 5
>>>   https://github.com/kszucs/arrow/commit/e90e92514f69b0734e8126a949858476bfb70c1a/checks?check_suite_id=304717369
>>> - [JS]: 6
>>>   https://github.com/kszucs/arrow/commit/f27d6500f628c152c732f2372beae15a34f088e9/checks?check_suite_id=304718837
>>> - [R]: 5
>>>   https://github.com/kszucs/arrow/commit/eaec66cbde923687beebe31114dea6d0410043c6/checks?check_suite_id=304719993
>>> - [Rust]: 6
>>>   https://github.com/kszucs/arrow/commit/0c52ad15a54e78fdd1fe797668ebf0cdfbf6f4ab/checks?check_suite_id=304720884
>>> - [Go]: 7
>>>   https://github.com/kszucs/arrow/commit/30ab7f5e8cf6373774ce6ea75b0473da5ec808a3/checks?check_suite_id=304722182
>>>
>>>
>>> On Sat, Nov 9, 2019 at 2:19 AM Wes McKinney <we...@gmail.com> wrote:
>>>>
>>>> Just to be sure, if this PR is merged, how many GHA tasks will be run
>>>> on each commit to master?
>>>>
>>>> On Fri, Nov 8, 2019 at 12:07 PM Krisztián Szűcs
>>>> <sz...@gmail.com> wrote:
>>>>>
>>>>> I've trimmed down the number of triggered builds on pull requests by
>>>>> converting them to run on master only or cron builds. Alsoe added
>>>>> the action filters including the changed path patterns. I've also
>>>>> collected ~30 follow up JIRAs aggregating the problems I came across
>>>>> during the refactor and possible further improvements and optimizations
>>>>> like caching.
>>>>>
>>>>> The PR is ready to be merged [1]. I expect multiple issues after merging
>>>>> the PR, despite that I was trying to port everything thoroughly.
>>>>> The changeset is simply too big, but we need to start somewhere, and
>>>>> sooner is better.
>>>>>
>>>>> [1]: https://github.com/apache/arrow/pull/5589
>>>>>
>>>>> On Thu, Oct 31, 2019 at 10:21 PM Wes McKinney <we...@gmail.com> wrote:
>>>>>>
>>>>>> hi Krisz -- I just left comments on the PR. This definitely looks like
>>>>>> a good step forward. My main comment is that I think there are too
>>>>>> many C++/Python tasks to run on an each-commit basis. Ideally many of
>>>>>> these would be run nightly. There is also a certain amount of
>>>>>> redundancy in rebuilding the C++ library multiple times before running
>>>>>> each dependent set of tests, whereas in Travis we build the C++
>>>>>> library once then test both C++ and Python. If there is sufficient
>>>>>> number of builders then perhaps it doesn't matter so much
>>>>>>
>>>>>> It seems there are a few things, like action filtering (similar to
>>>>>> "detect-changes.py") based on what was changed that would need to get
>>>>>> done before this can be merged.
>>>>>>
>>>>>> - Wes
>>>>>>
>>>>>> On Fri, Oct 25, 2019 at 7:25 PM Krisztián Szűcs
>>>>>> <sz...@gmail.com> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> During the release of 0.15.1-RC0 I literally had to wait days
>>>>>>> to ensure that the Travis, Appveyor and Crossbow builds
>>>>>>> were all passing for the release branch. Additionally each
>>>>>>> newly added patch was delaying the process by 8 hrs or so
>>>>>>> (actually felt like 16).
>>>>>>>
>>>>>>> Recently I've been working on to incorporate the advantages
>>>>>>> of the Buildbot setup into our current docker-compose
>>>>>>> configuration, including support for multiple architectures
>>>>>>> and platforms, reusing docker images and caching dependency
>>>>>>> installation steps. It tries to follow the semantics of ursabot,
>>>>>>> but using only docker-compose and tiny shell scripts.
>>>>>>>
>>>>>>> This refactoring also includes GitHub Actions workflows for
>>>>>>> Windows and macOS as well, reusing the same (bash) builds
>>>>>>> scripts. The docker configuration and the scripts are CI agnostic.
>>>>>>> Last but not least, I've managed to clean up a lot of things
>>>>>>> including every travis builds, and three Appveyor builds.
>>>>>>> As an example the ci [3] and dev [4] folders got much cleaner.
>>>>>>>
>>>>>>> The majority of the builds are passing [2], but due to the size
>>>>>>> of the pull request [1] reviews for relevant workflows like the
>>>>>>> JavaScript, C#, Rust, JNI, etc. would be much appreciated.
>>>>>>> I'll be on vacation until Wednesday, but will try to respond on
>>>>>>> both GH and the ML.
>>>>>>>
>>>>>>> Thanks, Krisztian
>>>>>>>
>>>>>>> [1]: https://github.com/apache/arrow/pull/5589
>>>>>>> [2]: https://github.com/apache/arrow/runs/275685241
>>>>>>> [3]: https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/ci
>>>>>>> [4]: https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/dev

Re: [CI] Docker-compose refactor and GitHub Actions

Posted by Antoine Pitrou <an...@python.org>.
A C++ change will end up triggering 38 builds (C++, Python, Ruby, R)...

Regards

Antoine.


Le 11/11/2019 à 12:25, Wes McKinney a écrit :
> That's too many for C++ and Python changes in my opinion. Let's focus
> on getting the PR merged and then we can address this problem
> 
> On Mon, Nov 11, 2019 at 3:21 AM Krisztián Szűcs
> <sz...@gmail.com> wrote:
>>
>> It depends on the change:
>>
>> - [C++]: 20
>>   https://github.com/kszucs/arrow/commit/5654da07c21a7cf6a0e08894f4bf4e7aaa70ed26/checks?check_suite_id=304713473
>> - [Python]: 8
>>   https://github.com/kszucs/arrow/commit/5084c23f4d8b0fe05f1056808e7066aadbf0033e/checks?check_suite_id=304716138
>> - [Ruby]: 5
>>   https://github.com/kszucs/arrow/commit/e90e92514f69b0734e8126a949858476bfb70c1a/checks?check_suite_id=304717369
>> - [JS]: 6
>>   https://github.com/kszucs/arrow/commit/f27d6500f628c152c732f2372beae15a34f088e9/checks?check_suite_id=304718837
>> - [R]: 5
>>   https://github.com/kszucs/arrow/commit/eaec66cbde923687beebe31114dea6d0410043c6/checks?check_suite_id=304719993
>> - [Rust]: 6
>>   https://github.com/kszucs/arrow/commit/0c52ad15a54e78fdd1fe797668ebf0cdfbf6f4ab/checks?check_suite_id=304720884
>> - [Go]: 7
>>   https://github.com/kszucs/arrow/commit/30ab7f5e8cf6373774ce6ea75b0473da5ec808a3/checks?check_suite_id=304722182
>>
>>
>> On Sat, Nov 9, 2019 at 2:19 AM Wes McKinney <we...@gmail.com> wrote:
>>>
>>> Just to be sure, if this PR is merged, how many GHA tasks will be run
>>> on each commit to master?
>>>
>>> On Fri, Nov 8, 2019 at 12:07 PM Krisztián Szűcs
>>> <sz...@gmail.com> wrote:
>>>>
>>>> I've trimmed down the number of triggered builds on pull requests by
>>>> converting them to run on master only or cron builds. Alsoe added
>>>> the action filters including the changed path patterns. I've also
>>>> collected ~30 follow up JIRAs aggregating the problems I came across
>>>> during the refactor and possible further improvements and optimizations
>>>> like caching.
>>>>
>>>> The PR is ready to be merged [1]. I expect multiple issues after merging
>>>> the PR, despite that I was trying to port everything thoroughly.
>>>> The changeset is simply too big, but we need to start somewhere, and
>>>> sooner is better.
>>>>
>>>> [1]: https://github.com/apache/arrow/pull/5589
>>>>
>>>> On Thu, Oct 31, 2019 at 10:21 PM Wes McKinney <we...@gmail.com> wrote:
>>>>>
>>>>> hi Krisz -- I just left comments on the PR. This definitely looks like
>>>>> a good step forward. My main comment is that I think there are too
>>>>> many C++/Python tasks to run on an each-commit basis. Ideally many of
>>>>> these would be run nightly. There is also a certain amount of
>>>>> redundancy in rebuilding the C++ library multiple times before running
>>>>> each dependent set of tests, whereas in Travis we build the C++
>>>>> library once then test both C++ and Python. If there is sufficient
>>>>> number of builders then perhaps it doesn't matter so much
>>>>>
>>>>> It seems there are a few things, like action filtering (similar to
>>>>> "detect-changes.py") based on what was changed that would need to get
>>>>> done before this can be merged.
>>>>>
>>>>> - Wes
>>>>>
>>>>> On Fri, Oct 25, 2019 at 7:25 PM Krisztián Szűcs
>>>>> <sz...@gmail.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> During the release of 0.15.1-RC0 I literally had to wait days
>>>>>> to ensure that the Travis, Appveyor and Crossbow builds
>>>>>> were all passing for the release branch. Additionally each
>>>>>> newly added patch was delaying the process by 8 hrs or so
>>>>>> (actually felt like 16).
>>>>>>
>>>>>> Recently I've been working on to incorporate the advantages
>>>>>> of the Buildbot setup into our current docker-compose
>>>>>> configuration, including support for multiple architectures
>>>>>> and platforms, reusing docker images and caching dependency
>>>>>> installation steps. It tries to follow the semantics of ursabot,
>>>>>> but using only docker-compose and tiny shell scripts.
>>>>>>
>>>>>> This refactoring also includes GitHub Actions workflows for
>>>>>> Windows and macOS as well, reusing the same (bash) builds
>>>>>> scripts. The docker configuration and the scripts are CI agnostic.
>>>>>> Last but not least, I've managed to clean up a lot of things
>>>>>> including every travis builds, and three Appveyor builds.
>>>>>> As an example the ci [3] and dev [4] folders got much cleaner.
>>>>>>
>>>>>> The majority of the builds are passing [2], but due to the size
>>>>>> of the pull request [1] reviews for relevant workflows like the
>>>>>> JavaScript, C#, Rust, JNI, etc. would be much appreciated.
>>>>>> I'll be on vacation until Wednesday, but will try to respond on
>>>>>> both GH and the ML.
>>>>>>
>>>>>> Thanks, Krisztian
>>>>>>
>>>>>> [1]: https://github.com/apache/arrow/pull/5589
>>>>>> [2]: https://github.com/apache/arrow/runs/275685241
>>>>>> [3]: https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/ci
>>>>>> [4]: https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/dev

Re: [CI] Docker-compose refactor and GitHub Actions

Posted by Krisztián Szűcs <sz...@gmail.com>.
We can reduce the number of builds further by running them as cron tasks.
In the beginning I'd rather keep them to keep regressions earlier. The PR
has an approval, so I'll merge it tomorrow morning (CET) if there are no
objections.

Thanks, Krisztian

On Mon, Nov 11, 2019 at 12:26 PM Wes McKinney <we...@gmail.com> wrote:
>
> That's too many for C++ and Python changes in my opinion. Let's focus
> on getting the PR merged and then we can address this problem
>
> On Mon, Nov 11, 2019 at 3:21 AM Krisztián Szűcs
> <sz...@gmail.com> wrote:
> >
> > It depends on the change:
> >
> > - [C++]: 20
> >   https://github.com/kszucs/arrow/commit/5654da07c21a7cf6a0e08894f4bf4e7aaa70ed26/checks?check_suite_id=304713473
> > - [Python]: 8
> >   https://github.com/kszucs/arrow/commit/5084c23f4d8b0fe05f1056808e7066aadbf0033e/checks?check_suite_id=304716138
> > - [Ruby]: 5
> >   https://github.com/kszucs/arrow/commit/e90e92514f69b0734e8126a949858476bfb70c1a/checks?check_suite_id=304717369
> > - [JS]: 6
> >   https://github.com/kszucs/arrow/commit/f27d6500f628c152c732f2372beae15a34f088e9/checks?check_suite_id=304718837
> > - [R]: 5
> >   https://github.com/kszucs/arrow/commit/eaec66cbde923687beebe31114dea6d0410043c6/checks?check_suite_id=304719993
> > - [Rust]: 6
> >   https://github.com/kszucs/arrow/commit/0c52ad15a54e78fdd1fe797668ebf0cdfbf6f4ab/checks?check_suite_id=304720884
> > - [Go]: 7
> >   https://github.com/kszucs/arrow/commit/30ab7f5e8cf6373774ce6ea75b0473da5ec808a3/checks?check_suite_id=304722182
> >
> >
> > On Sat, Nov 9, 2019 at 2:19 AM Wes McKinney <we...@gmail.com> wrote:
> > >
> > > Just to be sure, if this PR is merged, how many GHA tasks will be run
> > > on each commit to master?
> > >
> > > On Fri, Nov 8, 2019 at 12:07 PM Krisztián Szűcs
> > > <sz...@gmail.com> wrote:
> > > >
> > > > I've trimmed down the number of triggered builds on pull requests by
> > > > converting them to run on master only or cron builds. Alsoe added
> > > > the action filters including the changed path patterns. I've also
> > > > collected ~30 follow up JIRAs aggregating the problems I came across
> > > > during the refactor and possible further improvements and optimizations
> > > > like caching.
> > > >
> > > > The PR is ready to be merged [1]. I expect multiple issues after merging
> > > > the PR, despite that I was trying to port everything thoroughly.
> > > > The changeset is simply too big, but we need to start somewhere, and
> > > > sooner is better.
> > > >
> > > > [1]: https://github.com/apache/arrow/pull/5589
> > > >
> > > > On Thu, Oct 31, 2019 at 10:21 PM Wes McKinney <we...@gmail.com> wrote:
> > > > >
> > > > > hi Krisz -- I just left comments on the PR. This definitely looks like
> > > > > a good step forward. My main comment is that I think there are too
> > > > > many C++/Python tasks to run on an each-commit basis. Ideally many of
> > > > > these would be run nightly. There is also a certain amount of
> > > > > redundancy in rebuilding the C++ library multiple times before running
> > > > > each dependent set of tests, whereas in Travis we build the C++
> > > > > library once then test both C++ and Python. If there is sufficient
> > > > > number of builders then perhaps it doesn't matter so much
> > > > >
> > > > > It seems there are a few things, like action filtering (similar to
> > > > > "detect-changes.py") based on what was changed that would need to get
> > > > > done before this can be merged.
> > > > >
> > > > > - Wes
> > > > >
> > > > > On Fri, Oct 25, 2019 at 7:25 PM Krisztián Szűcs
> > > > > <sz...@gmail.com> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > During the release of 0.15.1-RC0 I literally had to wait days
> > > > > > to ensure that the Travis, Appveyor and Crossbow builds
> > > > > > were all passing for the release branch. Additionally each
> > > > > > newly added patch was delaying the process by 8 hrs or so
> > > > > > (actually felt like 16).
> > > > > >
> > > > > > Recently I've been working on to incorporate the advantages
> > > > > > of the Buildbot setup into our current docker-compose
> > > > > > configuration, including support for multiple architectures
> > > > > > and platforms, reusing docker images and caching dependency
> > > > > > installation steps. It tries to follow the semantics of ursabot,
> > > > > > but using only docker-compose and tiny shell scripts.
> > > > > >
> > > > > > This refactoring also includes GitHub Actions workflows for
> > > > > > Windows and macOS as well, reusing the same (bash) builds
> > > > > > scripts. The docker configuration and the scripts are CI agnostic.
> > > > > > Last but not least, I've managed to clean up a lot of things
> > > > > > including every travis builds, and three Appveyor builds.
> > > > > > As an example the ci [3] and dev [4] folders got much cleaner.
> > > > > >
> > > > > > The majority of the builds are passing [2], but due to the size
> > > > > > of the pull request [1] reviews for relevant workflows like the
> > > > > > JavaScript, C#, Rust, JNI, etc. would be much appreciated.
> > > > > > I'll be on vacation until Wednesday, but will try to respond on
> > > > > > both GH and the ML.
> > > > > >
> > > > > > Thanks, Krisztian
> > > > > >
> > > > > > [1]: https://github.com/apache/arrow/pull/5589
> > > > > > [2]: https://github.com/apache/arrow/runs/275685241
> > > > > > [3]: https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/ci
> > > > > > [4]: https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/dev

Re: [CI] Docker-compose refactor and GitHub Actions

Posted by Wes McKinney <we...@gmail.com>.
That's too many for C++ and Python changes in my opinion. Let's focus
on getting the PR merged and then we can address this problem

On Mon, Nov 11, 2019 at 3:21 AM Krisztián Szűcs
<sz...@gmail.com> wrote:
>
> It depends on the change:
>
> - [C++]: 20
>   https://github.com/kszucs/arrow/commit/5654da07c21a7cf6a0e08894f4bf4e7aaa70ed26/checks?check_suite_id=304713473
> - [Python]: 8
>   https://github.com/kszucs/arrow/commit/5084c23f4d8b0fe05f1056808e7066aadbf0033e/checks?check_suite_id=304716138
> - [Ruby]: 5
>   https://github.com/kszucs/arrow/commit/e90e92514f69b0734e8126a949858476bfb70c1a/checks?check_suite_id=304717369
> - [JS]: 6
>   https://github.com/kszucs/arrow/commit/f27d6500f628c152c732f2372beae15a34f088e9/checks?check_suite_id=304718837
> - [R]: 5
>   https://github.com/kszucs/arrow/commit/eaec66cbde923687beebe31114dea6d0410043c6/checks?check_suite_id=304719993
> - [Rust]: 6
>   https://github.com/kszucs/arrow/commit/0c52ad15a54e78fdd1fe797668ebf0cdfbf6f4ab/checks?check_suite_id=304720884
> - [Go]: 7
>   https://github.com/kszucs/arrow/commit/30ab7f5e8cf6373774ce6ea75b0473da5ec808a3/checks?check_suite_id=304722182
>
>
> On Sat, Nov 9, 2019 at 2:19 AM Wes McKinney <we...@gmail.com> wrote:
> >
> > Just to be sure, if this PR is merged, how many GHA tasks will be run
> > on each commit to master?
> >
> > On Fri, Nov 8, 2019 at 12:07 PM Krisztián Szűcs
> > <sz...@gmail.com> wrote:
> > >
> > > I've trimmed down the number of triggered builds on pull requests by
> > > converting them to run on master only or cron builds. Alsoe added
> > > the action filters including the changed path patterns. I've also
> > > collected ~30 follow up JIRAs aggregating the problems I came across
> > > during the refactor and possible further improvements and optimizations
> > > like caching.
> > >
> > > The PR is ready to be merged [1]. I expect multiple issues after merging
> > > the PR, despite that I was trying to port everything thoroughly.
> > > The changeset is simply too big, but we need to start somewhere, and
> > > sooner is better.
> > >
> > > [1]: https://github.com/apache/arrow/pull/5589
> > >
> > > On Thu, Oct 31, 2019 at 10:21 PM Wes McKinney <we...@gmail.com> wrote:
> > > >
> > > > hi Krisz -- I just left comments on the PR. This definitely looks like
> > > > a good step forward. My main comment is that I think there are too
> > > > many C++/Python tasks to run on an each-commit basis. Ideally many of
> > > > these would be run nightly. There is also a certain amount of
> > > > redundancy in rebuilding the C++ library multiple times before running
> > > > each dependent set of tests, whereas in Travis we build the C++
> > > > library once then test both C++ and Python. If there is sufficient
> > > > number of builders then perhaps it doesn't matter so much
> > > >
> > > > It seems there are a few things, like action filtering (similar to
> > > > "detect-changes.py") based on what was changed that would need to get
> > > > done before this can be merged.
> > > >
> > > > - Wes
> > > >
> > > > On Fri, Oct 25, 2019 at 7:25 PM Krisztián Szűcs
> > > > <sz...@gmail.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > During the release of 0.15.1-RC0 I literally had to wait days
> > > > > to ensure that the Travis, Appveyor and Crossbow builds
> > > > > were all passing for the release branch. Additionally each
> > > > > newly added patch was delaying the process by 8 hrs or so
> > > > > (actually felt like 16).
> > > > >
> > > > > Recently I've been working on to incorporate the advantages
> > > > > of the Buildbot setup into our current docker-compose
> > > > > configuration, including support for multiple architectures
> > > > > and platforms, reusing docker images and caching dependency
> > > > > installation steps. It tries to follow the semantics of ursabot,
> > > > > but using only docker-compose and tiny shell scripts.
> > > > >
> > > > > This refactoring also includes GitHub Actions workflows for
> > > > > Windows and macOS as well, reusing the same (bash) builds
> > > > > scripts. The docker configuration and the scripts are CI agnostic.
> > > > > Last but not least, I've managed to clean up a lot of things
> > > > > including every travis builds, and three Appveyor builds.
> > > > > As an example the ci [3] and dev [4] folders got much cleaner.
> > > > >
> > > > > The majority of the builds are passing [2], but due to the size
> > > > > of the pull request [1] reviews for relevant workflows like the
> > > > > JavaScript, C#, Rust, JNI, etc. would be much appreciated.
> > > > > I'll be on vacation until Wednesday, but will try to respond on
> > > > > both GH and the ML.
> > > > >
> > > > > Thanks, Krisztian
> > > > >
> > > > > [1]: https://github.com/apache/arrow/pull/5589
> > > > > [2]: https://github.com/apache/arrow/runs/275685241
> > > > > [3]: https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/ci
> > > > > [4]: https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/dev

Re: [CI] Docker-compose refactor and GitHub Actions

Posted by Krisztián Szűcs <sz...@gmail.com>.
It depends on the change:

- [C++]: 20
  https://github.com/kszucs/arrow/commit/5654da07c21a7cf6a0e08894f4bf4e7aaa70ed26/checks?check_suite_id=304713473
- [Python]: 8
  https://github.com/kszucs/arrow/commit/5084c23f4d8b0fe05f1056808e7066aadbf0033e/checks?check_suite_id=304716138
- [Ruby]: 5
  https://github.com/kszucs/arrow/commit/e90e92514f69b0734e8126a949858476bfb70c1a/checks?check_suite_id=304717369
- [JS]: 6
  https://github.com/kszucs/arrow/commit/f27d6500f628c152c732f2372beae15a34f088e9/checks?check_suite_id=304718837
- [R]: 5
  https://github.com/kszucs/arrow/commit/eaec66cbde923687beebe31114dea6d0410043c6/checks?check_suite_id=304719993
- [Rust]: 6
  https://github.com/kszucs/arrow/commit/0c52ad15a54e78fdd1fe797668ebf0cdfbf6f4ab/checks?check_suite_id=304720884
- [Go]: 7
  https://github.com/kszucs/arrow/commit/30ab7f5e8cf6373774ce6ea75b0473da5ec808a3/checks?check_suite_id=304722182


On Sat, Nov 9, 2019 at 2:19 AM Wes McKinney <we...@gmail.com> wrote:
>
> Just to be sure, if this PR is merged, how many GHA tasks will be run
> on each commit to master?
>
> On Fri, Nov 8, 2019 at 12:07 PM Krisztián Szűcs
> <sz...@gmail.com> wrote:
> >
> > I've trimmed down the number of triggered builds on pull requests by
> > converting them to run on master only or cron builds. Alsoe added
> > the action filters including the changed path patterns. I've also
> > collected ~30 follow up JIRAs aggregating the problems I came across
> > during the refactor and possible further improvements and optimizations
> > like caching.
> >
> > The PR is ready to be merged [1]. I expect multiple issues after merging
> > the PR, despite that I was trying to port everything thoroughly.
> > The changeset is simply too big, but we need to start somewhere, and
> > sooner is better.
> >
> > [1]: https://github.com/apache/arrow/pull/5589
> >
> > On Thu, Oct 31, 2019 at 10:21 PM Wes McKinney <we...@gmail.com> wrote:
> > >
> > > hi Krisz -- I just left comments on the PR. This definitely looks like
> > > a good step forward. My main comment is that I think there are too
> > > many C++/Python tasks to run on an each-commit basis. Ideally many of
> > > these would be run nightly. There is also a certain amount of
> > > redundancy in rebuilding the C++ library multiple times before running
> > > each dependent set of tests, whereas in Travis we build the C++
> > > library once then test both C++ and Python. If there is sufficient
> > > number of builders then perhaps it doesn't matter so much
> > >
> > > It seems there are a few things, like action filtering (similar to
> > > "detect-changes.py") based on what was changed that would need to get
> > > done before this can be merged.
> > >
> > > - Wes
> > >
> > > On Fri, Oct 25, 2019 at 7:25 PM Krisztián Szűcs
> > > <sz...@gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > During the release of 0.15.1-RC0 I literally had to wait days
> > > > to ensure that the Travis, Appveyor and Crossbow builds
> > > > were all passing for the release branch. Additionally each
> > > > newly added patch was delaying the process by 8 hrs or so
> > > > (actually felt like 16).
> > > >
> > > > Recently I've been working on to incorporate the advantages
> > > > of the Buildbot setup into our current docker-compose
> > > > configuration, including support for multiple architectures
> > > > and platforms, reusing docker images and caching dependency
> > > > installation steps. It tries to follow the semantics of ursabot,
> > > > but using only docker-compose and tiny shell scripts.
> > > >
> > > > This refactoring also includes GitHub Actions workflows for
> > > > Windows and macOS as well, reusing the same (bash) builds
> > > > scripts. The docker configuration and the scripts are CI agnostic.
> > > > Last but not least, I've managed to clean up a lot of things
> > > > including every travis builds, and three Appveyor builds.
> > > > As an example the ci [3] and dev [4] folders got much cleaner.
> > > >
> > > > The majority of the builds are passing [2], but due to the size
> > > > of the pull request [1] reviews for relevant workflows like the
> > > > JavaScript, C#, Rust, JNI, etc. would be much appreciated.
> > > > I'll be on vacation until Wednesday, but will try to respond on
> > > > both GH and the ML.
> > > >
> > > > Thanks, Krisztian
> > > >
> > > > [1]: https://github.com/apache/arrow/pull/5589
> > > > [2]: https://github.com/apache/arrow/runs/275685241
> > > > [3]: https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/ci
> > > > [4]: https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/dev

Re: [CI] Docker-compose refactor and GitHub Actions

Posted by Wes McKinney <we...@gmail.com>.
Just to be sure, if this PR is merged, how many GHA tasks will be run
on each commit to master?

On Fri, Nov 8, 2019 at 12:07 PM Krisztián Szűcs
<sz...@gmail.com> wrote:
>
> I've trimmed down the number of triggered builds on pull requests by
> converting them to run on master only or cron builds. Alsoe added
> the action filters including the changed path patterns. I've also
> collected ~30 follow up JIRAs aggregating the problems I came across
> during the refactor and possible further improvements and optimizations
> like caching.
>
> The PR is ready to be merged [1]. I expect multiple issues after merging
> the PR, despite that I was trying to port everything thoroughly.
> The changeset is simply too big, but we need to start somewhere, and
> sooner is better.
>
> [1]: https://github.com/apache/arrow/pull/5589
>
> On Thu, Oct 31, 2019 at 10:21 PM Wes McKinney <we...@gmail.com> wrote:
> >
> > hi Krisz -- I just left comments on the PR. This definitely looks like
> > a good step forward. My main comment is that I think there are too
> > many C++/Python tasks to run on an each-commit basis. Ideally many of
> > these would be run nightly. There is also a certain amount of
> > redundancy in rebuilding the C++ library multiple times before running
> > each dependent set of tests, whereas in Travis we build the C++
> > library once then test both C++ and Python. If there is sufficient
> > number of builders then perhaps it doesn't matter so much
> >
> > It seems there are a few things, like action filtering (similar to
> > "detect-changes.py") based on what was changed that would need to get
> > done before this can be merged.
> >
> > - Wes
> >
> > On Fri, Oct 25, 2019 at 7:25 PM Krisztián Szűcs
> > <sz...@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > During the release of 0.15.1-RC0 I literally had to wait days
> > > to ensure that the Travis, Appveyor and Crossbow builds
> > > were all passing for the release branch. Additionally each
> > > newly added patch was delaying the process by 8 hrs or so
> > > (actually felt like 16).
> > >
> > > Recently I've been working on to incorporate the advantages
> > > of the Buildbot setup into our current docker-compose
> > > configuration, including support for multiple architectures
> > > and platforms, reusing docker images and caching dependency
> > > installation steps. It tries to follow the semantics of ursabot,
> > > but using only docker-compose and tiny shell scripts.
> > >
> > > This refactoring also includes GitHub Actions workflows for
> > > Windows and macOS as well, reusing the same (bash) builds
> > > scripts. The docker configuration and the scripts are CI agnostic.
> > > Last but not least, I've managed to clean up a lot of things
> > > including every travis builds, and three Appveyor builds.
> > > As an example the ci [3] and dev [4] folders got much cleaner.
> > >
> > > The majority of the builds are passing [2], but due to the size
> > > of the pull request [1] reviews for relevant workflows like the
> > > JavaScript, C#, Rust, JNI, etc. would be much appreciated.
> > > I'll be on vacation until Wednesday, but will try to respond on
> > > both GH and the ML.
> > >
> > > Thanks, Krisztian
> > >
> > > [1]: https://github.com/apache/arrow/pull/5589
> > > [2]: https://github.com/apache/arrow/runs/275685241
> > > [3]: https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/ci
> > > [4]: https://github.com/apache/arrow/tree/9c7e7289b9c9486c13a02e7cb5682a0f9f274ec6/dev