You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Neal Richardson <ne...@gmail.com> on 2019/10/01 18:37:49 UTC

[DISCUSS] Understanding Arrow's CI problems and needs

Hi all,
Over the last few months, I've seen a lot of frustration and
discussion around the shortcomings of our current CI. I'm also seeing
debate over a few possible solutions; unfortunately, the debates tend
not to resolve in a clear, decisive way, and we end up having the same
debates repeatedly.

In my experience, this pattern often happens when there's not a shared
understanding of the problems we're trying to solve--it's hard to
agree on a solution if we don't agree on the problem. To help us reach
consensus on the problems, I've started a document:
https://docs.google.com/document/d/1fToW48TO-B9T8VRi0_Z30fDJkjOrBisc-Fr8Epl50s4/edit#

Please have a look and add/edit freely. I've tried to capture the
arguments I've seen go by the mailing list, as well as some from my
own experience, but if I've mischaracterized anything, please rectify.

I know several people have been exploring some potential solutions,
and I hope this document can help us begin to discuss their relative
merits more objectively and practically.

Neal

Re: [DISCUSS] Understanding Arrow's CI problems and needs

Posted by Wes McKinney <we...@gmail.com>.
The problems we're facing with our continuous integration are
difficult to capture in a single JIRA ticket given the scope and
complexity of the work involved. We could create some "umbrella"
tickets with topics like "Eliminate Travis CI-specific logic from
testing scripts" and attach child JIRA issues to that

On Sun, Oct 13, 2019 at 6:53 AM Renjie Liu <li...@gmail.com> wrote:
>
> Do we have ticket to track this?
>
> ?? Outlook for Android<https://aka.ms/ghei36>
>
> ________________________________
> From: Andy Grove <an...@gmail.com>
> Sent: Saturday, October 12, 2019 11:46:18 PM
> To: dev <de...@arrow.apache.org>
> Subject: Re: [DISCUSS] Understanding Arrow's CI problems and needs
>
> I've started a new section to discuss proposals and current initiatives. I
> know some of us have been working on some things but without much
> coordination so far. It would be good to track these efforts so everyone
> can comment on them.
>
> On Fri, Oct 11, 2019 at 11:11 AM Wes McKinney <we...@gmail.com> wrote:
>
> > It seems some time has passed here. Would some others like to read the
> > document and comment? This is important stuff.
> >
> > On Wed, Oct 2, 2019 at 2:20 PM Krisztián Szűcs
> > <sz...@gmail.com> wrote:
> > >
> > > The current document greatly summarizes the current situation, but in
> > > order to properly compare and eventually select a solution we need a
> > > a detailed list of explicit features with some sort of classification,
> > like
> > > should/must have. For example our future CI system must support
> > > "PRs from forks". After filling this table for the alternatives we can
> > > have a much clearer picture.
> > >
> > > On Wed, Oct 2, 2019 at 4:06 PM Wes McKinney <we...@gmail.com> wrote:
> > >
> > > > I reviewed the document, thanks for putting it together! I think it
> > > > captures most of the requirements and the challenges that we are
> > > > currently facing. I think that anyone who is actively contributing to
> > > > the project or merging pull requests should read this document since
> > > > this affects all of us.
> > > >
> > > > On Tue, Oct 1, 2019 at 1:55 PM Wes McKinney <we...@gmail.com>
> > wrote:
> > > > >
> > > > > Thanks Neal for starting this discussion. I will review and comment.
> > > > >
> > > > > I will say that as a maintainer the current situation is very nearly
> > > > > intolerable. As by far and away the most prolific merger-of-PRs [1],
> > > > > I've been negatively affected by the long queueing times and delayed
> > > > > feedback cycles. The project would not be able to accommodate 2x or
> > 5x
> > > > > the volume of PRs that we have now, and so it is urgent that we
> > > > > develop a scalable cross-platform CI solution that is under this
> > > > > community's control and does not require a high maintenance burden,
> > so
> > > > > if we need to increase the amount of resources dedicated to CI we can
> > > > > unilaterally do so.
> > > > >
> > > > > [1]: https://gist.github.com/wesm/78bfda4cef3b23a5193cf4fb8a6540fb
> > > > >
> > > > > On Tue, Oct 1, 2019 at 1:38 PM Neal Richardson
> > > > > <ne...@gmail.com> wrote:
> > > > > >
> > > > > > Hi all,
> > > > > > Over the last few months, I've seen a lot of frustration and
> > > > > > discussion around the shortcomings of our current CI. I'm also
> > seeing
> > > > > > debate over a few possible solutions; unfortunately, the debates
> > tend
> > > > > > not to resolve in a clear, decisive way, and we end up having the
> > same
> > > > > > debates repeatedly.
> > > > > >
> > > > > > In my experience, this pattern often happens when there's not a
> > shared
> > > > > > understanding of the problems we're trying to solve--it's hard to
> > > > > > agree on a solution if we don't agree on the problem. To help us
> > reach
> > > > > > consensus on the problems, I've started a document:
> > > > > >
> > > >
> > https://docs.google.com/document/d/1fToW48TO-B9T8VRi0_Z30fDJkjOrBisc-Fr8Epl50s4/edit#
> > > > > >
> > > > > > Please have a look and add/edit freely. I've tried to capture the
> > > > > > arguments I've seen go by the mailing list, as well as some from my
> > > > > > own experience, but if I've mischaracterized anything, please
> > rectify.
> > > > > >
> > > > > > I know several people have been exploring some potential solutions,
> > > > > > and I hope this document can help us begin to discuss their
> > relative
> > > > > > merits more objectively and practically.
> > > > > >
> > > > > > Neal
> > > >
> >

Re: [DISCUSS] Understanding Arrow's CI problems and needs

Posted by Renjie Liu <li...@gmail.com>.
Do we have ticket to track this?

?? Outlook for Android<https://aka.ms/ghei36>

________________________________
From: Andy Grove <an...@gmail.com>
Sent: Saturday, October 12, 2019 11:46:18 PM
To: dev <de...@arrow.apache.org>
Subject: Re: [DISCUSS] Understanding Arrow's CI problems and needs

I've started a new section to discuss proposals and current initiatives. I
know some of us have been working on some things but without much
coordination so far. It would be good to track these efforts so everyone
can comment on them.

On Fri, Oct 11, 2019 at 11:11 AM Wes McKinney <we...@gmail.com> wrote:

> It seems some time has passed here. Would some others like to read the
> document and comment? This is important stuff.
>
> On Wed, Oct 2, 2019 at 2:20 PM Krisztián Szűcs
> <sz...@gmail.com> wrote:
> >
> > The current document greatly summarizes the current situation, but in
> > order to properly compare and eventually select a solution we need a
> > a detailed list of explicit features with some sort of classification,
> like
> > should/must have. For example our future CI system must support
> > "PRs from forks". After filling this table for the alternatives we can
> > have a much clearer picture.
> >
> > On Wed, Oct 2, 2019 at 4:06 PM Wes McKinney <we...@gmail.com> wrote:
> >
> > > I reviewed the document, thanks for putting it together! I think it
> > > captures most of the requirements and the challenges that we are
> > > currently facing. I think that anyone who is actively contributing to
> > > the project or merging pull requests should read this document since
> > > this affects all of us.
> > >
> > > On Tue, Oct 1, 2019 at 1:55 PM Wes McKinney <we...@gmail.com>
> wrote:
> > > >
> > > > Thanks Neal for starting this discussion. I will review and comment.
> > > >
> > > > I will say that as a maintainer the current situation is very nearly
> > > > intolerable. As by far and away the most prolific merger-of-PRs [1],
> > > > I've been negatively affected by the long queueing times and delayed
> > > > feedback cycles. The project would not be able to accommodate 2x or
> 5x
> > > > the volume of PRs that we have now, and so it is urgent that we
> > > > develop a scalable cross-platform CI solution that is under this
> > > > community's control and does not require a high maintenance burden,
> so
> > > > if we need to increase the amount of resources dedicated to CI we can
> > > > unilaterally do so.
> > > >
> > > > [1]: https://gist.github.com/wesm/78bfda4cef3b23a5193cf4fb8a6540fb
> > > >
> > > > On Tue, Oct 1, 2019 at 1:38 PM Neal Richardson
> > > > <ne...@gmail.com> wrote:
> > > > >
> > > > > Hi all,
> > > > > Over the last few months, I've seen a lot of frustration and
> > > > > discussion around the shortcomings of our current CI. I'm also
> seeing
> > > > > debate over a few possible solutions; unfortunately, the debates
> tend
> > > > > not to resolve in a clear, decisive way, and we end up having the
> same
> > > > > debates repeatedly.
> > > > >
> > > > > In my experience, this pattern often happens when there's not a
> shared
> > > > > understanding of the problems we're trying to solve--it's hard to
> > > > > agree on a solution if we don't agree on the problem. To help us
> reach
> > > > > consensus on the problems, I've started a document:
> > > > >
> > >
> https://docs.google.com/document/d/1fToW48TO-B9T8VRi0_Z30fDJkjOrBisc-Fr8Epl50s4/edit#
> > > > >
> > > > > Please have a look and add/edit freely. I've tried to capture the
> > > > > arguments I've seen go by the mailing list, as well as some from my
> > > > > own experience, but if I've mischaracterized anything, please
> rectify.
> > > > >
> > > > > I know several people have been exploring some potential solutions,
> > > > > and I hope this document can help us begin to discuss their
> relative
> > > > > merits more objectively and practically.
> > > > >
> > > > > Neal
> > >
>

Re: [DISCUSS] Understanding Arrow's CI problems and needs

Posted by Andy Grove <an...@gmail.com>.
I've started a new section to discuss proposals and current initiatives. I
know some of us have been working on some things but without much
coordination so far. It would be good to track these efforts so everyone
can comment on them.

On Fri, Oct 11, 2019 at 11:11 AM Wes McKinney <we...@gmail.com> wrote:

> It seems some time has passed here. Would some others like to read the
> document and comment? This is important stuff.
>
> On Wed, Oct 2, 2019 at 2:20 PM Krisztián Szűcs
> <sz...@gmail.com> wrote:
> >
> > The current document greatly summarizes the current situation, but in
> > order to properly compare and eventually select a solution we need a
> > a detailed list of explicit features with some sort of classification,
> like
> > should/must have. For example our future CI system must support
> > "PRs from forks". After filling this table for the alternatives we can
> > have a much clearer picture.
> >
> > On Wed, Oct 2, 2019 at 4:06 PM Wes McKinney <we...@gmail.com> wrote:
> >
> > > I reviewed the document, thanks for putting it together! I think it
> > > captures most of the requirements and the challenges that we are
> > > currently facing. I think that anyone who is actively contributing to
> > > the project or merging pull requests should read this document since
> > > this affects all of us.
> > >
> > > On Tue, Oct 1, 2019 at 1:55 PM Wes McKinney <we...@gmail.com>
> wrote:
> > > >
> > > > Thanks Neal for starting this discussion. I will review and comment.
> > > >
> > > > I will say that as a maintainer the current situation is very nearly
> > > > intolerable. As by far and away the most prolific merger-of-PRs [1],
> > > > I've been negatively affected by the long queueing times and delayed
> > > > feedback cycles. The project would not be able to accommodate 2x or
> 5x
> > > > the volume of PRs that we have now, and so it is urgent that we
> > > > develop a scalable cross-platform CI solution that is under this
> > > > community's control and does not require a high maintenance burden,
> so
> > > > if we need to increase the amount of resources dedicated to CI we can
> > > > unilaterally do so.
> > > >
> > > > [1]: https://gist.github.com/wesm/78bfda4cef3b23a5193cf4fb8a6540fb
> > > >
> > > > On Tue, Oct 1, 2019 at 1:38 PM Neal Richardson
> > > > <ne...@gmail.com> wrote:
> > > > >
> > > > > Hi all,
> > > > > Over the last few months, I've seen a lot of frustration and
> > > > > discussion around the shortcomings of our current CI. I'm also
> seeing
> > > > > debate over a few possible solutions; unfortunately, the debates
> tend
> > > > > not to resolve in a clear, decisive way, and we end up having the
> same
> > > > > debates repeatedly.
> > > > >
> > > > > In my experience, this pattern often happens when there's not a
> shared
> > > > > understanding of the problems we're trying to solve--it's hard to
> > > > > agree on a solution if we don't agree on the problem. To help us
> reach
> > > > > consensus on the problems, I've started a document:
> > > > >
> > >
> https://docs.google.com/document/d/1fToW48TO-B9T8VRi0_Z30fDJkjOrBisc-Fr8Epl50s4/edit#
> > > > >
> > > > > Please have a look and add/edit freely. I've tried to capture the
> > > > > arguments I've seen go by the mailing list, as well as some from my
> > > > > own experience, but if I've mischaracterized anything, please
> rectify.
> > > > >
> > > > > I know several people have been exploring some potential solutions,
> > > > > and I hope this document can help us begin to discuss their
> relative
> > > > > merits more objectively and practically.
> > > > >
> > > > > Neal
> > >
>

Re: [DISCUSS] Understanding Arrow's CI problems and needs

Posted by Wes McKinney <we...@gmail.com>.
It seems some time has passed here. Would some others like to read the
document and comment? This is important stuff.

On Wed, Oct 2, 2019 at 2:20 PM Krisztián Szűcs
<sz...@gmail.com> wrote:
>
> The current document greatly summarizes the current situation, but in
> order to properly compare and eventually select a solution we need a
> a detailed list of explicit features with some sort of classification, like
> should/must have. For example our future CI system must support
> "PRs from forks". After filling this table for the alternatives we can
> have a much clearer picture.
>
> On Wed, Oct 2, 2019 at 4:06 PM Wes McKinney <we...@gmail.com> wrote:
>
> > I reviewed the document, thanks for putting it together! I think it
> > captures most of the requirements and the challenges that we are
> > currently facing. I think that anyone who is actively contributing to
> > the project or merging pull requests should read this document since
> > this affects all of us.
> >
> > On Tue, Oct 1, 2019 at 1:55 PM Wes McKinney <we...@gmail.com> wrote:
> > >
> > > Thanks Neal for starting this discussion. I will review and comment.
> > >
> > > I will say that as a maintainer the current situation is very nearly
> > > intolerable. As by far and away the most prolific merger-of-PRs [1],
> > > I've been negatively affected by the long queueing times and delayed
> > > feedback cycles. The project would not be able to accommodate 2x or 5x
> > > the volume of PRs that we have now, and so it is urgent that we
> > > develop a scalable cross-platform CI solution that is under this
> > > community's control and does not require a high maintenance burden, so
> > > if we need to increase the amount of resources dedicated to CI we can
> > > unilaterally do so.
> > >
> > > [1]: https://gist.github.com/wesm/78bfda4cef3b23a5193cf4fb8a6540fb
> > >
> > > On Tue, Oct 1, 2019 at 1:38 PM Neal Richardson
> > > <ne...@gmail.com> wrote:
> > > >
> > > > Hi all,
> > > > Over the last few months, I've seen a lot of frustration and
> > > > discussion around the shortcomings of our current CI. I'm also seeing
> > > > debate over a few possible solutions; unfortunately, the debates tend
> > > > not to resolve in a clear, decisive way, and we end up having the same
> > > > debates repeatedly.
> > > >
> > > > In my experience, this pattern often happens when there's not a shared
> > > > understanding of the problems we're trying to solve--it's hard to
> > > > agree on a solution if we don't agree on the problem. To help us reach
> > > > consensus on the problems, I've started a document:
> > > >
> > https://docs.google.com/document/d/1fToW48TO-B9T8VRi0_Z30fDJkjOrBisc-Fr8Epl50s4/edit#
> > > >
> > > > Please have a look and add/edit freely. I've tried to capture the
> > > > arguments I've seen go by the mailing list, as well as some from my
> > > > own experience, but if I've mischaracterized anything, please rectify.
> > > >
> > > > I know several people have been exploring some potential solutions,
> > > > and I hope this document can help us begin to discuss their relative
> > > > merits more objectively and practically.
> > > >
> > > > Neal
> >

Re: [DISCUSS] Understanding Arrow's CI problems and needs

Posted by Krisztián Szűcs <sz...@gmail.com>.
The current document greatly summarizes the current situation, but in
order to properly compare and eventually select a solution we need a
a detailed list of explicit features with some sort of classification, like
should/must have. For example our future CI system must support
"PRs from forks". After filling this table for the alternatives we can
have a much clearer picture.

On Wed, Oct 2, 2019 at 4:06 PM Wes McKinney <we...@gmail.com> wrote:

> I reviewed the document, thanks for putting it together! I think it
> captures most of the requirements and the challenges that we are
> currently facing. I think that anyone who is actively contributing to
> the project or merging pull requests should read this document since
> this affects all of us.
>
> On Tue, Oct 1, 2019 at 1:55 PM Wes McKinney <we...@gmail.com> wrote:
> >
> > Thanks Neal for starting this discussion. I will review and comment.
> >
> > I will say that as a maintainer the current situation is very nearly
> > intolerable. As by far and away the most prolific merger-of-PRs [1],
> > I've been negatively affected by the long queueing times and delayed
> > feedback cycles. The project would not be able to accommodate 2x or 5x
> > the volume of PRs that we have now, and so it is urgent that we
> > develop a scalable cross-platform CI solution that is under this
> > community's control and does not require a high maintenance burden, so
> > if we need to increase the amount of resources dedicated to CI we can
> > unilaterally do so.
> >
> > [1]: https://gist.github.com/wesm/78bfda4cef3b23a5193cf4fb8a6540fb
> >
> > On Tue, Oct 1, 2019 at 1:38 PM Neal Richardson
> > <ne...@gmail.com> wrote:
> > >
> > > Hi all,
> > > Over the last few months, I've seen a lot of frustration and
> > > discussion around the shortcomings of our current CI. I'm also seeing
> > > debate over a few possible solutions; unfortunately, the debates tend
> > > not to resolve in a clear, decisive way, and we end up having the same
> > > debates repeatedly.
> > >
> > > In my experience, this pattern often happens when there's not a shared
> > > understanding of the problems we're trying to solve--it's hard to
> > > agree on a solution if we don't agree on the problem. To help us reach
> > > consensus on the problems, I've started a document:
> > >
> https://docs.google.com/document/d/1fToW48TO-B9T8VRi0_Z30fDJkjOrBisc-Fr8Epl50s4/edit#
> > >
> > > Please have a look and add/edit freely. I've tried to capture the
> > > arguments I've seen go by the mailing list, as well as some from my
> > > own experience, but if I've mischaracterized anything, please rectify.
> > >
> > > I know several people have been exploring some potential solutions,
> > > and I hope this document can help us begin to discuss their relative
> > > merits more objectively and practically.
> > >
> > > Neal
>

Re: [DISCUSS] Understanding Arrow's CI problems and needs

Posted by Wes McKinney <we...@gmail.com>.
I reviewed the document, thanks for putting it together! I think it
captures most of the requirements and the challenges that we are
currently facing. I think that anyone who is actively contributing to
the project or merging pull requests should read this document since
this affects all of us.

On Tue, Oct 1, 2019 at 1:55 PM Wes McKinney <we...@gmail.com> wrote:
>
> Thanks Neal for starting this discussion. I will review and comment.
>
> I will say that as a maintainer the current situation is very nearly
> intolerable. As by far and away the most prolific merger-of-PRs [1],
> I've been negatively affected by the long queueing times and delayed
> feedback cycles. The project would not be able to accommodate 2x or 5x
> the volume of PRs that we have now, and so it is urgent that we
> develop a scalable cross-platform CI solution that is under this
> community's control and does not require a high maintenance burden, so
> if we need to increase the amount of resources dedicated to CI we can
> unilaterally do so.
>
> [1]: https://gist.github.com/wesm/78bfda4cef3b23a5193cf4fb8a6540fb
>
> On Tue, Oct 1, 2019 at 1:38 PM Neal Richardson
> <ne...@gmail.com> wrote:
> >
> > Hi all,
> > Over the last few months, I've seen a lot of frustration and
> > discussion around the shortcomings of our current CI. I'm also seeing
> > debate over a few possible solutions; unfortunately, the debates tend
> > not to resolve in a clear, decisive way, and we end up having the same
> > debates repeatedly.
> >
> > In my experience, this pattern often happens when there's not a shared
> > understanding of the problems we're trying to solve--it's hard to
> > agree on a solution if we don't agree on the problem. To help us reach
> > consensus on the problems, I've started a document:
> > https://docs.google.com/document/d/1fToW48TO-B9T8VRi0_Z30fDJkjOrBisc-Fr8Epl50s4/edit#
> >
> > Please have a look and add/edit freely. I've tried to capture the
> > arguments I've seen go by the mailing list, as well as some from my
> > own experience, but if I've mischaracterized anything, please rectify.
> >
> > I know several people have been exploring some potential solutions,
> > and I hope this document can help us begin to discuss their relative
> > merits more objectively and practically.
> >
> > Neal

Re: [DISCUSS] Understanding Arrow's CI problems and needs

Posted by Wes McKinney <we...@gmail.com>.
Thanks Neal for starting this discussion. I will review and comment.

I will say that as a maintainer the current situation is very nearly
intolerable. As by far and away the most prolific merger-of-PRs [1],
I've been negatively affected by the long queueing times and delayed
feedback cycles. The project would not be able to accommodate 2x or 5x
the volume of PRs that we have now, and so it is urgent that we
develop a scalable cross-platform CI solution that is under this
community's control and does not require a high maintenance burden, so
if we need to increase the amount of resources dedicated to CI we can
unilaterally do so.

[1]: https://gist.github.com/wesm/78bfda4cef3b23a5193cf4fb8a6540fb

On Tue, Oct 1, 2019 at 1:38 PM Neal Richardson
<ne...@gmail.com> wrote:
>
> Hi all,
> Over the last few months, I've seen a lot of frustration and
> discussion around the shortcomings of our current CI. I'm also seeing
> debate over a few possible solutions; unfortunately, the debates tend
> not to resolve in a clear, decisive way, and we end up having the same
> debates repeatedly.
>
> In my experience, this pattern often happens when there's not a shared
> understanding of the problems we're trying to solve--it's hard to
> agree on a solution if we don't agree on the problem. To help us reach
> consensus on the problems, I've started a document:
> https://docs.google.com/document/d/1fToW48TO-B9T8VRi0_Z30fDJkjOrBisc-Fr8Epl50s4/edit#
>
> Please have a look and add/edit freely. I've tried to capture the
> arguments I've seen go by the mailing list, as well as some from my
> own experience, but if I've mischaracterized anything, please rectify.
>
> I know several people have been exploring some potential solutions,
> and I hope this document can help us begin to discuss their relative
> merits more objectively and practically.
>
> Neal