You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mxnet.apache.org by kellen sunderland <ke...@gmail.com> on 2017/09/28 06:13:49 UTC

RE: Apache MXNet build failures are mostly valid - verify beforemerge

Looking at the email thread it means marking the master branch as protected in GitHub (a functionality they offer): https://help.github.com/articles/about-protected-branches/

Popular open source projects should at a minimum have force push disabled on their master branch to prevent broken downstream git references.  The other protections offered here are also most likely beneficial.  

-Kellen

From: Chris Olivier
Sent: Thursday, September 28, 2017 8:10 AM
To: dev@mxnet.incubator.apache.org
Subject: Re: Apache MXNet build failures are mostly valid - verify beforemerge

What does that mean? "Protected"? Protected from what?

On Wed, Sep 27, 2017 at 11:08 PM Gautam <ga...@gmail.com> wrote:

> Hi Chris,
>
>    I mean make "master branch protected" of  MXNet.
>
> -Gautam
>
> On Wed, Sep 27, 2017 at 11:04 PM, Chris Olivier <cj...@gmail.com>
> wrote:
>
> > What does this mean? "Mx-net branch protected"?
> >
> > On Wed, Sep 27, 2017 at 9:59 PM Tsuyoshi OZAWA <ozawa.tsuyoshi@gmail.com
> >
> > wrote:
> >
> > > +1,
> > >
> > > While I'm checking the recent build failures, and I think the decision
> > > of making the mx-net branch protected is necessary for stable
> > > building.
> > > Thanks Kumar for resuming important discussion.
> > >
> > > Best regards
> > > - Tsuyoshi
> > >
> > > On Thu, Sep 28, 2017 at 12:56 PM, Kumar, Gautam <ga...@amazon.com>
> > wrote:
> > > > Reviving the discussion.
> > > >
> > > > At this point of time we have couple of stable builds
> > > >
> > > https://builds.apache.org/view/Incubator%20Projects/job/
> > incubator-mxnet/job/master/448/
> > > >
> > > https://builds.apache.org/view/Incubator%20Projects/job/
> > incubator-mxnet/job/master/449/
> > > >
> > > > Should we have a quick discussion or polling on making the mx-net
> > branch
> > > protected? If you still think we shouldn’t make it protected please
> > provide
> > > a reason to support your claim.
> > > >
> > > > Few of us have concern over Jenkin’s stability. If I look two weeks
> > > back, after upgrading Linux slave to g2.8x and new windows AMI, we have
> > not
> > > seen any case where instance died due to high memory usage or any
> process
> > > got killed due to high cpu usage or any other issue with windows
> slaves.
> > > >
> > > > Going forward we are also planning that if we add any new slave we
> will
> > > not enable the main load immediately, but rather will do ‘test build’
> to
> > > make sure that new slaves are not causing any infrastructure issue and
> > > capable to perform as good as existing slaves.
> > > >
> > > > -Gautam
> > > >
> > > > On 8/31/17, 5:27 PM, "Lupesko, Hagay" <lu...@gmail.com> wrote:
> > > >
> > > >     @madan looking into some failures – you’re right… there’s
> multiple
> > > issues going on, some of them intermittent, and we want to be able to
> > merge
> > > fixes in.
> > > >     Agreed that we can wait with setting up protected mode until
> build
> > > stabilizes.
> > > >
> > > >     On 8/31/17, 11:41, "Madan Jampani" <ma...@gmail.com>
> > wrote:
> > > >
> > > >         @hagay: we agree on the end state. I'm not too particular
> about
> > > how we get
> > > >         there. If you think enabling it now and fixes regression
> later
> > > is doable,
> > > >         I'm fine with. I see a bit of a chicken and egg problem. We
> > need
> > > to get
> > > >         some fixes in even when the status checks are failing.
> > > >
> > > >         On Thu, Aug 31, 2017 at 11:25 AM, Lupesko, Hagay <
> > > lupesko@gmail.com> wrote:
> > > >
> > > >         > @madan – re: getting to a stable CI first:
> > > >         > I’m concerned that by not enabling protected branch mode
> > ASAP,
> > > we’re just
> > > >         > taking in more regressions, which makes a stable build a
> > > moving target for
> > > >         > us…
> > > >         >
> > > >         > On 8/31/17, 10:49, "Zha, Sheng" <zh...@amazon.com>
> wrote:
> > > >         >
> > > >         >     Just one thing: please don’t disable more tests or just
> > > raise the
> > > >         > tolerance thresholds.
> > > >         >
> > > >         >     Best regards,
> > > >         >     -sz
> > > >         >
> > > >         >     On 8/31/17, 10:45 AM, "Madan Jampani" <
> > > madan.jampani@gmail.com> wrote:
> > > >         >
> > > >         >         +1
> > > >         >         Before we can turn protected mode I feel we should
> > > first get to a
> > > >         > stable CI
> > > >         >         pipeline.
> > > >         >         Sandeep is chasing down known breaking issues.
> > > >         >
> > > >         >
> > > >         >         On Thu, Aug 31, 2017 at 10:27 AM, Hagay Lupesko <
> > > lupesko@gmail.com>
> > > >         > wrote:
> > > >         >
> > > >         >         > Build stability is a major issue, builds have
> been
> > > failing left
> > > >         > and right
> > > >         >         > over the last week. Some of it is due to Jenkins
> > > slave issues,
> > > >         > but some are
> > > >         >         > real regressions.
> > > >         >         > We need to be more strict in the code we're
> > > committing.
> > > >         >         >
> > > >         >         > I propose we configure our master to be a
> protected
> > > branch (
> > > >         >         >
> > > https://help.github.com/articles/about-protected-branches/).
> > > >         >         >
> > > >         >         > Thoughts?
> > > >         >         >
> > > >         >         > On 2017-08-28 22:41, sandeep krishnamurthy <
> > > s...@gmail.com>
> > > >         > wrote:
> > > >         >         > > Hello Committers and Contributors,>
> > > >         >         > >
> > > >         >         > > Due to unstable build pipelines, from past 1
> > week,
> > > PRs are
> > > >         > being merged>
> > > >         >         > > after CR ignoring PR build status. Build
> pipeline
> > > is much more
> > > >         > stable
> > > >         >         > than>
> > > >         >         > > last week and most of the build failures you
> see
> > > from now on,
> > > >         > are likely
> > > >         >         > to>
> > > >         >         > > be a valid failure and hence, it is recommended
> > to
> > > wait for PR
> > > >         > builds,
> > > >         >         > see>
> > > >         >         > > the root cause of any build failures before
> > > proceeding with
> > > >         > merges.>
> > > >         >         > >
> > > >         >         > > At this point of time, there are 2 intermittent
> > > issue yet to
> > > >         > be fixed ->
> > > >         >         > > * Network error leading to GitHub requests
> > > throwing 404>
> > > >         >         > > * A conflict in artifacts generated between
> > > branches/PR -
> > > >         > Cause unknown
> > > >         >         > yet.>
> > > >         >         > > These issues will be fixed soon.>
> > > >         >         > >
> > > >         >         > >
> > > >         >         > > -- >
> > > >         >         > > Sandeep Krishnamurthy>
> > > >         >         > >
> > > >         >         >
> > > >         >
> > > >         >
> > > >         >
> > > >         >
> > > >         >
> > > >         >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > > --
> > > - Tsuyoshi
> > >
> >
>
>
>
> --
> Best Regards,
> Gautam Kumar
>


Re: Apache MXNet build failures are mostly valid - verify beforemerge

Posted by Chris Olivier <cj...@gmail.com>.
Thank you for the clarification.

Agree force pushed are bad. I don't think anyone is doing this, though.

I am curious though, and this may be an unpopular question, but is there
precedent to tempoaraily disabling write access to people who merge in code
that hasn't passed CI tests and thus break CI (is there a way to tell)?
Certainly for 10% failures, one could argue that it "slipped by", but im
speaking more about blind merges that break everything 100% of the time.
How do other Apache projects deal with this?

On Wed, Sep 27, 2017 at 11:13 PM kellen sunderland <
kellen.sunderland@gmail.com> wrote:

> Looking at the email thread it means marking the master branch as
> protected in GitHub (a functionality they offer):
> https://help.github.com/articles/about-protected-branches/
>
> Popular open source projects should at a minimum have force push disabled
> on their master branch to prevent broken downstream git references.  The
> other protections offered here are also most likely beneficial.
>
> -Kellen
>
> From: Chris Olivier
> Sent: Thursday, September 28, 2017 8:10 AM
> To: dev@mxnet.incubator.apache.org
> Subject: Re: Apache MXNet build failures are mostly valid - verify
> beforemerge
>
> What does that mean? "Protected"? Protected from what?
>
> On Wed, Sep 27, 2017 at 11:08 PM Gautam <ga...@gmail.com> wrote:
>
> > Hi Chris,
> >
> >    I mean make "master branch protected" of  MXNet.
> >
> > -Gautam
> >
> > On Wed, Sep 27, 2017 at 11:04 PM, Chris Olivier <cj...@gmail.com>
> > wrote:
> >
> > > What does this mean? "Mx-net branch protected"?
> > >
> > > On Wed, Sep 27, 2017 at 9:59 PM Tsuyoshi OZAWA <
> ozawa.tsuyoshi@gmail.com
> > >
> > > wrote:
> > >
> > > > +1,
> > > >
> > > > While I'm checking the recent build failures, and I think the
> decision
> > > > of making the mx-net branch protected is necessary for stable
> > > > building.
> > > > Thanks Kumar for resuming important discussion.
> > > >
> > > > Best regards
> > > > - Tsuyoshi
> > > >
> > > > On Thu, Sep 28, 2017 at 12:56 PM, Kumar, Gautam <ga...@amazon.com>
> > > wrote:
> > > > > Reviving the discussion.
> > > > >
> > > > > At this point of time we have couple of stable builds
> > > > >
> > > > https://builds.apache.org/view/Incubator%20Projects/job/
> > > incubator-mxnet/job/master/448/
> > > > >
> > > > https://builds.apache.org/view/Incubator%20Projects/job/
> > > incubator-mxnet/job/master/449/
> > > > >
> > > > > Should we have a quick discussion or polling on making the mx-net
> > > branch
> > > > protected? If you still think we shouldn’t make it protected please
> > > provide
> > > > a reason to support your claim.
> > > > >
> > > > > Few of us have concern over Jenkin’s stability. If I look two weeks
> > > > back, after upgrading Linux slave to g2.8x and new windows AMI, we
> have
> > > not
> > > > seen any case where instance died due to high memory usage or any
> > process
> > > > got killed due to high cpu usage or any other issue with windows
> > slaves.
> > > > >
> > > > > Going forward we are also planning that if we add any new slave we
> > will
> > > > not enable the main load immediately, but rather will do ‘test build’
> > to
> > > > make sure that new slaves are not causing any infrastructure issue
> and
> > > > capable to perform as good as existing slaves.
> > > > >
> > > > > -Gautam
> > > > >
> > > > > On 8/31/17, 5:27 PM, "Lupesko, Hagay" <lu...@gmail.com> wrote:
> > > > >
> > > > >     @madan looking into some failures – you’re right… there’s
> > multiple
> > > > issues going on, some of them intermittent, and we want to be able to
> > > merge
> > > > fixes in.
> > > > >     Agreed that we can wait with setting up protected mode until
> > build
> > > > stabilizes.
> > > > >
> > > > >     On 8/31/17, 11:41, "Madan Jampani" <ma...@gmail.com>
> > > wrote:
> > > > >
> > > > >         @hagay: we agree on the end state. I'm not too particular
> > about
> > > > how we get
> > > > >         there. If you think enabling it now and fixes regression
> > later
> > > > is doable,
> > > > >         I'm fine with. I see a bit of a chicken and egg problem. We
> > > need
> > > > to get
> > > > >         some fixes in even when the status checks are failing.
> > > > >
> > > > >         On Thu, Aug 31, 2017 at 11:25 AM, Lupesko, Hagay <
> > > > lupesko@gmail.com> wrote:
> > > > >
> > > > >         > @madan – re: getting to a stable CI first:
> > > > >         > I’m concerned that by not enabling protected branch mode
> > > ASAP,
> > > > we’re just
> > > > >         > taking in more regressions, which makes a stable build a
> > > > moving target for
> > > > >         > us…
> > > > >         >
> > > > >         > On 8/31/17, 10:49, "Zha, Sheng" <zh...@amazon.com>
> > wrote:
> > > > >         >
> > > > >         >     Just one thing: please don’t disable more tests or
> just
> > > > raise the
> > > > >         > tolerance thresholds.
> > > > >         >
> > > > >         >     Best regards,
> > > > >         >     -sz
> > > > >         >
> > > > >         >     On 8/31/17, 10:45 AM, "Madan Jampani" <
> > > > madan.jampani@gmail.com> wrote:
> > > > >         >
> > > > >         >         +1
> > > > >         >         Before we can turn protected mode I feel we
> should
> > > > first get to a
> > > > >         > stable CI
> > > > >         >         pipeline.
> > > > >         >         Sandeep is chasing down known breaking issues.
> > > > >         >
> > > > >         >
> > > > >         >         On Thu, Aug 31, 2017 at 10:27 AM, Hagay Lupesko <
> > > > lupesko@gmail.com>
> > > > >         > wrote:
> > > > >         >
> > > > >         >         > Build stability is a major issue, builds have
> > been
> > > > failing left
> > > > >         > and right
> > > > >         >         > over the last week. Some of it is due to
> Jenkins
> > > > slave issues,
> > > > >         > but some are
> > > > >         >         > real regressions.
> > > > >         >         > We need to be more strict in the code we're
> > > > committing.
> > > > >         >         >
> > > > >         >         > I propose we configure our master to be a
> > protected
> > > > branch (
> > > > >         >         >
> > > > https://help.github.com/articles/about-protected-branches/).
> > > > >         >         >
> > > > >         >         > Thoughts?
> > > > >         >         >
> > > > >         >         > On 2017-08-28 22:41, sandeep krishnamurthy <
> > > > s...@gmail.com>
> > > > >         > wrote:
> > > > >         >         > > Hello Committers and Contributors,>
> > > > >         >         > >
> > > > >         >         > > Due to unstable build pipelines, from past 1
> > > week,
> > > > PRs are
> > > > >         > being merged>
> > > > >         >         > > after CR ignoring PR build status. Build
> > pipeline
> > > > is much more
> > > > >         > stable
> > > > >         >         > than>
> > > > >         >         > > last week and most of the build failures you
> > see
> > > > from now on,
> > > > >         > are likely
> > > > >         >         > to>
> > > > >         >         > > be a valid failure and hence, it is
> recommended
> > > to
> > > > wait for PR
> > > > >         > builds,
> > > > >         >         > see>
> > > > >         >         > > the root cause of any build failures before
> > > > proceeding with
> > > > >         > merges.>
> > > > >         >         > >
> > > > >         >         > > At this point of time, there are 2
> intermittent
> > > > issue yet to
> > > > >         > be fixed ->
> > > > >         >         > > * Network error leading to GitHub requests
> > > > throwing 404>
> > > > >         >         > > * A conflict in artifacts generated between
> > > > branches/PR -
> > > > >         > Cause unknown
> > > > >         >         > yet.>
> > > > >         >         > > These issues will be fixed soon.>
> > > > >         >         > >
> > > > >         >         > >
> > > > >         >         > > -- >
> > > > >         >         > > Sandeep Krishnamurthy>
> > > > >         >         > >
> > > > >         >         >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > - Tsuyoshi
> > > >
> > >
> >
> >
> >
> > --
> > Best Regards,
> > Gautam Kumar
> >
>
>