You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mxnet.apache.org by Marco de Abreu <ma...@googlemail.com> on 2018/03/01 22:11:01 UTC

Refactoring docker handling on CI and locally

Hello,

we have identified issues with the way CI handles docker containers and
thus started a rewrite of this component.

Besides problems that limit the maintainability of our CI system, the
current solution is not extensible in terms of other operating systems
running inside the container. But more importantly, developers are not able
to reproduce CI issues locally on their computer. We would like to give
every developer the chance to reproduce problems locally without going
through an iterative trial & error process assisted by our CI system. I
will follow up with more documentation at a later point in stage.

This project will change the structure of our dockerfiles, remove a lot of
code from the Jenkinsfile, centralize the configuration and behaviour as
well as clean up some files. Please expect some merge conflicts after this
PR has been shipped. Right now we're in the process of migrating all jobs
to the new layout; you can track the progress at
https://github.com/apache/incubator-mxnet/pull/9946.

Please don't hesitate to reach out to me in case of any questions.

Best regards,
Marco

Re: Refactoring docker handling on CI and locally

Posted by Marco de Abreu <ma...@googlemail.com>.
Hello,

documentation for this refactor is available at
https://cwiki.apache.org/confluence/display/MXNET/Reproducing+test+results
. Please redirect people having issues reproducing test failures to this
guide.

Best regards,
Marco

On Sat, Mar 10, 2018 at 1:23 AM, Marco de Abreu <
marco.g.abreu@googlemail.com> wrote:

> Hello,
>
> fyi, this change has just been merged to master and PRs containing
> modifications to Dockerfiles will have to be adopted to the new layout.
>
> I will write the documentation on Monday, the issue can be tracked at
> https://issues.apache.org/jira/browse/MXNET-71. Please don't hesitate to
> reach out to me if any issues arise or in case you have questions.
>
> Best regards,
> Marco
>
> On Fri, Mar 2, 2018 at 11:27 AM, Pedro Larroy <
> pedro.larroy.lists@gmail.com> wrote:
>
>> This is also part of clearing the path to integrate IoT devices into the
>> testing environment.
>>
>> In addition we have too much logic in the Jenkinsfile which is an
>> anti-pattern. Most of what CI does will be reproducible locally with a
>> simple command, which I think is very good to test PRs or local changes.
>>
>> @Tianqi thanks for your comments in the PR.
>>
>>
>> On Thu, Mar 1, 2018 at 11:43 PM, Marco de Abreu <
>> marco.g.abreu@googlemail.com> wrote:
>>
>> > Exactly. The current implementation using ci_build.sh is tightly
>> integrated
>> > into system, especially due to  /tests/ci_build/with_the_same_user that
>> > can
>> > only be executed on Ubuntu as container OS. Since we'll add CentOS 7 to
>> our
>> > supported OS, this change was necessary. We also used that moment to
>> > restructure the Dockerfiles to allow better maintainability.
>> >
>> > In terms of the OS the docker host is using, we will still be targetting
>> > Ubuntu 16.04, but we're also going to establish compatibility with Mac.
>> > Docker on Windows might work, but I don't it's time investing time into
>> > exploring towards that direction.
>> >
>> > -Marco
>> >
>> > On Thu, Mar 1, 2018 at 11:29 PM, Tianqi Chen <tq...@cs.washington.edu>
>> > wrote:
>> >
>> > > Just to be clear, is this mainly about rewriting ci_build.sh into
>> python
>> > so
>> > > that it is more portable across platforms?
>> > >
>> > > Tianqi
>> > >
>> > > On Thu, Mar 1, 2018 at 2:11 PM, Marco de Abreu <
>> > > marco.g.abreu@googlemail.com
>> > > > wrote:
>> > >
>> > > > Hello,
>> > > >
>> > > > we have identified issues with the way CI handles docker containers
>> and
>> > > > thus started a rewrite of this component.
>> > > >
>> > > > Besides problems that limit the maintainability of our CI system,
>> the
>> > > > current solution is not extensible in terms of other operating
>> systems
>> > > > running inside the container. But more importantly, developers are
>> not
>> > > able
>> > > > to reproduce CI issues locally on their computer. We would like to
>> give
>> > > > every developer the chance to reproduce problems locally without
>> going
>> > > > through an iterative trial & error process assisted by our CI
>> system. I
>> > > > will follow up with more documentation at a later point in stage.
>> > > >
>> > > > This project will change the structure of our dockerfiles, remove a
>> lot
>> > > of
>> > > > code from the Jenkinsfile, centralize the configuration and
>> behaviour
>> > as
>> > > > well as clean up some files. Please expect some merge conflicts
>> after
>> > > this
>> > > > PR has been shipped. Right now we're in the process of migrating all
>> > jobs
>> > > > to the new layout; you can track the progress at
>> > > > https://github.com/apache/incubator-mxnet/pull/9946.
>> > > >
>> > > > Please don't hesitate to reach out to me in case of any questions.
>> > > >
>> > > > Best regards,
>> > > > Marco
>> > > >
>> > >
>> >
>>
>
>

Re: Refactoring docker handling on CI and locally

Posted by Marco de Abreu <ma...@googlemail.com>.
Hello,

fyi, this change has just been merged to master and PRs containing
modifications to Dockerfiles will have to be adopted to the new layout.

I will write the documentation on Monday, the issue can be tracked at
https://issues.apache.org/jira/browse/MXNET-71. Please don't hesitate to
reach out to me if any issues arise or in case you have questions.

Best regards,
Marco

On Fri, Mar 2, 2018 at 11:27 AM, Pedro Larroy <pe...@gmail.com>
wrote:

> This is also part of clearing the path to integrate IoT devices into the
> testing environment.
>
> In addition we have too much logic in the Jenkinsfile which is an
> anti-pattern. Most of what CI does will be reproducible locally with a
> simple command, which I think is very good to test PRs or local changes.
>
> @Tianqi thanks for your comments in the PR.
>
>
> On Thu, Mar 1, 2018 at 11:43 PM, Marco de Abreu <
> marco.g.abreu@googlemail.com> wrote:
>
> > Exactly. The current implementation using ci_build.sh is tightly
> integrated
> > into system, especially due to  /tests/ci_build/with_the_same_user that
> > can
> > only be executed on Ubuntu as container OS. Since we'll add CentOS 7 to
> our
> > supported OS, this change was necessary. We also used that moment to
> > restructure the Dockerfiles to allow better maintainability.
> >
> > In terms of the OS the docker host is using, we will still be targetting
> > Ubuntu 16.04, but we're also going to establish compatibility with Mac.
> > Docker on Windows might work, but I don't it's time investing time into
> > exploring towards that direction.
> >
> > -Marco
> >
> > On Thu, Mar 1, 2018 at 11:29 PM, Tianqi Chen <tq...@cs.washington.edu>
> > wrote:
> >
> > > Just to be clear, is this mainly about rewriting ci_build.sh into
> python
> > so
> > > that it is more portable across platforms?
> > >
> > > Tianqi
> > >
> > > On Thu, Mar 1, 2018 at 2:11 PM, Marco de Abreu <
> > > marco.g.abreu@googlemail.com
> > > > wrote:
> > >
> > > > Hello,
> > > >
> > > > we have identified issues with the way CI handles docker containers
> and
> > > > thus started a rewrite of this component.
> > > >
> > > > Besides problems that limit the maintainability of our CI system, the
> > > > current solution is not extensible in terms of other operating
> systems
> > > > running inside the container. But more importantly, developers are
> not
> > > able
> > > > to reproduce CI issues locally on their computer. We would like to
> give
> > > > every developer the chance to reproduce problems locally without
> going
> > > > through an iterative trial & error process assisted by our CI
> system. I
> > > > will follow up with more documentation at a later point in stage.
> > > >
> > > > This project will change the structure of our dockerfiles, remove a
> lot
> > > of
> > > > code from the Jenkinsfile, centralize the configuration and behaviour
> > as
> > > > well as clean up some files. Please expect some merge conflicts after
> > > this
> > > > PR has been shipped. Right now we're in the process of migrating all
> > jobs
> > > > to the new layout; you can track the progress at
> > > > https://github.com/apache/incubator-mxnet/pull/9946.
> > > >
> > > > Please don't hesitate to reach out to me in case of any questions.
> > > >
> > > > Best regards,
> > > > Marco
> > > >
> > >
> >
>

Re: Refactoring docker handling on CI and locally

Posted by Pedro Larroy <pe...@gmail.com>.
This is also part of clearing the path to integrate IoT devices into the
testing environment.

In addition we have too much logic in the Jenkinsfile which is an
anti-pattern. Most of what CI does will be reproducible locally with a
simple command, which I think is very good to test PRs or local changes.

@Tianqi thanks for your comments in the PR.


On Thu, Mar 1, 2018 at 11:43 PM, Marco de Abreu <
marco.g.abreu@googlemail.com> wrote:

> Exactly. The current implementation using ci_build.sh is tightly integrated
> into system, especially due to  /tests/ci_build/with_the_same_user that
> can
> only be executed on Ubuntu as container OS. Since we'll add CentOS 7 to our
> supported OS, this change was necessary. We also used that moment to
> restructure the Dockerfiles to allow better maintainability.
>
> In terms of the OS the docker host is using, we will still be targetting
> Ubuntu 16.04, but we're also going to establish compatibility with Mac.
> Docker on Windows might work, but I don't it's time investing time into
> exploring towards that direction.
>
> -Marco
>
> On Thu, Mar 1, 2018 at 11:29 PM, Tianqi Chen <tq...@cs.washington.edu>
> wrote:
>
> > Just to be clear, is this mainly about rewriting ci_build.sh into python
> so
> > that it is more portable across platforms?
> >
> > Tianqi
> >
> > On Thu, Mar 1, 2018 at 2:11 PM, Marco de Abreu <
> > marco.g.abreu@googlemail.com
> > > wrote:
> >
> > > Hello,
> > >
> > > we have identified issues with the way CI handles docker containers and
> > > thus started a rewrite of this component.
> > >
> > > Besides problems that limit the maintainability of our CI system, the
> > > current solution is not extensible in terms of other operating systems
> > > running inside the container. But more importantly, developers are not
> > able
> > > to reproduce CI issues locally on their computer. We would like to give
> > > every developer the chance to reproduce problems locally without going
> > > through an iterative trial & error process assisted by our CI system. I
> > > will follow up with more documentation at a later point in stage.
> > >
> > > This project will change the structure of our dockerfiles, remove a lot
> > of
> > > code from the Jenkinsfile, centralize the configuration and behaviour
> as
> > > well as clean up some files. Please expect some merge conflicts after
> > this
> > > PR has been shipped. Right now we're in the process of migrating all
> jobs
> > > to the new layout; you can track the progress at
> > > https://github.com/apache/incubator-mxnet/pull/9946.
> > >
> > > Please don't hesitate to reach out to me in case of any questions.
> > >
> > > Best regards,
> > > Marco
> > >
> >
>

Re: Refactoring docker handling on CI and locally

Posted by Marco de Abreu <ma...@googlemail.com>.
Exactly. The current implementation using ci_build.sh is tightly integrated
into system, especially due to  /tests/ci_build/with_the_same_user that can
only be executed on Ubuntu as container OS. Since we'll add CentOS 7 to our
supported OS, this change was necessary. We also used that moment to
restructure the Dockerfiles to allow better maintainability.

In terms of the OS the docker host is using, we will still be targetting
Ubuntu 16.04, but we're also going to establish compatibility with Mac.
Docker on Windows might work, but I don't it's time investing time into
exploring towards that direction.

-Marco

On Thu, Mar 1, 2018 at 11:29 PM, Tianqi Chen <tq...@cs.washington.edu>
wrote:

> Just to be clear, is this mainly about rewriting ci_build.sh into python so
> that it is more portable across platforms?
>
> Tianqi
>
> On Thu, Mar 1, 2018 at 2:11 PM, Marco de Abreu <
> marco.g.abreu@googlemail.com
> > wrote:
>
> > Hello,
> >
> > we have identified issues with the way CI handles docker containers and
> > thus started a rewrite of this component.
> >
> > Besides problems that limit the maintainability of our CI system, the
> > current solution is not extensible in terms of other operating systems
> > running inside the container. But more importantly, developers are not
> able
> > to reproduce CI issues locally on their computer. We would like to give
> > every developer the chance to reproduce problems locally without going
> > through an iterative trial & error process assisted by our CI system. I
> > will follow up with more documentation at a later point in stage.
> >
> > This project will change the structure of our dockerfiles, remove a lot
> of
> > code from the Jenkinsfile, centralize the configuration and behaviour as
> > well as clean up some files. Please expect some merge conflicts after
> this
> > PR has been shipped. Right now we're in the process of migrating all jobs
> > to the new layout; you can track the progress at
> > https://github.com/apache/incubator-mxnet/pull/9946.
> >
> > Please don't hesitate to reach out to me in case of any questions.
> >
> > Best regards,
> > Marco
> >
>

Re: Refactoring docker handling on CI and locally

Posted by Tianqi Chen <tq...@cs.washington.edu>.
Just to be clear, is this mainly about rewriting ci_build.sh into python so
that it is more portable across platforms?

Tianqi

On Thu, Mar 1, 2018 at 2:11 PM, Marco de Abreu <marco.g.abreu@googlemail.com
> wrote:

> Hello,
>
> we have identified issues with the way CI handles docker containers and
> thus started a rewrite of this component.
>
> Besides problems that limit the maintainability of our CI system, the
> current solution is not extensible in terms of other operating systems
> running inside the container. But more importantly, developers are not able
> to reproduce CI issues locally on their computer. We would like to give
> every developer the chance to reproduce problems locally without going
> through an iterative trial & error process assisted by our CI system. I
> will follow up with more documentation at a later point in stage.
>
> This project will change the structure of our dockerfiles, remove a lot of
> code from the Jenkinsfile, centralize the configuration and behaviour as
> well as clean up some files. Please expect some merge conflicts after this
> PR has been shipped. Right now we're in the process of migrating all jobs
> to the new layout; you can track the progress at
> https://github.com/apache/incubator-mxnet/pull/9946.
>
> Please don't hesitate to reach out to me in case of any questions.
>
> Best regards,
> Marco
>