You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Aljoscha Krettek <al...@apache.org> on 2018/03/09 09:47:14 UTC

[DISCUSS] Add end-to-end tests using Docker Compose

Hi All,

Stephan pointed this out the other day to me, so here goes:  as some of you might now, there are end-to-end tests in flink-end-to-end tests that run a proper Flink cluster (on the local machine) and execute some tests. This catches bugs that you only catch when using Flink as a user because they exercise the whole system. We should add tests there that verify integration with other systems. For example, there's a bunch of Docker-compose configurations for starting complete Hadoop clusters [1] or Mesos [2] and there is other files for starting ZooKeeper, Kafka, ... We can use this to spin up a testing cluster and run Flink on YARN and Mesos and have a reproducible environment.

As a next step, we could perform the sort of tests we do for a release, as described here: [3]. For example, the test where we run a job and mess with processes and see that Flink correctly recovers and that the HA setup works as intended.

What do you think?

By the way, I'm also mostly writing to see if anyone has some experience with Docker/Docker compose and would be interested in getting started on this. I would do it myself because having more automated tests would help be sleep better at night but I'm currently too busy with other things. 😉

[1] https://github.com/big-data-europe/docker-hadoop/blob/master/docker-compose.yml
[2] https://github.com/bobrik/mesos-compose
[3]

Re: [DISCUSS] Add end-to-end tests using Docker Compose

Posted by Bowen Li <bo...@gmail.com>.
That is great!

On Mon, Mar 12, 2018 at 9:18 AM, Aljoscha Krettek <al...@apache.org>
wrote:

> Sorry, the missing link for [3] is this: https://docs.google.com/
> document/d/1cOkycJwEKVjG_onnpl3bQNTq7uebh48zDtIJxceyU2E/edit <
> https://docs.google.com/document/d/1cOkycJwEKVjG_
> onnpl3bQNTq7uebh48zDtIJxceyU2E/edit>
>
> Regarding build time, I recently merged this change:
> https://issues.apache.org/jira/browse/FLINK-8911 <
> https://issues.apache.org/jira/browse/FLINK-8911>. It introduces a
> separation between "pre-commit-tests" and "nightly-tests". The idea being
> the the former are executed for each pull request, as they are now. The
> latter are executed nightly or manually if you want to verify a release.
> This way, the nightly tests can get quite involved without blowing up build
> time. What do you think?
>
> Best,
> Aljoscha
>
> > On 10. Mar 2018, at 17:18, Bowen Li <bo...@gmail.com> wrote:
> >
> > This would be nice. BTW, the [3] link is missing
> >
> > I have a few questions, given concerns that 1) the full build time
> > will take longer and longer time and 2) build queue on Travis will be
> more
> > congested
> >
> >
> >   - How much time will be added to full build time according to
> estimation?
> >   - Do we have plans to further parallelize full build?
> >   - Are new integration tests gonna run for every full build on both user
> >   machine and Travis by default?
> >   - Shall we add ways to skip it?
> >   - if manually, add some flag or mvn parameter to disable it?
> >      - if automatically, when there's new code changes, enable it to
> >      recognize if the code change is trivial (like editing docs) and
> thus not
> >      run those heavy, time-consuming integration tests?
> >
> > Thanks,
> > Bowen
> >
> >
> > On Fri, Mar 9, 2018 at 4:40 AM, Renjie Liu <li...@gmail.com>
> wrote:
> >
> >> +1
> >> On Fri, 9 Mar 2018 at 5:47 PM Aljoscha Krettek <al...@apache.org>
> >> wrote:
> >>
> >>> Hi All,
> >>>
> >>> Stephan pointed this out the other day to me, so here goes:  as some of
> >>> you might now, there are end-to-end tests in flink-end-to-end tests
> that
> >>> run a proper Flink cluster (on the local machine) and execute some
> tests.
> >>> This catches bugs that you only catch when using Flink as a user
> because
> >>> they exercise the whole system. We should add tests there that verify
> >>> integration with other systems. For example, there's a bunch of
> >>> Docker-compose configurations for starting complete Hadoop clusters [1]
> >> or
> >>> Mesos [2] and there is other files for starting ZooKeeper, Kafka, ...
> We
> >>> can use this to spin up a testing cluster and run Flink on YARN and
> Mesos
> >>> and have a reproducible environment.
> >>>
> >>> As a next step, we could perform the sort of tests we do for a release,
> >> as
> >>> described here: [3]. For example, the test where we run a job and mess
> >> with
> >>> processes and see that Flink correctly recovers and that the HA setup
> >> works
> >>> as intended.
> >>>
> >>> What do you think?
> >>>
> >>> By the way, I'm also mostly writing to see if anyone has some
> experience
> >>> with Docker/Docker compose and would be interested in getting started
> on
> >>> this. I would do it myself because having more automated tests would
> help
> >>> be sleep better at night but I'm currently too busy with other things.
> 😉
> >>>
> >>> [1]
> >>> https://github.com/big-data-europe/docker-hadoop/blob/
> >> master/docker-compose.yml
> >>> [2] https://github.com/bobrik/mesos-compose
> >>> [3]
> >>
> >> --
> >> Liu, Renjie
> >> Software Engineer, MVAD
> >>
>
>

Re: [DISCUSS] Add end-to-end tests using Docker Compose

Posted by Aljoscha Krettek <al...@apache.org>.
Sorry, the missing link for [3] is this: https://docs.google.com/document/d/1cOkycJwEKVjG_onnpl3bQNTq7uebh48zDtIJxceyU2E/edit <https://docs.google.com/document/d/1cOkycJwEKVjG_onnpl3bQNTq7uebh48zDtIJxceyU2E/edit>

Regarding build time, I recently merged this change: https://issues.apache.org/jira/browse/FLINK-8911 <https://issues.apache.org/jira/browse/FLINK-8911>. It introduces a separation between "pre-commit-tests" and "nightly-tests". The idea being the the former are executed for each pull request, as they are now. The latter are executed nightly or manually if you want to verify a release. This way, the nightly tests can get quite involved without blowing up build time. What do you think?

Best,
Aljoscha

> On 10. Mar 2018, at 17:18, Bowen Li <bo...@gmail.com> wrote:
> 
> This would be nice. BTW, the [3] link is missing
> 
> I have a few questions, given concerns that 1) the full build time
> will take longer and longer time and 2) build queue on Travis will be more
> congested
> 
> 
>   - How much time will be added to full build time according to estimation?
>   - Do we have plans to further parallelize full build?
>   - Are new integration tests gonna run for every full build on both user
>   machine and Travis by default?
>   - Shall we add ways to skip it?
>   - if manually, add some flag or mvn parameter to disable it?
>      - if automatically, when there's new code changes, enable it to
>      recognize if the code change is trivial (like editing docs) and thus not
>      run those heavy, time-consuming integration tests?
> 
> Thanks,
> Bowen
> 
> 
> On Fri, Mar 9, 2018 at 4:40 AM, Renjie Liu <li...@gmail.com> wrote:
> 
>> +1
>> On Fri, 9 Mar 2018 at 5:47 PM Aljoscha Krettek <al...@apache.org>
>> wrote:
>> 
>>> Hi All,
>>> 
>>> Stephan pointed this out the other day to me, so here goes:  as some of
>>> you might now, there are end-to-end tests in flink-end-to-end tests that
>>> run a proper Flink cluster (on the local machine) and execute some tests.
>>> This catches bugs that you only catch when using Flink as a user because
>>> they exercise the whole system. We should add tests there that verify
>>> integration with other systems. For example, there's a bunch of
>>> Docker-compose configurations for starting complete Hadoop clusters [1]
>> or
>>> Mesos [2] and there is other files for starting ZooKeeper, Kafka, ... We
>>> can use this to spin up a testing cluster and run Flink on YARN and Mesos
>>> and have a reproducible environment.
>>> 
>>> As a next step, we could perform the sort of tests we do for a release,
>> as
>>> described here: [3]. For example, the test where we run a job and mess
>> with
>>> processes and see that Flink correctly recovers and that the HA setup
>> works
>>> as intended.
>>> 
>>> What do you think?
>>> 
>>> By the way, I'm also mostly writing to see if anyone has some experience
>>> with Docker/Docker compose and would be interested in getting started on
>>> this. I would do it myself because having more automated tests would help
>>> be sleep better at night but I'm currently too busy with other things. 😉
>>> 
>>> [1]
>>> https://github.com/big-data-europe/docker-hadoop/blob/
>> master/docker-compose.yml
>>> [2] https://github.com/bobrik/mesos-compose
>>> [3]
>> 
>> --
>> Liu, Renjie
>> Software Engineer, MVAD
>> 


Re: [DISCUSS] Add end-to-end tests using Docker Compose

Posted by Bowen Li <bo...@gmail.com>.
This would be nice. BTW, the [3] link is missing

I have a few questions, given concerns that 1) the full build time
will take longer and longer time and 2) build queue on Travis will be more
congested


   - How much time will be added to full build time according to estimation?
   - Do we have plans to further parallelize full build?
   - Are new integration tests gonna run for every full build on both user
   machine and Travis by default?
   - Shall we add ways to skip it?
   - if manually, add some flag or mvn parameter to disable it?
      - if automatically, when there's new code changes, enable it to
      recognize if the code change is trivial (like editing docs) and thus not
      run those heavy, time-consuming integration tests?

Thanks,
Bowen


On Fri, Mar 9, 2018 at 4:40 AM, Renjie Liu <li...@gmail.com> wrote:

> +1
> On Fri, 9 Mar 2018 at 5:47 PM Aljoscha Krettek <al...@apache.org>
> wrote:
>
> > Hi All,
> >
> > Stephan pointed this out the other day to me, so here goes:  as some of
> > you might now, there are end-to-end tests in flink-end-to-end tests that
> > run a proper Flink cluster (on the local machine) and execute some tests.
> > This catches bugs that you only catch when using Flink as a user because
> > they exercise the whole system. We should add tests there that verify
> > integration with other systems. For example, there's a bunch of
> > Docker-compose configurations for starting complete Hadoop clusters [1]
> or
> > Mesos [2] and there is other files for starting ZooKeeper, Kafka, ... We
> > can use this to spin up a testing cluster and run Flink on YARN and Mesos
> > and have a reproducible environment.
> >
> > As a next step, we could perform the sort of tests we do for a release,
> as
> > described here: [3]. For example, the test where we run a job and mess
> with
> > processes and see that Flink correctly recovers and that the HA setup
> works
> > as intended.
> >
> > What do you think?
> >
> > By the way, I'm also mostly writing to see if anyone has some experience
> > with Docker/Docker compose and would be interested in getting started on
> > this. I would do it myself because having more automated tests would help
> > be sleep better at night but I'm currently too busy with other things. 😉
> >
> > [1]
> > https://github.com/big-data-europe/docker-hadoop/blob/
> master/docker-compose.yml
> > [2] https://github.com/bobrik/mesos-compose
> > [3]
>
> --
> Liu, Renjie
> Software Engineer, MVAD
>

Re: [DISCUSS] Add end-to-end tests using Docker Compose

Posted by Renjie Liu <li...@gmail.com>.
+1
On Fri, 9 Mar 2018 at 5:47 PM Aljoscha Krettek <al...@apache.org> wrote:

> Hi All,
>
> Stephan pointed this out the other day to me, so here goes:  as some of
> you might now, there are end-to-end tests in flink-end-to-end tests that
> run a proper Flink cluster (on the local machine) and execute some tests.
> This catches bugs that you only catch when using Flink as a user because
> they exercise the whole system. We should add tests there that verify
> integration with other systems. For example, there's a bunch of
> Docker-compose configurations for starting complete Hadoop clusters [1] or
> Mesos [2] and there is other files for starting ZooKeeper, Kafka, ... We
> can use this to spin up a testing cluster and run Flink on YARN and Mesos
> and have a reproducible environment.
>
> As a next step, we could perform the sort of tests we do for a release, as
> described here: [3]. For example, the test where we run a job and mess with
> processes and see that Flink correctly recovers and that the HA setup works
> as intended.
>
> What do you think?
>
> By the way, I'm also mostly writing to see if anyone has some experience
> with Docker/Docker compose and would be interested in getting started on
> this. I would do it myself because having more automated tests would help
> be sleep better at night but I'm currently too busy with other things. 😉
>
> [1]
> https://github.com/big-data-europe/docker-hadoop/blob/master/docker-compose.yml
> [2] https://github.com/bobrik/mesos-compose
> [3]

-- 
Liu, Renjie
Software Engineer, MVAD