You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Eugene Kirpichov <ki...@google.com.INVALID> on 2017/06/22 22:27:24 UTC

Bundling multiple TestPipeline tests into one pipeline

Hi folks and especially runner developers,

https://issues.apache.org/jira/browse/BEAM-2506 - quoting from there:

Currently ValidatesRunner test suites run 1 pipeline per unit test. That's
a lot of small pipelines, and consumes a lot of resources especially in
case of a pretty heavyweight runner like Dataflow, so tests take a long
time and can't be run in parallel due to quota issues, etc.

Jason Kuster says he and Davor Bonaci discussed that we could execute
multiple unit tests in a single TestPipeline.

To further develop it: in case of Java, we could create a custom JUnit
Runner http://junit.org/junit4/javadoc/4.12/org/junit/runner/Runner.html
that would apply all the transforms and PAsserts in unit tests to a single
instance of TestPipeline (per class, rather than per method), and run the
whole thing at the end. PAssert captures the source location of its
application, so we could still report which particular test failed.

This obviously has fewer isolation between unit test methods, cause they
effectively run in parallel instead of in sequence, so things like
per-method setup and teardown will no longer be applicable. There'll
probably be other issues.

Anyway, this seems doable and high-impact.

Just bringing this to the attention of the community - it seems worth
discussing and perhaps someone will be interested in developing this idea
further or implementing it.

Re: Bundling multiple TestPipeline tests into one pipeline

Posted by Eugene Kirpichov <ki...@google.com.INVALID>.
Another advantage of "custom runner" approach is that we can convert
existing ValidatesRunner test classes one by one, switching them from
RunWith(Junit4.class) to RunWith(BundledTestPipelines.class) or whatever
(and making other necessary changes).

On Thu, Jun 22, 2017 at 3:48 PM Kenneth Knowles <ke...@apache.org> wrote:

> This is a great idea! Your suggestion to do it via a JUnit test runner
> makes it very concrete.
>
> Kenn
>
> On Thu, Jun 22, 2017 at 3:27 PM, Eugene Kirpichov <
> kirpichov@google.com.invalid> wrote:
>
> > Hi folks and especially runner developers,
> >
> > https://issues.apache.org/jira/browse/BEAM-2506 - quoting from there:
> >
> > Currently ValidatesRunner test suites run 1 pipeline per unit test.
> That's
> > a lot of small pipelines, and consumes a lot of resources especially in
> > case of a pretty heavyweight runner like Dataflow, so tests take a long
> > time and can't be run in parallel due to quota issues, etc.
> >
> > Jason Kuster says he and Davor Bonaci discussed that we could execute
> > multiple unit tests in a single TestPipeline.
> >
> > To further develop it: in case of Java, we could create a custom JUnit
> > Runner http://junit.org/junit4/javadoc/4.12/org/junit/runner/Runner.html
> > that would apply all the transforms and PAsserts in unit tests to a
> single
> > instance of TestPipeline (per class, rather than per method), and run the
> > whole thing at the end. PAssert captures the source location of its
> > application, so we could still report which particular test failed.
> >
> > This obviously has fewer isolation between unit test methods, cause they
> > effectively run in parallel instead of in sequence, so things like
> > per-method setup and teardown will no longer be applicable. There'll
> > probably be other issues.
> >
> > Anyway, this seems doable and high-impact.
> >
> > Just bringing this to the attention of the community - it seems worth
> > discussing and perhaps someone will be interested in developing this idea
> > further or implementing it.
> >
>

Re: Bundling multiple TestPipeline tests into one pipeline

Posted by Kenneth Knowles <ke...@apache.org>.
This is a great idea! Your suggestion to do it via a JUnit test runner
makes it very concrete.

Kenn

On Thu, Jun 22, 2017 at 3:27 PM, Eugene Kirpichov <
kirpichov@google.com.invalid> wrote:

> Hi folks and especially runner developers,
>
> https://issues.apache.org/jira/browse/BEAM-2506 - quoting from there:
>
> Currently ValidatesRunner test suites run 1 pipeline per unit test. That's
> a lot of small pipelines, and consumes a lot of resources especially in
> case of a pretty heavyweight runner like Dataflow, so tests take a long
> time and can't be run in parallel due to quota issues, etc.
>
> Jason Kuster says he and Davor Bonaci discussed that we could execute
> multiple unit tests in a single TestPipeline.
>
> To further develop it: in case of Java, we could create a custom JUnit
> Runner http://junit.org/junit4/javadoc/4.12/org/junit/runner/Runner.html
> that would apply all the transforms and PAsserts in unit tests to a single
> instance of TestPipeline (per class, rather than per method), and run the
> whole thing at the end. PAssert captures the source location of its
> application, so we could still report which particular test failed.
>
> This obviously has fewer isolation between unit test methods, cause they
> effectively run in parallel instead of in sequence, so things like
> per-method setup and teardown will no longer be applicable. There'll
> probably be other issues.
>
> Anyway, this seems doable and high-impact.
>
> Just bringing this to the attention of the community - it seems worth
> discussing and perhaps someone will be interested in developing this idea
> further or implementing it.
>

Re: Bundling multiple TestPipeline tests into one pipeline

Posted by Robert Bradshaw <ro...@google.com.INVALID>.
+1

http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201610.mbox/%3CCAFFRZHX4yq%3D%3DxuvkPjwDFezVhWH82oj%2BgpS-OhUMc%3D3QUVaS1g%40mail.gmail.com%3E

On Fri, Jun 23, 2017 at 9:23 AM, Davor Bonaci <da...@apache.org> wrote:

> This would be a great contribution if anyone wants to give it a try.
>
> On Thu, Jun 22, 2017 at 9:23 PM, Jean-Baptiste Onofré <jb...@nanthrax.net>
> wrote:
>
> > Hi Eugene
> >
> > I like the idea !
> >
> > Regards
> > JB
> >
> >
> > On 06/23/2017 12:27 AM, Eugene Kirpichov wrote:
> >
> >> Hi folks and especially runner developers,
> >>
> >> https://issues.apache.org/jira/browse/BEAM-2506 - quoting from there:
> >>
> >> Currently ValidatesRunner test suites run 1 pipeline per unit test.
> That's
> >> a lot of small pipelines, and consumes a lot of resources especially in
> >> case of a pretty heavyweight runner like Dataflow, so tests take a long
> >> time and can't be run in parallel due to quota issues, etc.
> >>
> >> Jason Kuster says he and Davor Bonaci discussed that we could execute
> >> multiple unit tests in a single TestPipeline.
> >>
> >> To further develop it: in case of Java, we could create a custom JUnit
> >> Runner http://junit.org/junit4/javadoc/4.12/org/junit/runner/
> Runner.html
> >> that would apply all the transforms and PAsserts in unit tests to a
> single
> >> instance of TestPipeline (per class, rather than per method), and run
> the
> >> whole thing at the end. PAssert captures the source location of its
> >> application, so we could still report which particular test failed.
> >>
> >> This obviously has fewer isolation between unit test methods, cause they
> >> effectively run in parallel instead of in sequence, so things like
> >> per-method setup and teardown will no longer be applicable. There'll
> >> probably be other issues.
> >>
> >> Anyway, this seems doable and high-impact.
> >>
> >> Just bringing this to the attention of the community - it seems worth
> >> discussing and perhaps someone will be interested in developing this
> idea
> >> further or implementing it.
> >>
> >>
> > --
> > Jean-Baptiste Onofré
> > jbonofre@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>

Re: Bundling multiple TestPipeline tests into one pipeline

Posted by Davor Bonaci <da...@apache.org>.
This would be a great contribution if anyone wants to give it a try.

On Thu, Jun 22, 2017 at 9:23 PM, Jean-Baptiste Onofré <jb...@nanthrax.net>
wrote:

> Hi Eugene
>
> I like the idea !
>
> Regards
> JB
>
>
> On 06/23/2017 12:27 AM, Eugene Kirpichov wrote:
>
>> Hi folks and especially runner developers,
>>
>> https://issues.apache.org/jira/browse/BEAM-2506 - quoting from there:
>>
>> Currently ValidatesRunner test suites run 1 pipeline per unit test. That's
>> a lot of small pipelines, and consumes a lot of resources especially in
>> case of a pretty heavyweight runner like Dataflow, so tests take a long
>> time and can't be run in parallel due to quota issues, etc.
>>
>> Jason Kuster says he and Davor Bonaci discussed that we could execute
>> multiple unit tests in a single TestPipeline.
>>
>> To further develop it: in case of Java, we could create a custom JUnit
>> Runner http://junit.org/junit4/javadoc/4.12/org/junit/runner/Runner.html
>> that would apply all the transforms and PAsserts in unit tests to a single
>> instance of TestPipeline (per class, rather than per method), and run the
>> whole thing at the end. PAssert captures the source location of its
>> application, so we could still report which particular test failed.
>>
>> This obviously has fewer isolation between unit test methods, cause they
>> effectively run in parallel instead of in sequence, so things like
>> per-method setup and teardown will no longer be applicable. There'll
>> probably be other issues.
>>
>> Anyway, this seems doable and high-impact.
>>
>> Just bringing this to the attention of the community - it seems worth
>> discussing and perhaps someone will be interested in developing this idea
>> further or implementing it.
>>
>>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Re: Bundling multiple TestPipeline tests into one pipeline

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Hi Eugene

I like the idea !

Regards
JB

On 06/23/2017 12:27 AM, Eugene Kirpichov wrote:
> Hi folks and especially runner developers,
> 
> https://issues.apache.org/jira/browse/BEAM-2506 - quoting from there:
> 
> Currently ValidatesRunner test suites run 1 pipeline per unit test. That's
> a lot of small pipelines, and consumes a lot of resources especially in
> case of a pretty heavyweight runner like Dataflow, so tests take a long
> time and can't be run in parallel due to quota issues, etc.
> 
> Jason Kuster says he and Davor Bonaci discussed that we could execute
> multiple unit tests in a single TestPipeline.
> 
> To further develop it: in case of Java, we could create a custom JUnit
> Runner http://junit.org/junit4/javadoc/4.12/org/junit/runner/Runner.html
> that would apply all the transforms and PAsserts in unit tests to a single
> instance of TestPipeline (per class, rather than per method), and run the
> whole thing at the end. PAssert captures the source location of its
> application, so we could still report which particular test failed.
> 
> This obviously has fewer isolation between unit test methods, cause they
> effectively run in parallel instead of in sequence, so things like
> per-method setup and teardown will no longer be applicable. There'll
> probably be other issues.
> 
> Anyway, this seems doable and high-impact.
> 
> Just bringing this to the attention of the community - it seems worth
> discussing and perhaps someone will be interested in developing this idea
> further or implementing it.
> 

-- 
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com