You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Etienne Chauchot <ec...@apache.org> on 2018/03/09 11:08:41 UTC

[PROPOSITION] schedule some sanity tests on a daily basis

Hi guys,

I was looking at the various jenkins jobs and I wanted to submit a proposition:

- Validates runner tests: currently run at PostCommit for all the runners. I think it is the quickest way to see
regressions. So keep it that way

- Integration tests: AFAIK we only run the ones in examples module and only on demand. What about running all the IT (in
particular IO IT) as a cron job on a daily basis with direct runner? Please note that it will require some always up
backend infrastructure.

- Performance tests: what about running Nexmark SMOKE test suite in batch and streaming modes with all the runners on a
daily basis and store the running times in a RRD database (to see performance regressions)? Please note that not all the
queries run in all the runners in all the modes right now. Also, we have some streaming pipelines termination issues
(see https://issues.apache.org/jira/browse/BEAM-2847)

I know that Stephen Sisk use to work on these topics. I also talked to guys from Polidea. But As I understood, they
launch mainly integration tests on Dataflow runner.

WDYT?

Etienne



Re: [PROPOSITION] schedule some sanity tests on a daily basis

Posted by Etienne Chauchot <ec...@apache.org>.
Hi all, 
please know that this subject has come forward:The first PR (https://github.com/apache/beam/pull/5464) writes perfs to
BQThe second one  (https://github.com/apache/beam/pull/4976) runs the PostCommits and configure the exports to BQFirst
PR needs to be merged before second one and once the two are merged we should have tables per query/runner/mode showing
the evolution of response time, events/sec, output size in the perfkit dashboards for each commit (post commit jobs and
also jenkins phrases in github). 
If it is too frequent we could then schedule to something like once a day.

Etienne
Le jeudi 29 mars 2018 à 17:40 +0200, Etienne Chauchot a écrit :
> Hi all,As discussed here I proposed a PR (https://github.com/apache/beam/pull/4976) to schedule nexmark runs as post
> commit tests. the post commits runNM on direct runner in batch modeNM on direct runner in streaming modeNM on flink
> runner in batch modeNM on flink runner in streaming modeNM on spark runner in batch modethese are the runners/modes
> for which all the nexmark queries run finethere is still output like a database to be added to nexmark (it just
> outputs to the console for now)If it is too costly, we could schedule less.Etienne
> > So what next? Shall we schedule nexmark runs and add a Bigquery sink to nexmark output?
> > Le lundi 12 mars 2018 à 10:30 +0100, Etienne Chauchot a écrit :
> > > Thanks everyone for your comments and support.
> > > Le vendredi 09 mars 2018 à 21:28 +0000, Alan Myrvold a écrit :
> > > > Great ideas. I want to see a daily signal for anything that could prevent a release from happening, and
> > > > precommits that are fast and reliable for areas that are commonly broken by code changes.
> > > > We are now running the java quickstarts daily on a cron schedule, using direct, dataflow, and local spark and
> > > > flink in the beam_PostRelease_NightlySnapshot job, see https://github.com/apache/beam/blob/master/release/build.
> > > > gradle This should provide a good signal for the examples integration tests against these runners.
> > > > 
> > > > As Kenn noted, the java_maveninstall also runs lots of tests. It would be good to be more clear and intentional
> > > > about which tests run when, and to consider implementing additional "always up" environments for use by the
> > > > tests.
> > > > Having the nexmark smoke tests run regularly and stored in a database would really enhance our efforts, perhaps
> > > > starting with directrunner for the performance tests.
> > > 
> > > Yes
> > > > What area would have the most immediate impact? Nexmark smoke tests?
> > > 
> > > Yes IMHO I think that Nexmark smoke tests would have a great return on investment. By just scheduling some of them
> > > (at first),  we enable deep confidence in the runners on real user pipelines. In the past Nexmark has allowed to
> > > discover regressions in performance before a release and also to discover some bugs in some runners. But, please
> > > note that, for this last ability, Nexmark is limited currently: it only detects failures if an exception is
> > > thrown, there is no check of the correctness of the output PCollection because the aim was performance tests and
> > > there is no point adding a slow test for correctness. Nevertheless, if we store the output size (as I suggested in
> > > this thread), we can get a hint on a failure if the output size is different from the last stored output sizes.
> > > Etienne
> > > > 
> > > > 
> > > > On Fri, Mar 9, 2018 at 12:57 PM Kenneth Knowles <kl...@google.com> wrote:
> > > > > On Fri, Mar 9, 2018 at 3:08 AM Etienne Chauchot <ec...@apache.org> wrote:
> > > > > > Hi guys,
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > I was looking at the various jenkins jobs and I wanted to submit a proposition:
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > - Validates runner tests: currently run at PostCommit for all the runners. I think it is the quickest way to
> > > > > > see
> > > > > > 
> > > > > > regressions. So keep it that way
> > > > > 
> > > > > We've also toyed with precommit for runners where it is fast. 
> > > > > > - Integration tests: AFAIK we only run the ones in examples module and only on demand. What about running
> > > > > > all the IT (in
> > > > > > 
> > > > > > particular IO IT) as a cron job on a daily basis with direct runner? Please note that it will require some
> > > > > > always up
> > > > > > 
> > > > > > backend infrastructure.
> > > > > 
> > > > > I like this idea. We actually run more, but in postcommit. You can see the goal here: https://github.com/apach
> > > > > e/beam/blob/master/.test-infra/jenkins/job_beam_PostCommit_Java_MavenInstall.groovy#L47
> > > > > There's no infrastructure set up that I see. It is only DirectRunner and DataflowRunner currently, as they are
> > > > > "always up". But so could be local Flink and Spark. Do the ITs spin up local versions of what they are
> > > > > connecting to?
> > > > > If we have adequate resources, I also think ValidatesRunner on a real cluster would add value, once we have
> > > > > the cluster set up / tear down or "always up". 
> > > > > > - Performance tests: what about running Nexmark SMOKE test suite in batch and streaming modes with all the
> > > > > > runners on a
> > > > > > 
> > > > > > daily basis and store the running times in a RRD database (to see performance regressions)?
> > > > > 
> > > > > I like this idea, too. I think we could do DirectRunner (and probably local Flink) as postcommit without being
> > > > > too expensive.
> > > > > Kenn
> > > > >  
> > > > > >  Please note that not all the
> > > > > > 
> > > > > > queries run in all the runners in all the modes right now. Also, we have some streaming pipelines
> > > > > > termination issues
> > > > > > 
> > > > > > (see https://issues.apache.org/jira/browse/BEAM-2847)
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > I know that Stephen Sisk use to work on these topics. I also talked to guys from Polidea. But As I
> > > > > > understood, they
> > > > > > 
> > > > > > launch mainly integration tests on Dataflow runner.
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > WDYT?
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > Etienne
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 

Re: [PROPOSITION] schedule some sanity tests on a daily basis

Posted by Etienne Chauchot <ec...@apache.org>.
Hi all,
As discussed here I proposed a PR (https://github.com/apache/beam/pull/4976) to schedule nexmark runs as post commit
tests. the post commits run
 * NM on direct runner in batch mode
 * NM on direct runner in streaming mode
 * NM on flink runner in batch mode
 * NM on flink runner in streaming mode
 * NM on spark runner in batch mode
these are the runners/modes for which all the nexmark queries run fine
there is still output like a database to be added to nexmark (it just outputs to the console for now)
If it is too costly, we could schedule less.
Etienne
> 
> So what next? Shall we schedule nexmark runs and add a Bigquery sink to nexmark output?
> 
> Le lundi 12 mars 2018 à 10:30 +0100, Etienne Chauchot a écrit :
> > Thanks everyone for your comments and support.
> > 
> > Le vendredi 09 mars 2018 à 21:28 +0000, Alan Myrvold a écrit :
> > > Great ideas. I want to see a daily signal for anything that could prevent a release from happening, and precommits
> > > that are fast and reliable for areas that are commonly broken by code changes.
> > > 
> > > We are now running the java quickstarts daily on a cron schedule, using direct, dataflow, and local spark and
> > > flink in the beam_PostRelease_NightlySnapshot job, see https://github.com/apache/beam/blob/master/release/build.gr
> > > adle This should provide a good signal for the examples integration tests against these runners.
> > > 
> > > As Kenn noted, the java_maveninstall also runs lots of tests. It would be good to be more clear and intentional
> > > about which tests run when, and to consider implementing additional "always up" environments for use by the tests.
> > > 
> > > Having the nexmark smoke tests run regularly and stored in a database would really enhance our efforts, perhaps
> > > starting with directrunner for the performance tests.
> > Yes
> > 
> > > 
> > > What area would have the most immediate impact? Nexmark smoke tests?
> > Yes IMHO I think that Nexmark smoke tests would have a great return on investment. By just scheduling some of them
> > (at first),  we enable deep confidence in the runners on real user pipelines. In the past Nexmark has allowed to
> > discover regressions in performance before a release and also to discover some bugs in some runners. But, please
> > note that, for this last ability, Nexmark is limited currently: it only detects failures if an exception is thrown,
> > there is no check of the correctness of the output PCollection because the aim was performance tests and there is no
> > point adding a slow test for correctness. Nevertheless, if we store the output size (as I suggested in this thread),
> > we can get a hint on a failure if the output size is different from the last stored output sizes.
> > 
> > Etienne
> > 
> > > 
> > > 
> > > 
> > > 
> > > On Fri, Mar 9, 2018 at 12:57 PM Kenneth Knowles <kl...@google.com> wrote:
> > > > On Fri, Mar 9, 2018 at 3:08 AM Etienne Chauchot <ec...@apache.org> wrote:
> > > > > Hi guys,
> > > > > 
> > > > > I was looking at the various jenkins jobs and I wanted to submit a proposition:
> > > > > 
> > > > > - Validates runner tests: currently run at PostCommit for all the runners. I think it is the quickest way to
> > > > > see
> > > > > regressions. So keep it that way
> > > > We've also toyed with precommit for runners where it is fast.
> > > >  
> > > > > - Integration tests: AFAIK we only run the ones in examples module and only on demand. What about running all
> > > > > the IT (in
> > > > > particular IO IT) as a cron job on a daily basis with direct runner? Please note that it will require some
> > > > > always up
> > > > > backend infrastructure.
> > > > I like this idea. We actually run more, but in postcommit. You can see the goal here: https://github.com/apache/
> > > > beam/blob/master/.test-infra/jenkins/job_beam_PostCommit_Java_MavenInstall.groovy#L47
> > > > 
> > > > There's no infrastructure set up that I see. It is only DirectRunner and DataflowRunner currently, as they are
> > > > "always up". But so could be local Flink and Spark. Do the ITs spin up local versions of what they are
> > > > connecting to?
> > > > 
> > > > If we have adequate resources, I also think ValidatesRunner on a real cluster would add value, once we have the
> > > > cluster set up / tear down or "always up".
> > > >  
> > > > > - Performance tests: what about running Nexmark SMOKE test suite in batch and streaming modes with all the
> > > > > runners on a
> > > > > daily basis and store the running times in a RRD database (to see performance regressions)?
> > > > I like this idea, too. I think we could do DirectRunner (and probably local Flink) as postcommit without being
> > > > too expensive.
> > > > 
> > > > Kenn
> > > > 
> > > >  
> > > > > Please note that not all the
> > > > > queries run in all the runners in all the modes right now. Also, we have some streaming pipelines termination
> > > > > issues
> > > > > (see https://issues.apache.org/jira/browse/BEAM-2847)
> > > > > 
> > > > > I know that Stephen Sisk use to work on these topics. I also talked to guys from Polidea. But As I understood,
> > > > > they
> > > > > launch mainly integration tests on Dataflow runner.
> > > > > 
> > > > > WDYT?
> > > > > 
> > > > > Etienne
> > > > > 
> > > > > 
> > > > > 

Re: [PROPOSITION] schedule some sanity tests on a daily basis

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Hi,

I would suggest to prepare a Maven profile to perform nexmark runs. 
Then, I can setup a job (seed/manual) in Jenkins to run this.

Regards
JB

On 15/03/2018 22:13, Etienne Chauchot wrote:
> 
> So what next? Shall we schedule nexmark runs and add a Bigquery sink to 
> nexmark output?
> 
> Le lundi 12 mars 2018 à 10:30 +0100, Etienne Chauchot a écrit :
>> Thanks everyone for your comments and support.
>>
>> Le vendredi 09 mars 2018 à 21:28 +0000, Alan Myrvold a écrit :
>>> Great ideas. I want to see a daily signal for anything that could 
>>> prevent a release from happening, and precommits that are fast and 
>>> reliable for areas that are commonly broken by code changes.
>>>
>>> We are now running the java quickstarts daily on a cron schedule, 
>>> using direct, dataflow, and local spark and flink in 
>>> the beam_PostRelease_NightlySnapshot job, see 
>>> https://github.com/apache/beam/blob/master/release/build.gradle This 
>>> should provide a good signal for the examples integration tests 
>>> against these runners.
>>>
>>> As Kenn noted, the java_maveninstall also runs lots of tests. It 
>>> would be good to be more clear and intentional about which tests run 
>>> when, and to consider implementing additional "always up" 
>>> environments for use by the tests.
>>>
>>> Having the nexmark smoke tests run regularly and stored in a database 
>>> would really enhance our efforts, perhaps starting with directrunner 
>>> for the performance tests.
>>
>> Yes
>>
>>>
>>> What area would have the most immediate impact? Nexmark smoke tests?
>>
>> Yes IMHO I think that Nexmark smoke tests would have a great return on 
>> investment. By just scheduling some of them (at first),  we enable 
>> deep confidence in the runners on real user pipelines. In the past 
>> Nexmark has allowed to discover regressions in performance before a 
>> release and also to discover some bugs in some runners. But, please 
>> note that, for this last ability, Nexmark is limited currently: it 
>> only detects failures if an exception is thrown, there is no check of 
>> the correctness of the output PCollection because the aim was 
>> performance tests and there is no point adding a slow test for 
>> correctness. Nevertheless, if we store the output size (as I suggested 
>> in this thread), we can get a hint on a failure if the output size is 
>> different from the last stored output sizes.
>>
>> Etienne
>>
>>>
>>>
>>>
>>>
>>> On Fri, Mar 9, 2018 at 12:57 PM Kenneth Knowles <klk@google.com 
>>> <ma...@google.com>> wrote:
>>>> On Fri, Mar 9, 2018 at 3:08 AM Etienne Chauchot 
>>>> <echauchot@apache.org <ma...@apache.org>> wrote:
>>>>> Hi guys,
>>>>>
>>>>> I was looking at the various jenkins jobs and I wanted to submit a 
>>>>> proposition:
>>>>>
>>>>> - Validates runner tests: currently run at PostCommit for all the 
>>>>> runners. I think it is the quickest way to see
>>>>> regressions. So keep it that way
>>>>
>>>> We've also toyed with precommit for runners where it is fast.
>>>>> - Integration tests: AFAIK we only run the ones in examples module 
>>>>> and only on demand. What about running all the IT (in
>>>>> particular IO IT) as a cron job on a daily basis with direct 
>>>>> runner? Please note that it will require some always up
>>>>> backend infrastructure.
>>>>
>>>> I like this idea. We actually run more, but in postcommit. You can 
>>>> see the goal here: 
>>>> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PostCommit_Java_MavenInstall.groovy#L47
>>>>
>>>> There's no infrastructure set up that I see. It is only DirectRunner 
>>>> and DataflowRunner currently, as they are "always up". But so could 
>>>> be local Flink and Spark. Do the ITs spin up local versions of what 
>>>> they are connecting to?
>>>>
>>>> If we have adequate resources, I also think ValidatesRunner on a 
>>>> real cluster would add value, once we have the cluster set up / tear 
>>>> down or "always up".
>>>>
>>>>> - Performance tests: what about running Nexmark SMOKE test suite in 
>>>>> batch and streaming modes with all the runners on a
>>>>> daily basis and store the running times in a RRD database (to see 
>>>>> performance regressions)?
>>>>
>>>> I like this idea, too. I think we could do DirectRunner (and 
>>>> probably local Flink) as postcommit without being too expensive.
>>>>
>>>> Kenn
>>>>
>>>>> Please note that not all the
>>>>> queries run in all the runners in all the modes right now. Also, we 
>>>>> have some streaming pipelines termination issues
>>>>> (see https://issues.apache.org/jira/browse/BEAM-2847)
>>>>>
>>>>> I know that Stephen Sisk use to work on these topics. I also talked 
>>>>> to guys from Polidea. But As I understood, they
>>>>> launch mainly integration tests on Dataflow runner.
>>>>>
>>>>> WDYT?
>>>>>
>>>>> Etienne
>>>>>
>>>>>
>>>>>
>>>>

Re: [PROPOSITION] schedule some sanity tests on a daily basis

Posted by Etienne Chauchot <ec...@apache.org>.
So what next? Shall we schedule nexmark runs and add a Bigquery sink to nexmark output?
Le lundi 12 mars 2018 à 10:30 +0100, Etienne Chauchot a écrit :
> Thanks everyone for your comments and support.
> 
> Le vendredi 09 mars 2018 à 21:28 +0000, Alan Myrvold a écrit :
> > Great ideas. I want to see a daily signal for anything that could prevent a release from happening, and precommits
> > that are fast and reliable for areas that are commonly broken by code changes.
> > 
> > We are now running the java quickstarts daily on a cron schedule, using direct, dataflow, and local spark and flink
> > in the beam_PostRelease_NightlySnapshot job, see https://github.com/apache/beam/blob/master/release/build.gradle
> > This should provide a good signal for the examples integration tests against these runners.
> > 
> > As Kenn noted, the java_maveninstall also runs lots of tests. It would be good to be more clear and intentional
> > about which tests run when, and to consider implementing additional "always up" environments for use by the tests.
> > 
> > Having the nexmark smoke tests run regularly and stored in a database would really enhance our efforts, perhaps
> > starting with directrunner for the performance tests.
> Yes
> 
> > 
> > What area would have the most immediate impact? Nexmark smoke tests?
> Yes IMHO I think that Nexmark smoke tests would have a great return on investment. By just scheduling some of them (at
> first),  we enable deep confidence in the runners on real user pipelines. In the past Nexmark has allowed to discover
> regressions in performance before a release and also to discover some bugs in some runners. But, please note that, for
> this last ability, Nexmark is limited currently: it only detects failures if an exception is thrown, there is no check
> of the correctness of the output PCollection because the aim was performance tests and there is no point adding a slow
> test for correctness. Nevertheless, if we store the output size (as I suggested in this thread), we can get a hint on
> a failure if the output size is different from the last stored output sizes.
> 
> Etienne
> 
> > 
> > 
> > 
> > 
> > On Fri, Mar 9, 2018 at 12:57 PM Kenneth Knowles <kl...@google.com> wrote:
> > > On Fri, Mar 9, 2018 at 3:08 AM Etienne Chauchot <ec...@apache.org> wrote:
> > > > Hi guys,
> > > > 
> > > > I was looking at the various jenkins jobs and I wanted to submit a proposition:
> > > > 
> > > > - Validates runner tests: currently run at PostCommit for all the runners. I think it is the quickest way to see
> > > > regressions. So keep it that way
> > > We've also toyed with precommit for runners where it is fast.
> > >  
> > > > - Integration tests: AFAIK we only run the ones in examples module and only on demand. What about running all
> > > > the IT (in
> > > > particular IO IT) as a cron job on a daily basis with direct runner? Please note that it will require some
> > > > always up
> > > > backend infrastructure.
> > > I like this idea. We actually run more, but in postcommit. You can see the goal here: https://github.com/apache/be
> > > am/blob/master/.test-infra/jenkins/job_beam_PostCommit_Java_MavenInstall.groovy#L47
> > > 
> > > There's no infrastructure set up that I see. It is only DirectRunner and DataflowRunner currently, as they are
> > > "always up". But so could be local Flink and Spark. Do the ITs spin up local versions of what they are connecting
> > > to?
> > > 
> > > If we have adequate resources, I also think ValidatesRunner on a real cluster would add value, once we have the
> > > cluster set up / tear down or "always up".
> > >  
> > > > - Performance tests: what about running Nexmark SMOKE test suite in batch and streaming modes with all the
> > > > runners on a
> > > > daily basis and store the running times in a RRD database (to see performance regressions)?
> > > I like this idea, too. I think we could do DirectRunner (and probably local Flink) as postcommit without being too
> > > expensive.
> > > 
> > > Kenn
> > > 
> > >  
> > > > Please note that not all the
> > > > queries run in all the runners in all the modes right now. Also, we have some streaming pipelines termination
> > > > issues
> > > > (see https://issues.apache.org/jira/browse/BEAM-2847)
> > > > 
> > > > I know that Stephen Sisk use to work on these topics. I also talked to guys from Polidea. But As I understood,
> > > > they
> > > > launch mainly integration tests on Dataflow runner.
> > > > 
> > > > WDYT?
> > > > 
> > > > Etienne
> > > > 
> > > > 
> > > > 

Re: [PROPOSITION] schedule some sanity tests on a daily basis

Posted by Etienne Chauchot <ec...@apache.org>.
Thanks everyone for your comments and support.
Le vendredi 09 mars 2018 à 21:28 +0000, Alan Myrvold a écrit :
> Great ideas. I want to see a daily signal for anything that could prevent a release from happening, and precommits
> that are fast and reliable for areas that are commonly broken by code changes.
> 
> We are now running the java quickstarts daily on a cron schedule, using direct, dataflow, and local spark and flink in
> the beam_PostRelease_NightlySnapshot job, see https://github.com/apache/beam/blob/master/release/build.gradle This
> should provide a good signal for the examples integration tests against these runners.
> 
> As Kenn noted, the java_maveninstall also runs lots of tests. It would be good to be more clear and intentional about
> which tests run when, and to consider implementing additional "always up" environments for use by the tests.
> 
> Having the nexmark smoke tests run regularly and stored in a database would really enhance our efforts, perhaps
> starting with directrunner for the performance tests.
Yes
> > What area would have the most immediate impact? Nexmark smoke tests?
Yes IMHO I think that Nexmark smoke tests would have a great return on investment. By just scheduling some of them (at
first),  we enable deep confidence in the runners on real user pipelines. In the past Nexmark has allowed to discover
regressions in performance before a release and also to discover some bugs in some runners. But, please note that, for
this last ability, Nexmark is limited currently: it only detects failures if an exception is thrown, there is no check
of the correctness of the output PCollection because the aim was performance tests and there is no point adding a slow
test for correctness. Nevertheless, if we store the output size (as I suggested in this thread), we can get a hint on a
failure if the output size is different from the last stored output sizes.
Etienne
> > 
> 

> On Fri, Mar 9, 2018 at 12:57 PM Kenneth Knowles <kl...@google.com> wrote:> > On Fri, Mar 9, 2018 at 3:08 AM Etienne Chauchot <ec...@apache.org> wrote:> > > Hi guys,
> > > 

> > > 
> > > I was looking at the various jenkins jobs and I wanted to submit a proposition:
> > > 

> > > 
> > > - Validates runner tests: currently run at PostCommit for all the runners. I think it is the quickest way to see
> > > 
> > > regressions. So keep it that way> > > > We've also toyed with precommit for runners where it is fast.
> >  
> > > 
> > > - Integration tests: AFAIK we only run the ones in examples module and only on demand. What about running all the IT (in
> > > 
> > > particular IO IT) as a cron job on a daily basis with direct runner? Please note that it will require some always up
> > > 
> > > backend infrastructure.> > > > I like this idea. We actually run more, but in postcommit. You can see the goal here: https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PostCommit_Java_MavenInstall.groovy#L47
> > > > There's no infrastructure set up that I see. > > It is only DirectRunner and DataflowRunner currently, as they are "always up". But so could be local Flink and Spark. Do the ITs spin up local versions of what they are connecting to?
> > > > If we have adequate resources, I also think ValidatesRunner on a real cluster would add value, once we have the cluster set up / tear down or "always up".
> >  > > > 
> > > - Performance tests: what about running Nexmark SMOKE test suite in batch and streaming modes with all the runners on a
> > > 
> > > daily basis and store the running times in a RRD database (to see performance regressions)?> > 
> > I like this idea, too. I think we could do DirectRunner (and probably local Flink) as postcommit without being too expensive.
> > > > Kenn
> > > >  
> > >  Please note that not all the
> > > 
> > > queries run in all the runners in all the modes right now. Also, we have some streaming pipelines termination issues
> > > 
> > > (see https://issues.apache.org/jira/browse/BEAM-2847)
> > > 

> > > 
> > > I know that Stephen Sisk use to work on these topics. I also talked to guys from Polidea. But As I understood, they
> > > 
> > > launch mainly integration tests on Dataflow runner.
> > > 

> > > 
> > > WDYT?
> > > 

> > > 
> > > Etienne
> > > 

> > > 

> > > 

> > 

> 

Re: [PROPOSITION] schedule some sanity tests on a daily basis

Posted by Alan Myrvold <am...@google.com>.
Great ideas. I want to see a daily signal for anything that could prevent a
release from happening, and precommits that are fast and reliable for areas
that are commonly broken by code changes.

We are now running the java quickstarts daily on a cron schedule, using
direct, dataflow, and local spark and flink in
the beam_PostRelease_NightlySnapshot job, see
https://github.com/apache/beam/blob/master/release/build.gradle This should
provide a good signal for the examples integration tests against these
runners.

As Kenn noted, the java_maveninstall also runs lots of tests. It would be
good to be more clear and intentional about which tests run when, and to
consider implementing additional "always up" environments for use by the
tests.

Having the nexmark smoke tests run regularly and stored in a database would
really enhance our efforts, perhaps starting with directrunner for the
performance tests.

What area would have the most immediate impact? Nexmark smoke tests?




On Fri, Mar 9, 2018 at 12:57 PM Kenneth Knowles <kl...@google.com> wrote:

> On Fri, Mar 9, 2018 at 3:08 AM Etienne Chauchot <ec...@apache.org>
> wrote:
>
>> Hi guys,
>>
>> I was looking at the various jenkins jobs and I wanted to submit a
>> proposition:
>>
>> - Validates runner tests: currently run at PostCommit for all the
>> runners. I think it is the quickest way to see
>> regressions. So keep it that way
>>
>
> We've also toyed with precommit for runners where it is fast.
>
>
>> - Integration tests: AFAIK we only run the ones in examples module and
>> only on demand. What about running all the IT (in
>> particular IO IT) as a cron job on a daily basis with direct runner?
>> Please note that it will require some always up
>> backend infrastructure.
>>
>
> I like this idea. We actually run more, but in postcommit. You can see the
> goal here:
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PostCommit_Java_MavenInstall.groovy#L47
>
> There's no infrastructure set up that I see. It is only DirectRunner and
> DataflowRunner currently, as they are "always up". But so could be local
> Flink and Spark. Do the ITs spin up local versions of what they are
> connecting to?
>
> If we have adequate resources, I also think ValidatesRunner on a real
> cluster would add value, once we have the cluster set up / tear down or
> "always up".
>
>
>> - Performance tests: what about running Nexmark SMOKE test suite in batch
>> and streaming modes with all the runners on a
>> daily basis and store the running times in a RRD database (to see
>> performance regressions)?
>
>
> I like this idea, too. I think we could do DirectRunner (and probably
> local Flink) as postcommit without being too expensive.
>
> Kenn
>
>
>
>> Please note that not all the
>> queries run in all the runners in all the modes right now. Also, we have
>> some streaming pipelines termination issues
>> (see https://issues.apache.org/jira/browse/BEAM-2847)
>>
>> I know that Stephen Sisk use to work on these topics. I also talked to
>> guys from Polidea. But As I understood, they
>> launch mainly integration tests on Dataflow runner.
>>
>> WDYT?
>>
>> Etienne
>>
>>
>>

Re: [PROPOSITION] schedule some sanity tests on a daily basis

Posted by Etienne Chauchot <ec...@apache.org>.
Le vendredi 09 mars 2018 à 20:57 +0000, Kenneth Knowles a écrit :
> On Fri, Mar 9, 2018 at 3:08 AM Etienne Chauchot <ec...@apache.org> wrote:
> > Hi guys,
> > 
> > I was looking at the various jenkins jobs and I wanted to submit a proposition:
> > 
> > - Validates runner tests: currently run at PostCommit for all the runners. I think it is the quickest way to see
> > regressions. So keep it that way
> We've also toyed with precommit for runners where it is fast.
>  
> > - Integration tests: AFAIK we only run the ones in examples module and only on demand. What about running all the IT
> > (in
> > particular IO IT) as a cron job on a daily basis with direct runner? Please note that it will require some always up
> > backend infrastructure.
> I like this idea. We actually run more, but in postcommit. You can see the goal here: https://github.com/apache/beam/b
> lob/master/.test-infra/jenkins/job_beam_PostCommit_Java_MavenInstall.groovy#L47
> 
> There's no infrastructure set up that I see. It is only DirectRunner and DataflowRunner currently, as they are "always
> up". But so could be local Flink and Spark. Do the ITs spin up local versions of what they are connecting to?
No, currently IO IT expect the backend middlewares to be set up and running.
> > If we have adequate resources, I also think ValidatesRunner on a real cluster would add value, once we have the cluster set up / tear down or "always up".
big +1
>  > > 
> > - Performance tests: what about running Nexmark SMOKE test suite in batch and streaming modes with all the runners on a
> > 
> > daily basis and store the running times in a RRD database (to see performance regressions)?> 
> I like this idea, too. I think we could do DirectRunner (and probably local Flink) as postcommit without being too expensive.
You mean run nexmark manually as a postCommit for DirectRunner and Flink in addition to scheduling nexmark runs?
Etienne
> > Kenn
> >  
> >  Please note that not all the
> > 
> > queries run in all the runners in all the modes right now. Also, we have some streaming pipelines termination issues
> > 
> > (see https://issues.apache.org/jira/browse/BEAM-2847)
> > 

> > 
> > I know that Stephen Sisk use to work on these topics. I also talked to guys from Polidea. But As I understood, they
> > 
> > launch mainly integration tests on Dataflow runner.
> > 

> > 
> > WDYT?
> > 

> > 
> > Etienne
> > 

> > 

> > 

> 

Re: [PROPOSITION] schedule some sanity tests on a daily basis

Posted by Kenneth Knowles <kl...@google.com>.
On Fri, Mar 9, 2018 at 3:08 AM Etienne Chauchot <ec...@apache.org>
wrote:

> Hi guys,
>
> I was looking at the various jenkins jobs and I wanted to submit a
> proposition:
>
> - Validates runner tests: currently run at PostCommit for all the runners.
> I think it is the quickest way to see
> regressions. So keep it that way
>

We've also toyed with precommit for runners where it is fast.


> - Integration tests: AFAIK we only run the ones in examples module and
> only on demand. What about running all the IT (in
> particular IO IT) as a cron job on a daily basis with direct runner?
> Please note that it will require some always up
> backend infrastructure.
>

I like this idea. We actually run more, but in postcommit. You can see the
goal here:
https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PostCommit_Java_MavenInstall.groovy#L47

There's no infrastructure set up that I see. It is only DirectRunner and
DataflowRunner currently, as they are "always up". But so could be local
Flink and Spark. Do the ITs spin up local versions of what they are
connecting to?

If we have adequate resources, I also think ValidatesRunner on a real
cluster would add value, once we have the cluster set up / tear down or
"always up".


> - Performance tests: what about running Nexmark SMOKE test suite in batch
> and streaming modes with all the runners on a
> daily basis and store the running times in a RRD database (to see
> performance regressions)?


I like this idea, too. I think we could do DirectRunner (and probably local
Flink) as postcommit without being too expensive.

Kenn



> Please note that not all the
> queries run in all the runners in all the modes right now. Also, we have
> some streaming pipelines termination issues
> (see https://issues.apache.org/jira/browse/BEAM-2847)
>
> I know that Stephen Sisk use to work on these topics. I also talked to
> guys from Polidea. But As I understood, they
> launch mainly integration tests on Dataflow runner.
>
> WDYT?
>
> Etienne
>
>
>

Re: [PROPOSITION] schedule some sanity tests on a daily basis

Posted by Etienne Chauchot <ec...@apache.org>.
Le samedi 10 mars 2018 à 12:57 +0100, Łukasz Gajowy a écrit :
> > - Integration tests: AFAIK we only run the ones in examples module and only on demand. What about running all the IT
> > (in
> > particular IO IT) as a cron job on a daily basis with direct runner? Please note that it will require some always up
> > backend infrastructure.
> Running IOITs on Direct runner is fairly easy now - all the testing infrastructure is there and the only thing needed
> is switching the runner to Direct so this is nice and low effort. +1
> 
> @Kenneth: currently we spin up required databases using Kubernetes (postgres, mongo on it's way on my branch). We also
> added a hdfs cluster setup but no Jenkins tests are fired on it on regular basis (yet). We also had some problems
> running IOITs on Flink and Spark, see BEAM-3370 and BEAM-3371 so this area may need some more development.
> > 
> > - Performance tests: what about running Nexmark SMOKE test suite in batch and streaming modes with all the runners
> > on a
> > daily basis and store the running times in a RRD database (to see performance regressions)? Please note that not all
> > the
> > queries run in all the runners in all the modes right now. Also, we have some streaming pipelines termination issues
> > (see https://issues.apache.org/jira/browse/BEAM-2847)
> +1 too. Currently Performance Tests store results in BigQuery. Do you guys think it's a good idea to store all the
> tests results (Nexmark and IOIT) in one database (not separately)? Or maybe think otherwise?
I think that storing separately (at least separate tables) makes sense. Indeed, nexmark output to store is a 4 columns
table: queryNum, executionTime, throughput (nb of events/s), and size of the output collection.
Whereas IO IT output is more a status table (testName, status)
Etienne
> > > 2018-03-10 6:59 GMT+01:00 Jean-Baptiste Onofré > <jb...@nanthrax.net>> :
> > Good ideas !
> > 

> > 
> > Validates runner tests and Integration tests should be nightly executed.
> > 

> > 
> > For the Performance tests, it's a great idea, but not sure daily basis is required. Maybe two times per week ? As these tests could be long, we should avoid to block executors that could impact our PR build and master build. Maybe we can add Jenkins executors dedicated to PerfTest.
> > 

> > 
> > Regards
> > 
> > JB
> > 

> > 
> > On 09/03/2018 12:08, Etienne Chauchot wrote:
> > 
> > > 
> > > Hi guys,
> > > 

> > > 
> > > I was looking at the various jenkins jobs and I wanted to submit a proposition:
> > > 

> > > 
> > > - Validates runner tests: currently run at PostCommit for all the runners. I think it is the quickest way to see
> > > 
> > > regressions. So keep it that way
> > > 

> > > 
> > > - Integration tests: AFAIK we only run the ones in examples module and only on demand. What about running all the IT (in
> > > 
> > > particular IO IT) as a cron job on a daily basis with direct runner? Please note that it will require some always up
> > > 
> > > backend infrastructure.
> > > 

> > > 
> > > - Performance tests: what about running Nexmark SMOKE test suite in batch and streaming modes with all the runners on a
> > > 
> > > daily basis and store the running times in a RRD database (to see performance regressions)? Please note that not all the
> > > 
> > > queries run in all the runners in all the modes right now. Also, we have some streaming pipelines termination issues
> > > 
> > > (see https://issues.apache.org/jira/browse/BEAM-2847)
> > > 

> > > 
> > > I know that Stephen Sisk use to work on these topics. I also talked to guys from Polidea. But As I understood, they
> > > 
> > > launch mainly integration tests on Dataflow runner.
> > > 

> > > 
> > > WDYT?
> > > 

> > > 
> > > Etienne
> > > 

> > > 

> > > 
> > 



> 

Re: [PROPOSITION] schedule some sanity tests on a daily basis

Posted by Łukasz Gajowy <lu...@gmail.com>.
>
> - Integration tests: AFAIK we only run the ones in examples module and
> only on demand. What about running all the IT (in
> particular IO IT) as a cron job on a daily basis with direct runner?
> Please note that it will require some always up
> backend infrastructure.
>

Running IOITs on Direct runner is fairly easy now - all the testing
infrastructure is there and the only thing needed is switching the runner
to Direct so this is nice and low effort. +1

@Kenneth: currently we spin up required databases using Kubernetes
(postgres, mongo on it's way on my branch). We also added a hdfs cluster
setup but no Jenkins tests are fired on it on regular basis (yet). We also
had some problems running IOITs on Flink and Spark, see BEAM-3370
and BEAM-3371 so this area may need some more development.

>
> - Performance tests: what about running Nexmark SMOKE test suite in batch
> and streaming modes with all the runners on a
> daily basis and store the running times in a RRD database (to see
> performance regressions)? Please note that not all the
> queries run in all the runners in all the modes right now. Also, we have
> some streaming pipelines termination issues
> (see https://issues.apache.org/jira/browse/BEAM-2847)
>

+1 too. Currently Performance Tests store results in BigQuery. Do you guys
think it's a good idea to store all the tests results (Nexmark and IOIT) in
one database (not separately)? Or maybe think otherwise?


2018-03-10 6:59 GMT+01:00 Jean-Baptiste Onofré <jb...@nanthrax.net>:

> Good ideas !
>
> Validates runner tests and Integration tests should be nightly executed.
>
> For the Performance tests, it's a great idea, but not sure daily basis is
> required. Maybe two times per week ? As these tests could be long, we
> should avoid to block executors that could impact our PR build and master
> build. Maybe we can add Jenkins executors dedicated to PerfTest.
>
> Regards
> JB
>
>
> On 09/03/2018 12:08, Etienne Chauchot wrote:
>
>> Hi guys,
>>
>> I was looking at the various jenkins jobs and I wanted to submit a
>> proposition:
>>
>> - Validates runner tests: currently run at PostCommit for all the
>> runners. I think it is the quickest way to see
>> regressions. So keep it that way
>>
>> - Integration tests: AFAIK we only run the ones in examples module and
>> only on demand. What about running all the IT (in
>> particular IO IT) as a cron job on a daily basis with direct runner?
>> Please note that it will require some always up
>> backend infrastructure.
>>
>> - Performance tests: what about running Nexmark SMOKE test suite in batch
>> and streaming modes with all the runners on a
>> daily basis and store the running times in a RRD database (to see
>> performance regressions)? Please note that not all the
>> queries run in all the runners in all the modes right now. Also, we have
>> some streaming pipelines termination issues
>> (see https://issues.apache.org/jira/browse/BEAM-2847)
>>
>> I know that Stephen Sisk use to work on these topics. I also talked to
>> guys from Polidea. But As I understood, they
>> launch mainly integration tests on Dataflow runner.
>>
>> WDYT?
>>
>> Etienne
>>
>>
>>

Re: [PROPOSITION] schedule some sanity tests on a daily basis

Posted by Etienne Chauchot <ec...@apache.org>.
Hi JB,
Le samedi 10 mars 2018 à 06:59 +0100, Jean-Baptiste Onofré a écrit :
> Good ideas !
> 
> Validates runner tests and Integration tests should be nightly executed.
> 
> For the Performance tests, it's a great idea, but not sure daily basis 
> is required. Maybe two times per week ? As these tests could be long, we 
> should avoid to block executors that could impact our PR build and 
> master build. 
Yes, I think once or twice a week is ok.
> Maybe we can add Jenkins executors dedicated to PerfTest.
> 

It would be awesome ! If you have some contacts in the apache infra team, maybe you can get us some executors ? :)
> Regards
> JB
> 
> On 09/03/2018 12:08, Etienne Chauchot wrote:
> 
> > 
> > Hi guys,
> > 
> > I was looking at the various jenkins jobs and I wanted to submit a proposition:
> > 
> > - Validates runner tests: currently run at PostCommit for all the runners. I think it is the quickest way to see
> > regressions. So keep it that way
> > 
> > - Integration tests: AFAIK we only run the ones in examples module and only on demand. What about running all the IT (in
> > particular IO IT) as a cron job on a daily basis with direct runner? Please note that it will require some always up
> > backend infrastructure.
> > 
> > - Performance tests: what about running Nexmark SMOKE test suite in batch and streaming modes with all the runners on a
> > daily basis and store the running times in a RRD database (to see performance regressions)? Please note that not all the
> > queries run in all the runners in all the modes right now. Also, we have some streaming pipelines termination issues
> > (see https://issues.apache.org/jira/browse/BEAM-2847)
> > )
> > 
> > I know that Stephen Sisk use to work on these topics. I also talked to guys from Polidea. But As I understood, they
> > launch mainly integration tests on Dataflow runner.
> > 
> > WDYT?
> > 
> > Etienne
> > 
> > 
> > 

Re: [PROPOSITION] schedule some sanity tests on a daily basis

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Good ideas !

Validates runner tests and Integration tests should be nightly executed.

For the Performance tests, it's a great idea, but not sure daily basis 
is required. Maybe two times per week ? As these tests could be long, we 
should avoid to block executors that could impact our PR build and 
master build. Maybe we can add Jenkins executors dedicated to PerfTest.

Regards
JB

On 09/03/2018 12:08, Etienne Chauchot wrote:
> Hi guys,
> 
> I was looking at the various jenkins jobs and I wanted to submit a proposition:
> 
> - Validates runner tests: currently run at PostCommit for all the runners. I think it is the quickest way to see
> regressions. So keep it that way
> 
> - Integration tests: AFAIK we only run the ones in examples module and only on demand. What about running all the IT (in
> particular IO IT) as a cron job on a daily basis with direct runner? Please note that it will require some always up
> backend infrastructure.
> 
> - Performance tests: what about running Nexmark SMOKE test suite in batch and streaming modes with all the runners on a
> daily basis and store the running times in a RRD database (to see performance regressions)? Please note that not all the
> queries run in all the runners in all the modes right now. Also, we have some streaming pipelines termination issues
> (see https://issues.apache.org/jira/browse/BEAM-2847)
> 
> I know that Stephen Sisk use to work on these topics. I also talked to guys from Polidea. But As I understood, they
> launch mainly integration tests on Dataflow runner.
> 
> WDYT?
> 
> Etienne
> 
>