You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Etienne Chauchot <ec...@apache.org> on 2018/05/31 13:34:11 UTC
Re: [PROPOSITION] schedule some sanity tests on a daily basis

Hi all, 
please know that this subject has come forward:The first PR (https://github.com/apache/beam/pull/5464) writes perfs to
BQThe second one  (https://github.com/apache/beam/pull/4976) runs the PostCommits and configure the exports to BQFirst
PR needs to be merged before second one and once the two are merged we should have tables per query/runner/mode showing
the evolution of response time, events/sec, output size in the perfkit dashboards for each commit (post commit jobs and
also jenkins phrases in github). 
If it is too frequent we could then schedule to something like once a day.

Etienne
Le jeudi 29 mars 2018 à 17:40 +0200, Etienne Chauchot a écrit :
> Hi all,As discussed here I proposed a PR (https://github.com/apache/beam/pull/4976) to schedule nexmark runs as post
> commit tests. the post commits runNM on direct runner in batch modeNM on direct runner in streaming modeNM on flink
> runner in batch modeNM on flink runner in streaming modeNM on spark runner in batch modethese are the runners/modes
> for which all the nexmark queries run finethere is still output like a database to be added to nexmark (it just
> outputs to the console for now)If it is too costly, we could schedule less.Etienne
> > So what next? Shall we schedule nexmark runs and add a Bigquery sink to nexmark output?
> > Le lundi 12 mars 2018 à 10:30 +0100, Etienne Chauchot a écrit :
> > > Thanks everyone for your comments and support.
> > > Le vendredi 09 mars 2018 à 21:28 +0000, Alan Myrvold a écrit :
> > > > Great ideas. I want to see a daily signal for anything that could prevent a release from happening, and
> > > > precommits that are fast and reliable for areas that are commonly broken by code changes.
> > > > We are now running the java quickstarts daily on a cron schedule, using direct, dataflow, and local spark and
> > > > flink in the beam_PostRelease_NightlySnapshot job, see https://github.com/apache/beam/blob/master/release/build.
> > > > gradle This should provide a good signal for the examples integration tests against these runners.
> > > > 
> > > > As Kenn noted, the java_maveninstall also runs lots of tests. It would be good to be more clear and intentional
> > > > about which tests run when, and to consider implementing additional "always up" environments for use by the
> > > > tests.
> > > > Having the nexmark smoke tests run regularly and stored in a database would really enhance our efforts, perhaps
> > > > starting with directrunner for the performance tests.
> > > 
> > > Yes
> > > > What area would have the most immediate impact? Nexmark smoke tests?
> > > 
> > > Yes IMHO I think that Nexmark smoke tests would have a great return on investment. By just scheduling some of them
> > > (at first),  we enable deep confidence in the runners on real user pipelines. In the past Nexmark has allowed to
> > > discover regressions in performance before a release and also to discover some bugs in some runners. But, please
> > > note that, for this last ability, Nexmark is limited currently: it only detects failures if an exception is
> > > thrown, there is no check of the correctness of the output PCollection because the aim was performance tests and
> > > there is no point adding a slow test for correctness. Nevertheless, if we store the output size (as I suggested in
> > > this thread), we can get a hint on a failure if the output size is different from the last stored output sizes.
> > > Etienne
> > > > 
> > > > 
> > > > On Fri, Mar 9, 2018 at 12:57 PM Kenneth Knowles <kl...@google.com> wrote:
> > > > > On Fri, Mar 9, 2018 at 3:08 AM Etienne Chauchot <ec...@apache.org> wrote:
> > > > > > Hi guys,
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > I was looking at the various jenkins jobs and I wanted to submit a proposition:
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > - Validates runner tests: currently run at PostCommit for all the runners. I think it is the quickest way to
> > > > > > see
> > > > > > 
> > > > > > regressions. So keep it that way
> > > > > 
> > > > > We've also toyed with precommit for runners where it is fast. 
> > > > > > - Integration tests: AFAIK we only run the ones in examples module and only on demand. What about running
> > > > > > all the IT (in
> > > > > > 
> > > > > > particular IO IT) as a cron job on a daily basis with direct runner? Please note that it will require some
> > > > > > always up
> > > > > > 
> > > > > > backend infrastructure.
> > > > > 
> > > > > I like this idea. We actually run more, but in postcommit. You can see the goal here: https://github.com/apach
> > > > > e/beam/blob/master/.test-infra/jenkins/job_beam_PostCommit_Java_MavenInstall.groovy#L47
> > > > > There's no infrastructure set up that I see. It is only DirectRunner and DataflowRunner currently, as they are
> > > > > "always up". But so could be local Flink and Spark. Do the ITs spin up local versions of what they are
> > > > > connecting to?
> > > > > If we have adequate resources, I also think ValidatesRunner on a real cluster would add value, once we have
> > > > > the cluster set up / tear down or "always up". 
> > > > > > - Performance tests: what about running Nexmark SMOKE test suite in batch and streaming modes with all the
> > > > > > runners on a
> > > > > > 
> > > > > > daily basis and store the running times in a RRD database (to see performance regressions)?
> > > > > 
> > > > > I like this idea, too. I think we could do DirectRunner (and probably local Flink) as postcommit without being
> > > > > too expensive.
> > > > > Kenn
> > > > >  
> > > > > >  Please note that not all the
> > > > > > 
> > > > > > queries run in all the runners in all the modes right now. Also, we have some streaming pipelines
> > > > > > termination issues
> > > > > > 
> > > > > > (see https://issues.apache.org/jira/browse/BEAM-2847)
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > I know that Stephen Sisk use to work on these topics. I also talked to guys from Polidea. But As I
> > > > > > understood, they
> > > > > > 
> > > > > > launch mainly integration tests on Dataflow runner.
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > WDYT?
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > Etienne
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > >