You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by Yu Li <ca...@gmail.com> on 2019/10/30 10:54:04 UTC

[DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Hi everyone,

We would like to propose FLIP-83 that adds an end-to-end performance
testing framework for Flink. We discovered some potential problems through
such an internal end-to-end performance testing framework before the
release of 1.9.0 [1], so we'd like to contribute it to Flink community as a
supplement to the existing daily run micro performance benchmark [2] and
nightly run end-to-end stability test [3].

The FLIP document could be found here:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework

Please kindly review the FLIP document and let us know if you have any
comments/suggestions, thanks!

[1] https://s.apache.org/m8kcq
[2] https://github.com/dataArtisans/flink-benchmarks
[3] https://github.com/apache/flink/tree/master/flink-end-to-end-tests

Best Regards,
Yu

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by Till Rohrmann <tr...@apache.org>.

Thanks for starting this discussion. I agree that performance tests will
help us to prevent introducing regressions.

+1 for this proposal.

Cheers,
Till

On Fri, Nov 1, 2019 at 5:13 PM Yun Tang <my...@live.com> wrote:

> +1, I like the idea of this improvement which acts as a watchdog for
> developers' code change.
>
> By the way, do you think it's worthy to add a checkpoint mode which just
> disable checkpoint to run end-to-end jobs? And when will stage2 and stage3
> be discussed in more details?
>
> Best
> Yun Tang
>
> On 11/1/19, 5:02 PM, "Piotr Nowojski" <pi...@ververica.com> wrote:
>
>     Hi Yu,
>
>     Thanks for the answers, it makes sense to me :)
>
>     Piotrek
>
>     > On 31 Oct 2019, at 11:25, Yu Li <ca...@gmail.com> wrote:
>     >
>     > Hi Piotr,
>     >
>     > Thanks for the comments!
>     >
>     > bq. How are you planning to execute the end-to-end benchmarks and
> integrate
>     > them with our build process?
>     > Great question! We plan to execute the end-to-end benchmark in a
> small
>     > cluster (like 3 vm nodes) to better reflect network cost, triggering
> it
>     > through our Jenkins service for micro benchmark and show the result
> on
>     > code-speed center. Will add these into FLIP document if no
> objections.
>     >
>     > bq. Are you planning to monitor the throughput and latency at the
> same time?
>     > Good question. And you're right, we will stress the cluster to
>     > back-pressure and watch the throughput, latency doesn't mean much in
> the
>     > first test suites. Let me refine the document.
>     >
>     > Thanks.
>     >
>     > Best Regards,
>     > Yu
>     >
>     >
>     > On Wed, 30 Oct 2019 at 19:07, Piotr Nowojski <pi...@ververica.com>
> wrote:
>     >
>     >> Hi Yu,
>     >>
>     >> Thanks for bringing this up.
>     >>
>     >> +1 for the idea and the proposal from my side.
>     >>
>     >> I think that the proposed Test Job List might be a bit
>     >> redundant/excessive, but:
>     >> - we can always adjust this later, once we have the infrastructure
> in place
>     >> - as long as we have the computing resources and ability to quickly
>     >> interpret the results/catch regressions, it doesn’t hurt to have
> more
>     >> benchmarks/tests then strictly necessary.
>     >>
>     >> Which brings me to a question. How are you planning to execute the
>     >> end-to-end benchmarks and integrate them with our build process?
>     >>
>     >> Another smaller question:
>     >>
>     >>> In this initial stage we will only monitor and display job
> throughput
>     >> and latency.
>     >>
>     >> Are you planning to monitor the throughput and latency at the same
> time?
>     >> It might be a bit problematic, as when measuring the throughput you
> want to
>     >> saturate the system and hit some bottleneck, which will cause a
>     >> back-pressure (measuring latency at the same time when system is
> back
>     >> pressured doesn’t make much sense).
>     >>
>     >> Piotrek
>     >>
>     >>> On 30 Oct 2019, at 11:54, Yu Li <ca...@gmail.com> wrote:
>     >>>
>     >>> Hi everyone,
>     >>>
>     >>> We would like to propose FLIP-83 that adds an end-to-end
> performance
>     >>> testing framework for Flink. We discovered some potential problems
>     >> through
>     >>> such an internal end-to-end performance testing framework before
> the
>     >>> release of 1.9.0 [1], so we'd like to contribute it to Flink
> community
>     >> as a
>     >>> supplement to the existing daily run micro performance benchmark
> [2] and
>     >>> nightly run end-to-end stability test [3].
>     >>>
>     >>> The FLIP document could be found here:
>     >>>
>     >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
>     >>>
>     >>> Please kindly review the FLIP document and let us know if you have
> any
>     >>> comments/suggestions, thanks!
>     >>>
>     >>> [1] https://s.apache.org/m8kcq
>     >>> [2] https://github.com/dataArtisans/flink-benchmarks
>     >>> [3]
> https://github.com/apache/flink/tree/master/flink-end-to-end-tests
>     >>>
>     >>> Best Regards,
>     >>> Yu
>     >>
>     >>
>
>
>
>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by Yu Li <ca...@gmail.com>.

Thanks for the comments.

bq. I think the perf e2e test suites will also need to be designed as
supporting running on both standalone env and distributed env. will be
helpful for developing & evaluating the perf.
Agreed and marked down, the benchmark will be able to be executed in
standalone mode. On the other hand, we plan to check the result in
distributed mode to better reflect network cost for the daily run.

Best Regards,
Yu


On Mon, 4 Nov 2019 at 10:00, OpenInx <op...@gmail.com> wrote:

> > The test cases are written in java and scripts in python. We propose a
> separate directory/module in parallel with flink-end-to-end-tests, with the
> > name of flink-end-to-end-perf-tests.
>
> Glad to see that the newly introduced e2e test will be written in Java.
> because  I'm re-working on the existed e2e tests suites from BASH scripts
> to Java test cases so that we can support more external system , such as
> running the testing job on yarn+flink, docker+flink, standalone+flink,
> distributed kafka cluster etc.
> BTW, I think the perf e2e test suites will also need to be designed as
> supporting running on both standalone env and distributed env. will be
> helpful
> for developing & evaluating the perf.
> Thanks.
>
> On Mon, Nov 4, 2019 at 9:31 AM aihua li <li...@gmail.com> wrote:
>
> > In stage1, the checkpoint mode isn't disabled,and uses heap as the
> > statebackend.
> > I think there should be some special scenarios to test checkpoint and
> > statebackend, which will be discussed and added in the release-1.11
> >
> > > 在 2019年11月2日，上午12:13，Yun Tang <my...@live.com> 写道：
> > >
> > > By the way, do you think it's worthy to add a checkpoint mode which
> just
> > disable checkpoint to run end-to-end jobs? And when will stage2 and
> stage3
> > be discussed in more details?
> >
> >
>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by Zhu Zhu <re...@gmail.com>.

Thanks Aihua for the explanation.
The proposal looks good to me then.

Thanks,
Zhu Zhu

aihua li <li...@gmail.com> 于2019年11月21日周四 下午3:59写道：

> Thanks for the comments Zhu Zhu!
>
> > 1. How do we measure the job throughput? By measuring the job execution
> > time on a finite input data set, or measuring the QPS when the job has
> > reached a stable state?
> >    I ask this because that, with LazyFromSource schedule mode, tasks are
> > launched gradually on processing progress.
> >    So if we are measuring the throughput in the latter way,
> > the LazyFromSource scheduling would make no difference with Eager
> > scheduling. So we can drop this dimension if taking this way.
> >   By measuring the total execution time, however, it can be kept since
> the
> > scheduling effectiveness can make differences, especially in small input
> > data set cases.
>
> we plan to meaure the job throughout by meauring the qps when the job has
> reached a stable state.
> If as you said, there is no difference between lazyfromsource and eager in
> this measuring way, we can adjust the test scenario after running for a
> while, and remove the duplicate part.
>
> > 2. In our prior experiences, the performance result is usually not that
> > stable, which may make the perf degradation harder to detect.
> >   Shall we define the rounds to run a job and how to aggregate the
> > result,  so that we can get a more reliable final performance result?
>
> Good advice, we plan to run multi rounds(5 is the default value ) per
> scene ,then calculate the average value as the result.
>
>
>
>
>
> > 在 2019年11月21日，下午3:01，Zhu Zhu <re...@gmail.com> 写道：
> >
> > Thanks Yu for bringing up this discussion.
> > The e2e perf tests can be really helpful and the overall design looks
> good
> > to me.
> >
> > Sorry it's late but I have 2 questions about the result check.
> > 1. How do we measure the job throughput? By measuring the job execution
> > time on a finite input data set, or measuring the QPS when the job has
> > reached a stable state?
> >    I ask this because that, with LazyFromSource schedule mode, tasks are
> > launched gradually on processing progress.
> >    So if we are measuring the throughput in the latter way,
> > the LazyFromSource scheduling would make no difference with Eager
> > scheduling. So we can drop this dimension if taking this way.
> >   By measuring the total execution time, however, it can be kept since
> the
> > scheduling effectiveness can make differences, especially in small input
> > data set cases.
> > 2. In our prior experiences, the performance result is usually not that
> > stable, which may make the perf degradation harder to detect.
> >   Shall we define the rounds to run a job and how to aggregate the
> > result,  so that we can get a more reliable final performance result?
> >
> > Thanks,
> > Zhu Zhu
> >
> > Yu Li <ca...@gmail.com> 于2019年11月14日周四 上午10:52写道：
> >
> >> Since one week passed and no more comments, I assume the latest FLIP doc
> >> looks good to all and will open a VOTE thread soon for the FLIP. Thanks
> for
> >> all the comments and discussion!
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >> On Thu, 7 Nov 2019 at 18:35, Yu Li <ca...@gmail.com> wrote:
> >>
> >>> Thanks for the comments Biao!
> >>>
> >>> bq. It seems this proposal is separated into several stages. Is there a
> >>> more detailed plan?
> >>> Good point! For stage one we'd like to try introducing the benchmark
> >>> first, so we could guard the release (hopefully starting from 1.10).
> For
> >>> other stages, we don't have detailed plan yet, but will add child FLIPs
> >>> when moving on and open new discussion/voting separately. I have
> updated
> >>> the FLIP document to better reflect this, please check it and let me
> know
> >>> what you think. Thanks.
> >>>
> >>> Best Regards,
> >>> Yu
> >>>
> >>>
> >>> On Tue, 5 Nov 2019 at 10:16, Biao Liu <mm...@gmail.com> wrote:
> >>>
> >>>> Thanks Yu for bringing this topic.
> >>>>
> >>>> +1 for this proposal. Glad to have an e2e performance testing.
> >>>>
> >>>> It seems this proposal is separated into several stages. Is there a
> more
> >>>> detailed plan?
> >>>>
> >>>> Thanks,
> >>>> Biao /'bɪ.aʊ/
> >>>>
> >>>>
> >>>>
> >>>> On Mon, 4 Nov 2019 at 19:54, Congxian Qiu <qc...@gmail.com>
> >> wrote:
> >>>>
> >>>>> +1 for this idea.
> >>>>>
> >>>>> Currently, we have the micro benchmark for flink, which can help us
> >> find
> >>>>> the regressions. And I think the e2e jobs performance testing can
> also
> >>>> help
> >>>>> us to cover more scenarios.
> >>>>>
> >>>>> Best,
> >>>>> Congxian
> >>>>>
> >>>>>
> >>>>> Jingsong Li <ji...@gmail.com> 于2019年11月4日周一 下午5:37写道：
> >>>>>
> >>>>>> +1 for the idea. Thanks Yu for driving this.
> >>>>>> Just curious about that can we collect the metrics about Job
> >>>> scheduling
> >>>>> and
> >>>>>> task launch. the speed of this part is also important.
> >>>>>> We can add tests for watch it too.
> >>>>>>
> >>>>>> Look forward to more batch test support.
> >>>>>>
> >>>>>> Best,
> >>>>>> Jingsong Lee
> >>>>>>
> >>>>>> On Mon, Nov 4, 2019 at 10:00 AM OpenInx <op...@gmail.com> wrote:
> >>>>>>
> >>>>>>>> The test cases are written in java and scripts in python. We
> >>>> propose
> >>>>> a
> >>>>>>> separate directory/module in parallel with flink-end-to-end-tests,
> >>>> with
> >>>>>> the
> >>>>>>>> name of flink-end-to-end-perf-tests.
> >>>>>>>
> >>>>>>> Glad to see that the newly introduced e2e test will be written in
> >>>> Java.
> >>>>>>> because  I'm re-working on the existed e2e tests suites from BASH
> >>>>> scripts
> >>>>>>> to Java test cases so that we can support more external system ,
> >>>> such
> >>>>> as
> >>>>>>> running the testing job on yarn+flink, docker+flink,
> >>>> standalone+flink,
> >>>>>>> distributed kafka cluster etc.
> >>>>>>> BTW, I think the perf e2e test suites will also need to be
> >> designed
> >>>> as
> >>>>>>> supporting running on both standalone env and distributed env.
> >> will
> >>>> be
> >>>>>>> helpful
> >>>>>>> for developing & evaluating the perf.
> >>>>>>> Thanks.
> >>>>>>>
> >>>>>>> On Mon, Nov 4, 2019 at 9:31 AM aihua li <li...@gmail.com>
> >>>> wrote:
> >>>>>>>
> >>>>>>>> In stage1, the checkpoint mode isn't disabled,and uses heap as
> >> the
> >>>>>>>> statebackend.
> >>>>>>>> I think there should be some special scenarios to test
> >> checkpoint
> >>>> and
> >>>>>>>> statebackend, which will be discussed and added in the
> >>>> release-1.11
> >>>>>>>>
> >>>>>>>>> 在 2019年11月2日，上午12:13，Yun Tang <my...@live.com> 写道：
> >>>>>>>>>
> >>>>>>>>> By the way, do you think it's worthy to add a checkpoint mode
> >>>> which
> >>>>>>> just
> >>>>>>>> disable checkpoint to run end-to-end jobs? And when will stage2
> >>>> and
> >>>>>>> stage3
> >>>>>>>> be discussed in more details?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Best, Jingsong Lee
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by aihua li <li...@gmail.com>.

Thanks for the comments Zhu Zhu!

> 1. How do we measure the job throughput? By measuring the job execution
> time on a finite input data set, or measuring the QPS when the job has
> reached a stable state?
>    I ask this because that, with LazyFromSource schedule mode, tasks are
> launched gradually on processing progress.
>    So if we are measuring the throughput in the latter way,
> the LazyFromSource scheduling would make no difference with Eager
> scheduling. So we can drop this dimension if taking this way.
>   By measuring the total execution time, however, it can be kept since the
> scheduling effectiveness can make differences, especially in small input
> data set cases.

we plan to meaure the job throughout by meauring the qps when the job has reached a stable state.
If as you said, there is no difference between lazyfromsource and eager in this measuring way, we can adjust the test scenario after running for a while, and remove the duplicate part.

> 2. In our prior experiences, the performance result is usually not that
> stable, which may make the perf degradation harder to detect.
>   Shall we define the rounds to run a job and how to aggregate the
> result,  so that we can get a more reliable final performance result?

Good advice, we plan to run multi rounds(5 is the default value ) per scene ,then calculate the average value as the result.



 

> 在 2019年11月21日，下午3:01，Zhu Zhu <re...@gmail.com> 写道：
> 
> Thanks Yu for bringing up this discussion.
> The e2e perf tests can be really helpful and the overall design looks good
> to me.
> 
> Sorry it's late but I have 2 questions about the result check.
> 1. How do we measure the job throughput? By measuring the job execution
> time on a finite input data set, or measuring the QPS when the job has
> reached a stable state?
>    I ask this because that, with LazyFromSource schedule mode, tasks are
> launched gradually on processing progress.
>    So if we are measuring the throughput in the latter way,
> the LazyFromSource scheduling would make no difference with Eager
> scheduling. So we can drop this dimension if taking this way.
>   By measuring the total execution time, however, it can be kept since the
> scheduling effectiveness can make differences, especially in small input
> data set cases.
> 2. In our prior experiences, the performance result is usually not that
> stable, which may make the perf degradation harder to detect.
>   Shall we define the rounds to run a job and how to aggregate the
> result,  so that we can get a more reliable final performance result?
> 
> Thanks,
> Zhu Zhu
> 
> Yu Li <ca...@gmail.com> 于2019年11月14日周四 上午10:52写道：
> 
>> Since one week passed and no more comments, I assume the latest FLIP doc
>> looks good to all and will open a VOTE thread soon for the FLIP. Thanks for
>> all the comments and discussion!
>> 
>> Best Regards,
>> Yu
>> 
>> 
>> On Thu, 7 Nov 2019 at 18:35, Yu Li <ca...@gmail.com> wrote:
>> 
>>> Thanks for the comments Biao!
>>> 
>>> bq. It seems this proposal is separated into several stages. Is there a
>>> more detailed plan?
>>> Good point! For stage one we'd like to try introducing the benchmark
>>> first, so we could guard the release (hopefully starting from 1.10). For
>>> other stages, we don't have detailed plan yet, but will add child FLIPs
>>> when moving on and open new discussion/voting separately. I have updated
>>> the FLIP document to better reflect this, please check it and let me know
>>> what you think. Thanks.
>>> 
>>> Best Regards,
>>> Yu
>>> 
>>> 
>>> On Tue, 5 Nov 2019 at 10:16, Biao Liu <mm...@gmail.com> wrote:
>>> 
>>>> Thanks Yu for bringing this topic.
>>>> 
>>>> +1 for this proposal. Glad to have an e2e performance testing.
>>>> 
>>>> It seems this proposal is separated into several stages. Is there a more
>>>> detailed plan?
>>>> 
>>>> Thanks,
>>>> Biao /'bɪ.aʊ/
>>>> 
>>>> 
>>>> 
>>>> On Mon, 4 Nov 2019 at 19:54, Congxian Qiu <qc...@gmail.com>
>> wrote:
>>>> 
>>>>> +1 for this idea.
>>>>> 
>>>>> Currently, we have the micro benchmark for flink, which can help us
>> find
>>>>> the regressions. And I think the e2e jobs performance testing can also
>>>> help
>>>>> us to cover more scenarios.
>>>>> 
>>>>> Best,
>>>>> Congxian
>>>>> 
>>>>> 
>>>>> Jingsong Li <ji...@gmail.com> 于2019年11月4日周一 下午5:37写道：
>>>>> 
>>>>>> +1 for the idea. Thanks Yu for driving this.
>>>>>> Just curious about that can we collect the metrics about Job
>>>> scheduling
>>>>> and
>>>>>> task launch. the speed of this part is also important.
>>>>>> We can add tests for watch it too.
>>>>>> 
>>>>>> Look forward to more batch test support.
>>>>>> 
>>>>>> Best,
>>>>>> Jingsong Lee
>>>>>> 
>>>>>> On Mon, Nov 4, 2019 at 10:00 AM OpenInx <op...@gmail.com> wrote:
>>>>>> 
>>>>>>>> The test cases are written in java and scripts in python. We
>>>> propose
>>>>> a
>>>>>>> separate directory/module in parallel with flink-end-to-end-tests,
>>>> with
>>>>>> the
>>>>>>>> name of flink-end-to-end-perf-tests.
>>>>>>> 
>>>>>>> Glad to see that the newly introduced e2e test will be written in
>>>> Java.
>>>>>>> because  I'm re-working on the existed e2e tests suites from BASH
>>>>> scripts
>>>>>>> to Java test cases so that we can support more external system ,
>>>> such
>>>>> as
>>>>>>> running the testing job on yarn+flink, docker+flink,
>>>> standalone+flink,
>>>>>>> distributed kafka cluster etc.
>>>>>>> BTW, I think the perf e2e test suites will also need to be
>> designed
>>>> as
>>>>>>> supporting running on both standalone env and distributed env.
>> will
>>>> be
>>>>>>> helpful
>>>>>>> for developing & evaluating the perf.
>>>>>>> Thanks.
>>>>>>> 
>>>>>>> On Mon, Nov 4, 2019 at 9:31 AM aihua li <li...@gmail.com>
>>>> wrote:
>>>>>>> 
>>>>>>>> In stage1, the checkpoint mode isn't disabled,and uses heap as
>> the
>>>>>>>> statebackend.
>>>>>>>> I think there should be some special scenarios to test
>> checkpoint
>>>> and
>>>>>>>> statebackend, which will be discussed and added in the
>>>> release-1.11
>>>>>>>> 
>>>>>>>>> 在 2019年11月2日，上午12:13，Yun Tang <my...@live.com> 写道：
>>>>>>>>> 
>>>>>>>>> By the way, do you think it's worthy to add a checkpoint mode
>>>> which
>>>>>>> just
>>>>>>>> disable checkpoint to run end-to-end jobs? And when will stage2
>>>> and
>>>>>>> stage3
>>>>>>>> be discussed in more details?
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Best, Jingsong Lee
>>>>>> 
>>>>> 
>>>> 
>>> 
>>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by Zhu Zhu <re...@gmail.com>.

Thanks Yu for bringing up this discussion.
The e2e perf tests can be really helpful and the overall design looks good
to me.

Sorry it's late but I have 2 questions about the result check.
1. How do we measure the job throughput? By measuring the job execution
time on a finite input data set, or measuring the QPS when the job has
reached a stable state?
    I ask this because that, with LazyFromSource schedule mode, tasks are
launched gradually on processing progress.
    So if we are measuring the throughput in the latter way,
the LazyFromSource scheduling would make no difference with Eager
scheduling. So we can drop this dimension if taking this way.
   By measuring the total execution time, however, it can be kept since the
scheduling effectiveness can make differences, especially in small input
data set cases.
2. In our prior experiences, the performance result is usually not that
stable, which may make the perf degradation harder to detect.
   Shall we define the rounds to run a job and how to aggregate the
result,  so that we can get a more reliable final performance result?

Thanks,
Zhu Zhu

Yu Li <ca...@gmail.com> 于2019年11月14日周四 上午10:52写道：

> Since one week passed and no more comments, I assume the latest FLIP doc
> looks good to all and will open a VOTE thread soon for the FLIP. Thanks for
> all the comments and discussion!
>
> Best Regards,
> Yu
>
>
> On Thu, 7 Nov 2019 at 18:35, Yu Li <ca...@gmail.com> wrote:
>
> > Thanks for the comments Biao!
> >
> > bq. It seems this proposal is separated into several stages. Is there a
> > more detailed plan?
> > Good point! For stage one we'd like to try introducing the benchmark
> > first, so we could guard the release (hopefully starting from 1.10). For
> > other stages, we don't have detailed plan yet, but will add child FLIPs
> > when moving on and open new discussion/voting separately. I have updated
> > the FLIP document to better reflect this, please check it and let me know
> > what you think. Thanks.
> >
> > Best Regards,
> > Yu
> >
> >
> > On Tue, 5 Nov 2019 at 10:16, Biao Liu <mm...@gmail.com> wrote:
> >
> >> Thanks Yu for bringing this topic.
> >>
> >> +1 for this proposal. Glad to have an e2e performance testing.
> >>
> >> It seems this proposal is separated into several stages. Is there a more
> >> detailed plan?
> >>
> >> Thanks,
> >> Biao /'bɪ.aʊ/
> >>
> >>
> >>
> >> On Mon, 4 Nov 2019 at 19:54, Congxian Qiu <qc...@gmail.com>
> wrote:
> >>
> >> > +1 for this idea.
> >> >
> >> > Currently, we have the micro benchmark for flink, which can help us
> find
> >> > the regressions. And I think the e2e jobs performance testing can also
> >> help
> >> > us to cover more scenarios.
> >> >
> >> > Best,
> >> > Congxian
> >> >
> >> >
> >> > Jingsong Li <ji...@gmail.com> 于2019年11月4日周一 下午5:37写道：
> >> >
> >> > > +1 for the idea. Thanks Yu for driving this.
> >> > > Just curious about that can we collect the metrics about Job
> >> scheduling
> >> > and
> >> > > task launch. the speed of this part is also important.
> >> > > We can add tests for watch it too.
> >> > >
> >> > > Look forward to more batch test support.
> >> > >
> >> > > Best,
> >> > > Jingsong Lee
> >> > >
> >> > > On Mon, Nov 4, 2019 at 10:00 AM OpenInx <op...@gmail.com> wrote:
> >> > >
> >> > > > > The test cases are written in java and scripts in python. We
> >> propose
> >> > a
> >> > > > separate directory/module in parallel with flink-end-to-end-tests,
> >> with
> >> > > the
> >> > > > > name of flink-end-to-end-perf-tests.
> >> > > >
> >> > > > Glad to see that the newly introduced e2e test will be written in
> >> Java.
> >> > > > because  I'm re-working on the existed e2e tests suites from BASH
> >> > scripts
> >> > > > to Java test cases so that we can support more external system ,
> >> such
> >> > as
> >> > > > running the testing job on yarn+flink, docker+flink,
> >> standalone+flink,
> >> > > > distributed kafka cluster etc.
> >> > > > BTW, I think the perf e2e test suites will also need to be
> designed
> >> as
> >> > > > supporting running on both standalone env and distributed env.
> will
> >> be
> >> > > > helpful
> >> > > > for developing & evaluating the perf.
> >> > > > Thanks.
> >> > > >
> >> > > > On Mon, Nov 4, 2019 at 9:31 AM aihua li <li...@gmail.com>
> >> wrote:
> >> > > >
> >> > > > > In stage1, the checkpoint mode isn't disabled,and uses heap as
> the
> >> > > > > statebackend.
> >> > > > > I think there should be some special scenarios to test
> checkpoint
> >> and
> >> > > > > statebackend, which will be discussed and added in the
> >> release-1.11
> >> > > > >
> >> > > > > > 在 2019年11月2日，上午12:13，Yun Tang <my...@live.com> 写道：
> >> > > > > >
> >> > > > > > By the way, do you think it's worthy to add a checkpoint mode
> >> which
> >> > > > just
> >> > > > > disable checkpoint to run end-to-end jobs? And when will stage2
> >> and
> >> > > > stage3
> >> > > > > be discussed in more details?
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> > >
> >> > > --
> >> > > Best, Jingsong Lee
> >> > >
> >> >
> >>
> >
>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by Yu Li <ca...@gmail.com>.

Since one week passed and no more comments, I assume the latest FLIP doc
looks good to all and will open a VOTE thread soon for the FLIP. Thanks for
all the comments and discussion!

Best Regards,
Yu


On Thu, 7 Nov 2019 at 18:35, Yu Li <ca...@gmail.com> wrote:

> Thanks for the comments Biao!
>
> bq. It seems this proposal is separated into several stages. Is there a
> more detailed plan?
> Good point! For stage one we'd like to try introducing the benchmark
> first, so we could guard the release (hopefully starting from 1.10). For
> other stages, we don't have detailed plan yet, but will add child FLIPs
> when moving on and open new discussion/voting separately. I have updated
> the FLIP document to better reflect this, please check it and let me know
> what you think. Thanks.
>
> Best Regards,
> Yu
>
>
> On Tue, 5 Nov 2019 at 10:16, Biao Liu <mm...@gmail.com> wrote:
>
>> Thanks Yu for bringing this topic.
>>
>> +1 for this proposal. Glad to have an e2e performance testing.
>>
>> It seems this proposal is separated into several stages. Is there a more
>> detailed plan?
>>
>> Thanks,
>> Biao /'bɪ.aʊ/
>>
>>
>>
>> On Mon, 4 Nov 2019 at 19:54, Congxian Qiu <qc...@gmail.com> wrote:
>>
>> > +1 for this idea.
>> >
>> > Currently, we have the micro benchmark for flink, which can help us find
>> > the regressions. And I think the e2e jobs performance testing can also
>> help
>> > us to cover more scenarios.
>> >
>> > Best,
>> > Congxian
>> >
>> >
>> > Jingsong Li <ji...@gmail.com> 于2019年11月4日周一 下午5:37写道：
>> >
>> > > +1 for the idea. Thanks Yu for driving this.
>> > > Just curious about that can we collect the metrics about Job
>> scheduling
>> > and
>> > > task launch. the speed of this part is also important.
>> > > We can add tests for watch it too.
>> > >
>> > > Look forward to more batch test support.
>> > >
>> > > Best,
>> > > Jingsong Lee
>> > >
>> > > On Mon, Nov 4, 2019 at 10:00 AM OpenInx <op...@gmail.com> wrote:
>> > >
>> > > > > The test cases are written in java and scripts in python. We
>> propose
>> > a
>> > > > separate directory/module in parallel with flink-end-to-end-tests,
>> with
>> > > the
>> > > > > name of flink-end-to-end-perf-tests.
>> > > >
>> > > > Glad to see that the newly introduced e2e test will be written in
>> Java.
>> > > > because  I'm re-working on the existed e2e tests suites from BASH
>> > scripts
>> > > > to Java test cases so that we can support more external system ,
>> such
>> > as
>> > > > running the testing job on yarn+flink, docker+flink,
>> standalone+flink,
>> > > > distributed kafka cluster etc.
>> > > > BTW, I think the perf e2e test suites will also need to be designed
>> as
>> > > > supporting running on both standalone env and distributed env. will
>> be
>> > > > helpful
>> > > > for developing & evaluating the perf.
>> > > > Thanks.
>> > > >
>> > > > On Mon, Nov 4, 2019 at 9:31 AM aihua li <li...@gmail.com>
>> wrote:
>> > > >
>> > > > > In stage1, the checkpoint mode isn't disabled,and uses heap as the
>> > > > > statebackend.
>> > > > > I think there should be some special scenarios to test checkpoint
>> and
>> > > > > statebackend, which will be discussed and added in the
>> release-1.11
>> > > > >
>> > > > > > 在 2019年11月2日，上午12:13，Yun Tang <my...@live.com> 写道：
>> > > > > >
>> > > > > > By the way, do you think it's worthy to add a checkpoint mode
>> which
>> > > > just
>> > > > > disable checkpoint to run end-to-end jobs? And when will stage2
>> and
>> > > > stage3
>> > > > > be discussed in more details?
>> > > > >
>> > > > >
>> > > >
>> > >
>> > >
>> > > --
>> > > Best, Jingsong Lee
>> > >
>> >
>>
>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by Yu Li <ca...@gmail.com>.

Thanks for the comments Biao!

bq. It seems this proposal is separated into several stages. Is there a
more detailed plan?
Good point! For stage one we'd like to try introducing the benchmark first,
so we could guard the release (hopefully starting from 1.10). For other
stages, we don't have detailed plan yet, but will add child FLIPs when
moving on and open new discussion/voting separately. I have updated the
FLIP document to better reflect this, please check it and let me know what
you think. Thanks.

Best Regards,
Yu


On Tue, 5 Nov 2019 at 10:16, Biao Liu <mm...@gmail.com> wrote:

> Thanks Yu for bringing this topic.
>
> +1 for this proposal. Glad to have an e2e performance testing.
>
> It seems this proposal is separated into several stages. Is there a more
> detailed plan?
>
> Thanks,
> Biao /'bɪ.aʊ/
>
>
>
> On Mon, 4 Nov 2019 at 19:54, Congxian Qiu <qc...@gmail.com> wrote:
>
> > +1 for this idea.
> >
> > Currently, we have the micro benchmark for flink, which can help us find
> > the regressions. And I think the e2e jobs performance testing can also
> help
> > us to cover more scenarios.
> >
> > Best,
> > Congxian
> >
> >
> > Jingsong Li <ji...@gmail.com> 于2019年11月4日周一 下午5:37写道：
> >
> > > +1 for the idea. Thanks Yu for driving this.
> > > Just curious about that can we collect the metrics about Job scheduling
> > and
> > > task launch. the speed of this part is also important.
> > > We can add tests for watch it too.
> > >
> > > Look forward to more batch test support.
> > >
> > > Best,
> > > Jingsong Lee
> > >
> > > On Mon, Nov 4, 2019 at 10:00 AM OpenInx <op...@gmail.com> wrote:
> > >
> > > > > The test cases are written in java and scripts in python. We
> propose
> > a
> > > > separate directory/module in parallel with flink-end-to-end-tests,
> with
> > > the
> > > > > name of flink-end-to-end-perf-tests.
> > > >
> > > > Glad to see that the newly introduced e2e test will be written in
> Java.
> > > > because  I'm re-working on the existed e2e tests suites from BASH
> > scripts
> > > > to Java test cases so that we can support more external system , such
> > as
> > > > running the testing job on yarn+flink, docker+flink,
> standalone+flink,
> > > > distributed kafka cluster etc.
> > > > BTW, I think the perf e2e test suites will also need to be designed
> as
> > > > supporting running on both standalone env and distributed env. will
> be
> > > > helpful
> > > > for developing & evaluating the perf.
> > > > Thanks.
> > > >
> > > > On Mon, Nov 4, 2019 at 9:31 AM aihua li <li...@gmail.com>
> wrote:
> > > >
> > > > > In stage1, the checkpoint mode isn't disabled,and uses heap as the
> > > > > statebackend.
> > > > > I think there should be some special scenarios to test checkpoint
> and
> > > > > statebackend, which will be discussed and added in the release-1.11
> > > > >
> > > > > > 在 2019年11月2日，上午12:13，Yun Tang <my...@live.com> 写道：
> > > > > >
> > > > > > By the way, do you think it's worthy to add a checkpoint mode
> which
> > > > just
> > > > > disable checkpoint to run end-to-end jobs? And when will stage2 and
> > > > stage3
> > > > > be discussed in more details?
> > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best, Jingsong Lee
> > >
> >
>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by Yang Wang <da...@gmail.com>.

Thanks Yu for starting this discussion.

I'm in favor of adding a e2e performance testing framework. Currently the
e2e tests are mainly focused
on functionality and written in shell. We need a better e2e framework for
performance and functionality tests.


Best,
Yang

Biao Liu <mm...@gmail.com> 于2019年11月5日周二 上午10:16写道：

> Thanks Yu for bringing this topic.
>
> +1 for this proposal. Glad to have an e2e performance testing.
>
> It seems this proposal is separated into several stages. Is there a more
> detailed plan?
>
> Thanks,
> Biao /'bɪ.aʊ/
>
>
>
> On Mon, 4 Nov 2019 at 19:54, Congxian Qiu <qc...@gmail.com> wrote:
>
> > +1 for this idea.
> >
> > Currently, we have the micro benchmark for flink, which can help us find
> > the regressions. And I think the e2e jobs performance testing can also
> help
> > us to cover more scenarios.
> >
> > Best,
> > Congxian
> >
> >
> > Jingsong Li <ji...@gmail.com> 于2019年11月4日周一 下午5:37写道：
> >
> > > +1 for the idea. Thanks Yu for driving this.
> > > Just curious about that can we collect the metrics about Job scheduling
> > and
> > > task launch. the speed of this part is also important.
> > > We can add tests for watch it too.
> > >
> > > Look forward to more batch test support.
> > >
> > > Best,
> > > Jingsong Lee
> > >
> > > On Mon, Nov 4, 2019 at 10:00 AM OpenInx <op...@gmail.com> wrote:
> > >
> > > > > The test cases are written in java and scripts in python. We
> propose
> > a
> > > > separate directory/module in parallel with flink-end-to-end-tests,
> with
> > > the
> > > > > name of flink-end-to-end-perf-tests.
> > > >
> > > > Glad to see that the newly introduced e2e test will be written in
> Java.
> > > > because  I'm re-working on the existed e2e tests suites from BASH
> > scripts
> > > > to Java test cases so that we can support more external system , such
> > as
> > > > running the testing job on yarn+flink, docker+flink,
> standalone+flink,
> > > > distributed kafka cluster etc.
> > > > BTW, I think the perf e2e test suites will also need to be designed
> as
> > > > supporting running on both standalone env and distributed env. will
> be
> > > > helpful
> > > > for developing & evaluating the perf.
> > > > Thanks.
> > > >
> > > > On Mon, Nov 4, 2019 at 9:31 AM aihua li <li...@gmail.com>
> wrote:
> > > >
> > > > > In stage1, the checkpoint mode isn't disabled,and uses heap as the
> > > > > statebackend.
> > > > > I think there should be some special scenarios to test checkpoint
> and
> > > > > statebackend, which will be discussed and added in the release-1.11
> > > > >
> > > > > > 在 2019年11月2日，上午12:13，Yun Tang <my...@live.com> 写道：
> > > > > >
> > > > > > By the way, do you think it's worthy to add a checkpoint mode
> which
> > > > just
> > > > > disable checkpoint to run end-to-end jobs? And when will stage2 and
> > > > stage3
> > > > > be discussed in more details?
> > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best, Jingsong Lee
> > >
> >
>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by Biao Liu <mm...@gmail.com>.

Thanks Yu for bringing this topic.

+1 for this proposal. Glad to have an e2e performance testing.

It seems this proposal is separated into several stages. Is there a more
detailed plan?

Thanks,
Biao /'bɪ.aʊ/



On Mon, 4 Nov 2019 at 19:54, Congxian Qiu <qc...@gmail.com> wrote:

> +1 for this idea.
>
> Currently, we have the micro benchmark for flink, which can help us find
> the regressions. And I think the e2e jobs performance testing can also help
> us to cover more scenarios.
>
> Best,
> Congxian
>
>
> Jingsong Li <ji...@gmail.com> 于2019年11月4日周一 下午5:37写道：
>
> > +1 for the idea. Thanks Yu for driving this.
> > Just curious about that can we collect the metrics about Job scheduling
> and
> > task launch. the speed of this part is also important.
> > We can add tests for watch it too.
> >
> > Look forward to more batch test support.
> >
> > Best,
> > Jingsong Lee
> >
> > On Mon, Nov 4, 2019 at 10:00 AM OpenInx <op...@gmail.com> wrote:
> >
> > > > The test cases are written in java and scripts in python. We propose
> a
> > > separate directory/module in parallel with flink-end-to-end-tests, with
> > the
> > > > name of flink-end-to-end-perf-tests.
> > >
> > > Glad to see that the newly introduced e2e test will be written in Java.
> > > because  I'm re-working on the existed e2e tests suites from BASH
> scripts
> > > to Java test cases so that we can support more external system , such
> as
> > > running the testing job on yarn+flink, docker+flink, standalone+flink,
> > > distributed kafka cluster etc.
> > > BTW, I think the perf e2e test suites will also need to be designed as
> > > supporting running on both standalone env and distributed env. will be
> > > helpful
> > > for developing & evaluating the perf.
> > > Thanks.
> > >
> > > On Mon, Nov 4, 2019 at 9:31 AM aihua li <li...@gmail.com> wrote:
> > >
> > > > In stage1, the checkpoint mode isn't disabled,and uses heap as the
> > > > statebackend.
> > > > I think there should be some special scenarios to test checkpoint and
> > > > statebackend, which will be discussed and added in the release-1.11
> > > >
> > > > > 在 2019年11月2日，上午12:13，Yun Tang <my...@live.com> 写道：
> > > > >
> > > > > By the way, do you think it's worthy to add a checkpoint mode which
> > > just
> > > > disable checkpoint to run end-to-end jobs? And when will stage2 and
> > > stage3
> > > > be discussed in more details?
> > > >
> > > >
> > >
> >
> >
> > --
> > Best, Jingsong Lee
> >
>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by Congxian Qiu <qc...@gmail.com>.

+1 for this idea.

Currently, we have the micro benchmark for flink, which can help us find
the regressions. And I think the e2e jobs performance testing can also help
us to cover more scenarios.

Best,
Congxian


Jingsong Li <ji...@gmail.com> 于2019年11月4日周一 下午5:37写道：

> +1 for the idea. Thanks Yu for driving this.
> Just curious about that can we collect the metrics about Job scheduling and
> task launch. the speed of this part is also important.
> We can add tests for watch it too.
>
> Look forward to more batch test support.
>
> Best,
> Jingsong Lee
>
> On Mon, Nov 4, 2019 at 10:00 AM OpenInx <op...@gmail.com> wrote:
>
> > > The test cases are written in java and scripts in python. We propose a
> > separate directory/module in parallel with flink-end-to-end-tests, with
> the
> > > name of flink-end-to-end-perf-tests.
> >
> > Glad to see that the newly introduced e2e test will be written in Java.
> > because  I'm re-working on the existed e2e tests suites from BASH scripts
> > to Java test cases so that we can support more external system , such as
> > running the testing job on yarn+flink, docker+flink, standalone+flink,
> > distributed kafka cluster etc.
> > BTW, I think the perf e2e test suites will also need to be designed as
> > supporting running on both standalone env and distributed env. will be
> > helpful
> > for developing & evaluating the perf.
> > Thanks.
> >
> > On Mon, Nov 4, 2019 at 9:31 AM aihua li <li...@gmail.com> wrote:
> >
> > > In stage1, the checkpoint mode isn't disabled,and uses heap as the
> > > statebackend.
> > > I think there should be some special scenarios to test checkpoint and
> > > statebackend, which will be discussed and added in the release-1.11
> > >
> > > > 在 2019年11月2日，上午12:13，Yun Tang <my...@live.com> 写道：
> > > >
> > > > By the way, do you think it's worthy to add a checkpoint mode which
> > just
> > > disable checkpoint to run end-to-end jobs? And when will stage2 and
> > stage3
> > > be discussed in more details?
> > >
> > >
> >
>
>
> --
> Best, Jingsong Lee
>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by Yu Li <ca...@gmail.com>.

Thanks for the suggestion Jingsong!

I've added a stage for adding more metrics in FLIP document, please check
and let me know if any further concerns. Thanks.

Best Regards,
Yu


On Mon, 4 Nov 2019 at 17:37, Jingsong Li <ji...@gmail.com> wrote:

> +1 for the idea. Thanks Yu for driving this.
> Just curious about that can we collect the metrics about Job scheduling and
> task launch. the speed of this part is also important.
> We can add tests for watch it too.
>
> Look forward to more batch test support.
>
> Best,
> Jingsong Lee
>
> On Mon, Nov 4, 2019 at 10:00 AM OpenInx <op...@gmail.com> wrote:
>
> > > The test cases are written in java and scripts in python. We propose a
> > separate directory/module in parallel with flink-end-to-end-tests, with
> the
> > > name of flink-end-to-end-perf-tests.
> >
> > Glad to see that the newly introduced e2e test will be written in Java.
> > because  I'm re-working on the existed e2e tests suites from BASH scripts
> > to Java test cases so that we can support more external system , such as
> > running the testing job on yarn+flink, docker+flink, standalone+flink,
> > distributed kafka cluster etc.
> > BTW, I think the perf e2e test suites will also need to be designed as
> > supporting running on both standalone env and distributed env. will be
> > helpful
> > for developing & evaluating the perf.
> > Thanks.
> >
> > On Mon, Nov 4, 2019 at 9:31 AM aihua li <li...@gmail.com> wrote:
> >
> > > In stage1, the checkpoint mode isn't disabled,and uses heap as the
> > > statebackend.
> > > I think there should be some special scenarios to test checkpoint and
> > > statebackend, which will be discussed and added in the release-1.11
> > >
> > > > 在 2019年11月2日，上午12:13，Yun Tang <my...@live.com> 写道：
> > > >
> > > > By the way, do you think it's worthy to add a checkpoint mode which
> > just
> > > disable checkpoint to run end-to-end jobs? And when will stage2 and
> > stage3
> > > be discussed in more details?
> > >
> > >
> >
>
>
> --
> Best, Jingsong Lee
>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by Jingsong Li <ji...@gmail.com>.

+1 for the idea. Thanks Yu for driving this.
Just curious about that can we collect the metrics about Job scheduling and
task launch. the speed of this part is also important.
We can add tests for watch it too.

Look forward to more batch test support.

Best,
Jingsong Lee

On Mon, Nov 4, 2019 at 10:00 AM OpenInx <op...@gmail.com> wrote:

> > The test cases are written in java and scripts in python. We propose a
> separate directory/module in parallel with flink-end-to-end-tests, with the
> > name of flink-end-to-end-perf-tests.
>
> Glad to see that the newly introduced e2e test will be written in Java.
> because  I'm re-working on the existed e2e tests suites from BASH scripts
> to Java test cases so that we can support more external system , such as
> running the testing job on yarn+flink, docker+flink, standalone+flink,
> distributed kafka cluster etc.
> BTW, I think the perf e2e test suites will also need to be designed as
> supporting running on both standalone env and distributed env. will be
> helpful
> for developing & evaluating the perf.
> Thanks.
>
> On Mon, Nov 4, 2019 at 9:31 AM aihua li <li...@gmail.com> wrote:
>
> > In stage1, the checkpoint mode isn't disabled,and uses heap as the
> > statebackend.
> > I think there should be some special scenarios to test checkpoint and
> > statebackend, which will be discussed and added in the release-1.11
> >
> > > 在 2019年11月2日，上午12:13，Yun Tang <my...@live.com> 写道：
> > >
> > > By the way, do you think it's worthy to add a checkpoint mode which
> just
> > disable checkpoint to run end-to-end jobs? And when will stage2 and
> stage3
> > be discussed in more details?
> >
> >
>


-- 
Best, Jingsong Lee

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by OpenInx <op...@gmail.com>.

> The test cases are written in java and scripts in python. We propose a
separate directory/module in parallel with flink-end-to-end-tests, with the
> name of flink-end-to-end-perf-tests.

Glad to see that the newly introduced e2e test will be written in Java.
because  I'm re-working on the existed e2e tests suites from BASH scripts
to Java test cases so that we can support more external system , such as
running the testing job on yarn+flink, docker+flink, standalone+flink,
distributed kafka cluster etc.
BTW, I think the perf e2e test suites will also need to be designed as
supporting running on both standalone env and distributed env. will be
helpful
for developing & evaluating the perf.
Thanks.

On Mon, Nov 4, 2019 at 9:31 AM aihua li <li...@gmail.com> wrote:

> In stage1, the checkpoint mode isn't disabled,and uses heap as the
> statebackend.
> I think there should be some special scenarios to test checkpoint and
> statebackend, which will be discussed and added in the release-1.11
>
> > 在 2019年11月2日，上午12:13，Yun Tang <my...@live.com> 写道：
> >
> > By the way, do you think it's worthy to add a checkpoint mode which just
> disable checkpoint to run end-to-end jobs? And when will stage2 and stage3
> be discussed in more details?
>
>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by aihua li <li...@gmail.com>.

In stage1, the checkpoint mode isn't disabled,and uses heap as the statebackend.
I think there should be some special scenarios to test checkpoint and statebackend, which will be discussed and added in the release-1.11

> 在 2019年11月2日，上午12:13，Yun Tang <my...@live.com> 写道：
> 
> By the way, do you think it's worthy to add a checkpoint mode which just disable checkpoint to run end-to-end jobs? And when will stage2 and stage3 be discussed in more details?

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by Yun Tang <my...@live.com>.

+1, I like the idea of this improvement which acts as a watchdog for developers' code change.

By the way, do you think it's worthy to add a checkpoint mode which just disable checkpoint to run end-to-end jobs? And when will stage2 and stage3 be discussed in more details?

Best
Yun Tang

On 11/1/19, 5:02 PM, "Piotr Nowojski" <pi...@ververica.com> wrote:

    Hi Yu,
    
    Thanks for the answers, it makes sense to me :)
    
    Piotrek
    
    > On 31 Oct 2019, at 11:25, Yu Li <ca...@gmail.com> wrote:
    > 
    > Hi Piotr,
    > 
    > Thanks for the comments!
    > 
    > bq. How are you planning to execute the end-to-end benchmarks and integrate
    > them with our build process?
    > Great question! We plan to execute the end-to-end benchmark in a small
    > cluster (like 3 vm nodes) to better reflect network cost, triggering it
    > through our Jenkins service for micro benchmark and show the result on
    > code-speed center. Will add these into FLIP document if no objections.
    > 
    > bq. Are you planning to monitor the throughput and latency at the same time?
    > Good question. And you're right, we will stress the cluster to
    > back-pressure and watch the throughput, latency doesn't mean much in the
    > first test suites. Let me refine the document.
    > 
    > Thanks.
    > 
    > Best Regards,
    > Yu
    > 
    > 
    > On Wed, 30 Oct 2019 at 19:07, Piotr Nowojski <pi...@ververica.com> wrote:
    > 
    >> Hi Yu,
    >> 
    >> Thanks for bringing this up.
    >> 
    >> +1 for the idea and the proposal from my side.
    >> 
    >> I think that the proposed Test Job List might be a bit
    >> redundant/excessive, but:
    >> - we can always adjust this later, once we have the infrastructure in place
    >> - as long as we have the computing resources and ability to quickly
    >> interpret the results/catch regressions, it doesn’t hurt to have more
    >> benchmarks/tests then strictly necessary.
    >> 
    >> Which brings me to a question. How are you planning to execute the
    >> end-to-end benchmarks and integrate them with our build process?
    >> 
    >> Another smaller question:
    >> 
    >>> In this initial stage we will only monitor and display job throughput
    >> and latency.
    >> 
    >> Are you planning to monitor the throughput and latency at the same time?
    >> It might be a bit problematic, as when measuring the throughput you want to
    >> saturate the system and hit some bottleneck, which will cause a
    >> back-pressure (measuring latency at the same time when system is back
    >> pressured doesn’t make much sense).
    >> 
    >> Piotrek
    >> 
    >>> On 30 Oct 2019, at 11:54, Yu Li <ca...@gmail.com> wrote:
    >>> 
    >>> Hi everyone,
    >>> 
    >>> We would like to propose FLIP-83 that adds an end-to-end performance
    >>> testing framework for Flink. We discovered some potential problems
    >> through
    >>> such an internal end-to-end performance testing framework before the
    >>> release of 1.9.0 [1], so we'd like to contribute it to Flink community
    >> as a
    >>> supplement to the existing daily run micro performance benchmark [2] and
    >>> nightly run end-to-end stability test [3].
    >>> 
    >>> The FLIP document could be found here:
    >>> 
    >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
    >>> 
    >>> Please kindly review the FLIP document and let us know if you have any
    >>> comments/suggestions, thanks!
    >>> 
    >>> [1] https://s.apache.org/m8kcq
    >>> [2] https://github.com/dataArtisans/flink-benchmarks
    >>> [3] https://github.com/apache/flink/tree/master/flink-end-to-end-tests
    >>> 
    >>> Best Regards,
    >>> Yu
    >> 
    >>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by Piotr Nowojski <pi...@ververica.com>.

Hi Yu,

Thanks for the answers, it makes sense to me :)

Piotrek

> On 31 Oct 2019, at 11:25, Yu Li <ca...@gmail.com> wrote:
> 
> Hi Piotr,
> 
> Thanks for the comments!
> 
> bq. How are you planning to execute the end-to-end benchmarks and integrate
> them with our build process?
> Great question! We plan to execute the end-to-end benchmark in a small
> cluster (like 3 vm nodes) to better reflect network cost, triggering it
> through our Jenkins service for micro benchmark and show the result on
> code-speed center. Will add these into FLIP document if no objections.
> 
> bq. Are you planning to monitor the throughput and latency at the same time?
> Good question. And you're right, we will stress the cluster to
> back-pressure and watch the throughput, latency doesn't mean much in the
> first test suites. Let me refine the document.
> 
> Thanks.
> 
> Best Regards,
> Yu
> 
> 
> On Wed, 30 Oct 2019 at 19:07, Piotr Nowojski <pi...@ververica.com> wrote:
> 
>> Hi Yu,
>> 
>> Thanks for bringing this up.
>> 
>> +1 for the idea and the proposal from my side.
>> 
>> I think that the proposed Test Job List might be a bit
>> redundant/excessive, but:
>> - we can always adjust this later, once we have the infrastructure in place
>> - as long as we have the computing resources and ability to quickly
>> interpret the results/catch regressions, it doesn’t hurt to have more
>> benchmarks/tests then strictly necessary.
>> 
>> Which brings me to a question. How are you planning to execute the
>> end-to-end benchmarks and integrate them with our build process?
>> 
>> Another smaller question:
>> 
>>> In this initial stage we will only monitor and display job throughput
>> and latency.
>> 
>> Are you planning to monitor the throughput and latency at the same time?
>> It might be a bit problematic, as when measuring the throughput you want to
>> saturate the system and hit some bottleneck, which will cause a
>> back-pressure (measuring latency at the same time when system is back
>> pressured doesn’t make much sense).
>> 
>> Piotrek
>> 
>>> On 30 Oct 2019, at 11:54, Yu Li <ca...@gmail.com> wrote:
>>> 
>>> Hi everyone,
>>> 
>>> We would like to propose FLIP-83 that adds an end-to-end performance
>>> testing framework for Flink. We discovered some potential problems
>> through
>>> such an internal end-to-end performance testing framework before the
>>> release of 1.9.0 [1], so we'd like to contribute it to Flink community
>> as a
>>> supplement to the existing daily run micro performance benchmark [2] and
>>> nightly run end-to-end stability test [3].
>>> 
>>> The FLIP document could be found here:
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
>>> 
>>> Please kindly review the FLIP document and let us know if you have any
>>> comments/suggestions, thanks!
>>> 
>>> [1] https://s.apache.org/m8kcq
>>> [2] https://github.com/dataArtisans/flink-benchmarks
>>> [3] https://github.com/apache/flink/tree/master/flink-end-to-end-tests
>>> 
>>> Best Regards,
>>> Yu
>> 
>>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by Yu Li <ca...@gmail.com>.

Hi Piotr,

Thanks for the comments!

bq. How are you planning to execute the end-to-end benchmarks and integrate
them with our build process?
Great question! We plan to execute the end-to-end benchmark in a small
cluster (like 3 vm nodes) to better reflect network cost, triggering it
through our Jenkins service for micro benchmark and show the result on
code-speed center. Will add these into FLIP document if no objections.

bq. Are you planning to monitor the throughput and latency at the same time?
Good question. And you're right, we will stress the cluster to
back-pressure and watch the throughput, latency doesn't mean much in the
first test suites. Let me refine the document.

Thanks.

Best Regards,
Yu


On Wed, 30 Oct 2019 at 19:07, Piotr Nowojski <pi...@ververica.com> wrote:

> Hi Yu,
>
> Thanks for bringing this up.
>
> +1 for the idea and the proposal from my side.
>
> I think that the proposed Test Job List might be a bit
> redundant/excessive, but:
> - we can always adjust this later, once we have the infrastructure in place
> - as long as we have the computing resources and ability to quickly
> interpret the results/catch regressions, it doesn’t hurt to have more
> benchmarks/tests then strictly necessary.
>
> Which brings me to a question. How are you planning to execute the
> end-to-end benchmarks and integrate them with our build process?
>
> Another smaller question:
>
> > In this initial stage we will only monitor and display job throughput
> and latency.
>
> Are you planning to monitor the throughput and latency at the same time?
> It might be a bit problematic, as when measuring the throughput you want to
> saturate the system and hit some bottleneck, which will cause a
> back-pressure (measuring latency at the same time when system is back
> pressured doesn’t make much sense).
>
> Piotrek
>
> > On 30 Oct 2019, at 11:54, Yu Li <ca...@gmail.com> wrote:
> >
> > Hi everyone,
> >
> > We would like to propose FLIP-83 that adds an end-to-end performance
> > testing framework for Flink. We discovered some potential problems
> through
> > such an internal end-to-end performance testing framework before the
> > release of 1.9.0 [1], so we'd like to contribute it to Flink community
> as a
> > supplement to the existing daily run micro performance benchmark [2] and
> > nightly run end-to-end stability test [3].
> >
> > The FLIP document could be found here:
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> >
> > Please kindly review the FLIP document and let us know if you have any
> > comments/suggestions, thanks!
> >
> > [1] https://s.apache.org/m8kcq
> > [2] https://github.com/dataArtisans/flink-benchmarks
> > [3] https://github.com/apache/flink/tree/master/flink-end-to-end-tests
> >
> > Best Regards,
> > Yu
>
>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Posted by Piotr Nowojski <pi...@ververica.com>.

Hi Yu,

Thanks for bringing this up.

+1 for the idea and the proposal from my side.

I think that the proposed Test Job List might be a bit redundant/excessive, but:
- we can always adjust this later, once we have the infrastructure in place
- as long as we have the computing resources and ability to quickly interpret the results/catch regressions, it doesn’t hurt to have more benchmarks/tests then strictly necessary.

Which brings me to a question. How are you planning to execute the end-to-end benchmarks and integrate them with our build process?

Another smaller question:

> In this initial stage we will only monitor and display job throughput and latency.

Are you planning to monitor the throughput and latency at the same time? It might be a bit problematic, as when measuring the throughput you want to saturate the system and hit some bottleneck, which will cause a back-pressure (measuring latency at the same time when system is back pressured doesn’t make much sense).

Piotrek

> On 30 Oct 2019, at 11:54, Yu Li <ca...@gmail.com> wrote:
> 
> Hi everyone,
> 
> We would like to propose FLIP-83 that adds an end-to-end performance
> testing framework for Flink. We discovered some potential problems through
> such an internal end-to-end performance testing framework before the
> release of 1.9.0 [1], so we'd like to contribute it to Flink community as a
> supplement to the existing daily run micro performance benchmark [2] and
> nightly run end-to-end stability test [3].
> 
> The FLIP document could be found here:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> 
> Please kindly review the FLIP document and let us know if you have any
> comments/suggestions, thanks!
> 
> [1] https://s.apache.org/m8kcq
> [2] https://github.com/dataArtisans/flink-benchmarks
> [3] https://github.com/apache/flink/tree/master/flink-end-to-end-tests
> 
> Best Regards,
> Yu