You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@beam.apache.org by Moritz Mack <mm...@talend.com> on 2022/09/13 09:54:31 UTC

[Infrastructure] Periodically run Java microbenchmarks on Jenkins

Hi team,

I’m looking for some help to setup infrastructure to periodically run Java microbenchmarks (JMH).
Results of these runs will be added to our community metrics (InfluxDB) to help us track performance, see [1].

To prevent noisy runs this would require a dedicated Jenkins machine that runs at most one job (benchmark) at a time. Benchmark runs take quite some time, but on the other hand they don’t have to run very frequently (once a week should be fine initially).

Thanks so much,
Moritz

[1] https://github.com/apache/beam/pull/23041

As a recipient of an email from Talend, your contact personal data will be on our systems. Please see our privacy notice. <https://www.talend.com/privacy/>

Re: [Infrastructure] Periodically run Java microbenchmarks on Jenkins

Posted by Kenneth Knowles <ke...@apache.org>.

We've got an "our infrastructure" section on the wiki. I expect it is
probably not super up to date.

On Thu, Sep 15, 2022 at 9:56 AM Brian Hulette via dev <de...@beam.apache.org>
wrote:

> Is there somewhere we could document this?
>
> On Thu, Sep 15, 2022 at 6:45 AM Moritz Mack <mm...@talend.com> wrote:
>
>> Thank you, Andrew!
>>
>> Exactly what I was looking for, that’s awesome!
>>
>>
>>
>> On 15.09.22, 06:37, "Alexey Romanenko" <ar...@gmail.com> wrote:
>>
>>
>>
>>
>>
>> Ahh, great! I didn’t know that 'beam-perf’ label is used for that.
>> Thanks!
>>
>> > On 14 Sep 2022, at 17:47, Andrew Pilloud <ap...@apache.org> wrote:
>> >
>> > We do have a dedicated machine for benchmarks. This is a single
>> > machine limited to running one test at a time. Set the
>> > jenkinsExecutorLabel for the job to 'beam-perf' to use it. For
>> > example:
>> >
>> https://urldefense.com/v3/__https://github.com/apache/beam/blob/66bbee84ed477d86008905646e68b100591b6f78/.test-infra/jenkins/job_PostCommit_Java_Nexmark_Direct.groovy*L36__;Iw!!CiXD_PY!Qat2J4NAyHVo4Cc32PKMn50yw8LgWHmEOm4Ltb7aRV-7KCfNamu3tGOiSYKDUZhLHKu3zlqbBXzJNiX_f_Qteg$
>> <https://urldefense.com/v3/__https:/github.com/apache/beam/blob/66bbee84ed477d86008905646e68b100591b6f78/.test-infra/jenkins/job_PostCommit_Java_Nexmark_Direct.groovy*L36__;Iw!!CiXD_PY!Qat2J4NAyHVo4Cc32PKMn50yw8LgWHmEOm4Ltb7aRV-7KCfNamu3tGOiSYKDUZhLHKu3zlqbBXzJNiX_f_Qteg$>
>>
>> >
>> > Andrew
>> >
>> > On Wed, Sep 14, 2022 at 8:28 AM Alexey Romanenko
>> > <ar...@gmail.com> wrote:
>> >>
>> >> I think it depends on the goal why to run that benchmarks. In ideal
>> case, we need to run them on the same dedicated machine(s) and with the
>> same configuration all the time but I’m not sure that it can be achieved in
>> current infrastructure reality.
>> >>
>> >> On the other hand, IIRC, the initial goal of benchmarks, like Nexmark,
>> was to detect fast any major regressions, especially between releases, that
>> are not so sensitive to ideal conditions. And here we a field for
>> improvements.
>> >>
>> >> —
>> >> Alexey
>> >>
>> >> On 13 Sep 2022, at 22:57, Kenneth Knowles <ke...@apache.org> wrote:
>> >>
>> >> Good idea. I'm curious about our current benchmarks. Some of them run
>> on clusters, but I think some of them are running locally and just being
>> noisy. Perhaps this could improve that. (or if they are running on local
>> Spark/Flink then maybe the results are not really meaningful anyhow)
>> >>
>> >> On Tue, Sep 13, 2022 at 2:54 AM Moritz Mack <mm...@talend.com> wrote:
>> >>>
>> >>> Hi team,
>> >>>
>> >>>
>> >>>
>> >>> I’m looking for some help to setup infrastructure to periodically run
>> Java microbenchmarks (JMH).
>> >>>
>> >>> Results of these runs will be added to our community metrics
>> (InfluxDB) to help us track performance, see [1].
>> >>>
>> >>>
>> >>>
>> >>> To prevent noisy runs this would require a dedicated Jenkins machine
>> that runs at most one job (benchmark) at a time. Benchmark runs take quite
>> some time, but on the other hand they don’t have to run very frequently
>> (once a week should be fine initially).
>> >>>
>> >>>
>> >>>
>> >>> Thanks so much,
>> >>>
>> >>> Moritz
>> >>>
>> >>>
>> >>>
>> >>> [1]
>> https://urldefense.com/v3/__https://github.com/apache/beam/pull/23041__;!!CiXD_PY!Qat2J4NAyHVo4Cc32PKMn50yw8LgWHmEOm4Ltb7aRV-7KCfNamu3tGOiSYKDUZhLHKu3zlqbBXzJNiUkaqlEKQ$
>> <https://urldefense.com/v3/__https:/github.com/apache/beam/pull/23041__;!!CiXD_PY!Qat2J4NAyHVo4Cc32PKMn50yw8LgWHmEOm4Ltb7aRV-7KCfNamu3tGOiSYKDUZhLHKu3zlqbBXzJNiUkaqlEKQ$>
>>
>> >>>
>> >>> As a recipient of an email from Talend, your contact personal data
>> will be on our systems. Please see our privacy notice.
>> >>>
>> >>>
>> >>>
>> >>
>>
>> *As a recipient of an email from Talend, your contact personal data will
>> be on our systems. Please see our privacy notice.
>> <https://www.talend.com/privacy/>*
>>
>>
>>

Re: [Infrastructure] Periodically run Java microbenchmarks on Jenkins

Posted by Brian Hulette via dev <de...@beam.apache.org>.

Is there somewhere we could document this?

On Thu, Sep 15, 2022 at 6:45 AM Moritz Mack <mm...@talend.com> wrote:

> Thank you, Andrew!
>
> Exactly what I was looking for, that’s awesome!
>
>
>
> On 15.09.22, 06:37, "Alexey Romanenko" <ar...@gmail.com> wrote:
>
>
>
>
>
> Ahh, great! I didn’t know that 'beam-perf’ label is used for that.
> Thanks!
>
> > On 14 Sep 2022, at 17:47, Andrew Pilloud <ap...@apache.org> wrote:
> >
> > We do have a dedicated machine for benchmarks. This is a single
> > machine limited to running one test at a time. Set the
> > jenkinsExecutorLabel for the job to 'beam-perf' to use it. For
> > example:
> >
> https://urldefense.com/v3/__https://github.com/apache/beam/blob/66bbee84ed477d86008905646e68b100591b6f78/.test-infra/jenkins/job_PostCommit_Java_Nexmark_Direct.groovy*L36__;Iw!!CiXD_PY!Qat2J4NAyHVo4Cc32PKMn50yw8LgWHmEOm4Ltb7aRV-7KCfNamu3tGOiSYKDUZhLHKu3zlqbBXzJNiX_f_Qteg$
> <https://urldefense.com/v3/__https:/github.com/apache/beam/blob/66bbee84ed477d86008905646e68b100591b6f78/.test-infra/jenkins/job_PostCommit_Java_Nexmark_Direct.groovy*L36__;Iw!!CiXD_PY!Qat2J4NAyHVo4Cc32PKMn50yw8LgWHmEOm4Ltb7aRV-7KCfNamu3tGOiSYKDUZhLHKu3zlqbBXzJNiX_f_Qteg$>
>
> >
> > Andrew
> >
> > On Wed, Sep 14, 2022 at 8:28 AM Alexey Romanenko
> > <ar...@gmail.com> wrote:
> >>
> >> I think it depends on the goal why to run that benchmarks. In ideal
> case, we need to run them on the same dedicated machine(s) and with the
> same configuration all the time but I’m not sure that it can be achieved in
> current infrastructure reality.
> >>
> >> On the other hand, IIRC, the initial goal of benchmarks, like Nexmark,
> was to detect fast any major regressions, especially between releases, that
> are not so sensitive to ideal conditions. And here we a field for
> improvements.
> >>
> >> —
> >> Alexey
> >>
> >> On 13 Sep 2022, at 22:57, Kenneth Knowles <ke...@apache.org> wrote:
> >>
> >> Good idea. I'm curious about our current benchmarks. Some of them run
> on clusters, but I think some of them are running locally and just being
> noisy. Perhaps this could improve that. (or if they are running on local
> Spark/Flink then maybe the results are not really meaningful anyhow)
> >>
> >> On Tue, Sep 13, 2022 at 2:54 AM Moritz Mack <mm...@talend.com> wrote:
> >>>
> >>> Hi team,
> >>>
> >>>
> >>>
> >>> I’m looking for some help to setup infrastructure to periodically run
> Java microbenchmarks (JMH).
> >>>
> >>> Results of these runs will be added to our community metrics
> (InfluxDB) to help us track performance, see [1].
> >>>
> >>>
> >>>
> >>> To prevent noisy runs this would require a dedicated Jenkins machine
> that runs at most one job (benchmark) at a time. Benchmark runs take quite
> some time, but on the other hand they don’t have to run very frequently
> (once a week should be fine initially).
> >>>
> >>>
> >>>
> >>> Thanks so much,
> >>>
> >>> Moritz
> >>>
> >>>
> >>>
> >>> [1]
> https://urldefense.com/v3/__https://github.com/apache/beam/pull/23041__;!!CiXD_PY!Qat2J4NAyHVo4Cc32PKMn50yw8LgWHmEOm4Ltb7aRV-7KCfNamu3tGOiSYKDUZhLHKu3zlqbBXzJNiUkaqlEKQ$
> <https://urldefense.com/v3/__https:/github.com/apache/beam/pull/23041__;!!CiXD_PY!Qat2J4NAyHVo4Cc32PKMn50yw8LgWHmEOm4Ltb7aRV-7KCfNamu3tGOiSYKDUZhLHKu3zlqbBXzJNiUkaqlEKQ$>
>
> >>>
> >>> As a recipient of an email from Talend, your contact personal data
> will be on our systems. Please see our privacy notice.
> >>>
> >>>
> >>>
> >>
>
> *As a recipient of an email from Talend, your contact personal data will
> be on our systems. Please see our privacy notice.
> <https://www.talend.com/privacy/>*
>
>
>

Re: [Infrastructure] Periodically run Java microbenchmarks on Jenkins

Posted by Moritz Mack <mm...@talend.com>.

Thank you, Andrew!
Exactly what I was looking for, that’s awesome!

On 15.09.22, 06:37, "Alexey Romanenko" <ar...@gmail.com> wrote:

Ahh, great! I didn’t know that 'beam-perf’ label is used for that.
Thanks!

> On 14 Sep 2022, at 17:47, Andrew Pilloud <ap...@apache.org> wrote:
>
> We do have a dedicated machine for benchmarks. This is a single
> machine limited to running one test at a time. Set the
> jenkinsExecutorLabel for the job to 'beam-perf' to use it. For
> example:
> https://urldefense.com/v3/__https://github.com/apache/beam/blob/66bbee84ed477d86008905646e68b100591b6f78/.test-infra/jenkins/job_PostCommit_Java_Nexmark_Direct.groovy*L36__;Iw!!CiXD_PY!Qat2J4NAyHVo4Cc32PKMn50yw8LgWHmEOm4Ltb7aRV-7KCfNamu3tGOiSYKDUZhLHKu3zlqbBXzJNiX_f_Qteg$<https://urldefense.com/v3/__https:/github.com/apache/beam/blob/66bbee84ed477d86008905646e68b100591b6f78/.test-infra/jenkins/job_PostCommit_Java_Nexmark_Direct.groovy*L36__;Iw!!CiXD_PY!Qat2J4NAyHVo4Cc32PKMn50yw8LgWHmEOm4Ltb7aRV-7KCfNamu3tGOiSYKDUZhLHKu3zlqbBXzJNiX_f_Qteg$>
>
> Andrew
>
> On Wed, Sep 14, 2022 at 8:28 AM Alexey Romanenko
> <ar...@gmail.com> wrote:
>>
>> I think it depends on the goal why to run that benchmarks. In ideal case, we need to run them on the same dedicated machine(s) and with the same configuration all the time but I’m not sure that it can be achieved in current infrastructure reality.
>>
>> On the other hand, IIRC, the initial goal of benchmarks, like Nexmark, was to detect fast any major regressions, especially between releases, that are not so sensitive to ideal conditions. And here we a field for improvements.
>>
>> —
>> Alexey
>>
>> On 13 Sep 2022, at 22:57, Kenneth Knowles <ke...@apache.org> wrote:
>>
>> Good idea. I'm curious about our current benchmarks. Some of them run on clusters, but I think some of them are running locally and just being noisy. Perhaps this could improve that. (or if they are running on local Spark/Flink then maybe the results are not really meaningful anyhow)
>>
>> On Tue, Sep 13, 2022 at 2:54 AM Moritz Mack <mm...@talend.com> wrote:
>>>
>>> Hi team,
>>>
>>>
>>>
>>> I’m looking for some help to setup infrastructure to periodically run Java microbenchmarks (JMH).
>>>
>>> Results of these runs will be added to our community metrics (InfluxDB) to help us track performance, see [1].
>>>
>>>
>>>
>>> To prevent noisy runs this would require a dedicated Jenkins machine that runs at most one job (benchmark) at a time. Benchmark runs take quite some time, but on the other hand they don’t have to run very frequently (once a week should be fine initially).
>>>
>>>
>>>
>>> Thanks so much,
>>>
>>> Moritz
>>>
>>>
>>>
>>> [1] https://urldefense.com/v3/__https://github.com/apache/beam/pull/23041__;!!CiXD_PY!Qat2J4NAyHVo4Cc32PKMn50yw8LgWHmEOm4Ltb7aRV-7KCfNamu3tGOiSYKDUZhLHKu3zlqbBXzJNiUkaqlEKQ$<https://urldefense.com/v3/__https:/github.com/apache/beam/pull/23041__;!!CiXD_PY!Qat2J4NAyHVo4Cc32PKMn50yw8LgWHmEOm4Ltb7aRV-7KCfNamu3tGOiSYKDUZhLHKu3zlqbBXzJNiUkaqlEKQ$>
>>>
>>> As a recipient of an email from Talend, your contact personal data will be on our systems. Please see our privacy notice.
>>>
>>>
>>>
>>

As a recipient of an email from Talend, your contact personal data will be on our systems. Please see our privacy notice. <https://www.talend.com/privacy/>

Re: [Infrastructure] Periodically run Java microbenchmarks on Jenkins

Posted by Alexey Romanenko <ar...@gmail.com>.

Ahh, great! I didn’t know that 'beam-perf’ label is used for that. 
Thanks!

> On 14 Sep 2022, at 17:47, Andrew Pilloud <ap...@apache.org> wrote:
> 
> We do have a dedicated machine for benchmarks. This is a single
> machine limited to running one test at a time. Set the
> jenkinsExecutorLabel for the job to 'beam-perf' to use it. For
> example:
> https://github.com/apache/beam/blob/66bbee84ed477d86008905646e68b100591b6f78/.test-infra/jenkins/job_PostCommit_Java_Nexmark_Direct.groovy#L36
> 
> Andrew
> 
> On Wed, Sep 14, 2022 at 8:28 AM Alexey Romanenko
> <ar...@gmail.com> wrote:
>> 
>> I think it depends on the goal why to run that benchmarks. In ideal case, we need to run them on the same dedicated machine(s) and with the same configuration all the time but I’m not sure that it can be achieved in current infrastructure reality.
>> 
>> On the other hand, IIRC, the initial goal of benchmarks, like Nexmark, was to detect fast any major regressions, especially between releases, that are not so sensitive to ideal conditions. And here we a field for improvements.
>> 
>> —
>> Alexey
>> 
>> On 13 Sep 2022, at 22:57, Kenneth Knowles <ke...@apache.org> wrote:
>> 
>> Good idea. I'm curious about our current benchmarks. Some of them run on clusters, but I think some of them are running locally and just being noisy. Perhaps this could improve that. (or if they are running on local Spark/Flink then maybe the results are not really meaningful anyhow)
>> 
>> On Tue, Sep 13, 2022 at 2:54 AM Moritz Mack <mm...@talend.com> wrote:
>>> 
>>> Hi team,
>>> 
>>> 
>>> 
>>> I’m looking for some help to setup infrastructure to periodically run Java microbenchmarks (JMH).
>>> 
>>> Results of these runs will be added to our community metrics (InfluxDB) to help us track performance, see [1].
>>> 
>>> 
>>> 
>>> To prevent noisy runs this would require a dedicated Jenkins machine that runs at most one job (benchmark) at a time. Benchmark runs take quite some time, but on the other hand they don’t have to run very frequently (once a week should be fine initially).
>>> 
>>> 
>>> 
>>> Thanks so much,
>>> 
>>> Moritz
>>> 
>>> 
>>> 
>>> [1] https://github.com/apache/beam/pull/23041
>>> 
>>> As a recipient of an email from Talend, your contact personal data will be on our systems. Please see our privacy notice.
>>> 
>>> 
>>> 
>>

Re: [Infrastructure] Periodically run Java microbenchmarks on Jenkins

Posted by Andrew Pilloud <ap...@apache.org>.

We do have a dedicated machine for benchmarks. This is a single
machine limited to running one test at a time. Set the
jenkinsExecutorLabel for the job to 'beam-perf' to use it. For
example:
https://github.com/apache/beam/blob/66bbee84ed477d86008905646e68b100591b6f78/.test-infra/jenkins/job_PostCommit_Java_Nexmark_Direct.groovy#L36

Andrew

On Wed, Sep 14, 2022 at 8:28 AM Alexey Romanenko
<ar...@gmail.com> wrote:
>
> I think it depends on the goal why to run that benchmarks. In ideal case, we need to run them on the same dedicated machine(s) and with the same configuration all the time but I’m not sure that it can be achieved in current infrastructure reality.
>
> On the other hand, IIRC, the initial goal of benchmarks, like Nexmark, was to detect fast any major regressions, especially between releases, that are not so sensitive to ideal conditions. And here we a field for improvements.
>
> —
> Alexey
>
> On 13 Sep 2022, at 22:57, Kenneth Knowles <ke...@apache.org> wrote:
>
> Good idea. I'm curious about our current benchmarks. Some of them run on clusters, but I think some of them are running locally and just being noisy. Perhaps this could improve that. (or if they are running on local Spark/Flink then maybe the results are not really meaningful anyhow)
>
> On Tue, Sep 13, 2022 at 2:54 AM Moritz Mack <mm...@talend.com> wrote:
>>
>> Hi team,
>>
>>
>>
>> I’m looking for some help to setup infrastructure to periodically run Java microbenchmarks (JMH).
>>
>> Results of these runs will be added to our community metrics (InfluxDB) to help us track performance, see [1].
>>
>>
>>
>> To prevent noisy runs this would require a dedicated Jenkins machine that runs at most one job (benchmark) at a time. Benchmark runs take quite some time, but on the other hand they don’t have to run very frequently (once a week should be fine initially).
>>
>>
>>
>> Thanks so much,
>>
>> Moritz
>>
>>
>>
>> [1] https://github.com/apache/beam/pull/23041
>>
>> As a recipient of an email from Talend, your contact personal data will be on our systems. Please see our privacy notice.
>>
>>
>>
>

Re: [Infrastructure] Periodically run Java microbenchmarks on Jenkins

Posted by Alexey Romanenko <ar...@gmail.com>.

I think it depends on the goal why to run that benchmarks. In ideal case, we need to run them on the same dedicated machine(s) and with the same configuration all the time but I’m not sure that it can be achieved in current infrastructure reality. 

On the other hand, IIRC, the initial goal of benchmarks, like Nexmark, was to detect fast any major regressions, especially between releases, that are not so sensitive to ideal conditions. And here we a field for improvements.

—
Alexey

> On 13 Sep 2022, at 22:57, Kenneth Knowles <ke...@apache.org> wrote:
> 
> Good idea. I'm curious about our current benchmarks. Some of them run on clusters, but I think some of them are running locally and just being noisy. Perhaps this could improve that. (or if they are running on local Spark/Flink then maybe the results are not really meaningful anyhow)
> 
> On Tue, Sep 13, 2022 at 2:54 AM Moritz Mack <mmack@talend.com <ma...@talend.com>> wrote:
> Hi team,
> 
>  
> 
> I’m looking for some help to setup infrastructure to periodically run Java microbenchmarks (JMH).
> 
> Results of these runs will be added to our community metrics (InfluxDB) to help us track performance, see [1]. 
> 
>  
> 
> To prevent noisy runs this would require a dedicated Jenkins machine that runs at most one job (benchmark) at a time. Benchmark runs take quite some time, but on the other hand they don’t have to run very frequently (once a week should be fine initially).
> 
>  
> 
> Thanks so much,
> 
> Moritz
> 
>  
> 
> [1] https://github.com/apache/beam/pull/23041 <https://github.com/apache/beam/pull/23041>
> As a recipient of an email from Talend, your contact personal data will be on our systems. Please see our privacy notice. <https://www.talend.com/privacy/>
>

Re: [Infrastructure] Periodically run Java microbenchmarks on Jenkins

Posted by Kenneth Knowles <ke...@apache.org>.

Good idea. I'm curious about our current benchmarks. Some of them run on
clusters, but I think some of them are running locally and just being
noisy. Perhaps this could improve that. (or if they are running on local
Spark/Flink then maybe the results are not really meaningful anyhow)

On Tue, Sep 13, 2022 at 2:54 AM Moritz Mack <mm...@talend.com> wrote:

> Hi team,
>
>
>
> I’m looking for some help to setup infrastructure to periodically run Java
> microbenchmarks (JMH).
>
> Results of these runs will be added to our community metrics (InfluxDB) to
> help us track performance, see [1].
>
>
>
> To prevent noisy runs this would require a dedicated Jenkins machine that
> runs at most one job (benchmark) at a time. Benchmark runs take quite some
> time, but on the other hand they don’t have to run very frequently (once a
> week should be fine initially).
>
>
>
> Thanks so much,
>
> Moritz
>
>
>
> [1] https://github.com/apache/beam/pull/23041
>
> *As a recipient of an email from Talend, your contact personal data will
> be on our systems. Please see our privacy notice.
> <https://www.talend.com/privacy/>*
>
>
>