You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Kamil Wasilewski <ka...@polidea.com> on 2020/01/02 13:54:55 UTC
Re: Performance drops in Python PortableRunner tests

Robert, you can find the pipeline of this particular test here:
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/load_tests/pardo_test.py
.

The documentation for running this kind of tests, including how to set up a
Flink cluster, is on CWIKI:
https://cwiki.apache.org/confluence/display/BEAM/Contribution+Testing+Guide#ContributionTestingGuide-TestsofCoreApacheBeamOperations
.
Hope this helps.


On Fri, Dec 20, 2019 at 7:10 PM Pablo Estrada <pa...@google.com> wrote:

> The jenkins jobs for the Flink load tests:
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_ParDo_Flink_Python.groovy
>
> The documentation for the test contains how to run it on each runner:
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/load_tests/pardo_test.py#L17
>
> I assume that standing up the Flink cluster should be done separately.
>
> LMK if that helps Robert.
> -P.
>
> On Fri, Dec 20, 2019 at 9:59 AM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> Yes, it is possible that this had an influence--Reads are now all
>> implemented as SDFs and Creates involve a reshuffle to better
>> redistribute data. This much of a change is quite surprising. Where is
>> the pipeline for, say, "Python | ParDo | 2GB, 100 byte records, 10
>> iterations | Batch" and how does one run it?
>>
>> On Fri, Dec 20, 2019 at 6:50 AM Kamil Wasilewski
>> <ka...@polidea.com> wrote:
>> >
>> > Hi all,
>> >
>> > We have a couple of Python load tests running on Flink in which we are
>> testing the performance of ParDo, GroupByKey, CoGroupByKey and Combine
>> operations.
>> >
>> > Recently, I've discovered that the runtime of all those tests rose up
>> significantly. It happened between the 6th and 7th of December (the tests
>> are running daily). Here are the dashboards where you can see the results:
>> >
>> >
>> https://apache-beam-testing.appspot.com/explore?dashboard=5649695233802240
>> >
>> https://apache-beam-testing.appspot.com/explore?dashboard=5763764733345792
>> >
>> https://apache-beam-testing.appspot.com/explore?dashboard=5698549949923328
>> >
>> https://apache-beam-testing.appspot.com/explore?dashboard=5678187241537536
>> >
>> > I've seen in that period we submitted some changes to the core,
>> including Read transform. Do you think this might have influenced the
>> results?
>> >
>> > Thanks,
>> > Kamil
>>
>