You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 17:13:45 UTC
[GitHub] [beam] damccorm opened a new issue, #20402: Combine Python streaming load test is too slow on Flink
damccorm opened a new issue, #20402:
URL: https://github.com/apache/beam/issues/20402
One of the Combine load test cases, which involves a global combiner and data stream of 200M elements, takes too long on Flink. Flink is able to process only a half of that data stream within 1 hour, which is too long for a Jenkins job.
Job's definition: https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_Combine_Flink_Python.groovy#L36
Test pipeline: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/load_tests/combine_test.py
Imported from Jira [BEAM-10852](https://issues.apache.org/jira/browse/BEAM-10852). Original Jira may contain additional context.
Reported by: kamilwu.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] Abacn commented on issue #20402: Combine Python streaming load test is too slow on Flink
Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on issue #20402:
URL: https://github.com/apache/beam/issues/20402#issuecomment-1563657143
The cause is that Python Flink test does not take advantage of parallelism (run on single core) even though the flink cluster has multiple workers. This is found by https://github.com/apache/beam/pull/26893#issuecomment-1563341002
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] Abacn commented on issue #20402: Combine Python load test (batch/stream) no parallelism
Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on issue #20402:
URL: https://github.com/apache/beam/issues/20402#issuecomment-1565150835
This indicates real issue for Flink runner - Combine runs on single core
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org