You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 17:13:45 UTC

[GitHub] [beam] damccorm opened a new issue, #20402: Combine Python streaming load test is too slow on Flink

damccorm opened a new issue, #20402:
URL: https://github.com/apache/beam/issues/20402

   One of the Combine load test cases, which involves a global combiner and data stream of 200M elements, takes too long on Flink. Flink is able to process only a half of that data stream within 1 hour, which is too long for a Jenkins job.
   
   Job's definition: https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_Combine_Flink_Python.groovy#L36
   
   Test pipeline: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/load_tests/combine_test.py
   
   
   Imported from Jira [BEAM-10852](https://issues.apache.org/jira/browse/BEAM-10852). Original Jira may contain additional context.
   Reported by: kamilwu.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on issue #20402: Combine Python streaming load test is too slow on Flink

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on issue #20402:
URL: https://github.com/apache/beam/issues/20402#issuecomment-1563657143

   The cause is that Python Flink test does not take advantage of parallelism (run on single core) even though the flink cluster has multiple workers. This is found by https://github.com/apache/beam/pull/26893#issuecomment-1563341002
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on issue #20402: Combine Python load test (batch/stream) no parallelism

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on issue #20402:
URL: https://github.com/apache/beam/issues/20402#issuecomment-1565150835

   This indicates real issue for Flink runner - Combine runs on single core


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org