You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/10/01 23:09:54 UTC

[GitHub] [beam] lukecwik commented on a change in pull request #15642: Allow multiple Python worker processe to share the same VM.

lukecwik commented on a change in pull request #15642:
URL: https://github.com/apache/beam/pull/15642#discussion_r720581805



##########
File path: model/pipeline/src/main/proto/beam_runner_api.proto
##########
@@ -1559,6 +1559,11 @@ message StandardProtocols {
     // cores.
     MULTI_CORE_BUNDLE_PROCESSING = 3 [(beam_urn) = "beam:protocol:multi_core_bundle_processing:v1"];
 
+    // Indicates this SDK can cheaply spawn sibling workers (e.g. within the
+    // same container) to work around the fact that it cannot take advantage
+    // of multiple cores (i.e. MULTI_CORE_BUNDLE_PROCESSING is not set).
+    SIBLING_WORKERS = 5 [(beam_urn) = "beam:protocol:sibling_workers:v1"];

Review comment:
       Do you think we could get python to finally support multi core bundle processing by using sub-interpreters / PEP554 instead of sibling workers or is the overhead not practical because we would need to still pay a high cost to transfer objects between the main interpreter that would have the connections to the runner and the sub-interpreters that would do the work?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org