You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/10/01 23:20:33 UTC

[GitHub] [beam] robertwb commented on a change in pull request #15642: Allow multiple Python worker processe to share the same VM.

robertwb commented on a change in pull request #15642:
URL: https://github.com/apache/beam/pull/15642#discussion_r720584648



##########
File path: model/pipeline/src/main/proto/beam_runner_api.proto
##########
@@ -1559,6 +1559,11 @@ message StandardProtocols {
     // cores.
     MULTI_CORE_BUNDLE_PROCESSING = 3 [(beam_urn) = "beam:protocol:multi_core_bundle_processing:v1"];
 
+    // Indicates this SDK can cheaply spawn sibling workers (e.g. within the
+    // same container) to work around the fact that it cannot take advantage
+    // of multiple cores (i.e. MULTI_CORE_BUNDLE_PROCESSING is not set).
+    SIBLING_WORKERS = 5 [(beam_urn) = "beam:protocol:sibling_workers:v1"];

Review comment:
       Yes, sub-interpreters sharing the same connection to the runner would have many of the same downsides of a multi-process worker in that it's expensive to transfer objects (though maybe slightly cheaper than between processes, and some simple (immutable, leaf) types can be shared. If one can't easily share objects, there's little benefit to using sub-interpreters over independent processes. (Something to possibly keep an eye on though.)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org