You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/03/23 15:00:40 UTC

[GitHub] [beam] tvalentyn commented on a change in pull request #14189: [BEAM-11935] Updates Dataflow SDK Harness map to set Environment ID

tvalentyn commented on a change in pull request #14189:
URL: https://github.com/apache/beam/pull/14189#discussion_r599651690



##########
File path: sdks/python/apache_beam/runners/dataflow/internal/apiclient.py
##########
@@ -740,6 +719,25 @@ def _apply_sdk_environment_overrides(
       new_payload.container_image = new_container_image
       environment.payload = new_payload.SerializeToString()
 
+    # De-dup environments by Docker container image since currently Dataflow

Review comment:
       Thanks for tagging me on this change. I am working on a related change to reflect pipeline resource hints in portable proto. Hints are defined in `Environment.resource_hints`. Transforms are mapped to environments, and different transforms can have different hints. Therefore, we can have multiple environments with different hints, but the same container image.
   My change is not yet ready to review, but the replication logic looks like this:
   https://github.com/apache/beam/pull/14082/files#diff-252b68d1b24f6f7cdd8c5e54163d4856afad59fd385f5f6a91bf0fe66f09e67dR243 
   
   I think deduplicating logic as proposed in this change will be difficult to reconcile with resource hints representation.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org