You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 16:16:12 UTC

[GitHub] [beam] damccorm opened a new issue, #20214: Mistakes Computing Composite Inputs and Outputs

damccorm opened a new issue, #20214:
URL: https://github.com/apache/beam/issues/20214

   The Go SDK uses a Scope object to manage beam Composites.
   
   A bug was discovered when consuming a PCollection in both the composite that created it, and in a separate composite.
   
   Further, the Go SDK should verify that the root hypergraph structure is a DAG and provides a reasonable error.  In particular, the leaf nodes of the graph could form a DAG, but due to how the beam.Scope object is used, might cause the hypergraph to not be a DAG.
   
   Eg. It's possible to write the following in the Go SDK.
   
    PTransforms A, B, C and PCollections colA, colB, and Composites a, b.
   A and C are in a, and B are in b.
   A generates colA
   B consumes colA, and generates colB.
   C consumes colA and colB.
   
   ```
   a := s.Scope(a)
   b := s.Scope(b)
   colA := beam.Impulse(*a*)
   colB := beam.ParDo(*b*, <doFn\>, colA)
   beam.ParDo0(*a*, <doFn\>, colA, beam.SideInput{colB})
   ```
   
   If it doesn't already, the Go SDK must emit a clear error, and fail pipeline construction.
   
   If the affected composites are roots in the graph, the cycle prevents being able to topologically sort the root ptransforms for the pipeline graph, which can adversely affect runners.
   
   The recommendation is always to wrap uses of scope in functions or other scopes to prevent such incorrect constructions.
   
   
   
   
   Imported from Jira [BEAM-9959](https://issues.apache.org/jira/browse/BEAM-9959). Original Jira may contain additional context.
   Reported by: lostluck.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org