You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/03 21:19:57 UTC

[GitHub] [beam] kennknowles opened a new issue, #18957: Element type inference doesn't work for multi-output DoFns

kennknowles opened a new issue, #18957:
URL: https://github.com/apache/beam/issues/18957

   TLDR: if you have a multi-output DoFn, then the non-main PCollections with incorrectly have their element types set to None. This affects type checking for pipelines involving these PCollections.
   
   Minimal example:
   ```
   
   import apache_beam as beam
   
   class TripleDoFn(beam.DoFn):
     def process(self, elem):
       yield_elem
   
      if elem % 2 == 0:
         yield beam.pvalue.TaggedOutput('ten_times', elem * 10)
       if elem % 3
   == 0:
         yield beam.pvalue.TaggedOutput('hundred_times', elem * 100)
         
   @beam.typehints.with_input_types(int)
   @beam.typehints.with_output_types(int)
   class
   MultiplyBy(beam.DoFn):
     def __init__(self, multiplier):
       self._multiplier = multiplier
   
     def
   process(self, elem):
       return elem * self._multiplier
     
   def main():
     with beam.Pipeline() as
   p:
       x, a, b = (
         p
         | 'Create' >> beam.Create([1, 2, 3])
         | 'TripleDo' >> beam.ParDo(TripleDoFn()).with_outputs(
   
          'ten_times', 'hundred_times', main='main_output'))
   
       _ = a | 'MultiplyBy2' >> beam.ParDo(MultiplyBy(2))
   
   if
   __name__ == '__main__':
     main()    
   
   ```
   
   Running this yields the following error:
   ```
   
   apache_beam.typehints.decorators.TypeCheckError: Type hint violation for 'MultiplyBy2': requires <type
   'int'> but got None for elem
   
   ```
   
   Replacing `a` with `b` yields the same error. Replacing `a` with `x` instead yields the following error:
   ```
   
   apache_beam.typehints.decorators.TypeCheckError: Type hint violation for 'MultiplyBy2': requires <type
   'int'> but got Union[TaggedOutput, int] for elem
   
   ```
   
   I would expect Beam to correctly infer that `a` and `b` have element types of `int` rather than `None`, and I would also expect Beam to correctly figure out that the element types of `x` are compatible with `int`.
   
   Imported from Jira [BEAM-4132](https://issues.apache.org/jira/browse/BEAM-4132). Original Jira may contain additional context.
   Reported by: chuanyu.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org