You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/07/21 21:10:29 UTC

[GitHub] [beam] TheNeuralBit commented on a diff in pull request #22198: Update element_type inference (default_type_hints) for batched DoFns with yields_batches/yields_elements

TheNeuralBit commented on code in PR #22198:
URL: https://github.com/apache/beam/pull/22198#discussion_r926132674


##########
sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:
##########
@@ -357,13 +357,15 @@ def infer_output_type(self, input_type):
         return np.int64
 
     with self.create_pipeline() as p:
-      res = (
+      pc = (
           p
           | beam.Create(np.array([1, 2, 3], dtype=np.int64)).with_output_types(
               np.int64)
-          | beam.ParDo(ArrayProduceDoFn())
-          | beam.ParDo(ArrayMultiplyDoFn())
-          | beam.Map(lambda x: x * 3))
+          | beam.ParDo(ArrayProduceDoFn()))
+
+      self.assertEqual(pc.element_type, np.int64)

Review Comment:
   Fair point, I initially added the test here for expediency, but now I have the concise test in `batch_dofn_test`. I'll drop this change.



##########
sdks/python/apache_beam/transforms/core.py:
##########
@@ -708,6 +708,36 @@ def get_function_arguments(self, func):
 
   def default_type_hints(self):
     fn_type_hints = typehints.decorators.IOTypeHints.from_callable(self.process)
+    batch_fn_type_hints = typehints.decorators.IOTypeHints.from_callable(
+        self.process_batch)
+
+    if fn_type_hints is not None:

Review Comment:
   yeah agreed this logic was not pretty. I think part of the issue is that it's has to deal with both `None`, and the possibility of `IOTypeHints.empty()` in both sets of typehints. I rewrote it to collapse those two possibilities, and also separated concerns a bit, first we deal with process_yields_batches, then deal with process_batch_yields_elements. WDYT?
   
   Can you clarify the "TODO comment why does this use with_input_types_from"?



##########
sdks/python/apache_beam/transforms/batch_dofn_test.py:
##########
@@ -170,6 +173,29 @@ def process_batch(self, batch, *args, **kwargs):
     yield [element * 2 for element in batch]
 
 
+class MismatchedBatchProducingDoFn(beam.DoFn):
+  """A DoFn that produces batches from both process and process_batch, with
+  mismatched types. Should yield a construction time error when applied."""

Review Comment:
   Thanks, done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org