You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "jrmccluskey (via GitHub)" <gi...@apache.org> on 2023/03/07 20:58:15 UTC

[GitHub] [beam] jrmccluskey commented on issue #25709: [Bug]: Data not being forwarded correctly to Stateful DoFn in Python SDK + PortableRunner

jrmccluskey commented on issue #25709:
URL: https://github.com/apache/beam/issues/25709#issuecomment-1458865446

   FWIW I have slightly modified this pipeline to execute on dataflow to check behavior there and got behavior like Flink's:
   
   ```
   INFO 2023-03-07T20:44:08.027822017Z Input to DummyDoFn: (1, 4)
   INFO 2023-03-07T20:44:08.042022228Z Input to DummyDoFn: (1, 5)
   INFO 2023-03-07T20:44:08.042737245Z Input to DummyDoFn: (1, 3)
   INFO 2023-03-07T20:44:08.051712751Z Input to DummyDoFn: (1, 8)
   INFO 2023-03-07T20:44:08.056941747Z Input to DummyDoFn: (1, 7)
   INFO 2023-03-07T20:44:08.057651281Z Input to DummyDoFn: (1, 1)
   INFO 2023-03-07T20:44:08.072991371Z Input to DummyDoFn: (1, 0)
   INFO 2023-03-07T20:44:08.084101915Z Input to DummyDoFn: (1, 2)
   INFO 2023-03-07T20:44:08.086365222Z Input to DummyDoFn: (1, 9)
   INFO 2023-03-07T20:44:08.088879585Z Input to DummyDoFn: (1, 6)
   INFO 2023-03-07T20:44:08.299842119Z Input to DummyReadModifyWriteStateDoFn: (1, 4)
   INFO 2023-03-07T20:44:08.300017595Z Final output(1, 4)
   INFO 2023-03-07T20:44:08.355724573Z Input to DummyReadModifyWriteStateDoFn: (1, 3)
   INFO 2023-03-07T20:44:08.355904102Z Final output(1, 3)
   INFO 2023-03-07T20:44:08.356028318Z Input to DummyReadModifyWriteStateDoFn: (1, 6)
   INFO 2023-03-07T20:44:08.356083869Z Final output(1, 6)
   INFO 2023-03-07T20:44:08.356168270Z Input to DummyReadModifyWriteStateDoFn: (1, 8)
   INFO 2023-03-07T20:44:08.356222152Z Final output(1, 8)
   INFO 2023-03-07T20:44:15.685304403Z Input to DummyReadModifyWriteStateDoFn: (1, 7)
   INFO 2023-03-07T20:44:15.685476541Z Final output(1, 7)
   INFO 2023-03-07T20:44:15.733944654Z Input to DummyReadModifyWriteStateDoFn: (1, 5)
   INFO 2023-03-07T20:44:15.734126329Z Final output(1, 5)
   INFO 2023-03-07T20:44:15.734240531Z Input to DummyReadModifyWriteStateDoFn: (1, 2)
   INFO 2023-03-07T20:44:15.734311342Z Final output(1, 2)
   INFO 2023-03-07T20:44:15.750033617Z Input to DummyReadModifyWriteStateDoFn: (1, 0)
   INFO 2023-03-07T20:44:15.750293016Z Final output(1, 0)
   INFO 2023-03-07T20:44:15.750429391Z Input to DummyReadModifyWriteStateDoFn: (1, 9)
   INFO 2023-03-07T20:44:15.750485658Z Final output(1, 9)
   INFO 2023-03-07T20:44:15.750657320Z Input to DummyReadModifyWriteStateDoFn: (1, 1)
   INFO 2023-03-07T20:44:15.750710273Z Final output(1, 1)
   ```
   
   My guess is that the `Create` function is outputting the elements as a single bundle on Flink and Dataflow. This also shouldn't be blocking in a streaming context since you're doing aggregations based on windows and/or triggers (or jsut processing input as you get it,) the unbounded nature of the input is accounted for. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org