You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "psolomin (via GitHub)" <gi...@apache.org> on 2023/04/17 09:43:38 UTC

[GitHub] [beam] psolomin commented on issue #26041: [Bug]: Unable to create exactly-once Flink pipeline with stream source and file sink

psolomin commented on issue #26041:
URL: https://github.com/apache/beam/issues/26041#issuecomment-1511027078

   > Without making any additional actions
   
   Correct. I tried to reproduce data loss again - and it does happen if I omit my `runId` in file names - files are replaced with new files when the app starts from a savepoint. Here's my code:
   
   https://github.com/psolomin/beam-playground/tree/file-sink-app-id-trick/kinesis-io-with-enhanced-fan-out#vanilla-flink
   
   I still think that it's my window misconfiguration, but windows are configured like that in all examples, including GCP one: https://cloud.google.com/pubsub/docs/stream-messages-dataflow#stream_messages_from_to
   
   > If so, this should be P1
   
   I would say, it should be. If this is expected behaviour, I think, windowing documentation should explain how windows behave when app is stopped and then re-started from serialised state + include window configuration examples which allow safe re-start from serialised state.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org