You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by shrikant bang <ma...@gmail.com> on 2020/12/17 09:35:56 UTC

At least once semantics with KafkaIO.Read

Hi Team,

       I am trying with KafkaIO.Read => FileIO.Write in *Batch* mode with
SparkRunner.

       With this use case, offsets should be committed at the end of the
pipeline once the files are written to the target location.

    I have couple of queries around it:

   1. I tried with commitOffsetsInFinalize(), but the offsets are not
   committed even though the pipeline succeeded.
   Is *commitOffsetsInFinalize() *applicable for only when KafkaIO.Read
   uses stream mode?

   2.  Is there any way we can get offsets back to the driver to commit
   once the pipeline finishes?


Thank You,
Shrikant Bang.

Re: At least once semantics with KafkaIO.Read

Posted by Alexey Romanenko <ar...@gmail.com>.
Hello Shrikant,

There is a known issue about using KafkaIO as Bounded source [1]
Please, take a look on the preceding discussion on this. [2] I’m not sure if there is a workaround for this problem.

[1] https://issues.apache.org/jira/browse/BEAM-6466
[2] https://lists.apache.org/thread.html/bcec8a1fb166029a4adf3f3491c407d49843406020b20f203ec3c2d2@%3Cuser.beam.apache.org%3E

> On 17 Dec 2020, at 10:35, shrikant bang <ma...@gmail.com> wrote:
> 
> Hi Team,
>     
>        I am trying with KafkaIO.Read => FileIO.Write in Batch mode with SparkRunner. 
>        
>        With this use case, offsets should be committed at the end of the pipeline once the files are written to the target location.  
>  
>     I have couple of queries around it:
> I tried with commitOffsetsInFinalize(), but the offsets are not committed even though the pipeline succeeded.  
> Is commitOffsetsInFinalize() applicable for only when KafkaIO.Read uses stream mode?
> 
>  Is there any way we can get offsets back to the driver to commit once the pipeline finishes?
> 
> Thank You,
> Shrikant Bang.