You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Sam Rohde (Jira)" <ji...@apache.org> on 2019/11/07 21:06:00 UTC

[jira] [Created] (BEAM-8582) Python SDK emits duplicate records for Default and AfterWatermark triggers

Sam Rohde created BEAM-8582:
-------------------------------

             Summary: Python SDK emits duplicate records for Default and AfterWatermark triggers
                 Key: BEAM-8582
                 URL: https://issues.apache.org/jira/browse/BEAM-8582
             Project: Beam
          Issue Type: Bug
          Components: sdk-py-core
            Reporter: Sam Rohde
            Assignee: Sam Rohde


This was found after fixing https://issues.apache.org/jira/browse/BEAM-8581. The fix for 8581 was to pass in the input watermark. Previously, it was using MIN_TIMESTAMP for all of its EOW calculations. By giving it a proper input watermark, this bug started to manifest.

The DefaultTrigger and AfterWatermark do not clear their timers after the watermark passed the end of the endow, leading to duplicate records being emitted.

Fix: Clear the watermark timer when the watermark reaches the end of the window.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)