You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Vilhelm von Ehrenheim <vo...@gmail.com> on 2018/02/20 18:13:37 UTC
Missing watermark?
Hi all!
I have a somewhat complicated stateful DoFn that i would like to add an
event time timer on. My goal with the timer is to not output anything until
sufficient amount of state has been built up in a Global window.
In doing this I realize that the watermark doesn’t seem to progress at all
(regardless of the timer) and in Google Dataflow the displayed watermark is
just “-“ when clicking on the ParDo(DoFn) node.
The DoFn is reading flattened input from a TextIO.watchForNewFiles and
KafkaIO. The Flatten element had a watermark set.
I have written tests for my DoFn that all pass using TestStream but since I
there explicitly set the watermark progression all is fine.
What can I do to look into why there is no watermark progression for a
specific PTransform?
Regards,
Vilhelm von Ehrenheim
Re: Missing watermark?
Posted by Vilhelm von Ehrenheim <vo...@gmail.com>.
Ok, that might explain it. I have a topic with very few messages so it
might have empty partitions actually. Thanks!
I’m really curious about the TextIO watermark as well though. I cant find
the implementation either so if anyone know where to look I’d love to get a
pointer.
// Vilhelm
On Tue, 20 Feb 2018 at 19:31, Raghu Angadi <ra...@google.com> wrote:
> Both the sources have to provide watermark.
>
> - KafkaIO : It can have its watermark stuck if you don't have any records
> in one of the partitions. Thats is a bug. The management of timestamps and
> watermarks in KafkaIO are being updated in
> https://github.com/apache/beam/pull/4680.
> - TextIO.watchForNewFiles() - I am not sure how the watermark is handled
> by TextIO. Didn't notice any mentions of in implementation.
>
> On Tue, Feb 20, 2018 at 10:13 AM, Vilhelm von Ehrenheim <
> vonehrenheim@gmail.com> wrote:
>
>> Hi all!
>> I have a somewhat complicated stateful DoFn that i would like to add an
>> event time timer on. My goal with the timer is to not output anything until
>> sufficient amount of state has been built up in a Global window.
>>
>> In doing this I realize that the watermark doesn’t seem to progress at
>> all (regardless of the timer) and in Google Dataflow the displayed
>> watermark is just “-“ when clicking on the ParDo(DoFn) node.
>>
>> The DoFn is reading flattened input from a TextIO.watchForNewFiles and
>> KafkaIO. The Flatten element had a watermark set.
>>
>> I have written tests for my DoFn that all pass using TestStream but since
>> I there explicitly set the watermark progression all is fine.
>>
>> What can I do to look into why there is no watermark progression for a
>> specific PTransform?
>>
>> Regards,
>> Vilhelm von Ehrenheim
>>
>
>
Re: Missing watermark?
Posted by Raghu Angadi <ra...@google.com>.
Both the sources have to provide watermark.
- KafkaIO : It can have its watermark stuck if you don't have any records
in one of the partitions. Thats is a bug. The management of timestamps and
watermarks in KafkaIO are being updated in
https://github.com/apache/beam/pull/4680.
- TextIO.watchForNewFiles() - I am not sure how the watermark is handled by
TextIO. Didn't notice any mentions of in implementation.
On Tue, Feb 20, 2018 at 10:13 AM, Vilhelm von Ehrenheim <
vonehrenheim@gmail.com> wrote:
> Hi all!
> I have a somewhat complicated stateful DoFn that i would like to add an
> event time timer on. My goal with the timer is to not output anything until
> sufficient amount of state has been built up in a Global window.
>
> In doing this I realize that the watermark doesn’t seem to progress at all
> (regardless of the timer) and in Google Dataflow the displayed watermark is
> just “-“ when clicking on the ParDo(DoFn) node.
>
> The DoFn is reading flattened input from a TextIO.watchForNewFiles and
> KafkaIO. The Flatten element had a watermark set.
>
> I have written tests for my DoFn that all pass using TestStream but since
> I there explicitly set the watermark progression all is fine.
>
> What can I do to look into why there is no watermark progression for a
> specific PTransform?
>
> Regards,
> Vilhelm von Ehrenheim
>