You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by "hsy541@gmail.com" <hs...@gmail.com> on 2023/12/22 01:41:16 UTC

pubsubliteio ack problem

In my application, the pubsubliteio seems never ack the message and the
data lateness is building up forever, my question is how does dataflow know
when to ack the message, How does the engine even know when it is
processed?

Re: pubsubliteio ack problem

Posted by Nirav Patel <ni...@gmail.com>.
whats your pipeline look like?

1. dataflow -> 2. pubsub lite publish -> 3. pubsub lite subscribe -> 4.
[spark or dataflow or whatever]

then step 2 is where we think either publish requests are being throttled
or taking too long for pslite server to either process or ack the message
as you mentioned. Slow latency is also what we struggled with. Their
throughput should scale with more capacity units and partition counts but
it doesn't in practice. You can check throughput utilization metrics
however. If its above 100 that will also create back pressure.

With java pslite io you won't see much backpressuer in dataflow
because that client just gives up after 1 min and throws errors that 1000s
of messages just didn't got published.





On Thu, Dec 21, 2023 at 5:41 PM hsy541@gmail.com <hs...@gmail.com> wrote:

> In my application, the pubsubliteio seems never ack the message and the
> data lateness is building up forever, my question is how does dataflow know
> when to ack the message, How does the engine even know when it is
> processed?
>