You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2023/01/02 11:43:14 UTC

[GitHub] [beam] scwhittle commented on issue #24836: [Bug]: Dataflow streaming runner, improve commit stream throughput after network disruption

scwhittle commented on issue #24836:
URL: https://github.com/apache/beam/issues/24836#issuecomment-1368873432

   The connection backoff logic while it can reach a large maximum, would take many attempts to reach that maximum. It also doesn't align with the monitoring available which shows that the stream is connected but not performing well. Stack traces indicate that the streams are often waiting for isReady to become true. From https://github.com/grpc/proposal/pull/135 it appears that is based upon 32KB queued output and is not configurable.  We have more data we want to send, and putting more data in the queue (within reason) seems like it could improve throughput by ensuring that there is always data available and not blocking on isReady to return.  Since we have a maximum message size of 2MB I think that the windmill DirectStreamObserver could be changed to evaluate isReady only every 10 messages to increase the effecting outgoing buffer size.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org