You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/08/19 18:47:53 UTC

[GitHub] [incubator-pinot] icefury71 commented on issue #4484: Pinot query timeout due to the broker waiting for a single non-responsive server

icefury71 commented on issue #4484:
URL: https://github.com/apache/incubator-pinot/issues/4484#issuecomment-676597798


   Thanks for the comments Subbu and Ting. Based on what @mcvsubbu mentioned, I'm thinking of adding a failure detector capability in the Pinot broker so that it can proactively prune bad server replicas. Although this is not a complete solution as mentioned, its still very useful for graceful degradation (instead of query failures).
   
   High level thoughts on the design: Have a failure detector interface (implemented with different algorithms) that keeps track of all servers in the External View and reports which Pinot servers are healthy. This can then be used by the Broker to determine where to route the query. 
   
   There are many different algorithms for failure detection in distributed systems ranging from periodic pings to piggybacking on the server responses to determine health after the fact.
   
   I'll add a formal design document for more details. Please let me know if there are any concerns up front. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org