You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/08/02 10:05:42 UTC

[GitHub] [airflow] potiuk edited a comment on issue #17191: Healthcheck endpoint for workers

potiuk edited a comment on issue #17191:
URL: https://github.com/apache/airflow/issues/17191#issuecomment-890898149

> @potiuk Only in our case we do not have an ideal situation that is in line with the philosophy of K8s. In our case, one component performs two roles: webserver, tasks handling service. For webserver, it is more common to use an HTTP-based probe, but for the second type of service (which does not provide an HTTP endpoint), it is more natural to use exec probe. Kubernetes also does not allow us to define of two liveness probes for one container, so we have to decide which service we want to monitor directly and which we will monitor only as a child process from the main process.

But since our workers already have "log http-webserver" built-in and there is one webserver per worker which is involved with running the worker itself - what is the problem of each worker providing health check using HTTP? I do not see any reason why we could not use it. There are many services that provide health check over http, even if their primary task is doing something else. Such health-check end-point could internally perform more complex checks than just responding with success - it could communicate with celery master process and query it for status etc. etc. I think this would make worker a nice self-container service with its own dedicated health-check end-point. I think the basic premise of our Celery architecture is that we can simply start up as many of those self-contained workers as possible and we can manage the number of those workers in a way that is completely independent from other Airflow components - delegating deployment, scaling etc. to external deploy
ment. Very similarly as we do with scheduler - we can manage number of schedulers independently and scaling works by adding "just another scheduler". I think it also fits K8S philosophy very well where you can define Deployment of different components of application and put them together like "lego blocks" - in the way that different components do not have to know about each other and act independently.

Or maybe we are talking about different thing altogether ? Maybe there is something I do not understand ? I think we do not need to move webserver to a different container or anything like that, so maybe we have a misunderstanding here.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org