You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/04/30 08:33:20 UTC

[GitHub] [airflow] resdevd edited a comment on pull request #8621: Docker Compose CeleryExecutor example

resdevd edited a comment on pull request #8621:
URL: https://github.com/apache/airflow/pull/8621#issuecomment-621684011


   > I think we should decide if the docker-compose file will serve as a "hello world" example, or should it be something more complex, but closer to a production deployment?
   
   IMO the simplest way to run airflow is still pip install with airflow commands. A "hello world" example should not be requiring users to install/learn docker and docker-compose. 
   
   > I think the folder name "templates" is incorrect here, no templating is being done? Something like "examples/docker", or even "docker-compose" in the root of the project makes more sense IMO
   
   Sure, like @turbaszek mentioned we could use a name like "deploy" etc., where more docker-compose config with LocalExecutor and potentially k8s/helm config can be added in future. IMO "examples" would be confusing as it might refer to example dags and operators. 
   
   > There are already plenty of example DAGs, let's not add yet another example
   
   It wasn't about adding an example, the goal was to show usage of docker volumes for dags and plugins.
   
   > I think the goal should be a simple example docker-compose file. People will then extract bits and pieces from the example into their deployment. There's a ton of different ways to do deployment, so I would not aim for creating a docker-compose file with the goal of being close to one single deployment method. Instead, aim for something easy to comprehend.
   
   This PR is addressing https://github.com/apache/airflow/issues/8605 "Add Production-ready docker compose for the production image". 
   IMO CeleryExecutor is probably most advanced config in terms of docker-compose, It's easy to change from CeleryExecutor to LocalExecutor than vice-versa. 
   
   > Okay, why does this change the argument for using the extension fields?
   
   I was referring to the code context that was being referred to regarding extension fields in the original review.
   
   > If we want the docker-compose file to serve as an example, I think it should be as self-explanatory as possible. Storing env vars in a separate file does not help that cause, I think using extension fields and having everything in the same file will be clearer.
   
   Like mentioned above, this PR is addressing an issue where the goal is to have docker-compose close to production.  Having fernet keys, sensitive info that live in the docker-compose is not a best practice either (in terms of .gitignore config and migration to ECS/Fargate, Docker Swarm etc., ). 
   
   
   > > Readability and Adoptation of extension-fields is still lagging behind with majority of docker/docker-compose users.
   
   > 
   > How so?
   
   Glad you asked. Prior to Airflow's official Docker images, the two most popular docker-compose airflow code include https://github.com/puckel/docker-airflow and https://github.com/bitnami/bitnami-docker-airflow . I don't think extension fields are used there. 
   Not only that, I don't think you'd find them being used even in official Docker's repo https://github.com/docker/awesome-compose extensively. You might notice same behavior with some of the most popular docker-compose code like https://github.com/deviantony/docker-elk and https://github.com/vegasbrianc/prometheus etc.,
   
   For the context, Extension fields were introduced in docker-compose 3.4 in 2017. 
   
   > You still have unique container names.
   
   I was referring to the code context that was being referred to regarding extension fields in the original review.
   
   > For the initialization I've used this thing in the past, worked quite okay. We could make it a bit more sophisticated, e.g. by checking some condition to be True once a second, instead of sleep 5:
   
   This is a separate issue https://github.com/apache/airflow/issues/8606 , in that case if env variables would be used  for this purpose then we can git rid of the init script entirely. Besides containers share cpu, memory, network and io, I'm not sure adding another container for a one time action is the best way to go.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org