You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/12/29 22:07:07 UTC

[GitHub] [airflow] potiuk commented on issue #13364: Add virtual env creation in Celery workers

potiuk commented on issue #13364:
URL: https://github.com/apache/airflow/issues/13364#issuecomment-752257146


   I think this is a bit bigger task to tackle. but an interesting candidate to discuss. I remember discussion at the devlist where we discussed different options how to do it (preparing wheel packages for tasks was one of those for example).
   
   However this idea is much simpler and can be implemented much easier and I think it might serve the purpose you described pretty well. It does not include a lot of changes. It would be rather similar to what cgroup_task_runner does already 
   https://github.com/apache/airflow/blob/master/airflow/task/task_runner/cgroup_task_runner.py
   
   Conceptualy it could be a "virtualenv_task_runner" but it would launch a process in another virtualenv (and create the venv if it does not exist before) - very similar to what cgroup_task_runner does when creating a new cgroup process. 
   
   I think there is possibly non-trivial work, though with dependency management of airlfow itself when crossed with the other requirements you want to add, because what you are really looking at is not only to create a virtualenv itself and add your packaages there, but you also have to make sure that airflow is installed in this virtualenv including some (possibly all ? ) packages that come with airflow. This is the only way to make sure that all operators will work within such environment.
   
   But maybe if we could assume that we copy "all" standard airflow dependencies and add only "extra" ones might be different, that might work rather well and can be implemented easily. I think python venv will install symlinks to the packages by default when they are already installed in the "main" environment so this might even be pretty effficient in terms of space used.
   
   I wonder what others think about it @kaxil @ashb @mik-laj @turbaszek - I know you might now some potential problems with that approach, but first look it looks rather reasonable and rather easy to implement ? 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org