You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/08/18 21:35:30 UTC

[GitHub] [airflow] shachibista opened a new issue #10388: Windows support using Daemoniker

shachibista opened a new issue #10388:
URL: https://github.com/apache/airflow/issues/10388


   **Description**
   
   Currently, the airflow project uses PEP-3143 style daemons to launch tasks (as implemented in https://pypi.org/project/python-daemon/), however this is targeted towards unix daemons. As a result, running airflow on windows requires multiple levels of abstraction each with their own problems. Would it be possible to use something like daemoniker (https://daemoniker.readthedocs.io/en/latest/) to launch tasks? What are the challenges and issues?
   
   In machine learning workflows, with large datasets, it is a huge time-saver if the pipeline tasks can be run on the GPU. WSL 1 does not support GPU passthrough, docker through WSL 2 supports GPU passthrough only with the Insiders build, additionally it has issues with networking when connected to VPN (https://github.com/microsoft/WSL/issues/5068). 
   
   **Use case / motivation**
   
   Natively running airflow without WSL 1/2 or docker on Windows. This is helpful in cases where the company ecosystem is windows-based. 
   
   **Possible implementation**
   
   The daemon module is only used to daemonize the scheduler and webserver. Here's a sample code that runs the scheduler (airflow origin/v1-10-stable) using daemoniker, comments are welcome:
   
   ```python
   # airflow/bin/cli.py
   from daemoniker import Daemonizer
   
   ...
   
   if args.daemon:
       with Daemonizer() as (is_setup, daemonizer):
           if is_setup:
               pid, stdout, stderr, log_file = setup_locations("scheduler",
                                                       args.pid,
                                                       args.stdout,
                                                       args.stderr,
                                                       args.log_file)
           
           _is_parent = daemonizer(
               pid,
               stdout_goto=stdout,
               stderr_goto=stderr
           )
   
       job.run()
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] casra-developers commented on issue #10388: Windows support for Airflow

Posted by GitBox <gi...@apache.org>.
casra-developers commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-848605540


   In our company we have now a setup where we use Ubuntu server to host Airflow (Web-Server, Dask-Scheduler) and a Windows Server as Dask-Worker. We need the tasks to run on Windows since there are some dependencies in them that cannot easily be ported to other platforms. Since the Dask-Worker also needs to have Airflow installed we had to clone the repository and add some extensions to deal with all the POSIX-only python functions that are not available on Windows. We ended up adding a platform check in certain files and "mimicking" POSIX behavior where necessary.
   This approach works really well in the limited manner we need it to work, but it would be great if such a custom solution could be replaced by something more official. We would be willing to share our insights, if the devs are interested in pursuing this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #10388: Windows support using Daemoniker

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-675732354


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] shachibista commented on issue #10388: Windows support using Daemoniker

Posted by GitBox <gi...@apache.org>.
shachibista commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-678676993


   @mik-laj No, I haven't tried installing the development version from source. Is there a simple way to do it within windows?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #10388: Windows support using Daemoniker

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-678686506


   I am afraid not. We know Airflow works in WSL2, but we also know it does not work on Windows. Unless you can convince someone to make it works for Windows, I am afraid it's not going to happen.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] casra-developers commented on issue #10388: Windows support for Airflow

Posted by GitBox <gi...@apache.org>.
casra-developers commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-849471227


   I've added the [PR](https://github.com/apache/airflow/pull/16110). After playing around a bit I now know that while this works fine for a Dask-Worker, it does not if you want to run the Web-Server or a Scheduler on Windows just because of the process handling. There probably needs to be something like one more layer of abstraction to handle the execution of processes platform agnostic.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] casra-developers commented on issue #10388: Windows support for Airflow

Posted by GitBox <gi...@apache.org>.
casra-developers commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-848739785


   We have a go, I will create a fork and CC you @potiuk  in the PR. There is probably a lot of things we need to do since the only goal was to implement enough functionality for Dask to run properly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on issue #10388: Windows support using Daemoniker

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-678696786


   you can install the application from local sources by cloning the repository and then running the `pip install -e .` command


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #10388: Windows support for Airflow

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-848798548


   We can do it in stages as well. Happy to introduce some parts and see if this needs/can be replicated elsewhere. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #10388: Windows support using Daemoniker

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-676508902


   I think it would be great if someone could invest in Windows support. I believe there are few things - not only the daemon model but also Local Executor uses fork mechanisms which won't be able on Windows, also there might be some problem if you want to use Celery Executor on Windows: https://www.distributedpython.com/2018/08/21/celery-4-windows/ There are few POSIX-compliant packages used as well with might not work on Windows. And automated testing might be a problem since we are using Docker. It looks like quite a big effort to invest..


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #10388: Windows support for Airflow

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-848867351


   Just split the changes needed maybe start with some small few lines part - I could then add the Windows CI tests around it on top. And we could add other PRs afterwards.  Generally the smaller PR - the better :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #10388: Windows support for Airflow

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-848686974


   Looking forward to it. Today we've merged official MSSQL support so seems we are getting friendlier for Microsoft :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #10388: Windows support using Daemoniker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-675758417


   have you encountered other problems with running Airflow on Windows? Windows support is highly anticipated by our users, but no one has dealt with this topic intensively yet.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #10388: Windows support for Airflow

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-848686974


   Looking forward to it. Today we've merged official MSSQL support so seems we are getting friendler for Microsoft :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #10388: Windows support using Daemoniker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-676493703


   @shachibista Have you tried installing the development version from source? I think this change should fix this problem.
   https://github.com/apache/airflow/pull/7808/files#r396126977


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #10388: Windows support for Airflow

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-848609916


   Absolutely! I think that might be great thing to add to Airflow. Maybe  you would like to open a PR about this  (cc: me) with your changes and we can discuss how to approach it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on issue #10388: Windows support using Daemoniker

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-675758417


   have you encountered other problems with running Airflow on Windows? Windows support is highly anticipated by our users, but no one has dealt with this topic intensively yet. Personally, I use MacOS. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on issue #10388: Windows support using Daemoniker

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-675758417


   have you encountered other problems with running Airflow on Windows? Windows support is highly anticipated by our users, but no one has dealt with this topic intensively yet. Personally, I use MacOS, but I support the idea of ​​adding support for Windows.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #10388: Windows support using Daemoniker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-678696786


   you can install the application from local sources by cloning the repository and then calling the `pip install -e .` command


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] shachibista commented on issue #10388: Windows support using Daemoniker

Posted by GitBox <gi...@apache.org>.
shachibista commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-679904706


   @mik-laj Yes, the development version fixes the issue with the `airflow` command, at least. But, I cannot start the scheduler due to the aforementioned issues. 
   
   @potiuk Are you sure there are no fork-like mechanisms for windows? I would really like to get this working at least using Local/SequentialExecutor.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #10388: Windows support using Daemoniker

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-679913304


   > @mik-laj Yes, the development version fixes the issue with the `airflow` command, at least. But, I cannot start the scheduler due to the aforementioned issues.
   > 
   > @potiuk Are you sure there are no fork-like mechanisms for windows? I would really like to get this working at least using Local/SequentialExecutor.
   
   There are different mechanisms - here is the whole discussion about it: https://docs.python.org/3/library/subprocess.html#popen-constructor - but they work differently and Airflow relies on some of the properties of Popen and passing opened file handlers (for example to opened log files). I think there are also a number of other dependencies and possibly hard-coded UNIX path "/" across the code, also Windows is not POSIX-compliant, and I think there are many places where we rely on some tools or binaries which are part of POSIX standard.
   
   I am not saying it's impossible, I just think it's quite an effort and unless you make all the tests pass on windows we can't even start thinking about it.  You can start with forking Airflow and trying to make the test work on Windows. Github Actions support Windows runners, so this should be easy to enable.
   
   We are heavily relying on Bash scripts for executing the tests and building Docker Images - and all our tests are run in Ubuntu docker image - however if you want to run it on Windows, it has to be done differently and likey not using Docker images - simply creating a virtualenv and installing everything.
   
   Maybe you can find others who have time and would like to take a look at that together with you ? Simply start a discussion on our devlist and ask for help. I am afraid at this stage for the community, the fact that it works for WSL2 for Windows users is quite enough.
   
   I know there were some changes implemented by @evgenyshulman from DataBand to make Airlfow work in a very limited way on Windows - so maybe rather than run a full set of tests on Windows, just getting a very simple support for Local Executor is possible ? Still Starting from a GitHub actions step installing Airflow on Windows is a good start, we cannot accept the code that is not tested, so being able to test it automatically is a prerequisite. 
   
   Happy to review any changes if you come up with tests running on Windows :).
   
   
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] shachibista commented on issue #10388: Windows support using Daemoniker

Posted by GitBox <gi...@apache.org>.
shachibista commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-676484403


   Yes. Following the installation manual on the homepage `pip install apache-airflow` installs the `airflow` command, but it is not a windows executable and windows does not recognize the `#! ..../python3.exe` shebang. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] casra-developers commented on issue #10388: Windows support for Airflow

Posted by GitBox <gi...@apache.org>.
casra-developers commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-848683487


   Great to hear! We will need a bit of time since we only cloned the repository and have not forked it yet. I will check with my team mates and create a PR as soon as I have the time. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #10388: Windows support for Airflow

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-848799594


   We also want probably to add some tests in the CI of ours to run on Windows. GitHub supports Windows runners as well so I am happy to work on incrementally adding more tests and run them in our CI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] casra-developers commented on issue #10388: Windows support for Airflow

Posted by GitBox <gi...@apache.org>.
casra-developers commented on issue #10388:
URL: https://github.com/apache/airflow/issues/10388#issuecomment-848850037


   Glad to hear it. We work mainly on Azure-DevOps so we are not very familiar with testing and CI tools on GitHub, but happy to learn. I have created the fork and started with implementing the changes. How would you go about step-wise integration?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org