You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/19 14:02:09 UTC

[GitHub] [airflow] ashb opened a new pull request #19709: Speed up webserver boot time by delaying provider initialization

ashb opened a new pull request #19709:
URL: https://github.com/apache/airflow/pull/19709


   This drops the time to first request from 37s to 20s by making the
   following changes:
   
   - Don't pre-load the app when not in daemon mode.
   
     The purpose of the call to `cached_app()` was to ensure that any
     errors are reported on the terminal before it is detached to make
     failures more obvious to the user (which is a good feature).
   
     However the comment about "pre-warm the cache" was incorrect and did
     not happen -- since we run gunicorn by spawning a whole new process
     it doesn't share any state from the current python interpreter.
   
   - Don't load/initialize providers when only importing airflow.www.views
   
     As it was written it would load the providers hook's at import time.
   
     This changes it through a combination of cached properties and the
     existing `init_connection_form` function.
   
     (`extra_fields` is not set as a cached_property because of how FAB
     works -- it iterates over all attributes of the class looking for
     methods/routes and then looks at properties on it, meaning it would
     still access the property too early)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #19709: Speed up webserver boot time by delaying provider initialization

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#discussion_r753389416



##########
File path: airflow/www/gunicorn_config.py
##########
@@ -30,3 +30,10 @@ def post_worker_init(_):
     """
     old_title = setproctitle.getproctitle()
     setproctitle.setproctitle(settings.GUNICORN_WORKER_READY_PREFIX + old_title)
+
+
+def on_starting(server):
+    from airflow.providers_manager import ProvidersManager
+
+    # Load providers before forking workers
+    ProvidersManager().connection_form_widgets

Review comment:
       This all started because I was fed up at the warning log messages appearing multiple times because my venv was missing about 47 deps 🤣 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #19709: Speed up webserver boot time by delaying provider initialization

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#discussion_r753362473



##########
File path: airflow/www/gunicorn_config.py
##########
@@ -30,3 +30,10 @@ def post_worker_init(_):
     """
     old_title = setproctitle.getproctitle()
     setproctitle.setproctitle(settings.GUNICORN_WORKER_READY_PREFIX + old_title)
+
+
+def on_starting(server):

Review comment:
       This is the server hook - https://docs.gunicorn.org/en/stable/settings.html#on-starting
   
   (Note for future me and readers :D )




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #19709: Speed up webserver boot time by delaying provider initialization

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#issuecomment-974298598


   Test failures are 
   
   ```
   tests/cli/commands/test_webserver_command.py::TestCliWebServer::test_cli_webserver_access_log_format: json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
   ```
   
   which has been fixed by making the tests better in #19712 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #19709: Speed up webserver boot time by delaying provider initialization

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#discussion_r753279342



##########
File path: airflow/cli/commands/webserver_command.py
##########
@@ -446,6 +441,11 @@ def monitor_gunicorn(gunicorn_master_pid: int):
             ).start()
 
         if args.daemon:
+            # This makes possible errors get reported before demonization

Review comment:
       ```suggestion
               # This makes possible errors get reported before daemonization
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #19709: Speed up webserver boot time by delaying provider initialization

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#issuecomment-974204827


   D'oh thanks. I simplified it and didn't test it again after making that change.
   
   I suspect it's cos I'm calling things once in the parent, and then once again in the fork.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #19709: Speed up webserver boot time by delaying provider initialization

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#discussion_r753389416



##########
File path: airflow/www/gunicorn_config.py
##########
@@ -30,3 +30,10 @@ def post_worker_init(_):
     """
     old_title = setproctitle.getproctitle()
     setproctitle.setproctitle(settings.GUNICORN_WORKER_READY_PREFIX + old_title)
+
+
+def on_starting(server):
+    from airflow.providers_manager import ProvidersManager
+
+    # Load providers before forking workers
+    ProvidersManager().connection_form_widgets

Review comment:
       This all started because I was ignored at the warning log messages because my venv was missing about 47 deps 🤣 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19709: Speed up webserver boot time by delaying provider initialization

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#issuecomment-974265700


   Tested this last version . Starts visibly faster ! Good job @ashb


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #19709: Speed up webserver boot time by delaying provider initialization

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#issuecomment-974100194


   This might help the failures on Kube tests on main a bit too -- It looks like we had some failures just caused by webserver being slow to start up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19709: Speed up webserver boot time by delaying provider initialization

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#issuecomment-974131895


   ah nice - will take a look shortly!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19709: Speed up webserver boot time by delaying provider initialization

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#discussion_r753377207



##########
File path: airflow/www/gunicorn_config.py
##########
@@ -30,3 +30,10 @@ def post_worker_init(_):
     """
     old_title = setproctitle.getproctitle()
     setproctitle.setproctitle(settings.GUNICORN_WORKER_READY_PREFIX + old_title)
+
+
+def on_starting(server):
+    from airflow.providers_manager import ProvidersManager
+
+    # Load providers before forking workers
+    ProvidersManager().connection_form_widgets

Review comment:
       Fantastic idea to load the connections only once :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb merged pull request #19709: Speed up webserver boot time by delaying provider initialization

Posted by GitBox <gi...@apache.org>.
ashb merged pull request #19709:
URL: https://github.com/apache/airflow/pull/19709


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #19709: Speed up webserver boot time by delaying provider initialization

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#issuecomment-974258078


   Fixed in 249eeeb @dstandish -- `add_columns` and `edit_columns` were the same list object 🤦🏻 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #19709: Speed up webserver boot time by delaying provider initialization

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#issuecomment-974162333


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org