You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/19 14:02:09 UTC
[GitHub] [airflow] ashb opened a new pull request #19709: Speed up webserver boot time by delaying provider initialization
ashb opened a new pull request #19709:
URL: https://github.com/apache/airflow/pull/19709
This drops the time to first request from 37s to 20s by making the
following changes:
- Don't pre-load the app when not in daemon mode.
The purpose of the call to `cached_app()` was to ensure that any
errors are reported on the terminal before it is detached to make
failures more obvious to the user (which is a good feature).
However the comment about "pre-warm the cache" was incorrect and did
not happen -- since we run gunicorn by spawning a whole new process
it doesn't share any state from the current python interpreter.
- Don't load/initialize providers when only importing airflow.www.views
As it was written it would load the providers hook's at import time.
This changes it through a combination of cached properties and the
existing `init_connection_form` function.
(`extra_fields` is not set as a cached_property because of how FAB
works -- it iterates over all attributes of the class looking for
methods/routes and then looks at properties on it, meaning it would
still access the property too early)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] ashb commented on a change in pull request #19709: Speed up webserver boot time by delaying provider initialization
Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#discussion_r753389416
##########
File path: airflow/www/gunicorn_config.py
##########
@@ -30,3 +30,10 @@ def post_worker_init(_):
"""
old_title = setproctitle.getproctitle()
setproctitle.setproctitle(settings.GUNICORN_WORKER_READY_PREFIX + old_title)
+
+
+def on_starting(server):
+ from airflow.providers_manager import ProvidersManager
+
+ # Load providers before forking workers
+ ProvidersManager().connection_form_widgets
Review comment:
This all started because I was fed up at the warning log messages appearing multiple times because my venv was missing about 47 deps 🤣
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] kaxil commented on a change in pull request #19709: Speed up webserver boot time by delaying provider initialization
Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#discussion_r753362473
##########
File path: airflow/www/gunicorn_config.py
##########
@@ -30,3 +30,10 @@ def post_worker_init(_):
"""
old_title = setproctitle.getproctitle()
setproctitle.setproctitle(settings.GUNICORN_WORKER_READY_PREFIX + old_title)
+
+
+def on_starting(server):
Review comment:
This is the server hook - https://docs.gunicorn.org/en/stable/settings.html#on-starting
(Note for future me and readers :D )
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] ashb commented on pull request #19709: Speed up webserver boot time by delaying provider initialization
Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#issuecomment-974298598
Test failures are
```
tests/cli/commands/test_webserver_command.py::TestCliWebServer::test_cli_webserver_access_log_format: json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
```
which has been fixed by making the tests better in #19712
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] dstandish commented on a change in pull request #19709: Speed up webserver boot time by delaying provider initialization
Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#discussion_r753279342
##########
File path: airflow/cli/commands/webserver_command.py
##########
@@ -446,6 +441,11 @@ def monitor_gunicorn(gunicorn_master_pid: int):
).start()
if args.daemon:
+ # This makes possible errors get reported before demonization
Review comment:
```suggestion
# This makes possible errors get reported before daemonization
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] ashb commented on pull request #19709: Speed up webserver boot time by delaying provider initialization
Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#issuecomment-974204827
D'oh thanks. I simplified it and didn't test it again after making that change.
I suspect it's cos I'm calling things once in the parent, and then once again in the fork.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] ashb commented on a change in pull request #19709: Speed up webserver boot time by delaying provider initialization
Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#discussion_r753389416
##########
File path: airflow/www/gunicorn_config.py
##########
@@ -30,3 +30,10 @@ def post_worker_init(_):
"""
old_title = setproctitle.getproctitle()
setproctitle.setproctitle(settings.GUNICORN_WORKER_READY_PREFIX + old_title)
+
+
+def on_starting(server):
+ from airflow.providers_manager import ProvidersManager
+
+ # Load providers before forking workers
+ ProvidersManager().connection_form_widgets
Review comment:
This all started because I was ignored at the warning log messages because my venv was missing about 47 deps 🤣
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #19709: Speed up webserver boot time by delaying provider initialization
Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#issuecomment-974265700
Tested this last version . Starts visibly faster ! Good job @ashb
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] ashb commented on pull request #19709: Speed up webserver boot time by delaying provider initialization
Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#issuecomment-974100194
This might help the failures on Kube tests on main a bit too -- It looks like we had some failures just caused by webserver being slow to start up.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #19709: Speed up webserver boot time by delaying provider initialization
Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#issuecomment-974131895
ah nice - will take a look shortly!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #19709: Speed up webserver boot time by delaying provider initialization
Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#discussion_r753377207
##########
File path: airflow/www/gunicorn_config.py
##########
@@ -30,3 +30,10 @@ def post_worker_init(_):
"""
old_title = setproctitle.getproctitle()
setproctitle.setproctitle(settings.GUNICORN_WORKER_READY_PREFIX + old_title)
+
+
+def on_starting(server):
+ from airflow.providers_manager import ProvidersManager
+
+ # Load providers before forking workers
+ ProvidersManager().connection_form_widgets
Review comment:
Fantastic idea to load the connections only once :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] ashb merged pull request #19709: Speed up webserver boot time by delaying provider initialization
Posted by GitBox <gi...@apache.org>.
ashb merged pull request #19709:
URL: https://github.com/apache/airflow/pull/19709
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] ashb commented on pull request #19709: Speed up webserver boot time by delaying provider initialization
Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#issuecomment-974258078
Fixed in 249eeeb @dstandish -- `add_columns` and `edit_columns` were the same list object 🤦🏻
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #19709: Speed up webserver boot time by delaying provider initialization
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #19709:
URL: https://github.com/apache/airflow/pull/19709#issuecomment-974162333
The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org