You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/02/04 18:39:34 UTC
[GitHub] [airflow] mtraynham edited a comment on issue #21265: Scheduler crashes after dag processing timeout
mtraynham edited a comment on issue #21265:
URL: https://github.com/apache/airflow/issues/21265#issuecomment-1030250466
We are seeing similar issues after an upgrade from 2.1.4 to 2.2.3. On our scheduler, this has been somewhat mitigated as we increased `AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL` and `AIRFLOW__SCHEDULER__MIN_FILE_PROCESS_INTERVAL` both to 10 minutes from 5 seconds.
However I am still seeing it in our Celery Worker and it effectively blocks tasks from running for 80+ seconds, before it times out.
I suspect our error is related to `Initializing Providers Manager[import_all_hooks]` which is something new with 2.2.X and it largely seems to always fail with `Exception when importing 'airflow.providers.docker.hooks.docker.DockerHook' from 'apache-airflow-providers-docker' package. Our log is below.
I have downgraded to 2.1.4 to see if that resolves our issue.
@dcardinha @WattsInABox , do either of you have the option to turn on DEBUG logging, possibly to see something similar?
```
2022-02-04T08:14:30.745502939Z [2022-02-04 08:14:30,745: INFO/ForkPoolWorker-15] Filling up the DagBag from /home/airflow/.local/lib/python3.8/site-packages/foobar/runner/__init__.py
2022-02-04T08:14:30.746292606Z [2022-02-04 08:14:30,746: DEBUG/ForkPoolWorker-15] Importing /home/airflow/.local/lib/python3.8/site-packages/foobar/runner/__init__.py
2022-02-04T08:14:31.084507410Z [2022-02-04 08:14:31,084: DEBUG/ForkPoolWorker-15] Initializing Providers Manager[hooks]
2022-02-04T08:14:31.084556312Z [2022-02-04 08:14:31,084: DEBUG/ForkPoolWorker-15] Initializing Providers Manager[list]
2022-02-04T08:14:31.095732961Z [2022-02-04 08:14:31,095: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.grpc.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-grpc
2022-02-04T08:14:31.097942644Z [2022-02-04 08:14:31,097: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.docker.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-docker
2022-02-04T08:14:31.100267595Z [2022-02-04 08:14:31,100: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.google.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-google
2022-02-04T08:14:31.104712785Z [2022-02-04 08:14:31,104: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.cncf.kubernetes.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-cncf-kubernetes
2022-02-04T08:14:31.106863937Z [2022-02-04 08:14:31,106: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.http.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-http
2022-02-04T08:14:31.109014206Z [2022-02-04 08:14:31,108: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis
2022-02-04T08:14:31.111201541Z [2022-02-04 08:14:31,110: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.mysql.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-mysql
2022-02-04T08:14:31.113902110Z [2022-02-04 08:14:31,113: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.amazon.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-amazon
2022-02-04T08:14:31.116398514Z [2022-02-04 08:14:31,116: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.elasticsearch.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-elasticsearch
2022-02-04T08:14:31.118252974Z [2022-02-04 08:14:31,118: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.ftp.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-ftp
2022-02-04T08:14:31.119932342Z [2022-02-04 08:14:31,119: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.sftp.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-sftp
2022-02-04T08:14:31.121772776Z [2022-02-04 08:14:31,121: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.ssh.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-ssh
2022-02-04T08:14:31.123516831Z [2022-02-04 08:14:31,123: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.sendgrid.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-sendgrid
2022-02-04T08:14:31.125036267Z [2022-02-04 08:14:31,124: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.hashicorp.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-hashicorp
2022-02-04T08:14:31.127102702Z [2022-02-04 08:14:31,126: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.microsoft.azure.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-microsoft-azure
2022-02-04T08:14:31.131129199Z [2022-02-04 08:14:31,130: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.slack.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-slack
2022-02-04T08:14:31.135302807Z [2022-02-04 08:14:31,135: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.celery.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-celery
2022-02-04T08:14:31.136855733Z [2022-02-04 08:14:31,136: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.imap.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-imap
2022-02-04T08:14:31.138380123Z [2022-02-04 08:14:31,138: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.sqlite.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-sqlite
2022-02-04T08:14:31.140693159Z [2022-02-04 08:14:31,140: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.odbc.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-odbc
2022-02-04T08:14:31.142356555Z [2022-02-04 08:14:31,142: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.postgres.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-postgres
2022-02-04T08:14:31.146554304Z [2022-02-04 08:14:31,146: DEBUG/ForkPoolWorker-15] Initialization of Providers Manager[list] took 0.06 seconds
2022-02-04T08:14:31.146788306Z [2022-02-04 08:14:31,146: DEBUG/ForkPoolWorker-15] Initialization of Providers Manager[hooks] took 0.06 seconds
2022-02-04T08:14:31.146835613Z [2022-02-04 08:14:31,146: DEBUG/ForkPoolWorker-15] Initializing Providers Manager[import_all_hooks]
2022-02-04T08:16:06.682682831Z [2022-02-04 08:16:06,681: ERROR/ForkPoolWorker-15] Process timed out, PID: 9579
2022-02-04T08:16:06.683022858Z [2022-02-04 08:16:06,682: WARNING/ForkPoolWorker-15] Exception when importing 'airflow.providers.docker.hooks.docker.DockerHook' from 'apache-airflow-providers-docker' package: DagBag import timeout for /home/airflow/.local/lib/python3.8/site-packages/foobar/runner/__init__.py after 30.0s.
2022-02-04T08:16:06.683046483Z Please take a look at these docs to improve your DAG import time:
2022-02-04T08:16:06.683052855Z * https://airflow.apache.org/docs/apache-airflow/2.2.3/best-practices.html#top-level-python-code
2022-02-04T08:16:06.683058750Z * https://airflow.apache.org/docs/apache-airflow/2.2.3/best-practices.html#reducing-dag-complexity, PID: 9579
2022-02-04T08:16:07.332734255Z [2022-02-04 08:16:07,332: DEBUG/ForkPoolWorker-15] Exception when importing 'airflow.providers.google.leveldb.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'plyvel'
2022-02-04T08:16:07.761798052Z [2022-02-04 08:16:07,761: DEBUG/ForkPoolWorker-15] Initialization of Providers Manager[import_all_hooks] took 96.61 seconds
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org