You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/02/04 18:39:34 UTC

[GitHub] [airflow] mtraynham edited a comment on issue #21265: Scheduler crashes after dag processing timeout

mtraynham edited a comment on issue #21265:
URL: https://github.com/apache/airflow/issues/21265#issuecomment-1030250466


   We are seeing similar issues after an upgrade from 2.1.4 to 2.2.3.  On our scheduler, this has been somewhat mitigated as we increased `AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL` and `AIRFLOW__SCHEDULER__MIN_FILE_PROCESS_INTERVAL` both to 10 minutes from 5 seconds.
   
   However I am still seeing it in our Celery Worker and it effectively blocks tasks from running for 80+ seconds, before it times out.
   
   I suspect our error is related to `Initializing Providers Manager[import_all_hooks]` which is something new with 2.2.X and it largely seems to always fail with `Exception when importing 'airflow.providers.docker.hooks.docker.DockerHook' from 'apache-airflow-providers-docker' package.  Our log is below.
   
   I have downgraded to 2.1.4 to see if that resolves our issue.
   
   @dcardinha @WattsInABox , do either of you have the option to turn on DEBUG logging, possibly to see something similar?
   
   ```
   2022-02-04T08:14:30.745502939Z [2022-02-04 08:14:30,745: INFO/ForkPoolWorker-15] Filling up the DagBag from /home/airflow/.local/lib/python3.8/site-packages/foobar/runner/__init__.py
   2022-02-04T08:14:30.746292606Z [2022-02-04 08:14:30,746: DEBUG/ForkPoolWorker-15] Importing /home/airflow/.local/lib/python3.8/site-packages/foobar/runner/__init__.py
   2022-02-04T08:14:31.084507410Z [2022-02-04 08:14:31,084: DEBUG/ForkPoolWorker-15] Initializing Providers Manager[hooks]
   2022-02-04T08:14:31.084556312Z [2022-02-04 08:14:31,084: DEBUG/ForkPoolWorker-15] Initializing Providers Manager[list]
   2022-02-04T08:14:31.095732961Z [2022-02-04 08:14:31,095: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.grpc.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-grpc
   2022-02-04T08:14:31.097942644Z [2022-02-04 08:14:31,097: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.docker.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-docker
   2022-02-04T08:14:31.100267595Z [2022-02-04 08:14:31,100: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.google.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-google
   2022-02-04T08:14:31.104712785Z [2022-02-04 08:14:31,104: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.cncf.kubernetes.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-cncf-kubernetes
   2022-02-04T08:14:31.106863937Z [2022-02-04 08:14:31,106: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.http.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-http
   2022-02-04T08:14:31.109014206Z [2022-02-04 08:14:31,108: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis
   2022-02-04T08:14:31.111201541Z [2022-02-04 08:14:31,110: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.mysql.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-mysql
   2022-02-04T08:14:31.113902110Z [2022-02-04 08:14:31,113: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.amazon.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-amazon
   2022-02-04T08:14:31.116398514Z [2022-02-04 08:14:31,116: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.elasticsearch.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-elasticsearch
   2022-02-04T08:14:31.118252974Z [2022-02-04 08:14:31,118: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.ftp.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-ftp
   2022-02-04T08:14:31.119932342Z [2022-02-04 08:14:31,119: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.sftp.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-sftp
   2022-02-04T08:14:31.121772776Z [2022-02-04 08:14:31,121: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.ssh.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-ssh
   2022-02-04T08:14:31.123516831Z [2022-02-04 08:14:31,123: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.sendgrid.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-sendgrid
   2022-02-04T08:14:31.125036267Z [2022-02-04 08:14:31,124: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.hashicorp.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-hashicorp
   2022-02-04T08:14:31.127102702Z [2022-02-04 08:14:31,126: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.microsoft.azure.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-microsoft-azure
   2022-02-04T08:14:31.131129199Z [2022-02-04 08:14:31,130: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.slack.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-slack
   2022-02-04T08:14:31.135302807Z [2022-02-04 08:14:31,135: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.celery.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-celery
   2022-02-04T08:14:31.136855733Z [2022-02-04 08:14:31,136: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.imap.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-imap
   2022-02-04T08:14:31.138380123Z [2022-02-04 08:14:31,138: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.sqlite.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-sqlite
   2022-02-04T08:14:31.140693159Z [2022-02-04 08:14:31,140: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.odbc.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-odbc
   2022-02-04T08:14:31.142356555Z [2022-02-04 08:14:31,142: DEBUG/ForkPoolWorker-15] Loading EntryPoint(name='provider_info', value='airflow.providers.postgres.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-postgres
   2022-02-04T08:14:31.146554304Z [2022-02-04 08:14:31,146: DEBUG/ForkPoolWorker-15] Initialization of Providers Manager[list] took 0.06 seconds
   2022-02-04T08:14:31.146788306Z [2022-02-04 08:14:31,146: DEBUG/ForkPoolWorker-15] Initialization of Providers Manager[hooks] took 0.06 seconds
   2022-02-04T08:14:31.146835613Z [2022-02-04 08:14:31,146: DEBUG/ForkPoolWorker-15] Initializing Providers Manager[import_all_hooks]
   2022-02-04T08:16:06.682682831Z [2022-02-04 08:16:06,681: ERROR/ForkPoolWorker-15] Process timed out, PID: 9579
   2022-02-04T08:16:06.683022858Z [2022-02-04 08:16:06,682: WARNING/ForkPoolWorker-15] Exception when importing 'airflow.providers.docker.hooks.docker.DockerHook' from 'apache-airflow-providers-docker' package: DagBag import timeout for /home/airflow/.local/lib/python3.8/site-packages/foobar/runner/__init__.py after 30.0s.
   2022-02-04T08:16:06.683046483Z Please take a look at these docs to improve your DAG import time:
   2022-02-04T08:16:06.683052855Z * https://airflow.apache.org/docs/apache-airflow/2.2.3/best-practices.html#top-level-python-code
   2022-02-04T08:16:06.683058750Z * https://airflow.apache.org/docs/apache-airflow/2.2.3/best-practices.html#reducing-dag-complexity, PID: 9579
   2022-02-04T08:16:07.332734255Z [2022-02-04 08:16:07,332: DEBUG/ForkPoolWorker-15] Exception when importing 'airflow.providers.google.leveldb.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'plyvel'
   2022-02-04T08:16:07.761798052Z [2022-02-04 08:16:07,761: DEBUG/ForkPoolWorker-15] Initialization of Providers Manager[import_all_hooks] took 96.61 seconds
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org