You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/03/24 13:00:42 UTC

[GitHub] [airflow] avinovarov opened a new issue #22504: Random timeouts on creating connections in kubernetes executors

avinovarov opened a new issue #22504:
URL: https://github.com/apache/airflow/issues/22504


   ### Apache Airflow version
   
   2.2.3
   
   ### What happened
   
   **The problem**
   
   - Under some load, with hundreds of DAGs running in parallel, Airflow executors RANDOMLY throw errors on creating connections:
   
   ```
   (some connections successfully created)
   ...
   creating: raw/pg_services/folder/connection_name
   [2022-03-23 02:39:43,102] {connection.py:404} ERROR - Unable to retrieve connection from secrets backend (MetastoreBackend). Checking subsequent secrets backend.
   ```
   
   It is reproduced on creating random connections, on about 25-50% Airflow workers, quite a lot of workers succeed in creating connections.
   
   **These timeouts happen only when we have dozens of DAGs running in parallel.**
   
   ### What you think should happen instead
   
   We'd assume that connections should be created on stable basis =)
   
   ### How to reproduce
   
   - Deploy Airflow to k8s and add connections to multiple Postgres databases (we have 75)
   - Run dozens of DAGs in parallel.
   
   ### Operating System
   
   k8s via rancher, on CentOS 7
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-postgres==2.4.0
   
   ### Deployment
   
   Other 3rd-party Helm chart
   
   ### Deployment details
   
   **Our setup**
   
   - Airflow on kubernetes, with KubernetesExecutor, installed with [user community Helm chart](https://github.com/airflow-helm/charts/blob/main/charts/airflow/values.yaml)
   - 75 connections to various sources, mainly Postgres databases, specified in helm chart values, like this:
   ```
   # this is how we add connections with credentials in helm chart values
     connections: 
       - id: pg_connection
         type: postgres
         host: database.domain.com
         login: $PG_LOGIN
         password: $PG_PASSWORD
         port: 5432
         schema: database
   
   # and specify credentials with secrets below
     connectionsTemplates: 
       PG_LOGIN:
         kind: secret
         name: airflow-secrets
         key: PG_LOGIN
   ```
   
   Of course we have k8s secrets deployed in our `airflow` namespace, and as long as we run individual DAGs we observe no errors.
   
   ### Anything else
   
   **As long as we run individual DAGs we observe no errors.**
   
   Based on the timeout error we assume that the issue is with gaining credentials (which apparently falls back to secondary credentials provider), not with connection to Postgres databases themselves, but this is just our guess. We also don't observe any overload on our Postgres databases.
   
   Googling the error didn't help much, so we'd be grateful for any advice.
   Thanks!
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #22504: Random timeouts on creating connections in kubernetes executors

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #22504:
URL: https://github.com/apache/airflow/issues/22504#issuecomment-1077602243


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org