You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by po...@apache.org on 2021/09/23 12:52:46 UTC

[airflow] 03/04: Add PGBouncer recommendation in "setup-database' doc. (#18399)

This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch v2-1-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit c94f69519b409df373d9103ef1b306c2c8dc607f
Author: Jarek Potiuk <ja...@potiuk.com>
AuthorDate: Tue Sep 21 09:35:21 2021 +0200

    Add PGBouncer recommendation in "setup-database' doc. (#18399)
    
    We were recommending using PGBouncer for all Postgres installation
    for quite some time at least verbally but also in the Helm Chart
    documentation. However we missed such recommendation in the
    general Postgres area of 'Setting Up the database` doc.
    
    This PR adds a note that we can refer to when explaining
    problems with connections and stability to the users who
    use Postgres without PGBouncer proxy (which is known to help
    in such cases)
    
    (cherry picked from commit e8667b6ac1d5409acddf1ba2a382ed30ee12f53d)
---
 docs/apache-airflow/howto/set-up-database.rst | 13 +++++++++++++
 docs/helm-chart/production-guide.rst          |  2 ++
 2 files changed, 15 insertions(+)

diff --git a/docs/apache-airflow/howto/set-up-database.rst b/docs/apache-airflow/howto/set-up-database.rst
index 6ed2982..216eeb3 100644
--- a/docs/apache-airflow/howto/set-up-database.rst
+++ b/docs/apache-airflow/howto/set-up-database.rst
@@ -218,6 +218,19 @@ want to set a default schema for your role with a SQL statement similar to ``ALT
 
 For more information regarding setup of the PostgresSQL connection, see `PostgreSQL dialect <https://docs.sqlalchemy.org/en/13/dialects/postgresql.html>`__ in SQLAlchemy documentation.
 
+.. note::
+
+   Airflow is known - especially in high-performance setup - to open many connections to metadata database. This might cause problems for
+   Postgres resource usage, because in Postgres, each connection creates a new process and it makes Postgres resource-hungry when a lot
+   of connections are opened. Therefore we recommend to use `PGBouncer <https://www.pgbouncer.org/>`_ as database proxy for all Postgres
+   production installations. PGBouncer can handle connection pooling from multiple components, but also in case you have remote
+   database with potentially unstable connectivity, it will make your DB connectivity much more resilient to temporary network problems.
+   Example implementation of PGBouncer deployment can be found in the :doc:`helm-chart:index` where you can enable pre-configured
+   PGBouncer instance with flipping a boolean flag. You can take a look at the approach we have taken there and use it as
+   an inspiration, when you prepare your own Deployment, even if you do not use the Official Helm Chart.
+
+   See also :ref:`Helm Chart production guide <production-guide:pgbouncer>`
+
 .. spelling::
 
      hba
diff --git a/docs/helm-chart/production-guide.rst b/docs/helm-chart/production-guide.rst
index bd61808..0c2a2b5 100644
--- a/docs/helm-chart/production-guide.rst
+++ b/docs/helm-chart/production-guide.rst
@@ -43,6 +43,8 @@ found on the :doc:`Set up a Database Backend <apache-airflow:howto/set-up-databa
       port: ...
       db: ...
 
+.. _production-guide:pgbouncer:
+
 PgBouncer
 ---------