You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by jh...@apache.org on 2021/08/11 23:07:49 UTC

[airflow] 11/11: Better diagnostics and self-healing of docker-compose (#17484)

This is an automated email from the ASF dual-hosted git repository.

jhtimmins pushed a commit to branch v2-1-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 20ed40b4ba2f754c205a2d193319cd99d8b144f7
Author: Jarek Potiuk <ja...@potiuk.com>
AuthorDate: Mon Aug 9 10:41:57 2021 +0200

    Better diagnostics and self-healing of docker-compose (#17484)
    
    There are several ways people might get the quick-start
    docker-compose running messed up (especially on linux):
    
    1) they do not run initialization steps and run docker-compose-up
    2) they do not run docker-compose-init first
    
    Also on MacOS/Windows default memory/disk settings are not
    enough to run Airflow via docker-compose and people are reporting
    "Airflow not working" where they simply do not allocate enough
    resources.
    
    Finally the docker compose does not support all versions of airflow
    and various problems might occur when you use this
    docker compose with old version of airflow.
    
    This change adds the following improvements:
    
    * automated check of minimum version of airflow supported
    * mkdir -p in the directories creation in instructions
    * automated checking if AIRFLOW_UID has been set (and printing
      error and instruction link in case it is not)
    * prints warning about too-low memory, cpu, disk allocation
      and instruction link where to read about it
    * automated fixing of ownership of the directories created in
      case they were not created initially and ended up owned by
      root user
    
    (cherry picked from commit 763860c8109fd3326fba92abc52fb8ffb1d39834)
---
 docs/apache-airflow/start/docker-compose.yaml | 57 ++++++++++++++++++++++++++-
 docs/apache-airflow/start/docker.rst          | 23 +++++++++--
 2 files changed, 74 insertions(+), 6 deletions(-)

diff --git a/docs/apache-airflow/start/docker-compose.yaml b/docs/apache-airflow/start/docker-compose.yaml
index 06991e7..b74f5ca 100644
--- a/docs/apache-airflow/start/docker-compose.yaml
+++ b/docs/apache-airflow/start/docker-compose.yaml
@@ -28,7 +28,7 @@
 # AIRFLOW_UID                  - User ID in Airflow containers
 #                                Default: 50000
 # AIRFLOW_GID                  - Group ID in Airflow containers
-#                                Default: 50000
+#                                Default: 0
 #
 # Those configurations are useful mostly in case of standalone testing/running Airflow in test/try-out mode
 #
@@ -133,13 +133,66 @@ services:
 
   airflow-init:
     <<: *airflow-common
-    command: version
+    entrypoint: /bin/bash
+    command:
+      - -c
+      - |
+        function ver() {
+          printf "%04d%04d%04d%04d" $${1//./ }
+        }
+        airflow_version=$$(gosu airflow airflow version)
+        airflow_version_comparable=$$(ver $${airflow_version})
+        min_airflow_version=2.1.0
+        min_airlfow_version_comparable=$$(ver $${min_airflow_version})
+        if (( airflow_version_comparable < min_airlfow_version_comparable )); then
+          echo -e "\033[1;31mERROR!!!: Too old Airflow version $${airflow_version}!\e[0m"
+          echo "The minimum Airflow version supported: $${min_airflow_version}. Only use this or higher!"
+          exit 1
+        fi
+        if [[ -z "${AIRFLOW_UID}" ]]; then
+          echo -e "\033[1;31mERROR!!!: AIRFLOW_UID not set!\e[0m"
+          echo "Please follow these instructions to set AIRFLOW_UID and AIRFLOW_GID environment variables:
+            https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#initializing-environment"
+          exit 1
+        fi
+        one_meg=1048576
+        mem_available=$$(($$(getconf _PHYS_PAGES) * $$(getconf PAGE_SIZE) / one_meg))
+        cpus_available=$$(grep -cE 'cpu[0-9]+' /proc/stat)
+        disk_available=$$(df / | tail -1 | awk '{print $$4}')
+        warning_resources="false"
+        if (( mem_available < 4000 )) ; then
+          echo -e "\033[1;33mWARNING!!!: Not enough memory available for Docker.\e[0m"
+          echo "At least 4GB of memory required. You have $$(numfmt --to iec $$((mem_available * one_meg)))"
+          warning_resources="true"
+        fi
+        if (( cpus_available < 2 )); then
+          echo -e "\033[1;33mWARNING!!!: Not enough CPUS available for Docker.\e[0m"
+          echo "At least 2 CPUs recommended. You have $${cpus_available}"
+          warning_resources="true"
+        fi
+        if (( disk_available < one_meg * 10 )); then
+          echo -e "\033[1;33mWARNING!!!: Not enough Disk space available for Docker.\e[0m"
+          echo "At least 10 GBs recommended. You have $$(numfmt --to iec $$((disk_available * 1024 )))"
+          warning_resources="true"
+        fi
+        if [[ $${warning_resources} == "true" ]]; then
+          echo
+          echo -e "\033[1;33mWARNING!!!: You have not enough resources to run Airflow (see above)!\e[0m"
+          echo "Please follow the instructions to increase amount of resources available:"
+          echo "   https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#before-you-begin"
+        fi
+        mkdir -p /sources/logs /sources/dags /sources/plugins
+        chown -R "${AIRFLOW_UID}:${AIRFLOW_GID}" /sources/{logs,dags,plugins}
+        exec /entrypoint airflow version
     environment:
       <<: *airflow-common-env
       _AIRFLOW_DB_UPGRADE: 'true'
       _AIRFLOW_WWW_USER_CREATE: 'true'
       _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
       _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}
+    user: "0:${AIRFLOW_GID:-0}"
+    volumes:
+      - .:/sources
 
   flower:
     <<: *airflow-common
diff --git a/docs/apache-airflow/start/docker.rst b/docs/apache-airflow/start/docker.rst
index 78d5c60..d37f33b 100644
--- a/docs/apache-airflow/start/docker.rst
+++ b/docs/apache-airflow/start/docker.rst
@@ -65,9 +65,6 @@ Some directories in the container are mounted, which means that their contents a
 This file uses the latest Airflow image (`apache/airflow <https://hub.docker.com/r/apache/airflow>`__).
 If you need install a new Python library or system library, you can :doc:`build your image <docker-stack:index>`.
 
-.. _initializing_docker_compose_environment:
-
-
 Using custom images
 ===================
 
@@ -80,6 +77,8 @@ to rebuild the images on-the-fly when you run other ``docker-compose`` commands.
 Examples of how you can extend the image with custom providers, python packages,
 apt packages and more can be found in :doc:`Building the image <docker-stack:build>`.
 
+.. _initializing_docker_compose_environment:
+
 Initializing Environment
 ========================
 
@@ -89,7 +88,7 @@ On **Linux**, the mounted volumes in container use the native Linux filesystem u
 
 .. code-block:: bash
 
-    mkdir ./dags ./logs ./plugins
+    mkdir -p ./dags ./logs ./plugins
     echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
 
 See :ref:`Docker Compose environment variables <docker-compose-env-variables>`
@@ -111,6 +110,22 @@ After initialization is complete, you should see a message like below.
 
 The account created has the login ``airflow`` and the password ``airflow``.
 
+Cleaning-up the environment
+===========================
+
+The docker-compose we prepare is a "Quick-start" one. It is not intended to be used in production
+and it has a number of caveats - one of them that the best way to recover from any problem is to clean it
+up and restart from the scratch.
+
+The best way to do it is to:
+
+* Run ``docker-compose down --volumes --remove-orphans`` command in the directory you downloaded the
+  ``docker-compose.yaml`` file
+* remove the whole directory where you downloaded the ``docker-compose.yaml`` file
+  ``rm -rf '<DIRECTORY>'``
+* re-download the ``docker-compose.yaml`` file
+* re-start following the instructions from the very beginning in this guide
+
 Running Airflow
 ===============