You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/03/13 23:40:23 UTC

[GitHub] [airflow] mik-laj opened a new pull request #14765: Docker image docs

mik-laj opened a new pull request #14765:
URL: https://github.com/apache/airflow/pull/14765


   Depends on: https://github.com/apache/airflow/pull/14762
   Close:  https://github.com/apache/airflow/issues/14644#issuecomment-794613029
   
   I am extracting documentation for Docker image as a new documentation package to clearly separate documentation for apache-airflow pip package I Docker Image. Even though it is related, it is not the same. Docker Image is built on top of the apache-aiirflow package. Like the Helm Chart it is built on top of Docker Image.
   
   This package has no versioning as its content is applicable to both Airflow 1.10 and Airflow 2.0. 
   
   For now, I have tried to limit the changes to the content to make sure everything was successfully migrated to the new package.
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #14765: Docker image docs

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #14765:
URL: https://github.com/apache/airflow/pull/14765#discussion_r593816024



##########
File path: docs/docker-stack/entrypoint.rst
##########
@@ -0,0 +1,201 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Entrypoint
+==========
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+The image entrypoint works as follows:

Review comment:
       This section seems to have a lot of repetition with the following sections but I will update that in the next PR.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #14765: Create a documentation package for Docker image

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #14765:
URL: https://github.com/apache/airflow/pull/14765#discussion_r595474630



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -118,852 +118,7 @@ To mitigate these issues, make sure you have a :doc:`health check </logging-moni
 Production Container Images
 ===========================
 
-Production-ready reference Image
---------------------------------
-
-For the ease of deployment in production, the community releases a production-ready reference container
-image.
-
-The docker image provided (as convenience binary package) in the
-`Apache Airflow DockerHub <https://hub.docker.com/r/apache/airflow>`_ is a bare image
-that has a few external dependencies and extras installed..
-
-The Apache Airflow image provided as convenience package is optimized for size, so
-it provides just a bare minimal set of the extras and dependencies installed and in most cases
-you want to either extend or customize the image. You can see all possible extras in
-:doc:`extra-packages-ref`. The set of extras used in Airflow Production image are available in the
-`Dockerfile <https://github.com/apache/airflow/blob/2c6c7fdb2308de98e142618836bdf414df9768c8/Dockerfile#L39>`_.
-
-The production images are build in DockerHub from released version and release candidates. There
-are also images published from branches but they are used mainly for development and testing purpose.
-See `Airflow Git Branching <https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#airflow-git-branches>`_
-for details.
-
-
-Customizing or extending the Production Image
----------------------------------------------
-
-Before you dive-deeply in the way how the Airflow Image is build, named and why we are doing it the
-way we do, you might want to know very quickly how you can extend or customize the existing image
-for Apache Airflow. This chapter gives you a short answer to those questions.
-
-Airflow Summit 2020's `Production Docker Image <https://youtu.be/wDr3Y7q2XoI>`_ talk provides more
-details about the context, architecture and customization/extension methods for the Production Image.
-
-Extending the image
-...................
-
-Extending the image is easiest if you just need to add some dependencies that do not require
-compiling. The compilation framework of Linux (so called ``build-essential``) is pretty big, and
-for the production images, size is really important factor to optimize for, so our Production Image
-does not contain ``build-essential``. If you need compiler like gcc or g++ or make/cmake etc. - those
-are not found in the image and it is recommended that you follow the "customize" route instead.
-
-How to extend the image - it is something you are most likely familiar with - simply
-build a new image using Dockerfile's ``FROM`` directive and add whatever you need. Then you can add your
-Debian dependencies with ``apt`` or PyPI dependencies with ``pip install`` or any other stuff you need.
-
-You should be aware, about a few things:
-
-* The production image of airflow uses "airflow" user, so if you want to add some of the tools
-  as ``root`` user, you need to switch to it with ``USER`` directive of the Dockerfile. Also you
-  should remember about following the
-  `best practises of Dockerfiles <https://docs.docker.com/develop/develop-images/dockerfile_best-practices/>`_
-  to make sure your image is lean and small.
-
-.. code-block:: dockerfile
-
-  FROM apache/airflow:2.0.1
-  USER root
-  RUN apt-get update \
-    && apt-get install -y --no-install-recommends \
-           my-awesome-apt-dependency-to-add \
-    && apt-get autoremove -yqq --purge \
-    && apt-get clean \
-    && rm -rf /var/lib/apt/lists/*
-  USER airflow
-
-
-* PyPI dependencies in Apache Airflow are installed in the user library, of the "airflow" user, so
-  you need to install them with the ``--user`` flag and WITHOUT switching to airflow user. Note also
-  that using --no-cache-dir is a good idea that can help to make your image smaller.
-
-.. code-block:: dockerfile
-
-  FROM apache/airflow:2.0.1
-  RUN pip install --no-cache-dir --user my-awesome-pip-dependency-to-add
-
-* As of 2.0.1 image the ``--user`` flag is turned on by default by setting ``PIP_USER`` environment variable
-  to ``true``. This can be disabled by un-setting the variable or by setting it to ``false``.
-
-
-* If your apt, or PyPI dependencies require some of the build-essentials, then your best choice is
-  to follow the "Customize the image" route. However it requires to checkout sources of Apache Airflow,
-  so you might still want to choose to add build essentials to your image, even if your image will
-  be significantly bigger.
-
-.. code-block:: dockerfile
-
-  FROM apache/airflow:2.0.1
-  USER root
-  RUN apt-get update \
-    && apt-get install -y --no-install-recommends \
-           build-essential my-awesome-apt-dependency-to-add \
-    && apt-get autoremove -yqq --purge \
-    && apt-get clean \
-    && rm -rf /var/lib/apt/lists/*
-  USER airflow
-  RUN pip install --no-cache-dir --user my-awesome-pip-dependency-to-add
-
-
-* You can also embed your dags in the image by simply adding them with COPY directive of Airflow.
-  The DAGs in production image are in /opt/airflow/dags folder.
-
-Customizing the image
-.....................
-
-Customizing the image is an alternative way of adding your own dependencies to the image - better
-suited to prepare optimized production images.
-
-The advantage of this method is that it produces optimized image even if you need some compile-time
-dependencies that are not needed in the final image. You need to use Airflow Sources to build such images
-from the `official distribution folder of Apache Airflow <https://downloads.apache.org/airflow/>`_ for the
-released versions, or checked out from the GitHub project if you happen to do it from git sources.
-
-The easiest way to build the image is to use ``breeze`` script, but you can also build such customized
-image by running appropriately crafted docker build in which you specify all the ``build-args``
-that you need to add to customize it. You can read about all the args and ways you can build the image
-in the `<#production-image-build-arguments>`_ chapter below.
-
-Here just a few examples are presented which should give you general understanding of what you can customize.
-
-This builds the production image in version 3.7 with additional airflow extras from 2.0.1 PyPI package and
-additional apt dev and runtime dependencies.
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg PYTHON_BASE_IMAGE="python:3.7-slim-buster" \
-    --build-arg PYTHON_MAJOR_MINOR_VERSION=3.7 \
-    --build-arg AIRFLOW_INSTALLATION_METHOD="apache-airflow" \
-    --build-arg AIRFLOW_VERSION="2.0.1" \
-    --build-arg AIRFLOW_VERSION_SPECIFICATION="==2.0.1" \
-    --build-arg AIRFLOW_CONSTRAINTS_REFERENCE="constraints-2-0" \
-    --build-arg AIRFLOW_SOURCES_FROM="empty" \
-    --build-arg AIRFLOW_SOURCES_TO="/empty" \
-    --build-arg ADDITIONAL_AIRFLOW_EXTRAS="jdbc" \
-    --build-arg ADDITIONAL_PYTHON_DEPS="pandas" \
-    --build-arg ADDITIONAL_DEV_APT_DEPS="gcc g++" \
-    --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless" \
-    --tag my-image
-
-
-the same image can be built using ``breeze`` (it supports auto-completion of the options):
-
-.. code-block:: bash
-
-  ./breeze build-image \
-      --production-image  --python 3.7 --install-airflow-version=2.0.1 \
-      --additional-extras=jdbc --additional-python-deps="pandas" \
-      --additional-dev-apt-deps="gcc g++" --additional-runtime-apt-deps="default-jre-headless"
-
-
-You can customize more aspects of the image - such as additional commands executed before apt dependencies
-are installed, or adding extra sources to install your dependencies from. You can see all the arguments
-described below but here is an example of rather complex command to customize the image
-based on example in `this comment <https://github.com/apache/airflow/issues/8605#issuecomment-690065621>`_:
-
-.. code-block:: bash
-
-  docker build . -f Dockerfile \
-    --build-arg PYTHON_BASE_IMAGE="python:3.7-slim-buster" \
-    --build-arg PYTHON_MAJOR_MINOR_VERSION=3.7 \
-    --build-arg AIRFLOW_INSTALLATION_METHOD="apache-airflow" \
-    --build-arg AIRFLOW_VERSION="2.0.1" \
-    --build-arg AIRFLOW_VERSION_SPECIFICATION="==2.0.1" \
-    --build-arg AIRFLOW_CONSTRAINTS_REFERENCE="constraints-2-0" \
-    --build-arg AIRFLOW_SOURCES_FROM="empty" \
-    --build-arg AIRFLOW_SOURCES_TO="/empty" \
-    --build-arg ADDITIONAL_AIRFLOW_EXTRAS="slack" \
-    --build-arg ADDITIONAL_PYTHON_DEPS="apache-airflow-backport-providers-odbc \
-        apache-airflow-backport-providers-odbc \
-        azure-storage-blob \
-        sshtunnel \
-        google-api-python-client \
-        oauth2client \
-        beautifulsoup4 \
-        dateparser \
-        rocketchat_API \
-        typeform" \
-    --build-arg ADDITIONAL_DEV_APT_DEPS="msodbcsql17 unixodbc-dev g++" \
-    --build-arg ADDITIONAL_DEV_APT_COMMAND="curl https://packages.microsoft.com/keys/microsoft.asc | \
-    apt-key add --no-tty - && \
-    curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list" \
-    --build-arg ADDITIONAL_DEV_ENV_VARS="ACCEPT_EULA=Y" \
-    --build-arg ADDITIONAL_RUNTIME_APT_COMMAND="curl https://packages.microsoft.com/keys/microsoft.asc | \
-    apt-key add --no-tty - && \
-    curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list" \
-    --build-arg ADDITIONAL_RUNTIME_APT_DEPS="msodbcsql17 unixodbc git procps vim" \
-    --build-arg ADDITIONAL_RUNTIME_ENV_VARS="ACCEPT_EULA=Y" \
-    --tag my-image
-
-Customizing images in high security restricted environments
-...........................................................
-
-You can also make sure your image is only build using local constraint file and locally downloaded
-wheel files. This is often useful in Enterprise environments where the binary files are verified and
-vetted by the security teams.
-
-This builds below builds the production image in version 3.7 with packages and constraints used from the local
-``docker-context-files`` rather than installed from PyPI or GitHub. It also disables MySQL client
-installation as it is using external installation method.
-
-Note that as a prerequisite - you need to have downloaded wheel files. In the example below we
-first download such constraint file locally and then use ``pip download`` to get the .whl files needed
-but in most likely scenario, those wheel files should be copied from an internal repository of such .whl
-files. Note that ``AIRFLOW_VERSION_SPECIFICATION`` is only there for reference, the apache airflow .whl file
-in the right version is part of the .whl files downloaded.
-
-Note that 'pip download' will only works on Linux host as some of the packages need to be compiled from
-sources and you cannot install them providing ``--platform`` switch. They also need to be downloaded using
-the same python version as the target image.
-
-The ``pip download`` might happen in a separate environment. The files can be committed to a separate
-binary repository and vetted/verified by the security team and used subsequently to build images
-of Airflow when needed on an air-gaped system.
-
-Preparing the constraint files and wheel files:
-
-.. code-block:: bash
-
-  rm docker-context-files/*.whl docker-context-files/*.txt
-
-  curl -Lo "docker-context-files/constraints-2-0.txt" \
-    https://raw.githubusercontent.com/apache/airflow/constraints-2-0/constraints-3.7.txt
-
-  pip download --dest docker-context-files \
-    --constraint docker-context-files/constraints-2-0.txt  \
-    apache-airflow[async,aws,azure,celery,dask,elasticsearch,gcp,kubernetes,mysql,postgres,redis,slack,ssh,statsd,virtualenv]==2.0.1
-
-Since apache-airflow .whl packages are treated differently by the docker image, you need to rename the
-downloaded apache-airflow* files, for example:
-
-.. code-block:: bash
-
-   pushd docker-context-files
-   for file in apache?airflow*
-   do
-     mv ${file} _${file}
-   done
-   popd
-
-Building the image:
-
-.. code-block:: bash
-
-  ./breeze build-image \
-      --production-image --python 3.7 --install-airflow-version=2.0.1 \
-      --disable-mysql-client-installation --disable-pip-cache --install-from-local-files-when-building \
-      --constraints-location="/docker-context-files/constraints-2-0.txt"
-
-or
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg PYTHON_BASE_IMAGE="python:3.7-slim-buster" \
-    --build-arg PYTHON_MAJOR_MINOR_VERSION=3.7 \
-    --build-arg AIRFLOW_INSTALLATION_METHOD="apache-airflow" \
-    --build-arg AIRFLOW_VERSION="2.0.1" \
-    --build-arg AIRFLOW_VERSION_SPECIFICATION="==2.0.1" \
-    --build-arg AIRFLOW_CONSTRAINTS_REFERENCE="constraints-2-0" \
-    --build-arg AIRFLOW_SOURCES_FROM="empty" \
-    --build-arg AIRFLOW_SOURCES_TO="/empty" \
-    --build-arg INSTALL_MYSQL_CLIENT="false" \
-    --build-arg AIRFLOW_PRE_CACHED_PIP_PACKAGES="false" \
-    --build-arg INSTALL_FROM_DOCKER_CONTEXT_FILES="true" \
-    --build-arg AIRFLOW_CONSTRAINTS_LOCATION="/docker-context-files/constraints-2-0.txt"
-
-
-Customizing & extending the image together
-..........................................
-
-You can combine both - customizing & extending the image. You can build the image first using
-``customize`` method (either with docker command or with ``breeze`` and then you can ``extend``
-the resulting image using ``FROM`` any dependencies you want.
-
-Customizing PYPI installation
-.............................
-
-You can customize PYPI sources used during image build by adding a docker-context-files/.pypirc file
-This .pypirc will never be committed to the repository and will not be present in the final production image.
-It is added and used only in the build segment of the image so it is never copied to the final image.
-
-External sources for dependencies
----------------------------------
-
-In corporate environments, there is often the need to build your Container images using
-other than default sources of dependencies. The docker file uses standard sources (such as
-Debian apt repositories or PyPI repository. However, in corporate environments, the dependencies
-are often only possible to be installed from internal, vetted repositories that are reviewed and
-approved by the internal security teams. In those cases, you might need to use those different
-sources.
-
-This is rather easy if you extend the image - you simply write your extension commands
-using the right sources - either by adding/replacing the sources in apt configuration or
-specifying the source repository in pip install command.
-
-It's a bit more involved in the case of customizing the image. We do not have yet (but we are working
-on it) a capability of changing the sources via build args. However, since the builds use
-Dockerfile that is a source file, you can rather easily simply modify the file manually and
-specify different sources to be used by either of the commands.
-
-
-Comparing extending and customizing the image
----------------------------------------------
-
-Here is the comparison of the two types of building images.
-
-+----------------------------------------------------+---------------------+-----------------------+
-|                                                    | Extending the image | Customizing the image |
-+====================================================+=====================+=======================+
-| Produces optimized image                           | No                  | Yes                   |
-+----------------------------------------------------+---------------------+-----------------------+
-| Use Airflow Dockerfile sources to build the image  | No                  | Yes                   |
-+----------------------------------------------------+---------------------+-----------------------+
-| Requires Airflow sources                           | No                  | Yes                   |
-+----------------------------------------------------+---------------------+-----------------------+
-| You can build it with Breeze                       | No                  | Yes                   |
-+----------------------------------------------------+---------------------+-----------------------+
-| Allows to use non-default sources for dependencies | Yes                 | No [1]                |
-+----------------------------------------------------+---------------------+-----------------------+
-
-[1] When you combine customizing and extending the image, you can use external sources
-in the "extend" part. There are plans to add functionality to add external sources
-option to image customization. You can also modify Dockerfile manually if you want to
-use non-default sources for dependencies.
-
-Using the production image
---------------------------
-
-The PROD image entrypoint works as follows:
-
-* In case the user is not "airflow" (with undefined user id) and the group id of the user is set to 0 (root),
-  then the user is dynamically added to /etc/passwd at entry using USER_NAME variable to define the user name.
-  This is in order to accommodate the
-  `OpenShift Guidelines <https://docs.openshift.com/enterprise/3.0/creating_images/guidelines.html>`_
-
-* The ``AIRFLOW_HOME`` is set by default to ``/opt/airflow/`` - this means that DAGs
-  are in default in the ``/opt/airflow/dags`` folder and logs are in the ``/opt/airflow/logs``
-
-* The working directory is ``/opt/airflow`` by default.
-
-* If ``AIRFLOW__CORE__SQL_ALCHEMY_CONN`` variable is passed to the container and it is either mysql or postgres
-  SQL alchemy connection, then the connection is checked and the script waits until the database is reachable.
-  If ``AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD`` variable is passed to the container, it is evaluated as a
-  command to execute and result of this evaluation is used as ``AIRFLOW__CORE__SQL_ALCHEMY_CONN``. The
-  ``_CMD`` variable takes precedence over the ``AIRFLOW__CORE__SQL_ALCHEMY_CONN`` variable.
-
-* If no ``AIRFLOW__CORE__SQL_ALCHEMY_CONN`` variable is set then SQLite database is created in
-  ${AIRFLOW_HOME}/airflow.db and db reset is executed.
-
-* If first argument equals to "bash" - you are dropped to a bash shell or you can executes bash command
-  if you specify extra arguments. For example:
-
-.. code-block:: bash
-
-  docker run -it apache/airflow:master-python3.6 bash -c "ls -la"
-  total 16
-  drwxr-xr-x 4 airflow root 4096 Jun  5 18:12 .
-  drwxr-xr-x 1 root    root 4096 Jun  5 18:12 ..
-  drwxr-xr-x 2 airflow root 4096 Jun  5 18:12 dags
-  drwxr-xr-x 2 airflow root 4096 Jun  5 18:12 logs
-
-* If first argument is equal to "python" - you are dropped in python shell or python commands are executed if
-  you pass extra parameters. For example:
-
-.. code-block:: bash
-
-  > docker run -it apache/airflow:master-python3.6 python -c "print('test')"
-  test
-
-* If first argument equals to "airflow" - the rest of the arguments is treated as an airflow command
-  to execute. Example:
-
-.. code-block:: bash
-
-   docker run -it apache/airflow:master-python3.6 airflow webserver
-
-* If there are any other arguments - they are simply passed to the "airflow" command
-
-.. code-block:: bash
-
-  > docker run -it apache/airflow:master-python3.6 version
-  2.1.0.dev0
-
-* If ``AIRFLOW__CELERY__BROKER_URL`` variable is passed and airflow command with
-  scheduler, worker of flower command is used, then the script checks the broker connection
-  and waits until the Celery broker database is reachable.
-  If ``AIRFLOW__CELERY__BROKER_URL_CMD`` variable is passed to the container, it is evaluated as a
-  command to execute and result of this evaluation is used as ``AIRFLOW__CELERY__BROKER_URL``. The
-  ``_CMD`` variable takes precedence over the ``AIRFLOW__CELERY__BROKER_URL`` variable.
-
-Production image build arguments
---------------------------------
-
-The following build arguments (``--build-arg`` in docker build command) can be used for production images:
-
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| Build argument                           | Default value                            | Description                              |
-+==========================================+==========================================+==========================================+
-| ``PYTHON_BASE_IMAGE``                    | ``python:3.6-slim-buster``               | Base python image.                       |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``PYTHON_MAJOR_MINOR_VERSION``           | ``3.6``                                  | major/minor version of Python (should    |
-|                                          |                                          | match base image).                       |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_VERSION``                      | ``2.0.1.dev0``                           | version of Airflow.                      |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_REPO``                         | ``apache/airflow``                       | the repository from which PIP            |
-|                                          |                                          | dependencies are pre-installed.          |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_BRANCH``                       | ``master``                               | the branch from which PIP dependencies   |
-|                                          |                                          | are pre-installed initially.             |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_CONSTRAINTS_LOCATION``         |                                          | If not empty, it will override the       |
-|                                          |                                          | source of the constraints with the       |
-|                                          |                                          | specified URL or file. Note that the     |
-|                                          |                                          | file has to be in docker context so      |
-|                                          |                                          | it's best to place such file in          |
-|                                          |                                          | one of the folders included in           |
-|                                          |                                          | .dockerignore.                           |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_CONSTRAINTS_REFERENCE``        | ``constraints-master``                   | Reference (branch or tag) from GitHub    |
-|                                          |                                          | where constraints file is taken from     |
-|                                          |                                          | It can be ``constraints-master`` but     |
-|                                          |                                          | also can be ``constraints-1-10`` for     |
-|                                          |                                          | 1.10.* installation. In case of building |
-|                                          |                                          | specific version you want to point it    |
-|                                          |                                          | to specific tag, for example             |
-|                                          |                                          | ``constraints-1.10.14``.                 |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``INSTALL_PROVIDERS_FROM_SOURCES``       | ``false``                                | If set to ``true`` and image is built    |
-|                                          |                                          | from sources, all provider packages are  |
-|                                          |                                          | installed from sources rather than from  |
-|                                          |                                          | packages. It has no effect when          |
-|                                          |                                          | installing from PyPI or GitHub repo.     |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_EXTRAS``                       | (see Dockerfile)                         | Default extras with which airflow is     |
-|                                          |                                          | installed.                               |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``INSTALL_FROM_PYPI``                    | ``true``                                 | If set to true, Airflow is installed     |
-|                                          |                                          | from PyPI. if you want to install        |
-|                                          |                                          | Airflow from self-build package          |
-|                                          |                                          | you can set it to false, put package in  |
-|                                          |                                          | ``docker-context-files`` and set         |
-|                                          |                                          | ``INSTALL_FROM_DOCKER_CONTEXT_FILES`` to |
-|                                          |                                          | ``true``. For this you have to also keep |
-|                                          |                                          | ``AIRFLOW_PRE_CACHED_PIP_PACKAGES`` flag |
-|                                          |                                          | set to ``false``.                        |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_PRE_CACHED_PIP_PACKAGES``      | ``false``                                | Allows to pre-cache airflow PIP packages |
-|                                          |                                          | from the GitHub of Apache Airflow        |
-|                                          |                                          | This allows to optimize iterations for   |
-|                                          |                                          | Image builds and speeds up CI builds.    |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``INSTALL_FROM_DOCKER_CONTEXT_FILES``    | ``false``                                | If set to true, Airflow, providers and   |
-|                                          |                                          | all dependencies are installed from      |
-|                                          |                                          | from locally built/downloaded            |
-|                                          |                                          | .whl and .tar.gz files placed in the     |
-|                                          |                                          | ``docker-context-files``. In certain     |
-|                                          |                                          | corporate environments, this is required |
-|                                          |                                          | to install airflow from such pre-vetted  |
-|                                          |                                          | packages rather than from PyPI. For this |
-|                                          |                                          | to work, also set ``INSTALL_FROM_PYPI``. |
-|                                          |                                          | Note that packages starting with         |
-|                                          |                                          | ``apache?airflow`` glob are treated      |
-|                                          |                                          | differently than other packages. All     |
-|                                          |                                          | ``apache?airflow`` packages are          |
-|                                          |                                          | installed with dependencies limited by   |
-|                                          |                                          | airflow constraints. All other packages  |
-|                                          |                                          | are installed without dependencies       |
-|                                          |                                          | 'as-is'. If you wish to install airflow  |
-|                                          |                                          | via 'pip download' with all dependencies |
-|                                          |                                          | downloaded, you have to rename the       |
-|                                          |                                          | apache airflow and provider packages to  |
-|                                          |                                          | not start with ``apache?airflow`` glob.  |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``UPGRADE_TO_NEWER_DEPENDENCIES``        | ``false``                                | If set to true, the dependencies are     |
-|                                          |                                          | upgraded to newer versions matching      |
-|                                          |                                          | setup.py before installation.            |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``CONTINUE_ON_PIP_CHECK_FAILURE``        | ``false``                                | By default the image build fails if pip  |
-|                                          |                                          | check fails for it. This is good for     |
-|                                          |                                          | interactive building but on CI the       |
-|                                          |                                          | image should be built regardless - we    |
-|                                          |                                          | have a separate step to verify image.    |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_AIRFLOW_EXTRAS``            |                                          | Optional additional extras with which    |
-|                                          |                                          | airflow is installed.                    |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_PYTHON_DEPS``               |                                          | Optional python packages to extend       |
-|                                          |                                          | the image with some extra dependencies.  |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``DEV_APT_COMMAND``                      | (see Dockerfile)                         | Dev apt command executed before dev deps |
-|                                          |                                          | are installed in the Build image.        |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_DEV_APT_COMMAND``           |                                          | Additional Dev apt command executed      |
-|                                          |                                          | before dev dep are installed             |
-|                                          |                                          | in the Build image. Should start with    |
-|                                          |                                          | ``&&``.                                  |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``DEV_APT_DEPS``                         | (see Dockerfile)                         | Dev APT dependencies installed           |
-|                                          |                                          | in the Build image.                      |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_DEV_APT_DEPS``              |                                          | Additional apt dev dependencies          |
-|                                          |                                          | installed in the Build image.            |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_DEV_APT_ENV``               |                                          | Additional env variables defined         |
-|                                          |                                          | when installing dev deps.                |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``RUNTIME_APT_COMMAND``                  | (see Dockerfile)                         | Runtime apt command executed before deps |
-|                                          |                                          | are installed in the Main image.         |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_RUNTIME_APT_COMMAND``       |                                          | Additional Runtime apt command executed  |
-|                                          |                                          | before runtime dep are installed         |
-|                                          |                                          | in the Main image. Should start with     |
-|                                          |                                          | ``&&``.                                  |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``RUNTIME_APT_DEPS``                     | (see Dockerfile)                         | Runtime APT dependencies installed       |
-|                                          |                                          | in the Main image.                       |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_RUNTIME_APT_DEPS``          |                                          | Additional apt runtime dependencies      |
-|                                          |                                          | installed in the Main image.             |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_RUNTIME_APT_ENV``           |                                          | Additional env variables defined         |
-|                                          |                                          | when installing runtime deps.            |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_HOME``                         | ``/opt/airflow``                         | Airflow’s HOME (that’s where logs and    |
-|                                          |                                          | SQLite databases are stored).            |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_UID``                          | ``50000``                                | Airflow user UID.                        |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_GID``                          | ``50000``                                | Airflow group GID. Note that most files  |
-|                                          |                                          | created on behalf of airflow user belong |
-|                                          |                                          | to the ``root`` group (0) to keep        |
-|                                          |                                          | OpenShift Guidelines compatibility.      |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_USER_HOME_DIR``                | ``/home/airflow``                        | Home directory of the Airflow user.      |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``CASS_DRIVER_BUILD_CONCURRENCY``        | ``8``                                    | Number of processors to use for          |
-|                                          |                                          | cassandra PIP install (speeds up         |
-|                                          |                                          | installing in case cassandra extra is    |
-|                                          |                                          | used).                                   |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``INSTALL_MYSQL_CLIENT``                 | ``true``                                 | Whether MySQL client should be installed |
-|                                          |                                          | The mysql extra is removed from extras   |
-|                                          |                                          | if the client is not installed.          |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_PIP_VERSION``                  | ``20.2.4``                               | PIP version used.                        |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``PIP_PROGRESS_BAR``                     | ``on``                                   | Progress bar for PIP installation        |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-
-There are build arguments that determine the installation mechanism of Apache Airflow for the
-production image. There are three types of build:
-
-* From local sources (by default for example when you use ``docker build .``)
-* You can build the image from released PyPI airflow package (used to build the official Docker image)
-* You can build the image from any version in GitHub repository(this is used mostly for system testing).
-
-+-----------------------------------+------------------------+-----------------------------------------------------------------------------------+
-| Build argument                    | Default                | What to specify                                                                   |
-+===================================+========================+===================================================================================+
-| ``AIRFLOW_INSTALLATION_METHOD``   | ``apache-airflow``     | Should point to the installation method of Apache Airflow. It can be              |
-|                                   |                        | ``apache-airflow`` for installation from packages and URL to installation from    |
-|                                   |                        | GitHub repository tag or branch or "." to install from sources.                   |
-|                                   |                        | Note that installing from local sources requires appropriate values of the        |
-|                                   |                        | ``AIRFLOW_SOURCES_FROM`` and ``AIRFLOW_SOURCES_TO`` variables as described below. |
-|                                   |                        | Only used when ``INSTALL_FROM_PYPI`` is set to ``true``.                          |
-+-----------------------------------+------------------------+-----------------------------------------------------------------------------------+
-| ``AIRFLOW_VERSION_SPECIFICATION`` |                        | Optional - might be used for package installation of different Airflow version    |
-|                                   |                        | for example"==2.0.1". For consistency, you should also set``AIRFLOW_VERSION``     |
-|                                   |                        | to the same value AIRFLOW_VERSION is resolved as label in the image created.      |
-+-----------------------------------+------------------------+-----------------------------------------------------------------------------------+
-| ``AIRFLOW_CONSTRAINTS_REFERENCE`` | ``constraints-master`` | Reference (branch or tag) from GitHub where constraints file is taken from.       |
-|                                   |                        | It can be ``constraints-master`` but also can be``constraints-1-10`` for          |
-|                                   |                        | 1.10.*  installations. In case of building specific version                       |
-|                                   |                        | you want to point it to specific tag, for example ``constraints-2.0.1``           |
-+-----------------------------------+------------------------+-----------------------------------------------------------------------------------+
-| ``AIRFLOW_WWW``                   | ``www``                | In case of Airflow 2.0 it should be "www", in case of Airflow 1.10                |
-|                                   |                        | series it should be "www_rbac".                                                   |
-+-----------------------------------+------------------------+-----------------------------------------------------------------------------------+
-| ``AIRFLOW_SOURCES_FROM``          | ``empty``              | Sources of Airflow. Set it to "." when you install airflow from                   |
-|                                   |                        | local sources.                                                                    |
-+-----------------------------------+------------------------+-----------------------------------------------------------------------------------+
-| ``AIRFLOW_SOURCES_TO``            | ``/empty``             | Target for Airflow sources. Set to "/opt/airflow" when                            |
-|                                   |                        | you want to install airflow from local sources.                                   |
-+-----------------------------------+------------------------+-----------------------------------------------------------------------------------+
-
-This builds production image in version 3.6 with default extras from the local sources (master version
-of 2.0 currently):
-
-.. code-block:: bash
-
-  docker build .
-
-This builds the production image in version 3.7 with default extras from 2.0.1 tag and
-constraints taken from constraints-2-0 branch in GitHub.
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg PYTHON_BASE_IMAGE="python:3.7-slim-buster" \
-    --build-arg PYTHON_MAJOR_MINOR_VERSION=3.7 \
-    --build-arg AIRFLOW_INSTALLATION_METHOD="https://github.com/apache/airflow/archive/2.0.1.tar.gz#egg=apache-airflow" \
-    --build-arg AIRFLOW_CONSTRAINTS_REFERENCE="constraints-2-0" \
-    --build-arg AIRFLOW_BRANCH="v1-10-test" \
-    --build-arg AIRFLOW_SOURCES_FROM="empty" \
-    --build-arg AIRFLOW_SOURCES_TO="/empty"
-
-This builds the production image in version 3.7 with default extras from 2.0.1 PyPI package and
-constraints taken from 2.0.1 tag in GitHub and pre-installed pip dependencies from the top
-of v1-10-test branch.
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg PYTHON_BASE_IMAGE="python:3.7-slim-buster" \
-    --build-arg PYTHON_MAJOR_MINOR_VERSION=3.7 \
-    --build-arg AIRFLOW_INSTALLATION_METHOD="apache-airflow" \
-    --build-arg AIRFLOW_VERSION="2.0.1" \
-    --build-arg AIRFLOW_VERSION_SPECIFICATION="==2.0.1" \
-    --build-arg AIRFLOW_BRANCH="v1-10-test" \
-    --build-arg AIRFLOW_CONSTRAINTS_REFERENCE="constraints-2.0.1" \
-    --build-arg AIRFLOW_SOURCES_FROM="empty" \
-    --build-arg AIRFLOW_SOURCES_TO="/empty"
-
-This builds the production image in version 3.7 with additional airflow extras from 2.0.1 PyPI package and
-additional python dependencies and pre-installed pip dependencies from 2.0.1 tagged constraints.
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg PYTHON_BASE_IMAGE="python:3.7-slim-buster" \
-    --build-arg PYTHON_MAJOR_MINOR_VERSION=3.7 \
-    --build-arg AIRFLOW_INSTALLATION_METHOD="apache-airflow" \
-    --build-arg AIRFLOW_VERSION="2.0.1" \
-    --build-arg AIRFLOW_VERSION_SPECIFICATION="==2.0.1" \
-    --build-arg AIRFLOW_BRANCH="v1-10-test" \
-    --build-arg AIRFLOW_CONSTRAINTS_REFERENCE="constraints-2.0.1" \
-    --build-arg AIRFLOW_SOURCES_FROM="empty" \
-    --build-arg AIRFLOW_SOURCES_TO="/empty" \
-    --build-arg ADDITIONAL_AIRFLOW_EXTRAS="mssql,hdfs" \
-    --build-arg ADDITIONAL_PYTHON_DEPS="sshtunnel oauth2client"
-
-This builds the production image in version 3.7 with additional airflow extras from 2.0.1 PyPI package and
-additional apt dev and runtime dependencies.
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg PYTHON_BASE_IMAGE="python:3.7-slim-buster" \
-    --build-arg PYTHON_MAJOR_MINOR_VERSION=3.7 \
-    --build-arg AIRFLOW_INSTALLATION_METHOD="apache-airflow" \
-    --build-arg AIRFLOW_VERSION="2.0.1" \
-    --build-arg AIRFLOW_VERSION_SPECIFICATION="==2.0.1" \
-    --build-arg AIRFLOW_CONSTRAINTS_REFERENCE="constraints-2-0" \
-    --build-arg AIRFLOW_SOURCES_FROM="empty" \
-    --build-arg AIRFLOW_SOURCES_TO="/empty" \
-    --build-arg ADDITIONAL_AIRFLOW_EXTRAS="jdbc" \
-    --build-arg ADDITIONAL_DEV_APT_DEPS="gcc g++" \
-    --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
-
-
-Actions executed at image start
--------------------------------
-
-If you are using the default entrypoint of the production image,
-there are a few actions that are automatically performed when the container starts.
-In some cases, you can pass environment variables to the image to trigger some of that behaviour.
-
-The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
-from the variables used to build the image starting with ``AIRFLOW``.
-
-Creating system user
-....................
-
-Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
-Airflow will automatically create such a user and make it's home directory point to ``/home/airflow``.
-You can read more about it in the "Support arbitrary user ids" chapter in the
-`Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_.
-
-Waits for Airflow DB connection
-...............................
-
-In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
-available. This happens always when you use the default entrypoint.
-
-The script detects backend type depending on the URL schema and assigns default port numbers if not specified
-in the URL. Then it loops until the connection to the host/port specified can be established
-It tries ``CONNECTION_CHECK_MAX_COUNT`` times and sleeps ``CONNECTION_CHECK_SLEEP_TIME`` between checks
-To disable check, set ``CONNECTION_CHECK_MAX_COUNT=0``.
-
-Supported schemes:
-
-* ``postgres://`` - default port 5432
-* ``mysql://``    - default port 3306
-* ``sqlite://``
-
-In case of SQLite backend, there is no connection to establish and waiting is skipped.
-
-Upgrading Airflow DB
-....................
-
-If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will run
-the ``airflow db upgrade`` command right after verifying the connection. You can also use this
-when you are running airflow with internal SQLite database (default) to upgrade the db and create
-admin users at entrypoint, so that you can start the webserver immediately. Note - using SQLite is
-intended only for testing purpose, never use SQLite in production as it has severe limitations when it
-comes to concurrency.
-
-
-Creating admin user
-...................
-
-The entrypoint can also create webserver user automatically when you enter it. you need to set
-``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
-production, it is only useful if you would like to run a quick test with the production image.
-You need to pass at least password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
-``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
-the ``*_CMD`` will be evaluated as shell command and it's output will be set as password.
-
-User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
-password for security reasons.
-
-+-----------+--------------------------+----------------------------------------------------------------------+
-| Parameter | Default                  | Environment variable                                                 |
-+===========+==========================+======================================================================+
-| username  | admin                    | ``_AIRFLOW_WWW_USER_USERNAME``                                       |
-+-----------+--------------------------+----------------------------------------------------------------------+
-| password  |                          | ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or ``_AIRFLOW_WWW_USER_PASSWORD`` |
-+-----------+--------------------------+----------------------------------------------------------------------+
-| firstname | Airflow                  | ``_AIRFLOW_WWW_USER_FIRSTNAME``                                      |
-+-----------+--------------------------+----------------------------------------------------------------------+
-| lastname  | Admin                    | ``_AIRFLOW_WWW_USER_LASTNAME``                                       |
-+-----------+--------------------------+----------------------------------------------------------------------+
-| email     | airflowadmin@example.com | ``_AIRFLOW_WWW_USER_EMAIL``                                          |
-+-----------+--------------------------+----------------------------------------------------------------------+
-| role      | Admin                    | ``_AIRFLOW_WWW_USER_ROLE``                                           |
-+-----------+--------------------------+----------------------------------------------------------------------+
-
-In case the password is specified, the user will be attempted to be created, but the entrypoint will
-not fail if the attempt fails (this accounts for the case that the user is already created).
-
-You can, for example start the webserver in the production image with initializing the internal SQLite
-database and creating an ``admin/admin`` Admin user with the following command:
-
-.. code-block:: bash
-
-  docker run -it -p 8080:8080 \
-    --env "_AIRFLOW_DB_UPGRADE=true" \
-    --env "_AIRFLOW_WWW_USER_CREATE=true" \
-    --env "_AIRFLOW_WWW_USER_PASSWORD=admin" \
-      apache/airflow:master-python3.8 webserver
-
-
-.. code-block:: bash
-
-  docker run -it -p 8080:8080 \
-    --env "_AIRFLOW_DB_UPGRADE=true" \
-    --env "_AIRFLOW_WWW_USER_CREATE=true" \
-    --env "_AIRFLOW_WWW_USER_PASSWORD_CMD=echo admin" \
-      apache/airflow:master-python3.8 webserver
-
-The commands above perform initialization of the SQLite database, create admin user with admin password
-and Admin role. They also forward local port ``8080`` to the webserver port and finally start the webserver.
-
-
-Waits for celery broker connection
-..................................
-
-In case Postgres or MySQL DB is used, and one of the ``scheduler``, ``celery``, ``worker``, or ``flower``
-commands are used the entrypoint will wait until the celery broker DB connection is available.
-
-The script detects backend type depending on the URL schema and assigns default port numbers if not specified
-in the URL. Then it loops until connection to the host/port specified can be established
-It tries ``CONNECTION_CHECK_MAX_COUNT`` times and sleeps ``CONNECTION_CHECK_SLEEP_TIME`` between checks.
-To disable check, set ``CONNECTION_CHECK_MAX_COUNT=0``.
-
-Supported schemes:
-
-* ``amqp(s)://``  (rabbitmq) - default port 5672
-* ``redis://``               - default port 6379
-* ``postgres://``            - default port 5432
-* ``mysql://``               - default port 3306
-* ``sqlite://``
-
-In case of SQLite backend, there is no connection to establish and waiting is skipped.
-
-
-Recipes
--------
-
-Users sometimes share interesting ways of using the Docker images. We encourage users to contribute these
-recipes to the documentation in case they prove useful to other members of the community by
-submitting a pull request. The sections below capture this knowledge.
-
-Google Cloud SDK installation
-.............................
-
-Some operators, such as :class:`airflow.providers.google.cloud.operators.kubernetes_engine.GKEStartPodOperator`,
-:class:`airflow.providers.google.cloud.operators.dataflow.DataflowStartSqlJobOperator`, require
-the installation of `Google Cloud SDK <https://cloud.google.com/sdk>`__ (includes ``gcloud``).
-You can also run these commands with BashOperator.
-
-Create a new Dockerfile like the one shown below.
-
-.. exampleinclude:: /docker-images-recipes/gcloud.Dockerfile
-    :language: dockerfile
-
-Then build a new image.
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg BASE_AIRFLOW_IMAGE="apache/airflow:2.0.1" \
-    -t my-airflow-image
-
-
-Apache Hadoop Stack installation
-................................
-
-Airflow is often used to run tasks on Hadoop cluster. It required Java Runtime Environment (JRE) to run.
-Below are the steps to take tools that are frequently used in Hadoop-world:
-
-- Java Runtime Environment (JRE)
-- Apache Hadoop
-- Apache Hive
-- `Cloud Storage connector for Apache Hadoop <https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage>`__
-
-
-Create a new Dockerfile like the one shown below.
-
-.. exampleinclude:: /docker-images-recipes/hadoop.Dockerfile
-    :language: dockerfile
-
-Then build a new image.
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg BASE_AIRFLOW_IMAGE="apache/airflow:2.0.1" \
-    -t my-airflow-image
-
-More details about the images
------------------------------
-
-You can read more details about the images - the context, their parameters and internal structure in the
-`IMAGES.rst <https://github.com/apache/airflow/blob/master/IMAGES.rst>`_ document.
+We provides :doc:`a Docker Image (OCI) for Apache Airflow <docker-stack:index>` for use in a containerized environment. Consider using it to guarantees that software will always run the same no matter where it’s deployed.

Review comment:
       ```suggestion
   We provide :doc:`a Docker Image (OCI) for Apache Airflow <docker-stack:index>` for use in a containerized environment. Consider using it to guarantee that software will always run the same no matter where it’s deployed.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #14765: Docker image docs

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #14765:
URL: https://github.com/apache/airflow/pull/14765#discussion_r593816132



##########
File path: docs/docker-stack/entrypoint.rst
##########
@@ -0,0 +1,201 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Entrypoint
+==========
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+The image entrypoint works as follows:
+
+* In case the user is not "airflow" (with undefined user id) and the group id of the user is set to ``0`` (root),
+  then the user is dynamically added to /etc/passwd at entry using ``USER_NAME`` variable to define the user name.
+  This is in order to accommodate the
+  `OpenShift Guidelines <https://docs.openshift.com/enterprise/3.0/creating_images/guidelines.html>`_
+
+* The ``AIRFLOW_HOME`` is set by default to ``/opt/airflow/`` - this means that DAGs
+  are in default in the ``/opt/airflow/dags`` folder and logs are in the ``/opt/airflow/logs``
+
+* The working directory is ``/opt/airflow`` by default.
+
+* If ``AIRFLOW__CORE__SQL_ALCHEMY_CONN`` variable is passed to the container and it is either mysql or postgres
+  SQL alchemy connection, then the connection is checked and the script waits until the database is reachable.
+  If ``AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD`` variable is passed to the container, it is evaluated as a
+  command to execute and result of this evaluation is used as ``AIRFLOW__CORE__SQL_ALCHEMY_CONN``. The
+  ``_CMD`` variable takes precedence over the ``AIRFLOW__CORE__SQL_ALCHEMY_CONN`` variable.
+
+* If no ``AIRFLOW__CORE__SQL_ALCHEMY_CONN`` variable is set then SQLite database is created in
+  ``${AIRFLOW_HOME}/airflow.db`` and db reset is executed.
+
+* If first argument equals to "bash" - you are dropped to a bash shell or you can executes bash command
+  if you specify extra arguments. For example:
+
+  .. code-block:: bash
+
+    docker run -it apache/airflow:master-python3.6 bash -c "ls -la"
+    total 16
+    drwxr-xr-x 4 airflow root 4096 Jun  5 18:12 .
+    drwxr-xr-x 1 root    root 4096 Jun  5 18:12 ..
+    drwxr-xr-x 2 airflow root 4096 Jun  5 18:12 dags
+    drwxr-xr-x 2 airflow root 4096 Jun  5 18:12 logs
+
+* If first argument is equal to ``python`` - you are dropped in python shell or python commands are executed if
+  you pass extra parameters. For example:
+
+  .. code-block:: bash
+
+    > docker run -it apache/airflow:master-python3.6 python -c "print('test')"
+    test
+
+* If first argument equals to "airflow" - the rest of the arguments is treated as an airflow command
+  to execute. Example:
+
+  .. code-block:: bash
+
+     docker run -it apache/airflow:master-python3.6 airflow webserver
+
+* If there are any other arguments - they are simply passed to the "airflow" command
+
+  .. code-block:: bash
+
+    > docker run -it apache/airflow:master-python3.6 version
+    2.1.0.dev0
+
+* If ``AIRFLOW__CELERY__BROKER_URL`` variable is passed and airflow command with
+  scheduler, worker of flower command is used, then the script checks the broker connection
+  and waits until the Celery broker database is reachable.
+  If ``AIRFLOW__CELERY__BROKER_URL_CMD`` variable is passed to the container, it is evaluated as a
+  command to execute and result of this evaluation is used as ``AIRFLOW__CELERY__BROKER_URL``. The
+  ``_CMD`` variable takes precedence over the ``AIRFLOW__CELERY__BROKER_URL`` variable.
+
+Creating system user
+--------------------
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to ``/home/airflow``.
+You can read more about it in the "Support arbitrary user ids" chapter in the
+`Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_.
+
+Waits for Airflow DB connection
+-------------------------------
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+The script detects backend type depending on the URL schema and assigns default port numbers if not specified
+in the URL. Then it loops until the connection to the host/port specified can be established
+It tries ``CONNECTION_CHECK_MAX_COUNT`` times and sleeps ``CONNECTION_CHECK_SLEEP_TIME`` between checks
+To disable check, set ``CONNECTION_CHECK_MAX_COUNT=0``.
+
+Supported schemes:
+
+* ``postgres://`` - default port 5432
+* ``mysql://``    - default port 3306
+* ``sqlite://``
+
+In case of SQLite backend, there is no connection to establish and waiting is skipped.
+
+Upgrading Airflow DB
+--------------------
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will run
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal SQLite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using SQLite is
+intended only for testing purpose, never use SQLite in production as it has severe limitations when it
+comes to concurrency.
+
+Creating admin user
+-------------------
+
+The entrypoint can also create webserver user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you would like to run a quick test with the production image.
+You need to pass at least password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
+``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
+the ``*_CMD`` will be evaluated as shell command and it's output will be set as password.
+
+User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
+password for security reasons.
+
++-----------+--------------------------+----------------------------------------------------------------------+
+| Parameter | Default                  | Environment variable                                                 |
++===========+==========================+======================================================================+
+| username  | admin                    | ``_AIRFLOW_WWW_USER_USERNAME``                                       |
++-----------+--------------------------+----------------------------------------------------------------------+
+| password  |                          | ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or ``_AIRFLOW_WWW_USER_PASSWORD`` |
++-----------+--------------------------+----------------------------------------------------------------------+
+| firstname | Airflow                  | ``_AIRFLOW_WWW_USER_FIRSTNAME``                                      |
++-----------+--------------------------+----------------------------------------------------------------------+
+| lastname  | Admin                    | ``_AIRFLOW_WWW_USER_LASTNAME``                                       |
++-----------+--------------------------+----------------------------------------------------------------------+
+| email     | airflowadmin@example.com | ``_AIRFLOW_WWW_USER_EMAIL``                                          |
++-----------+--------------------------+----------------------------------------------------------------------+
+| role      | Admin                    | ``_AIRFLOW_WWW_USER_ROLE``                                           |
++-----------+--------------------------+----------------------------------------------------------------------+
+
+In case the password is specified, the user will be attempted to be created, but the entrypoint will
+not fail if the attempt fails (this accounts for the case that the user is already created).
+
+You can, for example start the webserver in the production image with initializing the internal SQLite
+database and creating an ``admin/admin`` Admin user with the following command:
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    --env "_AIRFLOW_DB_UPGRADE=true" \
+    --env "_AIRFLOW_WWW_USER_CREATE=true" \
+    --env "_AIRFLOW_WWW_USER_PASSWORD=admin" \
+      apache/airflow:master-python3.8 webserver
+
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    --env "_AIRFLOW_DB_UPGRADE=true" \
+    --env "_AIRFLOW_WWW_USER_CREATE=true" \
+    --env "_AIRFLOW_WWW_USER_PASSWORD_CMD=echo admin" \
+      apache/airflow:master-python3.8 webserver
+
+The commands above perform initialization of the SQLite database, create admin user with admin password
+and Admin role. They also forward local port ``8080`` to the webserver port and finally start the webserver.
+
+Waits for celery broker connection
+----------------------------------
+
+In case Postgres or MySQL DB is used, and one of the ``scheduler``, ``celery``, ``worker``, or ``flower``
+commands are used the entrypoint will wait until the celery broker DB connection is available.
+
+The script detects backend type depending on the URL schema and assigns default port numbers if not specified
+in the URL. Then it loops until connection to the host/port specified can be established
+It tries ``CONNECTION_CHECK_MAX_COUNT`` times and sleeps ``CONNECTION_CHECK_SLEEP_TIME`` between checks.
+To disable check, set ``CONNECTION_CHECK_MAX_COUNT=0``.
+
+Supported schemes:
+
+* ``amqp(s)://``  (rabbitmq) - default port 5672
+* ``redis://``               - default port 6379
+* ``postgres://``            - default port 5432
+* ``mysql://``               - default port 3306
+* ``sqlite://``

Review comment:
       It is not supported. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jwitz commented on pull request #14765: Create a documentation package for Docker image

Posted by GitBox <gi...@apache.org>.
jwitz commented on pull request #14765:
URL: https://github.com/apache/airflow/pull/14765#issuecomment-814298820


   @mik-laj Is this published on the web yet? I can't seem to find it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #14765: Create a documentation package for Docker image

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #14765:
URL: https://github.com/apache/airflow/pull/14765#discussion_r595474479



##########
File path: docs/apache-airflow/installation.rst
##########
@@ -27,7 +27,7 @@ installation with other tools as well.
 
 .. note::
 
-    Airflow is also distributed as a Docker image (OCI Image). For more information, see: :ref:`docker_image`
+    Airflow is also distributed as a Docker image (OCI Image). Consider using it to guarantees that software will always run the same no matter where it’s deployed. For more information, see: :doc:`docker-stack:index`.

Review comment:
       ```suggestion
       Airflow is also distributed as a Docker image (OCI Image). Consider using it to guarantee that software will always run the same no matter where it is deployed. For more information, see: :doc:`docker-stack:index`.
   ```

##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -118,852 +118,7 @@ To mitigate these issues, make sure you have a :doc:`health check </logging-moni
 Production Container Images
 ===========================
 
-Production-ready reference Image
---------------------------------
-
-For the ease of deployment in production, the community releases a production-ready reference container
-image.
-
-The docker image provided (as convenience binary package) in the
-`Apache Airflow DockerHub <https://hub.docker.com/r/apache/airflow>`_ is a bare image
-that has a few external dependencies and extras installed..
-
-The Apache Airflow image provided as convenience package is optimized for size, so
-it provides just a bare minimal set of the extras and dependencies installed and in most cases
-you want to either extend or customize the image. You can see all possible extras in
-:doc:`extra-packages-ref`. The set of extras used in Airflow Production image are available in the
-`Dockerfile <https://github.com/apache/airflow/blob/2c6c7fdb2308de98e142618836bdf414df9768c8/Dockerfile#L39>`_.
-
-The production images are build in DockerHub from released version and release candidates. There
-are also images published from branches but they are used mainly for development and testing purpose.
-See `Airflow Git Branching <https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#airflow-git-branches>`_
-for details.
-
-
-Customizing or extending the Production Image
----------------------------------------------
-
-Before you dive-deeply in the way how the Airflow Image is build, named and why we are doing it the
-way we do, you might want to know very quickly how you can extend or customize the existing image
-for Apache Airflow. This chapter gives you a short answer to those questions.
-
-Airflow Summit 2020's `Production Docker Image <https://youtu.be/wDr3Y7q2XoI>`_ talk provides more
-details about the context, architecture and customization/extension methods for the Production Image.
-
-Extending the image
-...................
-
-Extending the image is easiest if you just need to add some dependencies that do not require
-compiling. The compilation framework of Linux (so called ``build-essential``) is pretty big, and
-for the production images, size is really important factor to optimize for, so our Production Image
-does not contain ``build-essential``. If you need compiler like gcc or g++ or make/cmake etc. - those
-are not found in the image and it is recommended that you follow the "customize" route instead.
-
-How to extend the image - it is something you are most likely familiar with - simply
-build a new image using Dockerfile's ``FROM`` directive and add whatever you need. Then you can add your
-Debian dependencies with ``apt`` or PyPI dependencies with ``pip install`` or any other stuff you need.
-
-You should be aware, about a few things:
-
-* The production image of airflow uses "airflow" user, so if you want to add some of the tools
-  as ``root`` user, you need to switch to it with ``USER`` directive of the Dockerfile. Also you
-  should remember about following the
-  `best practises of Dockerfiles <https://docs.docker.com/develop/develop-images/dockerfile_best-practices/>`_
-  to make sure your image is lean and small.
-
-.. code-block:: dockerfile
-
-  FROM apache/airflow:2.0.1
-  USER root
-  RUN apt-get update \
-    && apt-get install -y --no-install-recommends \
-           my-awesome-apt-dependency-to-add \
-    && apt-get autoremove -yqq --purge \
-    && apt-get clean \
-    && rm -rf /var/lib/apt/lists/*
-  USER airflow
-
-
-* PyPI dependencies in Apache Airflow are installed in the user library, of the "airflow" user, so
-  you need to install them with the ``--user`` flag and WITHOUT switching to airflow user. Note also
-  that using --no-cache-dir is a good idea that can help to make your image smaller.
-
-.. code-block:: dockerfile
-
-  FROM apache/airflow:2.0.1
-  RUN pip install --no-cache-dir --user my-awesome-pip-dependency-to-add
-
-* As of 2.0.1 image the ``--user`` flag is turned on by default by setting ``PIP_USER`` environment variable
-  to ``true``. This can be disabled by un-setting the variable or by setting it to ``false``.
-
-
-* If your apt, or PyPI dependencies require some of the build-essentials, then your best choice is
-  to follow the "Customize the image" route. However it requires to checkout sources of Apache Airflow,
-  so you might still want to choose to add build essentials to your image, even if your image will
-  be significantly bigger.
-
-.. code-block:: dockerfile
-
-  FROM apache/airflow:2.0.1
-  USER root
-  RUN apt-get update \
-    && apt-get install -y --no-install-recommends \
-           build-essential my-awesome-apt-dependency-to-add \
-    && apt-get autoremove -yqq --purge \
-    && apt-get clean \
-    && rm -rf /var/lib/apt/lists/*
-  USER airflow
-  RUN pip install --no-cache-dir --user my-awesome-pip-dependency-to-add
-
-
-* You can also embed your dags in the image by simply adding them with COPY directive of Airflow.
-  The DAGs in production image are in /opt/airflow/dags folder.
-
-Customizing the image
-.....................
-
-Customizing the image is an alternative way of adding your own dependencies to the image - better
-suited to prepare optimized production images.
-
-The advantage of this method is that it produces optimized image even if you need some compile-time
-dependencies that are not needed in the final image. You need to use Airflow Sources to build such images
-from the `official distribution folder of Apache Airflow <https://downloads.apache.org/airflow/>`_ for the
-released versions, or checked out from the GitHub project if you happen to do it from git sources.
-
-The easiest way to build the image is to use ``breeze`` script, but you can also build such customized
-image by running appropriately crafted docker build in which you specify all the ``build-args``
-that you need to add to customize it. You can read about all the args and ways you can build the image
-in the `<#production-image-build-arguments>`_ chapter below.
-
-Here just a few examples are presented which should give you general understanding of what you can customize.
-
-This builds the production image in version 3.7 with additional airflow extras from 2.0.1 PyPI package and
-additional apt dev and runtime dependencies.
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg PYTHON_BASE_IMAGE="python:3.7-slim-buster" \
-    --build-arg PYTHON_MAJOR_MINOR_VERSION=3.7 \
-    --build-arg AIRFLOW_INSTALLATION_METHOD="apache-airflow" \
-    --build-arg AIRFLOW_VERSION="2.0.1" \
-    --build-arg AIRFLOW_VERSION_SPECIFICATION="==2.0.1" \
-    --build-arg AIRFLOW_CONSTRAINTS_REFERENCE="constraints-2-0" \
-    --build-arg AIRFLOW_SOURCES_FROM="empty" \
-    --build-arg AIRFLOW_SOURCES_TO="/empty" \
-    --build-arg ADDITIONAL_AIRFLOW_EXTRAS="jdbc" \
-    --build-arg ADDITIONAL_PYTHON_DEPS="pandas" \
-    --build-arg ADDITIONAL_DEV_APT_DEPS="gcc g++" \
-    --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless" \
-    --tag my-image
-
-
-the same image can be built using ``breeze`` (it supports auto-completion of the options):
-
-.. code-block:: bash
-
-  ./breeze build-image \
-      --production-image  --python 3.7 --install-airflow-version=2.0.1 \
-      --additional-extras=jdbc --additional-python-deps="pandas" \
-      --additional-dev-apt-deps="gcc g++" --additional-runtime-apt-deps="default-jre-headless"
-
-
-You can customize more aspects of the image - such as additional commands executed before apt dependencies
-are installed, or adding extra sources to install your dependencies from. You can see all the arguments
-described below but here is an example of rather complex command to customize the image
-based on example in `this comment <https://github.com/apache/airflow/issues/8605#issuecomment-690065621>`_:
-
-.. code-block:: bash
-
-  docker build . -f Dockerfile \
-    --build-arg PYTHON_BASE_IMAGE="python:3.7-slim-buster" \
-    --build-arg PYTHON_MAJOR_MINOR_VERSION=3.7 \
-    --build-arg AIRFLOW_INSTALLATION_METHOD="apache-airflow" \
-    --build-arg AIRFLOW_VERSION="2.0.1" \
-    --build-arg AIRFLOW_VERSION_SPECIFICATION="==2.0.1" \
-    --build-arg AIRFLOW_CONSTRAINTS_REFERENCE="constraints-2-0" \
-    --build-arg AIRFLOW_SOURCES_FROM="empty" \
-    --build-arg AIRFLOW_SOURCES_TO="/empty" \
-    --build-arg ADDITIONAL_AIRFLOW_EXTRAS="slack" \
-    --build-arg ADDITIONAL_PYTHON_DEPS="apache-airflow-backport-providers-odbc \
-        apache-airflow-backport-providers-odbc \
-        azure-storage-blob \
-        sshtunnel \
-        google-api-python-client \
-        oauth2client \
-        beautifulsoup4 \
-        dateparser \
-        rocketchat_API \
-        typeform" \
-    --build-arg ADDITIONAL_DEV_APT_DEPS="msodbcsql17 unixodbc-dev g++" \
-    --build-arg ADDITIONAL_DEV_APT_COMMAND="curl https://packages.microsoft.com/keys/microsoft.asc | \
-    apt-key add --no-tty - && \
-    curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list" \
-    --build-arg ADDITIONAL_DEV_ENV_VARS="ACCEPT_EULA=Y" \
-    --build-arg ADDITIONAL_RUNTIME_APT_COMMAND="curl https://packages.microsoft.com/keys/microsoft.asc | \
-    apt-key add --no-tty - && \
-    curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list" \
-    --build-arg ADDITIONAL_RUNTIME_APT_DEPS="msodbcsql17 unixodbc git procps vim" \
-    --build-arg ADDITIONAL_RUNTIME_ENV_VARS="ACCEPT_EULA=Y" \
-    --tag my-image
-
-Customizing images in high security restricted environments
-...........................................................
-
-You can also make sure your image is only build using local constraint file and locally downloaded
-wheel files. This is often useful in Enterprise environments where the binary files are verified and
-vetted by the security teams.
-
-This builds below builds the production image in version 3.7 with packages and constraints used from the local
-``docker-context-files`` rather than installed from PyPI or GitHub. It also disables MySQL client
-installation as it is using external installation method.
-
-Note that as a prerequisite - you need to have downloaded wheel files. In the example below we
-first download such constraint file locally and then use ``pip download`` to get the .whl files needed
-but in most likely scenario, those wheel files should be copied from an internal repository of such .whl
-files. Note that ``AIRFLOW_VERSION_SPECIFICATION`` is only there for reference, the apache airflow .whl file
-in the right version is part of the .whl files downloaded.
-
-Note that 'pip download' will only works on Linux host as some of the packages need to be compiled from
-sources and you cannot install them providing ``--platform`` switch. They also need to be downloaded using
-the same python version as the target image.
-
-The ``pip download`` might happen in a separate environment. The files can be committed to a separate
-binary repository and vetted/verified by the security team and used subsequently to build images
-of Airflow when needed on an air-gaped system.
-
-Preparing the constraint files and wheel files:
-
-.. code-block:: bash
-
-  rm docker-context-files/*.whl docker-context-files/*.txt
-
-  curl -Lo "docker-context-files/constraints-2-0.txt" \
-    https://raw.githubusercontent.com/apache/airflow/constraints-2-0/constraints-3.7.txt
-
-  pip download --dest docker-context-files \
-    --constraint docker-context-files/constraints-2-0.txt  \
-    apache-airflow[async,aws,azure,celery,dask,elasticsearch,gcp,kubernetes,mysql,postgres,redis,slack,ssh,statsd,virtualenv]==2.0.1
-
-Since apache-airflow .whl packages are treated differently by the docker image, you need to rename the
-downloaded apache-airflow* files, for example:
-
-.. code-block:: bash
-
-   pushd docker-context-files
-   for file in apache?airflow*
-   do
-     mv ${file} _${file}
-   done
-   popd
-
-Building the image:
-
-.. code-block:: bash
-
-  ./breeze build-image \
-      --production-image --python 3.7 --install-airflow-version=2.0.1 \
-      --disable-mysql-client-installation --disable-pip-cache --install-from-local-files-when-building \
-      --constraints-location="/docker-context-files/constraints-2-0.txt"
-
-or
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg PYTHON_BASE_IMAGE="python:3.7-slim-buster" \
-    --build-arg PYTHON_MAJOR_MINOR_VERSION=3.7 \
-    --build-arg AIRFLOW_INSTALLATION_METHOD="apache-airflow" \
-    --build-arg AIRFLOW_VERSION="2.0.1" \
-    --build-arg AIRFLOW_VERSION_SPECIFICATION="==2.0.1" \
-    --build-arg AIRFLOW_CONSTRAINTS_REFERENCE="constraints-2-0" \
-    --build-arg AIRFLOW_SOURCES_FROM="empty" \
-    --build-arg AIRFLOW_SOURCES_TO="/empty" \
-    --build-arg INSTALL_MYSQL_CLIENT="false" \
-    --build-arg AIRFLOW_PRE_CACHED_PIP_PACKAGES="false" \
-    --build-arg INSTALL_FROM_DOCKER_CONTEXT_FILES="true" \
-    --build-arg AIRFLOW_CONSTRAINTS_LOCATION="/docker-context-files/constraints-2-0.txt"
-
-
-Customizing & extending the image together
-..........................................
-
-You can combine both - customizing & extending the image. You can build the image first using
-``customize`` method (either with docker command or with ``breeze`` and then you can ``extend``
-the resulting image using ``FROM`` any dependencies you want.
-
-Customizing PYPI installation
-.............................
-
-You can customize PYPI sources used during image build by adding a docker-context-files/.pypirc file
-This .pypirc will never be committed to the repository and will not be present in the final production image.
-It is added and used only in the build segment of the image so it is never copied to the final image.
-
-External sources for dependencies
----------------------------------
-
-In corporate environments, there is often the need to build your Container images using
-other than default sources of dependencies. The docker file uses standard sources (such as
-Debian apt repositories or PyPI repository. However, in corporate environments, the dependencies
-are often only possible to be installed from internal, vetted repositories that are reviewed and
-approved by the internal security teams. In those cases, you might need to use those different
-sources.
-
-This is rather easy if you extend the image - you simply write your extension commands
-using the right sources - either by adding/replacing the sources in apt configuration or
-specifying the source repository in pip install command.
-
-It's a bit more involved in the case of customizing the image. We do not have yet (but we are working
-on it) a capability of changing the sources via build args. However, since the builds use
-Dockerfile that is a source file, you can rather easily simply modify the file manually and
-specify different sources to be used by either of the commands.
-
-
-Comparing extending and customizing the image
----------------------------------------------
-
-Here is the comparison of the two types of building images.
-
-+----------------------------------------------------+---------------------+-----------------------+
-|                                                    | Extending the image | Customizing the image |
-+====================================================+=====================+=======================+
-| Produces optimized image                           | No                  | Yes                   |
-+----------------------------------------------------+---------------------+-----------------------+
-| Use Airflow Dockerfile sources to build the image  | No                  | Yes                   |
-+----------------------------------------------------+---------------------+-----------------------+
-| Requires Airflow sources                           | No                  | Yes                   |
-+----------------------------------------------------+---------------------+-----------------------+
-| You can build it with Breeze                       | No                  | Yes                   |
-+----------------------------------------------------+---------------------+-----------------------+
-| Allows to use non-default sources for dependencies | Yes                 | No [1]                |
-+----------------------------------------------------+---------------------+-----------------------+
-
-[1] When you combine customizing and extending the image, you can use external sources
-in the "extend" part. There are plans to add functionality to add external sources
-option to image customization. You can also modify Dockerfile manually if you want to
-use non-default sources for dependencies.
-
-Using the production image
---------------------------
-
-The PROD image entrypoint works as follows:
-
-* In case the user is not "airflow" (with undefined user id) and the group id of the user is set to 0 (root),
-  then the user is dynamically added to /etc/passwd at entry using USER_NAME variable to define the user name.
-  This is in order to accommodate the
-  `OpenShift Guidelines <https://docs.openshift.com/enterprise/3.0/creating_images/guidelines.html>`_
-
-* The ``AIRFLOW_HOME`` is set by default to ``/opt/airflow/`` - this means that DAGs
-  are in default in the ``/opt/airflow/dags`` folder and logs are in the ``/opt/airflow/logs``
-
-* The working directory is ``/opt/airflow`` by default.
-
-* If ``AIRFLOW__CORE__SQL_ALCHEMY_CONN`` variable is passed to the container and it is either mysql or postgres
-  SQL alchemy connection, then the connection is checked and the script waits until the database is reachable.
-  If ``AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD`` variable is passed to the container, it is evaluated as a
-  command to execute and result of this evaluation is used as ``AIRFLOW__CORE__SQL_ALCHEMY_CONN``. The
-  ``_CMD`` variable takes precedence over the ``AIRFLOW__CORE__SQL_ALCHEMY_CONN`` variable.
-
-* If no ``AIRFLOW__CORE__SQL_ALCHEMY_CONN`` variable is set then SQLite database is created in
-  ${AIRFLOW_HOME}/airflow.db and db reset is executed.
-
-* If first argument equals to "bash" - you are dropped to a bash shell or you can executes bash command
-  if you specify extra arguments. For example:
-
-.. code-block:: bash
-
-  docker run -it apache/airflow:master-python3.6 bash -c "ls -la"
-  total 16
-  drwxr-xr-x 4 airflow root 4096 Jun  5 18:12 .
-  drwxr-xr-x 1 root    root 4096 Jun  5 18:12 ..
-  drwxr-xr-x 2 airflow root 4096 Jun  5 18:12 dags
-  drwxr-xr-x 2 airflow root 4096 Jun  5 18:12 logs
-
-* If first argument is equal to "python" - you are dropped in python shell or python commands are executed if
-  you pass extra parameters. For example:
-
-.. code-block:: bash
-
-  > docker run -it apache/airflow:master-python3.6 python -c "print('test')"
-  test
-
-* If first argument equals to "airflow" - the rest of the arguments is treated as an airflow command
-  to execute. Example:
-
-.. code-block:: bash
-
-   docker run -it apache/airflow:master-python3.6 airflow webserver
-
-* If there are any other arguments - they are simply passed to the "airflow" command
-
-.. code-block:: bash
-
-  > docker run -it apache/airflow:master-python3.6 version
-  2.1.0.dev0
-
-* If ``AIRFLOW__CELERY__BROKER_URL`` variable is passed and airflow command with
-  scheduler, worker of flower command is used, then the script checks the broker connection
-  and waits until the Celery broker database is reachable.
-  If ``AIRFLOW__CELERY__BROKER_URL_CMD`` variable is passed to the container, it is evaluated as a
-  command to execute and result of this evaluation is used as ``AIRFLOW__CELERY__BROKER_URL``. The
-  ``_CMD`` variable takes precedence over the ``AIRFLOW__CELERY__BROKER_URL`` variable.
-
-Production image build arguments
---------------------------------
-
-The following build arguments (``--build-arg`` in docker build command) can be used for production images:
-
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| Build argument                           | Default value                            | Description                              |
-+==========================================+==========================================+==========================================+
-| ``PYTHON_BASE_IMAGE``                    | ``python:3.6-slim-buster``               | Base python image.                       |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``PYTHON_MAJOR_MINOR_VERSION``           | ``3.6``                                  | major/minor version of Python (should    |
-|                                          |                                          | match base image).                       |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_VERSION``                      | ``2.0.1.dev0``                           | version of Airflow.                      |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_REPO``                         | ``apache/airflow``                       | the repository from which PIP            |
-|                                          |                                          | dependencies are pre-installed.          |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_BRANCH``                       | ``master``                               | the branch from which PIP dependencies   |
-|                                          |                                          | are pre-installed initially.             |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_CONSTRAINTS_LOCATION``         |                                          | If not empty, it will override the       |
-|                                          |                                          | source of the constraints with the       |
-|                                          |                                          | specified URL or file. Note that the     |
-|                                          |                                          | file has to be in docker context so      |
-|                                          |                                          | it's best to place such file in          |
-|                                          |                                          | one of the folders included in           |
-|                                          |                                          | .dockerignore.                           |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_CONSTRAINTS_REFERENCE``        | ``constraints-master``                   | Reference (branch or tag) from GitHub    |
-|                                          |                                          | where constraints file is taken from     |
-|                                          |                                          | It can be ``constraints-master`` but     |
-|                                          |                                          | also can be ``constraints-1-10`` for     |
-|                                          |                                          | 1.10.* installation. In case of building |
-|                                          |                                          | specific version you want to point it    |
-|                                          |                                          | to specific tag, for example             |
-|                                          |                                          | ``constraints-1.10.14``.                 |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``INSTALL_PROVIDERS_FROM_SOURCES``       | ``false``                                | If set to ``true`` and image is built    |
-|                                          |                                          | from sources, all provider packages are  |
-|                                          |                                          | installed from sources rather than from  |
-|                                          |                                          | packages. It has no effect when          |
-|                                          |                                          | installing from PyPI or GitHub repo.     |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_EXTRAS``                       | (see Dockerfile)                         | Default extras with which airflow is     |
-|                                          |                                          | installed.                               |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``INSTALL_FROM_PYPI``                    | ``true``                                 | If set to true, Airflow is installed     |
-|                                          |                                          | from PyPI. if you want to install        |
-|                                          |                                          | Airflow from self-build package          |
-|                                          |                                          | you can set it to false, put package in  |
-|                                          |                                          | ``docker-context-files`` and set         |
-|                                          |                                          | ``INSTALL_FROM_DOCKER_CONTEXT_FILES`` to |
-|                                          |                                          | ``true``. For this you have to also keep |
-|                                          |                                          | ``AIRFLOW_PRE_CACHED_PIP_PACKAGES`` flag |
-|                                          |                                          | set to ``false``.                        |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_PRE_CACHED_PIP_PACKAGES``      | ``false``                                | Allows to pre-cache airflow PIP packages |
-|                                          |                                          | from the GitHub of Apache Airflow        |
-|                                          |                                          | This allows to optimize iterations for   |
-|                                          |                                          | Image builds and speeds up CI builds.    |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``INSTALL_FROM_DOCKER_CONTEXT_FILES``    | ``false``                                | If set to true, Airflow, providers and   |
-|                                          |                                          | all dependencies are installed from      |
-|                                          |                                          | from locally built/downloaded            |
-|                                          |                                          | .whl and .tar.gz files placed in the     |
-|                                          |                                          | ``docker-context-files``. In certain     |
-|                                          |                                          | corporate environments, this is required |
-|                                          |                                          | to install airflow from such pre-vetted  |
-|                                          |                                          | packages rather than from PyPI. For this |
-|                                          |                                          | to work, also set ``INSTALL_FROM_PYPI``. |
-|                                          |                                          | Note that packages starting with         |
-|                                          |                                          | ``apache?airflow`` glob are treated      |
-|                                          |                                          | differently than other packages. All     |
-|                                          |                                          | ``apache?airflow`` packages are          |
-|                                          |                                          | installed with dependencies limited by   |
-|                                          |                                          | airflow constraints. All other packages  |
-|                                          |                                          | are installed without dependencies       |
-|                                          |                                          | 'as-is'. If you wish to install airflow  |
-|                                          |                                          | via 'pip download' with all dependencies |
-|                                          |                                          | downloaded, you have to rename the       |
-|                                          |                                          | apache airflow and provider packages to  |
-|                                          |                                          | not start with ``apache?airflow`` glob.  |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``UPGRADE_TO_NEWER_DEPENDENCIES``        | ``false``                                | If set to true, the dependencies are     |
-|                                          |                                          | upgraded to newer versions matching      |
-|                                          |                                          | setup.py before installation.            |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``CONTINUE_ON_PIP_CHECK_FAILURE``        | ``false``                                | By default the image build fails if pip  |
-|                                          |                                          | check fails for it. This is good for     |
-|                                          |                                          | interactive building but on CI the       |
-|                                          |                                          | image should be built regardless - we    |
-|                                          |                                          | have a separate step to verify image.    |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_AIRFLOW_EXTRAS``            |                                          | Optional additional extras with which    |
-|                                          |                                          | airflow is installed.                    |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_PYTHON_DEPS``               |                                          | Optional python packages to extend       |
-|                                          |                                          | the image with some extra dependencies.  |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``DEV_APT_COMMAND``                      | (see Dockerfile)                         | Dev apt command executed before dev deps |
-|                                          |                                          | are installed in the Build image.        |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_DEV_APT_COMMAND``           |                                          | Additional Dev apt command executed      |
-|                                          |                                          | before dev dep are installed             |
-|                                          |                                          | in the Build image. Should start with    |
-|                                          |                                          | ``&&``.                                  |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``DEV_APT_DEPS``                         | (see Dockerfile)                         | Dev APT dependencies installed           |
-|                                          |                                          | in the Build image.                      |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_DEV_APT_DEPS``              |                                          | Additional apt dev dependencies          |
-|                                          |                                          | installed in the Build image.            |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_DEV_APT_ENV``               |                                          | Additional env variables defined         |
-|                                          |                                          | when installing dev deps.                |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``RUNTIME_APT_COMMAND``                  | (see Dockerfile)                         | Runtime apt command executed before deps |
-|                                          |                                          | are installed in the Main image.         |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_RUNTIME_APT_COMMAND``       |                                          | Additional Runtime apt command executed  |
-|                                          |                                          | before runtime dep are installed         |
-|                                          |                                          | in the Main image. Should start with     |
-|                                          |                                          | ``&&``.                                  |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``RUNTIME_APT_DEPS``                     | (see Dockerfile)                         | Runtime APT dependencies installed       |
-|                                          |                                          | in the Main image.                       |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_RUNTIME_APT_DEPS``          |                                          | Additional apt runtime dependencies      |
-|                                          |                                          | installed in the Main image.             |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``ADDITIONAL_RUNTIME_APT_ENV``           |                                          | Additional env variables defined         |
-|                                          |                                          | when installing runtime deps.            |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_HOME``                         | ``/opt/airflow``                         | Airflow’s HOME (that’s where logs and    |
-|                                          |                                          | SQLite databases are stored).            |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_UID``                          | ``50000``                                | Airflow user UID.                        |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_GID``                          | ``50000``                                | Airflow group GID. Note that most files  |
-|                                          |                                          | created on behalf of airflow user belong |
-|                                          |                                          | to the ``root`` group (0) to keep        |
-|                                          |                                          | OpenShift Guidelines compatibility.      |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_USER_HOME_DIR``                | ``/home/airflow``                        | Home directory of the Airflow user.      |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``CASS_DRIVER_BUILD_CONCURRENCY``        | ``8``                                    | Number of processors to use for          |
-|                                          |                                          | cassandra PIP install (speeds up         |
-|                                          |                                          | installing in case cassandra extra is    |
-|                                          |                                          | used).                                   |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``INSTALL_MYSQL_CLIENT``                 | ``true``                                 | Whether MySQL client should be installed |
-|                                          |                                          | The mysql extra is removed from extras   |
-|                                          |                                          | if the client is not installed.          |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``AIRFLOW_PIP_VERSION``                  | ``20.2.4``                               | PIP version used.                        |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-| ``PIP_PROGRESS_BAR``                     | ``on``                                   | Progress bar for PIP installation        |
-+------------------------------------------+------------------------------------------+------------------------------------------+
-
-There are build arguments that determine the installation mechanism of Apache Airflow for the
-production image. There are three types of build:
-
-* From local sources (by default for example when you use ``docker build .``)
-* You can build the image from released PyPI airflow package (used to build the official Docker image)
-* You can build the image from any version in GitHub repository(this is used mostly for system testing).
-
-+-----------------------------------+------------------------+-----------------------------------------------------------------------------------+
-| Build argument                    | Default                | What to specify                                                                   |
-+===================================+========================+===================================================================================+
-| ``AIRFLOW_INSTALLATION_METHOD``   | ``apache-airflow``     | Should point to the installation method of Apache Airflow. It can be              |
-|                                   |                        | ``apache-airflow`` for installation from packages and URL to installation from    |
-|                                   |                        | GitHub repository tag or branch or "." to install from sources.                   |
-|                                   |                        | Note that installing from local sources requires appropriate values of the        |
-|                                   |                        | ``AIRFLOW_SOURCES_FROM`` and ``AIRFLOW_SOURCES_TO`` variables as described below. |
-|                                   |                        | Only used when ``INSTALL_FROM_PYPI`` is set to ``true``.                          |
-+-----------------------------------+------------------------+-----------------------------------------------------------------------------------+
-| ``AIRFLOW_VERSION_SPECIFICATION`` |                        | Optional - might be used for package installation of different Airflow version    |
-|                                   |                        | for example"==2.0.1". For consistency, you should also set``AIRFLOW_VERSION``     |
-|                                   |                        | to the same value AIRFLOW_VERSION is resolved as label in the image created.      |
-+-----------------------------------+------------------------+-----------------------------------------------------------------------------------+
-| ``AIRFLOW_CONSTRAINTS_REFERENCE`` | ``constraints-master`` | Reference (branch or tag) from GitHub where constraints file is taken from.       |
-|                                   |                        | It can be ``constraints-master`` but also can be``constraints-1-10`` for          |
-|                                   |                        | 1.10.*  installations. In case of building specific version                       |
-|                                   |                        | you want to point it to specific tag, for example ``constraints-2.0.1``           |
-+-----------------------------------+------------------------+-----------------------------------------------------------------------------------+
-| ``AIRFLOW_WWW``                   | ``www``                | In case of Airflow 2.0 it should be "www", in case of Airflow 1.10                |
-|                                   |                        | series it should be "www_rbac".                                                   |
-+-----------------------------------+------------------------+-----------------------------------------------------------------------------------+
-| ``AIRFLOW_SOURCES_FROM``          | ``empty``              | Sources of Airflow. Set it to "." when you install airflow from                   |
-|                                   |                        | local sources.                                                                    |
-+-----------------------------------+------------------------+-----------------------------------------------------------------------------------+
-| ``AIRFLOW_SOURCES_TO``            | ``/empty``             | Target for Airflow sources. Set to "/opt/airflow" when                            |
-|                                   |                        | you want to install airflow from local sources.                                   |
-+-----------------------------------+------------------------+-----------------------------------------------------------------------------------+
-
-This builds production image in version 3.6 with default extras from the local sources (master version
-of 2.0 currently):
-
-.. code-block:: bash
-
-  docker build .
-
-This builds the production image in version 3.7 with default extras from 2.0.1 tag and
-constraints taken from constraints-2-0 branch in GitHub.
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg PYTHON_BASE_IMAGE="python:3.7-slim-buster" \
-    --build-arg PYTHON_MAJOR_MINOR_VERSION=3.7 \
-    --build-arg AIRFLOW_INSTALLATION_METHOD="https://github.com/apache/airflow/archive/2.0.1.tar.gz#egg=apache-airflow" \
-    --build-arg AIRFLOW_CONSTRAINTS_REFERENCE="constraints-2-0" \
-    --build-arg AIRFLOW_BRANCH="v1-10-test" \
-    --build-arg AIRFLOW_SOURCES_FROM="empty" \
-    --build-arg AIRFLOW_SOURCES_TO="/empty"
-
-This builds the production image in version 3.7 with default extras from 2.0.1 PyPI package and
-constraints taken from 2.0.1 tag in GitHub and pre-installed pip dependencies from the top
-of v1-10-test branch.
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg PYTHON_BASE_IMAGE="python:3.7-slim-buster" \
-    --build-arg PYTHON_MAJOR_MINOR_VERSION=3.7 \
-    --build-arg AIRFLOW_INSTALLATION_METHOD="apache-airflow" \
-    --build-arg AIRFLOW_VERSION="2.0.1" \
-    --build-arg AIRFLOW_VERSION_SPECIFICATION="==2.0.1" \
-    --build-arg AIRFLOW_BRANCH="v1-10-test" \
-    --build-arg AIRFLOW_CONSTRAINTS_REFERENCE="constraints-2.0.1" \
-    --build-arg AIRFLOW_SOURCES_FROM="empty" \
-    --build-arg AIRFLOW_SOURCES_TO="/empty"
-
-This builds the production image in version 3.7 with additional airflow extras from 2.0.1 PyPI package and
-additional python dependencies and pre-installed pip dependencies from 2.0.1 tagged constraints.
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg PYTHON_BASE_IMAGE="python:3.7-slim-buster" \
-    --build-arg PYTHON_MAJOR_MINOR_VERSION=3.7 \
-    --build-arg AIRFLOW_INSTALLATION_METHOD="apache-airflow" \
-    --build-arg AIRFLOW_VERSION="2.0.1" \
-    --build-arg AIRFLOW_VERSION_SPECIFICATION="==2.0.1" \
-    --build-arg AIRFLOW_BRANCH="v1-10-test" \
-    --build-arg AIRFLOW_CONSTRAINTS_REFERENCE="constraints-2.0.1" \
-    --build-arg AIRFLOW_SOURCES_FROM="empty" \
-    --build-arg AIRFLOW_SOURCES_TO="/empty" \
-    --build-arg ADDITIONAL_AIRFLOW_EXTRAS="mssql,hdfs" \
-    --build-arg ADDITIONAL_PYTHON_DEPS="sshtunnel oauth2client"
-
-This builds the production image in version 3.7 with additional airflow extras from 2.0.1 PyPI package and
-additional apt dev and runtime dependencies.
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg PYTHON_BASE_IMAGE="python:3.7-slim-buster" \
-    --build-arg PYTHON_MAJOR_MINOR_VERSION=3.7 \
-    --build-arg AIRFLOW_INSTALLATION_METHOD="apache-airflow" \
-    --build-arg AIRFLOW_VERSION="2.0.1" \
-    --build-arg AIRFLOW_VERSION_SPECIFICATION="==2.0.1" \
-    --build-arg AIRFLOW_CONSTRAINTS_REFERENCE="constraints-2-0" \
-    --build-arg AIRFLOW_SOURCES_FROM="empty" \
-    --build-arg AIRFLOW_SOURCES_TO="/empty" \
-    --build-arg ADDITIONAL_AIRFLOW_EXTRAS="jdbc" \
-    --build-arg ADDITIONAL_DEV_APT_DEPS="gcc g++" \
-    --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
-
-
-Actions executed at image start
--------------------------------
-
-If you are using the default entrypoint of the production image,
-there are a few actions that are automatically performed when the container starts.
-In some cases, you can pass environment variables to the image to trigger some of that behaviour.
-
-The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
-from the variables used to build the image starting with ``AIRFLOW``.
-
-Creating system user
-....................
-
-Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
-Airflow will automatically create such a user and make it's home directory point to ``/home/airflow``.
-You can read more about it in the "Support arbitrary user ids" chapter in the
-`Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_.
-
-Waits for Airflow DB connection
-...............................
-
-In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
-available. This happens always when you use the default entrypoint.
-
-The script detects backend type depending on the URL schema and assigns default port numbers if not specified
-in the URL. Then it loops until the connection to the host/port specified can be established
-It tries ``CONNECTION_CHECK_MAX_COUNT`` times and sleeps ``CONNECTION_CHECK_SLEEP_TIME`` between checks
-To disable check, set ``CONNECTION_CHECK_MAX_COUNT=0``.
-
-Supported schemes:
-
-* ``postgres://`` - default port 5432
-* ``mysql://``    - default port 3306
-* ``sqlite://``
-
-In case of SQLite backend, there is no connection to establish and waiting is skipped.
-
-Upgrading Airflow DB
-....................
-
-If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will run
-the ``airflow db upgrade`` command right after verifying the connection. You can also use this
-when you are running airflow with internal SQLite database (default) to upgrade the db and create
-admin users at entrypoint, so that you can start the webserver immediately. Note - using SQLite is
-intended only for testing purpose, never use SQLite in production as it has severe limitations when it
-comes to concurrency.
-
-
-Creating admin user
-...................
-
-The entrypoint can also create webserver user automatically when you enter it. you need to set
-``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
-production, it is only useful if you would like to run a quick test with the production image.
-You need to pass at least password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
-``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
-the ``*_CMD`` will be evaluated as shell command and it's output will be set as password.
-
-User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
-password for security reasons.
-
-+-----------+--------------------------+----------------------------------------------------------------------+
-| Parameter | Default                  | Environment variable                                                 |
-+===========+==========================+======================================================================+
-| username  | admin                    | ``_AIRFLOW_WWW_USER_USERNAME``                                       |
-+-----------+--------------------------+----------------------------------------------------------------------+
-| password  |                          | ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or ``_AIRFLOW_WWW_USER_PASSWORD`` |
-+-----------+--------------------------+----------------------------------------------------------------------+
-| firstname | Airflow                  | ``_AIRFLOW_WWW_USER_FIRSTNAME``                                      |
-+-----------+--------------------------+----------------------------------------------------------------------+
-| lastname  | Admin                    | ``_AIRFLOW_WWW_USER_LASTNAME``                                       |
-+-----------+--------------------------+----------------------------------------------------------------------+
-| email     | airflowadmin@example.com | ``_AIRFLOW_WWW_USER_EMAIL``                                          |
-+-----------+--------------------------+----------------------------------------------------------------------+
-| role      | Admin                    | ``_AIRFLOW_WWW_USER_ROLE``                                           |
-+-----------+--------------------------+----------------------------------------------------------------------+
-
-In case the password is specified, the user will be attempted to be created, but the entrypoint will
-not fail if the attempt fails (this accounts for the case that the user is already created).
-
-You can, for example start the webserver in the production image with initializing the internal SQLite
-database and creating an ``admin/admin`` Admin user with the following command:
-
-.. code-block:: bash
-
-  docker run -it -p 8080:8080 \
-    --env "_AIRFLOW_DB_UPGRADE=true" \
-    --env "_AIRFLOW_WWW_USER_CREATE=true" \
-    --env "_AIRFLOW_WWW_USER_PASSWORD=admin" \
-      apache/airflow:master-python3.8 webserver
-
-
-.. code-block:: bash
-
-  docker run -it -p 8080:8080 \
-    --env "_AIRFLOW_DB_UPGRADE=true" \
-    --env "_AIRFLOW_WWW_USER_CREATE=true" \
-    --env "_AIRFLOW_WWW_USER_PASSWORD_CMD=echo admin" \
-      apache/airflow:master-python3.8 webserver
-
-The commands above perform initialization of the SQLite database, create admin user with admin password
-and Admin role. They also forward local port ``8080`` to the webserver port and finally start the webserver.
-
-
-Waits for celery broker connection
-..................................
-
-In case Postgres or MySQL DB is used, and one of the ``scheduler``, ``celery``, ``worker``, or ``flower``
-commands are used the entrypoint will wait until the celery broker DB connection is available.
-
-The script detects backend type depending on the URL schema and assigns default port numbers if not specified
-in the URL. Then it loops until connection to the host/port specified can be established
-It tries ``CONNECTION_CHECK_MAX_COUNT`` times and sleeps ``CONNECTION_CHECK_SLEEP_TIME`` between checks.
-To disable check, set ``CONNECTION_CHECK_MAX_COUNT=0``.
-
-Supported schemes:
-
-* ``amqp(s)://``  (rabbitmq) - default port 5672
-* ``redis://``               - default port 6379
-* ``postgres://``            - default port 5432
-* ``mysql://``               - default port 3306
-* ``sqlite://``
-
-In case of SQLite backend, there is no connection to establish and waiting is skipped.
-
-
-Recipes
--------
-
-Users sometimes share interesting ways of using the Docker images. We encourage users to contribute these
-recipes to the documentation in case they prove useful to other members of the community by
-submitting a pull request. The sections below capture this knowledge.
-
-Google Cloud SDK installation
-.............................
-
-Some operators, such as :class:`airflow.providers.google.cloud.operators.kubernetes_engine.GKEStartPodOperator`,
-:class:`airflow.providers.google.cloud.operators.dataflow.DataflowStartSqlJobOperator`, require
-the installation of `Google Cloud SDK <https://cloud.google.com/sdk>`__ (includes ``gcloud``).
-You can also run these commands with BashOperator.
-
-Create a new Dockerfile like the one shown below.
-
-.. exampleinclude:: /docker-images-recipes/gcloud.Dockerfile
-    :language: dockerfile
-
-Then build a new image.
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg BASE_AIRFLOW_IMAGE="apache/airflow:2.0.1" \
-    -t my-airflow-image
-
-
-Apache Hadoop Stack installation
-................................
-
-Airflow is often used to run tasks on Hadoop cluster. It required Java Runtime Environment (JRE) to run.
-Below are the steps to take tools that are frequently used in Hadoop-world:
-
-- Java Runtime Environment (JRE)
-- Apache Hadoop
-- Apache Hive
-- `Cloud Storage connector for Apache Hadoop <https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage>`__
-
-
-Create a new Dockerfile like the one shown below.
-
-.. exampleinclude:: /docker-images-recipes/hadoop.Dockerfile
-    :language: dockerfile
-
-Then build a new image.
-
-.. code-block:: bash
-
-  docker build . \
-    --build-arg BASE_AIRFLOW_IMAGE="apache/airflow:2.0.1" \
-    -t my-airflow-image
-
-More details about the images
------------------------------
-
-You can read more details about the images - the context, their parameters and internal structure in the
-`IMAGES.rst <https://github.com/apache/airflow/blob/master/IMAGES.rst>`_ document.
+We provides :doc:`a Docker Image (OCI) for Apache Airflow <docker-stack:index>` for use in a containerized environment. Consider using it to guarantees that software will always run the same no matter where it’s deployed.

Review comment:
       ```suggestion
   We provide :doc:`a Docker Image (OCI) for Apache Airflow <docker-stack:index>` for use in a containerized environment. Consider using it to guarantees that software will always run the same no matter where it’s deployed.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #14765: Docker image docs

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #14765:
URL: https://github.com/apache/airflow/pull/14765#discussion_r593816087



##########
File path: docs/docker-stack/entrypoint.rst
##########
@@ -0,0 +1,201 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Entrypoint
+==========
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+The image entrypoint works as follows:
+
+* In case the user is not "airflow" (with undefined user id) and the group id of the user is set to ``0`` (root),
+  then the user is dynamically added to /etc/passwd at entry using ``USER_NAME`` variable to define the user name.
+  This is in order to accommodate the
+  `OpenShift Guidelines <https://docs.openshift.com/enterprise/3.0/creating_images/guidelines.html>`_
+
+* The ``AIRFLOW_HOME`` is set by default to ``/opt/airflow/`` - this means that DAGs
+  are in default in the ``/opt/airflow/dags`` folder and logs are in the ``/opt/airflow/logs``
+
+* The working directory is ``/opt/airflow`` by default.
+
+* If ``AIRFLOW__CORE__SQL_ALCHEMY_CONN`` variable is passed to the container and it is either mysql or postgres

Review comment:
       This is not true for the image for Airflow 2.0. For Airflow 2.o, we use `airflow db check` command.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #14765: Create a documentation package for Docker image

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #14765:
URL: https://github.com/apache/airflow/pull/14765#issuecomment-814354133


   > @mik-laj Is this published on the web yet? I can't seem to find it.
   
   It is at http://apache-airflow-docs.s3-website.eu-central-1.amazonaws.com/docs/docker-stack/index.html
   
   You can find the latest docs (docs from Master) at s.apache.org/airflow-docs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #14765: Create a documentation package for Docker image

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #14765:
URL: https://github.com/apache/airflow/pull/14765#issuecomment-800544328


   The PR is likely ready to be merged. No tests are needed as no important environment files, nor python files were modified by it. However, committers might decide that full test matrix is needed and add the 'full tests needed' label. Then you should rebase it to the latest master or amend the last commit of the PR, and push it with --force-with-lease.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #14765: Create a documentation package for Docker image

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #14765:
URL: https://github.com/apache/airflow/pull/14765#discussion_r595483852



##########
File path: docs/docker-stack/entrypoint.rst
##########
@@ -0,0 +1,201 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Entrypoint
+==========
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+The image entrypoint works as follows:
+
+* In case the user is not "airflow" (with undefined user id) and the group id of the user is set to ``0`` (root),
+  then the user is dynamically added to /etc/passwd at entry using ``USER_NAME`` variable to define the user name.

Review comment:
       ```suggestion
     then the user is dynamically added to ``/etc/passwd`` at entry using ``USER_NAME`` variable to define the user name.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj merged pull request #14765: Create a documentation package for Docker image

Posted by GitBox <gi...@apache.org>.
mik-laj merged pull request #14765:
URL: https://github.com/apache/airflow/pull/14765


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #14765: Docker image docs

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #14765:
URL: https://github.com/apache/airflow/pull/14765#discussion_r593817226



##########
File path: docs/docker-stack/entrypoint.rst
##########
@@ -0,0 +1,201 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Entrypoint

Review comment:
       This page contains a combination of content from several sections, incl., Using the production image,  Actions executed at image start.  It's not the best idea, but seems sufficient for now. I would like to write the "Using the production image" section a little differently so that the user can actually run Airflow after reading this page and that there would not be so much duplicate information.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #14765: Docker image docs

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #14765:
URL: https://github.com/apache/airflow/pull/14765#discussion_r593816550



##########
File path: docs/docker-stack/build.rst
##########
@@ -0,0 +1,380 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Building the image
+==================
+
+Before you dive-deeply in the way how the Airflow Image is build, named and why we are doing it the
+way we do, you might want to know very quickly how you can extend or customize the existing image
+for Apache Airflow. This chapter gives you a short answer to those questions.
+
+Airflow Summit 2020's `Production Docker Image <https://youtu.be/wDr3Y7q2XoI>`_ talk provides more
+details about the context, architecture and customization/extension methods for the Production Image.
+
+Extending the image
+-------------------
+
+Extending the image is easiest if you just need to add some dependencies that do not require
+compiling. The compilation framework of Linux (so called ``build-essential``) is pretty big, and
+for the production images, size is really important factor to optimize for, so our Production Image
+does not contain ``build-essential``. If you need compiler like gcc or g++ or make/cmake etc. - those
+are not found in the image and it is recommended that you follow the "customize" route instead.
+
+How to extend the image - it is something you are most likely familiar with - simply
+build a new image using Dockerfile's ``FROM`` directive and add whatever you need. Then you can add your
+Debian dependencies with ``apt`` or PyPI dependencies with ``pip install`` or any other stuff you need.
+
+You should be aware, about a few things:
+
+* The production image of airflow uses "airflow" user, so if you want to add some of the tools
+  as ``root`` user, you need to switch to it with ``USER`` directive of the Dockerfile. Also you
+  should remember about following the
+  `best practises of Dockerfiles <https://docs.docker.com/develop/develop-images/dockerfile_best-practices/>`_
+  to make sure your image is lean and small.
+
+  .. code-block:: dockerfile
+
+    FROM apache/airflow:2.0.1
+    USER root
+    RUN apt-get update \
+      && apt-get install -y --no-install-recommends \
+             my-awesome-apt-dependency-to-add \
+      && apt-get autoremove -yqq --purge \
+      && apt-get clean \
+      && rm -rf /var/lib/apt/lists/*
+    USER airflow
+
+
+* PyPI dependencies in Apache Airflow are installed in the user library, of the "airflow" user, so
+  you need to install them with the ``--user`` flag and WITHOUT switching to airflow user. Note also
+  that using --no-cache-dir is a good idea that can help to make your image smaller.
+
+  .. code-block:: dockerfile
+
+    FROM apache/airflow:2.0.1
+    RUN pip install --no-cache-dir --user my-awesome-pip-dependency-to-add
+
+* As of 2.0.1 image the ``--user`` flag is turned on by default by setting ``PIP_USER`` environment variable
+  to ``true``. This can be disabled by un-setting the variable or by setting it to ``false``.
+
+
+* If your apt, or PyPI dependencies require some of the build-essentials, then your best choice is
+  to follow the "Customize the image" route. However it requires to checkout sources of Apache Airflow,
+  so you might still want to choose to add build essentials to your image, even if your image will
+  be significantly bigger.
+
+  .. code-block:: dockerfile
+
+    FROM apache/airflow:2.0.1
+    USER root
+    RUN apt-get update \
+      && apt-get install -y --no-install-recommends \
+             build-essential my-awesome-apt-dependency-to-add \
+      && apt-get autoremove -yqq --purge \
+      && apt-get clean \
+      && rm -rf /var/lib/apt/lists/*
+    USER airflow
+    RUN pip install --no-cache-dir --user my-awesome-pip-dependency-to-add
+
+* You can also embed your dags in the image by simply adding them with COPY directive of Airflow.
+  The DAGs in production image are in ``/opt/airflow/dags`` folder.
+
+Customizing the image
+---------------------
+
+Customizing the image is an alternative way of adding your own dependencies to the image - better
+suited to prepare optimized production images.
+
+The advantage of this method is that it produces optimized image even if you need some compile-time
+dependencies that are not needed in the final image. You need to use Airflow Sources to build such images
+from the `official distribution folder of Apache Airflow <https://downloads.apache.org/airflow/>`_ for the
+released versions, or checked out from the GitHub project if you happen to do it from git sources.
+
+The easiest way to build the image is to use ``breeze`` script, but you can also build such customized
+image by running appropriately crafted docker build in which you specify all the ``build-args``
+that you need to add to customize it. You can read about all the args and ways you can build the image
+in :doc:`build-arg-ref`.
+
+Here just a few examples are presented which should give you general understanding of what you can customize.
+
+This builds production image in version 3.6 with default extras from the local sources (master version

Review comment:
       We had these examples repeated twice in the documentation - once in the "Customizing the image" section and the second time under the tables in the "Production image build arguments" section. Some examples were common and some were different. Now we have one place with examples.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org