You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/01/17 17:45:02 UTC

[GitHub] [airflow] potiuk opened a new pull request #13728: Adds automated user creation in production image

potiuk opened a new pull request #13728:
URL: https://github.com/apache/airflow/pull/13728


   This PR implements automated user creation for production image
   controlled by environment variables.
   
   This is a solution for anyone who would like to make a quick test
   with the production image and would like to:
   
   * init/upgrade the db automatically
   * create a user
   
   This is particularly useful for internal sqlite db initialization
   but can also be used to initialize the user in docker-compose
   or similar cases where there is no equivalent of init containers
   that are usually used to perform the initilization.
   
   Closes #8606
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559757428



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +762,130 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to ``/home/airflow``.
+You can read more about it in the "Support arbitrary user ids" chapter in the
+`Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_.
+
+Waits for Airflow DB connection
+...............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+The script detects backend type depending on the URL schema and assigns default port numbers if not specified
+in the URL. Then it loops until connection to the host/port specified can be established
+It tries ``CONNECTION_CHECK_MAX_COUNT`` times and sleeps ``CONNECTION_CHECK_SLEEP_TIME`` between checks
+
+Supported schemes:
+
+* ``postgres://`` - default port 5432
+* ``mysql://``    - default port 3306
+* ``sqlite://``
+
+In case of SQLite backend, there is no connection to establish and waiting is skipped.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal SQLite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using SQLite is
+intended only for testing purpose, never use SQLite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set

Review comment:
       ```suggestion
   The entrypoint can also create an admin user automatically for the Webserver when you enter it. you need to set
   ```

##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +762,130 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to ``/home/airflow``.
+You can read more about it in the "Support arbitrary user ids" chapter in the
+`Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_.
+
+Waits for Airflow DB connection
+...............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+The script detects backend type depending on the URL schema and assigns default port numbers if not specified
+in the URL. Then it loops until connection to the host/port specified can be established
+It tries ``CONNECTION_CHECK_MAX_COUNT`` times and sleeps ``CONNECTION_CHECK_SLEEP_TIME`` between checks
+
+Supported schemes:
+
+* ``postgres://`` - default port 5432
+* ``mysql://``    - default port 3306
+* ``sqlite://``
+
+In case of SQLite backend, there is no connection to establish and waiting is skipped.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal SQLite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using SQLite is
+intended only for testing purpose, never use SQLite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set

Review comment:
       ```suggestion
   The entrypoint can also create an admin user automatically for the Webserver when you enter it. You need to set
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559455647



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.

Review comment:
       ```suggestion
   Airflow will automatically create such a user and make it's home directory point to ``/home/airflow``.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559486925



##########
File path: scripts/in_container/prod/entrypoint_prod.sh
##########
@@ -109,51 +109,116 @@ function verify_db_connection {
     fi
 }
 
-if ! whoami &> /dev/null; then
-  if [[ -w /etc/passwd ]]; then
-    echo "${USER_NAME:-default}:x:$(id -u):0:${USER_NAME:-default} user:${AIRFLOW_USER_HOME_DIR}:/sbin/nologin" \
-        >> /etc/passwd
-  fi
-  export HOME="${AIRFLOW_USER_HOME_DIR}"
-fi
-
 # Warning: command environment variables (*_CMD) have priority over usual configuration variables
 # for configuration parameters that require sensitive information. This is the case for the SQL database
 # and the broker backend in this entrypoint script.
 
 
-if [[ -n "${AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD=}" ]]; then
-    verify_db_connection "$(eval "$AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD")"
-else
-    # if no DB configured - use sqlite db by default
-    AIRFLOW__CORE__SQL_ALCHEMY_CONN="${AIRFLOW__CORE__SQL_ALCHEMY_CONN:="sqlite:///${AIRFLOW_HOME}/airflow.db"}"
-    verify_db_connection "${AIRFLOW__CORE__SQL_ALCHEMY_CONN}"
-fi
+function create_www_user() {
+    local local_password=""
+    if [[ -n "${_AIRFLOW_WWW_USER_PASSWORD_CMD=}" ]]; then
+        local_password=$(eval "${_AIRFLOW_WWW_USER_PASSWORD_CMD}")
+        unset _AIRFLOW_WWW_USER_PASSWORD_CMD
+    elif [[ -n "${_AIRFLOW_WWW_USER_PASSWORD=}" ]]; then
+        local_password="${_AIRFLOW_WWW_USER_PASSWORD}"
+        unset _AIRFLOW_WWW_USER_PASSWORD
+    fi
+    if [[ -z ${local_password} ]]; then
+        echo
+        echo ERROR! Airflow Admin password not set via _AIRFLOW_WWW_USER_PASSWORD or _AIRFLOW_WWW_USER_PASSWORD_CMD variables!
+        echo
+        exit 1
+    fi
 
+    airflow users create \
+       --username "${_AIRFLOW_WWW_USER_USERNAME="admin"}" \
+       --firstname "${_AIRFLOW_WWW_USER_FIRSTNAME="Airflow"}" \
+       --lastname "${_AIRFLOW_WWW_USER_LASTNME="Admin"}" \
+       --email "${_AIRFLOW_WWW_USER_EMAIL="airflowadmin@example.com"}" \
+       --role "${_AIRFLOW_WWW_USER_ROLE="Admin"}" \
+       --password "${local_password}" ||
+    airflow create_user \
+       --username "${_AIRFLOW_WWW_USER_USERNAME="admin"}" \
+       --firstname "${_AIRFLOW_WWW_USER_FIRSTNAME="Airflow"}" \
+       --lastname "${_AIRFLOW_WWW_USER_LASTNME="Admin"}" \
+       --email "${_AIRFLOW_WWW_USER_EMAIL="airflowadmin@example.com"}" \
+       --role "${_AIRFLOW_WWW_USER_ROLE="Admin"}" \
+       --password "${local_password}" || true
+}
 
-# The Bash and python commands still should verify the basic connections so they are run after the
-# DB check but before the broker check
-if [[ ${AIRFLOW_COMMAND} == "bash" ]]; then
-   shift
-   exec "/bin/bash" "${@}"
-elif [[ ${AIRFLOW_COMMAND} == "python" ]]; then
-   shift
-   exec "python" "${@}"
-elif [[ ${AIRFLOW_COMMAND} == "airflow" ]]; then
-   AIRFLOW_COMMAND="${2}"
-   shift
-fi
+function create_system_user_if_missing() {
+    # This is needed in case of OpenShift-compatible container execution. In case of OpenShift random
+    # User id is used when starting the image, however group 0 is kept as the user group. Our production
+    # Image is OpenShift compatible, so all permissions on all folders are set so that 0 group can exercise
+    # the same privileges as the default "airflow" user, this code checks if the user is already
+    # present in /etc/passwd and will create the system user dynamically, including setting its
+    # HOME directory to the /home/airflow so that (for example) the ${HOME}/.local folder where airflow is
+    # Installed can be automatically added to PYTHONPATH
+    if ! whoami &> /dev/null; then
+      if [[ -w /etc/passwd ]]; then
+        echo "${USER_NAME:-default}:x:$(id -u):0:${USER_NAME:-default} user:${AIRFLOW_USER_HOME_DIR}:/sbin/nologin" \
+            >> /etc/passwd
+      fi
+      export HOME="${AIRFLOW_USER_HOME_DIR}"
+    fi
+}
 
-# Note: the broker backend configuration concerns only a subset of Airflow components
-if [[ ${AIRFLOW_COMMAND} =~ ^(scheduler|celery|worker|flower)$ ]]; then
-    if [[ -n "${AIRFLOW__CELERY__BROKER_URL_CMD=}" ]]; then
-        verify_db_connection "$(eval "$AIRFLOW__CELERY__BROKER_URL_CMD")"
+function verify_connections() {
+    if [[ -n "${AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD=}" ]]; then
+        verify_db_connection "$(eval "${AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD}")"
     else
-        AIRFLOW__CELERY__BROKER_URL=${AIRFLOW__CELERY__BROKER_URL:=}
-        if [[ -n ${AIRFLOW__CELERY__BROKER_URL=} ]]; then
-            verify_db_connection "${AIRFLOW__CELERY__BROKER_URL}"
+        # if no DB configured - use sqlite db by default
+        AIRFLOW__CORE__SQL_ALCHEMY_CONN="${AIRFLOW__CORE__SQL_ALCHEMY_CONN:="sqlite:///${AIRFLOW_HOME}/airflow.db"}"
+        verify_db_connection "${AIRFLOW__CORE__SQL_ALCHEMY_CONN}"
+    fi
+
+}
+
+function upgrade_db() {
+    airflow db upgrade || airflow upgradedb || true
+}
+
+function verify_celery_connection_if_needed() {
+    # Note: the broker backend configuration concerns only a subset of Airflow components
+    if [[ ${AIRFLOW_COMMAND} =~ ^(scheduler|celery|worker|flower)$ ]]; then
+        if [[ -n "${AIRFLOW__CELERY__BROKER_URL_CMD=}" ]]; then
+            verify_db_connection "$(eval "${AIRFLOW__CELERY__BROKER_URL_CMD}")"
+        else
+            AIRFLOW__CELERY__BROKER_URL=${AIRFLOW__CELERY__BROKER_URL:=}
+            if [[ -n ${AIRFLOW__CELERY__BROKER_URL=} ]]; then
+                verify_db_connection "${AIRFLOW__CELERY__BROKER_URL}"
+            fi
         fi
     fi
+}
+
+function execute_bash_or_python_command_if_specified() {
+    # The Bash and python commands still should verify the basic connections so they are run after the

Review comment:
       addded `bash` and `python` instead as those are commands to use when running the image.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559455647



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.

Review comment:
       ```suggestion
   Airflow will automatically create such a user and make it's home directory point to `/home/airflow`.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559418778



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal sqlite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using sqlite is
+intended only for testing purpose, never use sqlite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.
+You need to pass at lest password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
+``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
+the ``*_CMD`` will be evaluated as shell command and it's output will be set ass password.
+
+User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
+password for security reasons.
+
++-----------+--------------------------+--------------------------------------------------------------+
+| Parameter | Default                  | Environment variable                                         |
++===========+==========================+==============================================================+
+| username  | admin                    | _AIRFLOW_WWW_USER_USERNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| password  |                          | _AIRFLOW_WWW_USER_PASSWORD_CMD or _AIRFLOW_WWW_USER_PASSWORD |
++-----------+--------------------------+--------------------------------------------------------------+
+| firstname | Airflow                  | _AIRFLOW_WWW_USER_FIRSTNAME                                  |
++-----------+--------------------------+--------------------------------------------------------------+
+| lastname  | Admin                    | _AIRFLOW_WWW_USER_LASTNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| email     | airflowadmin@example.com | _AIRFLOW_WWW_USER_EMAIL                                      |
++-----------+--------------------------+--------------------------------------------------------------+
+| role      | Admin                    | _AIRFLOW_WWW_USER_ROLE                                       |
++-----------+--------------------------+--------------------------------------------------------------+
+
+In case the password is specified, the user will be attempted to be created, but the entrypoint will
+not fail if the attempt fails (this accounts for the case that the user is already created).
+
+You can, for example start the webserver in the production image with initializing the internal sqlite
+database and creating an ``admin/admin`` Admin user with the following command:
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD="admin" \
+      apache/airflow:master-python3.8 webserver
+
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD_CMD="echo admin" \
+      apache/airflow:master-python3.8 webserver

Review comment:
       I think this is only issue specific to .env file, the quotes when you pass environment variables via -e switch are not necessary when you have no spaces (but they are always removed). They are absolutely necessary though when you have spaces (otherwise shell treats them as another parameter). This is a "shelll" expansion standard, not  .env file standard and quoting variables there is really good practice.
   
   For example: 
   
   `-e _AIRFLOW_WWW_USER_PASSWORD_CMD=echo admin`
   
   will set `_AIRFLOW_WWW_USER_PASSWORD_CMD` variable to `echo` and it will try to run "admin" image.
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559457126



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal sqlite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using sqlite is
+intended only for testing purpose, never use sqlite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.
+You need to pass at lest password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
+``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
+the ``*_CMD`` will be evaluated as shell command and it's output will be set ass password.
+
+User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
+password for security reasons.
+
++-----------+--------------------------+--------------------------------------------------------------+
+| Parameter | Default                  | Environment variable                                         |
++===========+==========================+==============================================================+
+| username  | admin                    | _AIRFLOW_WWW_USER_USERNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| password  |                          | _AIRFLOW_WWW_USER_PASSWORD_CMD or _AIRFLOW_WWW_USER_PASSWORD |
++-----------+--------------------------+--------------------------------------------------------------+
+| firstname | Airflow                  | _AIRFLOW_WWW_USER_FIRSTNAME                                  |
++-----------+--------------------------+--------------------------------------------------------------+
+| lastname  | Admin                    | _AIRFLOW_WWW_USER_LASTNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| email     | airflowadmin@example.com | _AIRFLOW_WWW_USER_EMAIL                                      |
++-----------+--------------------------+--------------------------------------------------------------+
+| role      | Admin                    | _AIRFLOW_WWW_USER_ROLE                                       |
++-----------+--------------------------+--------------------------------------------------------------+
+
+In case the password is specified, the user will be attempted to be created, but the entrypoint will
+not fail if the attempt fails (this accounts for the case that the user is already created).
+
+You can, for example start the webserver in the production image with initializing the internal sqlite
+database and creating an ``admin/admin`` Admin user with the following command:
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    --env "_AIRFLOW_DB_UPGRADE=true" \
+    --env "_AIRFLOW_WWW_USER_CREATE=true" \
+    --env "_AIRFLOW_WWW_USER_PASSWORD=admin" \
+      apache/airflow:master-python3.8 webserver
+
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    --env "_AIRFLOW_DB_UPGRADE=true" \
+    --env "_AIRFLOW_WWW_USER_CREATE=true" \
+    --env "_AIRFLOW_WWW_USER_PASSWORD_CMD=echo admin" \
+      apache/airflow:master-python3.8 webserver
+
+The commands above perform initialization of the sqlite database, create admin user with admin password
+and Admin role. They also forward local port 8080 to the webserver port and finally start the webserver.

Review comment:
       ```suggestion
   and Admin role. They also forward local port ``8080`` to the webserver port and finally start the webserver.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559756892



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +762,130 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to ``/home/airflow``.
+You can read more about it in the "Support arbitrary user ids" chapter in the
+`Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_.
+
+Waits for Airflow DB connection
+...............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+The script detects backend type depending on the URL schema and assigns default port numbers if not specified
+in the URL. Then it loops until connection to the host/port specified can be established

Review comment:
       ```suggestion
   in the URL. Then it loops until the connection to the host/port specified can be established.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #13728:
URL: https://github.com/apache/airflow/pull/13728


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559468238



##########
File path: scripts/in_container/prod/entrypoint_prod.sh
##########
@@ -109,51 +109,116 @@ function verify_db_connection {
     fi
 }
 
-if ! whoami &> /dev/null; then
-  if [[ -w /etc/passwd ]]; then
-    echo "${USER_NAME:-default}:x:$(id -u):0:${USER_NAME:-default} user:${AIRFLOW_USER_HOME_DIR}:/sbin/nologin" \
-        >> /etc/passwd
-  fi
-  export HOME="${AIRFLOW_USER_HOME_DIR}"
-fi
-
 # Warning: command environment variables (*_CMD) have priority over usual configuration variables
 # for configuration parameters that require sensitive information. This is the case for the SQL database
 # and the broker backend in this entrypoint script.
 
 
-if [[ -n "${AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD=}" ]]; then
-    verify_db_connection "$(eval "$AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD")"
-else
-    # if no DB configured - use sqlite db by default
-    AIRFLOW__CORE__SQL_ALCHEMY_CONN="${AIRFLOW__CORE__SQL_ALCHEMY_CONN:="sqlite:///${AIRFLOW_HOME}/airflow.db"}"
-    verify_db_connection "${AIRFLOW__CORE__SQL_ALCHEMY_CONN}"
-fi
+function create_www_user() {
+    local local_password=""
+    if [[ -n "${_AIRFLOW_WWW_USER_PASSWORD_CMD=}" ]]; then
+        local_password=$(eval "${_AIRFLOW_WWW_USER_PASSWORD_CMD}")
+        unset _AIRFLOW_WWW_USER_PASSWORD_CMD
+    elif [[ -n "${_AIRFLOW_WWW_USER_PASSWORD=}" ]]; then
+        local_password="${_AIRFLOW_WWW_USER_PASSWORD}"
+        unset _AIRFLOW_WWW_USER_PASSWORD
+    fi
+    if [[ -z ${local_password} ]]; then
+        echo
+        echo ERROR! Airflow Admin password not set via _AIRFLOW_WWW_USER_PASSWORD or _AIRFLOW_WWW_USER_PASSWORD_CMD variables!
+        echo
+        exit 1
+    fi
 
+    airflow users create \
+       --username "${_AIRFLOW_WWW_USER_USERNAME="admin"}" \
+       --firstname "${_AIRFLOW_WWW_USER_FIRSTNAME="Airflow"}" \
+       --lastname "${_AIRFLOW_WWW_USER_LASTNME="Admin"}" \
+       --email "${_AIRFLOW_WWW_USER_EMAIL="airflowadmin@example.com"}" \
+       --role "${_AIRFLOW_WWW_USER_ROLE="Admin"}" \
+       --password "${local_password}" ||
+    airflow create_user \
+       --username "${_AIRFLOW_WWW_USER_USERNAME="admin"}" \
+       --firstname "${_AIRFLOW_WWW_USER_FIRSTNAME="Airflow"}" \
+       --lastname "${_AIRFLOW_WWW_USER_LASTNME="Admin"}" \
+       --email "${_AIRFLOW_WWW_USER_EMAIL="airflowadmin@example.com"}" \
+       --role "${_AIRFLOW_WWW_USER_ROLE="Admin"}" \
+       --password "${local_password}" || true
+}
 
-# The Bash and python commands still should verify the basic connections so they are run after the
-# DB check but before the broker check
-if [[ ${AIRFLOW_COMMAND} == "bash" ]]; then
-   shift
-   exec "/bin/bash" "${@}"
-elif [[ ${AIRFLOW_COMMAND} == "python" ]]; then
-   shift
-   exec "python" "${@}"
-elif [[ ${AIRFLOW_COMMAND} == "airflow" ]]; then
-   AIRFLOW_COMMAND="${2}"
-   shift
-fi
+function create_system_user_if_missing() {
+    # This is needed in case of OpenShift-compatible container execution. In case of OpenShift random
+    # User id is used when starting the image, however group 0 is kept as the user group. Our production
+    # Image is OpenShift compatible, so all permissions on all folders are set so that 0 group can exercise
+    # the same privileges as the default "airflow" user, this code checks if the user is already
+    # present in /etc/passwd and will create the system user dynamically, including setting its
+    # HOME directory to the /home/airflow so that (for example) the ${HOME}/.local folder where airflow is
+    # Installed can be automatically added to PYTHONPATH
+    if ! whoami &> /dev/null; then
+      if [[ -w /etc/passwd ]]; then
+        echo "${USER_NAME:-default}:x:$(id -u):0:${USER_NAME:-default} user:${AIRFLOW_USER_HOME_DIR}:/sbin/nologin" \
+            >> /etc/passwd
+      fi
+      export HOME="${AIRFLOW_USER_HOME_DIR}"
+    fi
+}
 
-# Note: the broker backend configuration concerns only a subset of Airflow components
-if [[ ${AIRFLOW_COMMAND} =~ ^(scheduler|celery|worker|flower)$ ]]; then
-    if [[ -n "${AIRFLOW__CELERY__BROKER_URL_CMD=}" ]]; then
-        verify_db_connection "$(eval "$AIRFLOW__CELERY__BROKER_URL_CMD")"
+function verify_connections() {
+    if [[ -n "${AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD=}" ]]; then
+        verify_db_connection "$(eval "${AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD}")"
     else
-        AIRFLOW__CELERY__BROKER_URL=${AIRFLOW__CELERY__BROKER_URL:=}
-        if [[ -n ${AIRFLOW__CELERY__BROKER_URL=} ]]; then
-            verify_db_connection "${AIRFLOW__CELERY__BROKER_URL}"
+        # if no DB configured - use sqlite db by default
+        AIRFLOW__CORE__SQL_ALCHEMY_CONN="${AIRFLOW__CORE__SQL_ALCHEMY_CONN:="sqlite:///${AIRFLOW_HOME}/airflow.db"}"
+        verify_db_connection "${AIRFLOW__CORE__SQL_ALCHEMY_CONN}"
+    fi
+
+}
+
+function upgrade_db() {
+    airflow db upgrade || airflow upgradedb || true
+}
+
+function verify_celery_connection_if_needed() {
+    # Note: the broker backend configuration concerns only a subset of Airflow components
+    if [[ ${AIRFLOW_COMMAND} =~ ^(scheduler|celery|worker|flower)$ ]]; then
+        if [[ -n "${AIRFLOW__CELERY__BROKER_URL_CMD=}" ]]; then
+            verify_db_connection "$(eval "${AIRFLOW__CELERY__BROKER_URL_CMD}")"
+        else
+            AIRFLOW__CELERY__BROKER_URL=${AIRFLOW__CELERY__BROKER_URL:=}
+            if [[ -n ${AIRFLOW__CELERY__BROKER_URL=} ]]; then
+                verify_db_connection "${AIRFLOW__CELERY__BROKER_URL}"

Review comment:
       Does the function correctly detect the port number for Redis/SQS/ or other backend if it is not specified in the URL?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559756260



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -18,62 +18,70 @@
 Production Deployment
 ^^^^^^^^^^^^^^^^^^^^^
 
-It is time to deploy your DAG in production. To do this, first, you need to make sure that the Airflow is itself production-ready.
-Let's see what precautions you need to take.
+It is time to deploy your DAG in production. To do this, first, you need to make sure that the Airflow
+is itself production-ready. Let's see what precautions you need to take.
 
 Database backend
 ================
 
-Airflow comes with an ``SQLite`` backend by default. This allows the user to run Airflow without any external database.
-However, such a setup is meant to be used for testing purposes only; running the default setup in production can lead to data loss in multiple scenarios.
-If you want to run production-grade Airflow, make sure you :doc:`configure the backend <howto/set-up-database>` to be an external database such as PostgreSQL or MySQL.
+Airflow comes with an ``SQLite`` backend by default. This allows the user to run Airflow without any external
+database. However, such a setup is meant to be used for testing purposes only; running the default setup
+in production can lead to data loss in multiple scenarios. If you want to run production-grade Airflow,
+make sure you :doc:`configure the backend <howto/set-up-database>` to be an external database
+such as PostgreSQL or MySQL.
 
 You can change the backend using the following config
 
 .. code-block:: ini
 
- [core]
- sql_alchemy_conn = my_conn_string
+    [core]
+    sql_alchemy_conn = my_conn_string
 
 Once you have changed the backend, airflow needs to create all the tables required for operation.
 Create an empty DB and give airflow's user the permission to ``CREATE/ALTER`` it.
 Once that is done, you can run -
 
 .. code-block:: bash
 
- airflow db upgrade
+    airflow db upgrade
 
 ``upgrade`` keeps track of migrations already applied, so it's safe to run as often as you need.
 
 .. note::
 
- Do not use ``airflow db init`` as it can create a lot of default connections, charts, etc. which are not required in production DB.
+    Do not use ``airflow db init`` as it can create a lot of default connections, charts, etc. which are not
+    required in production DB.
 
 
 Multi-Node Cluster
 ==================
 
-Airflow uses :class:`airflow.executors.sequential_executor.SequentialExecutor` by default. However, by its nature, the user is limited to executing at most
-one task at a time. ``Sequential Executor`` also pauses the scheduler when it runs a task, hence not recommended in a production setup.
-You should use the :class:`Local executor <airflow.executors.local_executor.LocalExecutor>` for a single machine.
-For a multi-node setup, you should use the :doc:`Kubernetes executor <../executor/kubernetes>` or the :doc:`Celery executor <../executor/celery>`.
+Airflow uses :class:`airflow.executors.sequential_executor.SequentialExecutor` by default. However, by it
+nature, the user is limited to executing at most one task at a time. ``Sequential Executor`` also pauses
+the scheduler when it runs a task, hence not recommended in a production setup. You should use the
+:class:`Local executor <airflow.executors.local_executor.LocalExecutor>` for a single machine.

Review comment:
       ```suggestion
   :class:`~airflow.executors.local_executor.LocalExecutor` for a single machine.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559376800



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,102 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform

Review comment:
       The main reason is that it is a "behavior" change and it is not image "build" time behavior but container "execution" behaviour. All the other AIRFLOW_* variables impact the way how the image is built, but those have no impact on the image, but change how container behaves.  So I wanted to have them clearly separated so that they are not easily confused. 
   
   Naming is the most difficult part (as usual) - so happy to change it (better idea?) or add some explanation why they are named like that.
   
   WDYT @feluelle ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#issuecomment-762821089


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559422654



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal sqlite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using sqlite is
+intended only for testing purpose, never use sqlite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.
+You need to pass at lest password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
+``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
+the ``*_CMD`` will be evaluated as shell command and it's output will be set ass password.
+
+User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
+password for security reasons.
+
++-----------+--------------------------+--------------------------------------------------------------+
+| Parameter | Default                  | Environment variable                                         |
++===========+==========================+==============================================================+
+| username  | admin                    | _AIRFLOW_WWW_USER_USERNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| password  |                          | _AIRFLOW_WWW_USER_PASSWORD_CMD or _AIRFLOW_WWW_USER_PASSWORD |
++-----------+--------------------------+--------------------------------------------------------------+
+| firstname | Airflow                  | _AIRFLOW_WWW_USER_FIRSTNAME                                  |
++-----------+--------------------------+--------------------------------------------------------------+
+| lastname  | Admin                    | _AIRFLOW_WWW_USER_LASTNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| email     | airflowadmin@example.com | _AIRFLOW_WWW_USER_EMAIL                                      |
++-----------+--------------------------+--------------------------------------------------------------+
+| role      | Admin                    | _AIRFLOW_WWW_USER_ROLE                                       |
++-----------+--------------------------+--------------------------------------------------------------+
+
+In case the password is specified, the user will be attempted to be created, but the entrypoint will
+not fail if the attempt fails (this accounts for the case that the user is already created).
+
+You can, for example start the webserver in the production image with initializing the internal sqlite
+database and creating an ``admin/admin`` Admin user with the following command:
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD="admin" \
+      apache/airflow:master-python3.8 webserver
+
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD_CMD="echo admin" \
+      apache/airflow:master-python3.8 webserver

Review comment:
       But indeed probably better to quote all arguments in this case 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
feluelle commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559387433



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal sqlite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using sqlite is
+intended only for testing purpose, never use sqlite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.
+You need to pass at lest password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
+``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
+the ``*_CMD`` will be evaluated as shell command and it's output will be set ass password.
+
+User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
+password for security reasons.
+
++-----------+--------------------------+--------------------------------------------------------------+
+| Parameter | Default                  | Environment variable                                         |
++===========+==========================+==============================================================+
+| username  | admin                    | _AIRFLOW_WWW_USER_USERNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| password  |                          | _AIRFLOW_WWW_USER_PASSWORD_CMD or _AIRFLOW_WWW_USER_PASSWORD |
++-----------+--------------------------+--------------------------------------------------------------+
+| firstname | Airflow                  | _AIRFLOW_WWW_USER_FIRSTNAME                                  |
++-----------+--------------------------+--------------------------------------------------------------+
+| lastname  | Admin                    | _AIRFLOW_WWW_USER_LASTNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| email     | airflowadmin@example.com | _AIRFLOW_WWW_USER_EMAIL                                      |
++-----------+--------------------------+--------------------------------------------------------------+
+| role      | Admin                    | _AIRFLOW_WWW_USER_ROLE                                       |
++-----------+--------------------------+--------------------------------------------------------------+
+
+In case the password is specified, the user will be attempted to be created, but the entrypoint will
+not fail if the attempt fails (this accounts for the case that the user is already created).
+
+You can, for example start the webserver in the production image with initializing the internal sqlite
+database and creating an ``admin/admin`` Admin user with the following command:
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD="admin" \
+      apache/airflow:master-python3.8 webserver
+
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD_CMD="echo admin" \
+      apache/airflow:master-python3.8 webserver

Review comment:
       https://dev.to/tvanantwerp/don-t-quote-environment-variables-in-docker-268h --> is pretty much the issue I faced, too. But don't know if this also can happen to env variables directly passed into the container.

##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal sqlite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using sqlite is
+intended only for testing purpose, never use sqlite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.
+You need to pass at lest password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
+``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
+the ``*_CMD`` will be evaluated as shell command and it's output will be set ass password.
+
+User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
+password for security reasons.
+
++-----------+--------------------------+--------------------------------------------------------------+
+| Parameter | Default                  | Environment variable                                         |
++===========+==========================+==============================================================+
+| username  | admin                    | _AIRFLOW_WWW_USER_USERNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| password  |                          | _AIRFLOW_WWW_USER_PASSWORD_CMD or _AIRFLOW_WWW_USER_PASSWORD |
++-----------+--------------------------+--------------------------------------------------------------+
+| firstname | Airflow                  | _AIRFLOW_WWW_USER_FIRSTNAME                                  |
++-----------+--------------------------+--------------------------------------------------------------+
+| lastname  | Admin                    | _AIRFLOW_WWW_USER_LASTNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| email     | airflowadmin@example.com | _AIRFLOW_WWW_USER_EMAIL                                      |
++-----------+--------------------------+--------------------------------------------------------------+
+| role      | Admin                    | _AIRFLOW_WWW_USER_ROLE                                       |
++-----------+--------------------------+--------------------------------------------------------------+
+
+In case the password is specified, the user will be attempted to be created, but the entrypoint will
+not fail if the attempt fails (this accounts for the case that the user is already created).
+
+You can, for example start the webserver in the production image with initializing the internal sqlite
+database and creating an ``admin/admin`` Admin user with the following command:
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD="admin" \
+      apache/airflow:master-python3.8 webserver
+
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD_CMD="echo admin" \
+      apache/airflow:master-python3.8 webserver

Review comment:
       ```suggestion
   .. code-block:: bash
   
     docker run -it -p 8080:8080 \
       -e _AIRFLOW_DB_UPGRADE=true \
       -e _AIRFLOW_WWW_USER_CREATE=true \
       -e _AIRFLOW_WWW_USER_PASSWORD=admin \
         apache/airflow:master-python3.8 webserver
   
   
   .. code-block:: bash
   
     docker run -it -p 8080:8080 \
       -e _AIRFLOW_DB_UPGRADE=true \
       -e _AIRFLOW_WWW_USER_CREATE=true \
       -e _AIRFLOW_WWW_USER_PASSWORD_CMD=echo admin \
         apache/airflow:master-python3.8 webserver
   ```
   
   From the [docs](https://docs.docker.com/engine/reference/commandline/run/#set-environment-variables--e---env---env-file) I would say we remove the `"`. For example I had some issues - not in docker `-e` but in `--env-file` using `"`.
   
   WDYT?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559757155



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +762,130 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to ``/home/airflow``.
+You can read more about it in the "Support arbitrary user ids" chapter in the
+`Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_.
+
+Waits for Airflow DB connection
+...............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+The script detects backend type depending on the URL schema and assigns default port numbers if not specified
+in the URL. Then it loops until connection to the host/port specified can be established
+It tries ``CONNECTION_CHECK_MAX_COUNT`` times and sleeps ``CONNECTION_CHECK_SLEEP_TIME`` between checks
+
+Supported schemes:
+
+* ``postgres://`` - default port 5432
+* ``mysql://``    - default port 3306
+* ``sqlite://``
+
+In case of SQLite backend, there is no connection to establish and waiting is skipped.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform

Review comment:
       ```suggestion
   If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will run
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559418778



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal sqlite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using sqlite is
+intended only for testing purpose, never use sqlite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.
+You need to pass at lest password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
+``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
+the ``*_CMD`` will be evaluated as shell command and it's output will be set ass password.
+
+User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
+password for security reasons.
+
++-----------+--------------------------+--------------------------------------------------------------+
+| Parameter | Default                  | Environment variable                                         |
++===========+==========================+==============================================================+
+| username  | admin                    | _AIRFLOW_WWW_USER_USERNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| password  |                          | _AIRFLOW_WWW_USER_PASSWORD_CMD or _AIRFLOW_WWW_USER_PASSWORD |
++-----------+--------------------------+--------------------------------------------------------------+
+| firstname | Airflow                  | _AIRFLOW_WWW_USER_FIRSTNAME                                  |
++-----------+--------------------------+--------------------------------------------------------------+
+| lastname  | Admin                    | _AIRFLOW_WWW_USER_LASTNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| email     | airflowadmin@example.com | _AIRFLOW_WWW_USER_EMAIL                                      |
++-----------+--------------------------+--------------------------------------------------------------+
+| role      | Admin                    | _AIRFLOW_WWW_USER_ROLE                                       |
++-----------+--------------------------+--------------------------------------------------------------+
+
+In case the password is specified, the user will be attempted to be created, but the entrypoint will
+not fail if the attempt fails (this accounts for the case that the user is already created).
+
+You can, for example start the webserver in the production image with initializing the internal sqlite
+database and creating an ``admin/admin`` Admin user with the following command:
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD="admin" \
+      apache/airflow:master-python3.8 webserver
+
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD_CMD="echo admin" \
+      apache/airflow:master-python3.8 webserver

Review comment:
       I think this is only issue specific to .env file, the quotes when you pass environment variables via -e switch are not necessary when you have no spaces (but they are always removed). They are absolutely necessary though when you have spaces (otherwise shell treats what follows as another parameter). This is a "shelll" expansion standard, not  .env file standard and quoting variables there is really good practice.
   
   For example: 
   
   `-e _AIRFLOW_WWW_USER_PASSWORD_CMD=echo admin`
   
   will set `_AIRFLOW_WWW_USER_PASSWORD_CMD` variable to `echo` and it will try to run "admin" image.
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559883674



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +762,130 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to ``/home/airflow``.
+You can read more about it in the "Support arbitrary user ids" chapter in the
+`Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_.
+
+Waits for Airflow DB connection
+...............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+The script detects backend type depending on the URL schema and assigns default port numbers if not specified
+in the URL. Then it loops until connection to the host/port specified can be established
+It tries ``CONNECTION_CHECK_MAX_COUNT`` times and sleeps ``CONNECTION_CHECK_SLEEP_TIME`` between checks
+
+Supported schemes:
+
+* ``postgres://`` - default port 5432
+* ``mysql://``    - default port 3306
+* ``sqlite://``
+
+In case of SQLite backend, there is no connection to establish and waiting is skipped.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal SQLite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using SQLite is
+intended only for testing purpose, never use SQLite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set

Review comment:
       I changed it to "webserver user" (to distinguish it from 'system user')




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559756162



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -18,62 +18,70 @@
 Production Deployment
 ^^^^^^^^^^^^^^^^^^^^^
 
-It is time to deploy your DAG in production. To do this, first, you need to make sure that the Airflow is itself production-ready.
-Let's see what precautions you need to take.
+It is time to deploy your DAG in production. To do this, first, you need to make sure that the Airflow
+is itself production-ready. Let's see what precautions you need to take.
 
 Database backend
 ================
 
-Airflow comes with an ``SQLite`` backend by default. This allows the user to run Airflow without any external database.
-However, such a setup is meant to be used for testing purposes only; running the default setup in production can lead to data loss in multiple scenarios.
-If you want to run production-grade Airflow, make sure you :doc:`configure the backend <howto/set-up-database>` to be an external database such as PostgreSQL or MySQL.
+Airflow comes with an ``SQLite`` backend by default. This allows the user to run Airflow without any external
+database. However, such a setup is meant to be used for testing purposes only; running the default setup
+in production can lead to data loss in multiple scenarios. If you want to run production-grade Airflow,
+make sure you :doc:`configure the backend <howto/set-up-database>` to be an external database
+such as PostgreSQL or MySQL.
 
 You can change the backend using the following config
 
 .. code-block:: ini
 
- [core]
- sql_alchemy_conn = my_conn_string
+    [core]
+    sql_alchemy_conn = my_conn_string
 
 Once you have changed the backend, airflow needs to create all the tables required for operation.
 Create an empty DB and give airflow's user the permission to ``CREATE/ALTER`` it.
 Once that is done, you can run -
 
 .. code-block:: bash
 
- airflow db upgrade
+    airflow db upgrade
 
 ``upgrade`` keeps track of migrations already applied, so it's safe to run as often as you need.
 
 .. note::
 
- Do not use ``airflow db init`` as it can create a lot of default connections, charts, etc. which are not required in production DB.
+    Do not use ``airflow db init`` as it can create a lot of default connections, charts, etc. which are not
+    required in production DB.
 
 
 Multi-Node Cluster
 ==================
 
-Airflow uses :class:`airflow.executors.sequential_executor.SequentialExecutor` by default. However, by its nature, the user is limited to executing at most
-one task at a time. ``Sequential Executor`` also pauses the scheduler when it runs a task, hence not recommended in a production setup.
-You should use the :class:`Local executor <airflow.executors.local_executor.LocalExecutor>` for a single machine.
-For a multi-node setup, you should use the :doc:`Kubernetes executor <../executor/kubernetes>` or the :doc:`Celery executor <../executor/celery>`.
+Airflow uses :class:`airflow.executors.sequential_executor.SequentialExecutor` by default. However, by it

Review comment:
       ```suggestion
   Airflow uses :class:`~airflow.executors.sequential_executor.SequentialExecutor` by default. However, by it
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559460216



##########
File path: scripts/in_container/prod/entrypoint_prod.sh
##########
@@ -109,51 +109,116 @@ function verify_db_connection {
     fi
 }
 
-if ! whoami &> /dev/null; then
-  if [[ -w /etc/passwd ]]; then
-    echo "${USER_NAME:-default}:x:$(id -u):0:${USER_NAME:-default} user:${AIRFLOW_USER_HOME_DIR}:/sbin/nologin" \
-        >> /etc/passwd
-  fi
-  export HOME="${AIRFLOW_USER_HOME_DIR}"
-fi
-
 # Warning: command environment variables (*_CMD) have priority over usual configuration variables
 # for configuration parameters that require sensitive information. This is the case for the SQL database
 # and the broker backend in this entrypoint script.
 
 
-if [[ -n "${AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD=}" ]]; then
-    verify_db_connection "$(eval "$AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD")"
-else
-    # if no DB configured - use sqlite db by default
-    AIRFLOW__CORE__SQL_ALCHEMY_CONN="${AIRFLOW__CORE__SQL_ALCHEMY_CONN:="sqlite:///${AIRFLOW_HOME}/airflow.db"}"
-    verify_db_connection "${AIRFLOW__CORE__SQL_ALCHEMY_CONN}"
-fi
+function create_www_user() {
+    local local_password=""
+    if [[ -n "${_AIRFLOW_WWW_USER_PASSWORD_CMD=}" ]]; then
+        local_password=$(eval "${_AIRFLOW_WWW_USER_PASSWORD_CMD}")
+        unset _AIRFLOW_WWW_USER_PASSWORD_CMD
+    elif [[ -n "${_AIRFLOW_WWW_USER_PASSWORD=}" ]]; then
+        local_password="${_AIRFLOW_WWW_USER_PASSWORD}"
+        unset _AIRFLOW_WWW_USER_PASSWORD
+    fi
+    if [[ -z ${local_password} ]]; then
+        echo
+        echo ERROR! Airflow Admin password not set via _AIRFLOW_WWW_USER_PASSWORD or _AIRFLOW_WWW_USER_PASSWORD_CMD variables!
+        echo
+        exit 1
+    fi
 
+    airflow users create \
+       --username "${_AIRFLOW_WWW_USER_USERNAME="admin"}" \
+       --firstname "${_AIRFLOW_WWW_USER_FIRSTNAME="Airflow"}" \
+       --lastname "${_AIRFLOW_WWW_USER_LASTNME="Admin"}" \
+       --email "${_AIRFLOW_WWW_USER_EMAIL="airflowadmin@example.com"}" \
+       --role "${_AIRFLOW_WWW_USER_ROLE="Admin"}" \
+       --password "${local_password}" ||
+    airflow create_user \
+       --username "${_AIRFLOW_WWW_USER_USERNAME="admin"}" \
+       --firstname "${_AIRFLOW_WWW_USER_FIRSTNAME="Airflow"}" \
+       --lastname "${_AIRFLOW_WWW_USER_LASTNME="Admin"}" \
+       --email "${_AIRFLOW_WWW_USER_EMAIL="airflowadmin@example.com"}" \
+       --role "${_AIRFLOW_WWW_USER_ROLE="Admin"}" \
+       --password "${local_password}" || true
+}
 
-# The Bash and python commands still should verify the basic connections so they are run after the
-# DB check but before the broker check
-if [[ ${AIRFLOW_COMMAND} == "bash" ]]; then
-   shift
-   exec "/bin/bash" "${@}"
-elif [[ ${AIRFLOW_COMMAND} == "python" ]]; then
-   shift
-   exec "python" "${@}"
-elif [[ ${AIRFLOW_COMMAND} == "airflow" ]]; then
-   AIRFLOW_COMMAND="${2}"
-   shift
-fi
+function create_system_user_if_missing() {
+    # This is needed in case of OpenShift-compatible container execution. In case of OpenShift random
+    # User id is used when starting the image, however group 0 is kept as the user group. Our production
+    # Image is OpenShift compatible, so all permissions on all folders are set so that 0 group can exercise
+    # the same privileges as the default "airflow" user, this code checks if the user is already
+    # present in /etc/passwd and will create the system user dynamically, including setting its
+    # HOME directory to the /home/airflow so that (for example) the ${HOME}/.local folder where airflow is
+    # Installed can be automatically added to PYTHONPATH
+    if ! whoami &> /dev/null; then
+      if [[ -w /etc/passwd ]]; then
+        echo "${USER_NAME:-default}:x:$(id -u):0:${USER_NAME:-default} user:${AIRFLOW_USER_HOME_DIR}:/sbin/nologin" \
+            >> /etc/passwd
+      fi
+      export HOME="${AIRFLOW_USER_HOME_DIR}"
+    fi
+}
 
-# Note: the broker backend configuration concerns only a subset of Airflow components
-if [[ ${AIRFLOW_COMMAND} =~ ^(scheduler|celery|worker|flower)$ ]]; then
-    if [[ -n "${AIRFLOW__CELERY__BROKER_URL_CMD=}" ]]; then
-        verify_db_connection "$(eval "$AIRFLOW__CELERY__BROKER_URL_CMD")"
+function verify_connections() {

Review comment:
       Does this function name describe its contents well? It seems to me that the singular should be used.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559376800



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,102 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform

Review comment:
       The main reason is that it is a "behavior" change and it is not "AIRFLOW build time behavior but container "execution" behaviour. All the other AIRFLOW_* variables impact the way how the image is built, but those have no impact on the image, but change how container behaves.  So I wanted to have them clearly separated so that they are not easily confused. 
   
   Naming is the most difficult part (as usual) - so happy to change it (better idea?) or add some explanation why they are named like that.
   
   WDYT @feluelle ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559512400



##########
File path: scripts/in_container/prod/entrypoint_prod.sh
##########
@@ -109,51 +109,116 @@ function verify_db_connection {
     fi
 }
 
-if ! whoami &> /dev/null; then
-  if [[ -w /etc/passwd ]]; then
-    echo "${USER_NAME:-default}:x:$(id -u):0:${USER_NAME:-default} user:${AIRFLOW_USER_HOME_DIR}:/sbin/nologin" \
-        >> /etc/passwd
-  fi
-  export HOME="${AIRFLOW_USER_HOME_DIR}"
-fi
-
 # Warning: command environment variables (*_CMD) have priority over usual configuration variables
 # for configuration parameters that require sensitive information. This is the case for the SQL database
 # and the broker backend in this entrypoint script.
 
 
-if [[ -n "${AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD=}" ]]; then
-    verify_db_connection "$(eval "$AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD")"
-else
-    # if no DB configured - use sqlite db by default
-    AIRFLOW__CORE__SQL_ALCHEMY_CONN="${AIRFLOW__CORE__SQL_ALCHEMY_CONN:="sqlite:///${AIRFLOW_HOME}/airflow.db"}"
-    verify_db_connection "${AIRFLOW__CORE__SQL_ALCHEMY_CONN}"
-fi
+function create_www_user() {
+    local local_password=""
+    if [[ -n "${_AIRFLOW_WWW_USER_PASSWORD_CMD=}" ]]; then
+        local_password=$(eval "${_AIRFLOW_WWW_USER_PASSWORD_CMD}")
+        unset _AIRFLOW_WWW_USER_PASSWORD_CMD
+    elif [[ -n "${_AIRFLOW_WWW_USER_PASSWORD=}" ]]; then
+        local_password="${_AIRFLOW_WWW_USER_PASSWORD}"
+        unset _AIRFLOW_WWW_USER_PASSWORD
+    fi
+    if [[ -z ${local_password} ]]; then
+        echo
+        echo ERROR! Airflow Admin password not set via _AIRFLOW_WWW_USER_PASSWORD or _AIRFLOW_WWW_USER_PASSWORD_CMD variables!
+        echo
+        exit 1
+    fi
 
+    airflow users create \
+       --username "${_AIRFLOW_WWW_USER_USERNAME="admin"}" \
+       --firstname "${_AIRFLOW_WWW_USER_FIRSTNAME="Airflow"}" \
+       --lastname "${_AIRFLOW_WWW_USER_LASTNME="Admin"}" \
+       --email "${_AIRFLOW_WWW_USER_EMAIL="airflowadmin@example.com"}" \
+       --role "${_AIRFLOW_WWW_USER_ROLE="Admin"}" \
+       --password "${local_password}" ||
+    airflow create_user \
+       --username "${_AIRFLOW_WWW_USER_USERNAME="admin"}" \
+       --firstname "${_AIRFLOW_WWW_USER_FIRSTNAME="Airflow"}" \
+       --lastname "${_AIRFLOW_WWW_USER_LASTNME="Admin"}" \
+       --email "${_AIRFLOW_WWW_USER_EMAIL="airflowadmin@example.com"}" \
+       --role "${_AIRFLOW_WWW_USER_ROLE="Admin"}" \
+       --password "${local_password}" || true
+}
 
-# The Bash and python commands still should verify the basic connections so they are run after the
-# DB check but before the broker check
-if [[ ${AIRFLOW_COMMAND} == "bash" ]]; then
-   shift
-   exec "/bin/bash" "${@}"
-elif [[ ${AIRFLOW_COMMAND} == "python" ]]; then
-   shift
-   exec "python" "${@}"
-elif [[ ${AIRFLOW_COMMAND} == "airflow" ]]; then
-   AIRFLOW_COMMAND="${2}"
-   shift
-fi
+function create_system_user_if_missing() {
+    # This is needed in case of OpenShift-compatible container execution. In case of OpenShift random
+    # User id is used when starting the image, however group 0 is kept as the user group. Our production
+    # Image is OpenShift compatible, so all permissions on all folders are set so that 0 group can exercise
+    # the same privileges as the default "airflow" user, this code checks if the user is already
+    # present in /etc/passwd and will create the system user dynamically, including setting its
+    # HOME directory to the /home/airflow so that (for example) the ${HOME}/.local folder where airflow is
+    # Installed can be automatically added to PYTHONPATH
+    if ! whoami &> /dev/null; then
+      if [[ -w /etc/passwd ]]; then
+        echo "${USER_NAME:-default}:x:$(id -u):0:${USER_NAME:-default} user:${AIRFLOW_USER_HOME_DIR}:/sbin/nologin" \
+            >> /etc/passwd
+      fi
+      export HOME="${AIRFLOW_USER_HOME_DIR}"
+    fi
+}
 
-# Note: the broker backend configuration concerns only a subset of Airflow components
-if [[ ${AIRFLOW_COMMAND} =~ ^(scheduler|celery|worker|flower)$ ]]; then
-    if [[ -n "${AIRFLOW__CELERY__BROKER_URL_CMD=}" ]]; then
-        verify_db_connection "$(eval "$AIRFLOW__CELERY__BROKER_URL_CMD")"
+function verify_connections() {
+    if [[ -n "${AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD=}" ]]; then
+        verify_db_connection "$(eval "${AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD}")"
     else
-        AIRFLOW__CELERY__BROKER_URL=${AIRFLOW__CELERY__BROKER_URL:=}
-        if [[ -n ${AIRFLOW__CELERY__BROKER_URL=} ]]; then
-            verify_db_connection "${AIRFLOW__CELERY__BROKER_URL}"
+        # if no DB configured - use sqlite db by default
+        AIRFLOW__CORE__SQL_ALCHEMY_CONN="${AIRFLOW__CORE__SQL_ALCHEMY_CONN:="sqlite:///${AIRFLOW_HOME}/airflow.db"}"
+        verify_db_connection "${AIRFLOW__CORE__SQL_ALCHEMY_CONN}"
+    fi
+
+}
+
+function upgrade_db() {
+    airflow db upgrade || airflow upgradedb || true
+}
+
+function verify_celery_connection_if_needed() {
+    # Note: the broker backend configuration concerns only a subset of Airflow components
+    if [[ ${AIRFLOW_COMMAND} =~ ^(scheduler|celery|worker|flower)$ ]]; then
+        if [[ -n "${AIRFLOW__CELERY__BROKER_URL_CMD=}" ]]; then
+            verify_db_connection "$(eval "${AIRFLOW__CELERY__BROKER_URL_CMD}")"
+        else
+            AIRFLOW__CELERY__BROKER_URL=${AIRFLOW__CELERY__BROKER_URL:=}
+            if [[ -n ${AIRFLOW__CELERY__BROKER_URL=} ]]; then
+                verify_db_connection "${AIRFLOW__CELERY__BROKER_URL}"

Review comment:
       Good point. I reviewed the whole code and refactored it slightly to get complete picture and to add missing port numbers. I also added a bit more comprehensive documentation on this behavior and use the opportunity to wrap the documentation for `production-deployment.rst` where I could to make it better readable when you look at the sources. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559421369



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal sqlite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using sqlite is
+intended only for testing purpose, never use sqlite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.
+You need to pass at lest password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
+``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
+the ``*_CMD`` will be evaluated as shell command and it's output will be set ass password.
+
+User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
+password for security reasons.
+
++-----------+--------------------------+--------------------------------------------------------------+
+| Parameter | Default                  | Environment variable                                         |
++===========+==========================+==============================================================+
+| username  | admin                    | _AIRFLOW_WWW_USER_USERNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| password  |                          | _AIRFLOW_WWW_USER_PASSWORD_CMD or _AIRFLOW_WWW_USER_PASSWORD |
++-----------+--------------------------+--------------------------------------------------------------+
+| firstname | Airflow                  | _AIRFLOW_WWW_USER_FIRSTNAME                                  |
++-----------+--------------------------+--------------------------------------------------------------+
+| lastname  | Admin                    | _AIRFLOW_WWW_USER_LASTNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| email     | airflowadmin@example.com | _AIRFLOW_WWW_USER_EMAIL                                      |
++-----------+--------------------------+--------------------------------------------------------------+
+| role      | Admin                    | _AIRFLOW_WWW_USER_ROLE                                       |
++-----------+--------------------------+--------------------------------------------------------------+
+
+In case the password is specified, the user will be attempted to be created, but the entrypoint will
+not fail if the attempt fails (this accounts for the case that the user is already created).
+
+You can, for example start the webserver in the production image with initializing the internal sqlite
+database and creating an ``admin/admin`` Admin user with the following command:
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD="admin" \
+      apache/airflow:master-python3.8 webserver
+
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD_CMD="echo admin" \
+      apache/airflow:master-python3.8 webserver

Review comment:
       This is rather widely used practice.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
feluelle commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559384124



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,102 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform

Review comment:
       SGTM 👍 Makes sense. Thanks for the explanation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559457236



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal sqlite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using sqlite is
+intended only for testing purpose, never use sqlite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.
+You need to pass at lest password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
+``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
+the ``*_CMD`` will be evaluated as shell command and it's output will be set ass password.
+
+User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
+password for security reasons.
+
++-----------+--------------------------+--------------------------------------------------------------+
+| Parameter | Default                  | Environment variable                                         |
++===========+==========================+==============================================================+
+| username  | admin                    | _AIRFLOW_WWW_USER_USERNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| password  |                          | _AIRFLOW_WWW_USER_PASSWORD_CMD or _AIRFLOW_WWW_USER_PASSWORD |
++-----------+--------------------------+--------------------------------------------------------------+
+| firstname | Airflow                  | _AIRFLOW_WWW_USER_FIRSTNAME                                  |
++-----------+--------------------------+--------------------------------------------------------------+
+| lastname  | Admin                    | _AIRFLOW_WWW_USER_LASTNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| email     | airflowadmin@example.com | _AIRFLOW_WWW_USER_EMAIL                                      |
++-----------+--------------------------+--------------------------------------------------------------+
+| role      | Admin                    | _AIRFLOW_WWW_USER_ROLE                                       |
++-----------+--------------------------+--------------------------------------------------------------+
+
+In case the password is specified, the user will be attempted to be created, but the entrypoint will
+not fail if the attempt fails (this accounts for the case that the user is already created).
+
+You can, for example start the webserver in the production image with initializing the internal sqlite
+database and creating an ``admin/admin`` Admin user with the following command:
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    --env "_AIRFLOW_DB_UPGRADE=true" \
+    --env "_AIRFLOW_WWW_USER_CREATE=true" \
+    --env "_AIRFLOW_WWW_USER_PASSWORD=admin" \
+      apache/airflow:master-python3.8 webserver
+
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    --env "_AIRFLOW_DB_UPGRADE=true" \
+    --env "_AIRFLOW_WWW_USER_CREATE=true" \
+    --env "_AIRFLOW_WWW_USER_PASSWORD_CMD=echo admin" \
+      apache/airflow:master-python3.8 webserver
+
+The commands above perform initialization of the sqlite database, create admin user with admin password

Review comment:
       ```suggestion
   The commands above perform initialization of the SQLIte database, create admin user with admin password
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559758211



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +762,130 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to ``/home/airflow``.
+You can read more about it in the "Support arbitrary user ids" chapter in the
+`Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_.
+
+Waits for Airflow DB connection
+...............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+The script detects backend type depending on the URL schema and assigns default port numbers if not specified
+in the URL. Then it loops until connection to the host/port specified can be established
+It tries ``CONNECTION_CHECK_MAX_COUNT`` times and sleeps ``CONNECTION_CHECK_SLEEP_TIME`` between checks
+
+Supported schemes:
+
+* ``postgres://`` - default port 5432
+* ``mysql://``    - default port 3306
+* ``sqlite://``
+
+In case of SQLite backend, there is no connection to establish and waiting is skipped.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal SQLite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using SQLite is
+intended only for testing purpose, never use SQLite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.
+You need to pass at lest password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
+``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
+the ``*_CMD`` will be evaluated as shell command and it's output will be set ass password.
+
+User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
+password for security reasons.
+
++-----------+--------------------------+--------------------------------------------------------------+
+| Parameter | Default                  | Environment variable                                         |
++===========+==========================+==============================================================+
+| username  | admin                    | _AIRFLOW_WWW_USER_USERNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| password  |                          | _AIRFLOW_WWW_USER_PASSWORD_CMD or _AIRFLOW_WWW_USER_PASSWORD |
++-----------+--------------------------+--------------------------------------------------------------+
+| firstname | Airflow                  | _AIRFLOW_WWW_USER_FIRSTNAME                                  |
++-----------+--------------------------+--------------------------------------------------------------+
+| lastname  | Admin                    | _AIRFLOW_WWW_USER_LASTNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| email     | airflowadmin@example.com | _AIRFLOW_WWW_USER_EMAIL                                      |
++-----------+--------------------------+--------------------------------------------------------------+
+| role      | Admin                    | _AIRFLOW_WWW_USER_ROLE                                       |

Review comment:
       ```suggestion
   | username  | admin                    | ``_AIRFLOW_WWW_USER_USERNAME``                                   |
   +-----------+--------------------------+--------------------------------------------------------------+
   | password  |                          | ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or ``_AIRFLOW_WWW_USER_PASSWORD`` |
   +-----------+--------------------------+--------------------------------------------------------------+
   | firstname | Airflow                  | ``_AIRFLOW_WWW_USER_FIRSTNAME``                                  |
   +-----------+--------------------------+--------------------------------------------------------------+
   | lastname  | Admin                    | ``_AIRFLOW_WWW_USER_LASTNAME``                                   |
   +-----------+--------------------------+--------------------------------------------------------------+
   | email     | airflowadmin@example.com | ``_AIRFLOW_WWW_USER_EMAIL``                                      |
   +-----------+--------------------------+--------------------------------------------------------------+
   | role      | Admin                    | ``_AIRFLOW_WWW_USER_ROLE``                                      |
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559418778



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal sqlite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using sqlite is
+intended only for testing purpose, never use sqlite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.
+You need to pass at lest password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
+``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
+the ``*_CMD`` will be evaluated as shell command and it's output will be set ass password.
+
+User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
+password for security reasons.
+
++-----------+--------------------------+--------------------------------------------------------------+
+| Parameter | Default                  | Environment variable                                         |
++===========+==========================+==============================================================+
+| username  | admin                    | _AIRFLOW_WWW_USER_USERNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| password  |                          | _AIRFLOW_WWW_USER_PASSWORD_CMD or _AIRFLOW_WWW_USER_PASSWORD |
++-----------+--------------------------+--------------------------------------------------------------+
+| firstname | Airflow                  | _AIRFLOW_WWW_USER_FIRSTNAME                                  |
++-----------+--------------------------+--------------------------------------------------------------+
+| lastname  | Admin                    | _AIRFLOW_WWW_USER_LASTNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| email     | airflowadmin@example.com | _AIRFLOW_WWW_USER_EMAIL                                      |
++-----------+--------------------------+--------------------------------------------------------------+
+| role      | Admin                    | _AIRFLOW_WWW_USER_ROLE                                       |
++-----------+--------------------------+--------------------------------------------------------------+
+
+In case the password is specified, the user will be attempted to be created, but the entrypoint will
+not fail if the attempt fails (this accounts for the case that the user is already created).
+
+You can, for example start the webserver in the production image with initializing the internal sqlite
+database and creating an ``admin/admin`` Admin user with the following command:
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD="admin" \
+      apache/airflow:master-python3.8 webserver
+
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD_CMD="echo admin" \
+      apache/airflow:master-python3.8 webserver

Review comment:
       I think this is only issue specific to .env file, the quotes when you pass environment variables via -e switch are not needed when you have no spaces, but they are absolutely necessary when you have spaces (otherwise shell treats them as another parameter). This is a "shelll" expansion standard, not  .env file standard and quoting variables there is really good practice.
   
   For example: 
   
   `-e _AIRFLOW_WWW_USER_PASSWORD_CMD=echo admin`
   
   will set `_AIRFLOW_WWW_USER_PASSWORD_CMD` variable to `echo` and it will try to run "admin" image.
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
feluelle commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r560142305



##########
File path: scripts/in_container/prod/entrypoint_prod.sh
##########
@@ -39,121 +39,216 @@ function run_nc() {
     nc -zvvn "${ip}" "${port}"
 }
 
-function verify_db_connection {
-    DB_URL="${1}"
+function wait_for_connection {
+    # Waits for Connection to the backend specified via URL passed as first parameter
+    # Detects backend type depending on the URL schema and assigns
+    # default port numbers if not specified in the URL.
+    # Then it loops until connection to the host/port specified can be established
+    # It tries `CONNECTION_CHECK_MAX_COUNT` times and sleeps `CONNECTION_CHECK_SLEEP_TIME` between checks
+    local connection_url
+    connection_url="${1}"
 
-    DB_CHECK_MAX_COUNT=${MAX_DB_CHECK_COUNT:=20}
-    DB_CHECK_SLEEP_TIME=${DB_CHECK_SLEEP_TIME:=3}
+    local detected_backend=""
+    local detected_host=""
+    local detected_port=""
 
-    local DETECTED_DB_BACKEND=""
-    local DETECTED_DB_HOST=""
-    local DETECTED_DB_PORT=""
 
-
-    if [[ ${DB_URL} != sqlite* ]]; then
+    if [[ ${connection_url} != sqlite* ]]; then
         # Auto-detect DB parameters
-        [[ ${DB_URL} =~ ([^:]*)://([^:]*[@.*]?):([^@]*)@?([^/:]*):?([0-9]*)/([^\?]*)\??(.*) ]] && \
-            DETECTED_DB_BACKEND=${BASH_REMATCH[1]} &&
+        [[ ${connection_url} =~ ([^:]*)://([^:]*[@.*]?):([^@]*)@?([^/:]*):?([0-9]*)/([^\?]*)\??(.*) ]] && \
+            detected_backend=${BASH_REMATCH[1]} &&
             # Not used USER match
             # Not used PASSWORD match
-            DETECTED_DB_HOST=${BASH_REMATCH[4]} &&
-            DETECTED_DB_PORT=${BASH_REMATCH[5]} &&
+            detected_host=${BASH_REMATCH[4]} &&
+            detected_port=${BASH_REMATCH[5]} &&
             # Not used SCHEMA match
             # Not used PARAMS match
 
-        echo DB_BACKEND="${DB_BACKEND:=${DETECTED_DB_BACKEND}}"
-
-        if [[ -z "${DETECTED_DB_PORT=}" ]]; then
-            if [[ ${DB_BACKEND} == "postgres"* ]]; then
-                DETECTED_DB_PORT=5432
-            elif [[ ${DB_BACKEND} == "mysql"* ]]; then
-                DETECTED_DB_PORT=3306
+        echo BACKEND="${BACKEND:=${detected_backend}}"
+        readonly BACKEND
+
+        if [[ -z "${detected_port=}" ]]; then
+            if [[ ${BACKEND} == "postgres"* ]]; then
+                detected_port=5432
+            elif [[ ${BACKEND} == "mysql"* ]]; then
+                detected_port=3306
+            elif [[ ${BACKEND} == "redis"* ]]; then
+                detected_port=6379
+            elif [[ ${BACKEND} == "amqp"* ]]; then
+                detected_port=5672
             fi
         fi
 
-        DETECTED_DB_HOST=${DETECTED_DB_HOST:="localhost"}
+        detected_host=${detected_host:="localhost"}
 
         # Allow the DB parameters to be overridden by environment variable
-        echo DB_HOST="${DB_HOST:=${DETECTED_DB_HOST}}"
-        echo DB_PORT="${DB_PORT:=${DETECTED_DB_PORT}}"
+        echo DB_HOST="${DB_HOST:=${detected_host}}"
+        readonly DB_HOST
 
+        echo DB_PORT="${DB_PORT:=${detected_port}}"
+        readonly DB_PORT
+        local countdown
+        countdown="${CONNECTION_CHECK_MAX_COUNT}"
         while true
         do
             set +e
-            LAST_CHECK_RESULT=$(run_nc "${DB_HOST}" "${DB_PORT}" >/dev/null 2>&1)
-            RES=$?
+            local last_check_result
+            local res
+            last_check_result=$(run_nc "${DB_HOST}" "${DB_PORT}" >/dev/null 2>&1)
+            res=$?
             set -e
-            if [[ ${RES} == 0 ]]; then
+            if [[ ${res} == 0 ]]; then
                 echo
                 break
             else
                 echo -n "."
-                DB_CHECK_MAX_COUNT=$((DB_CHECK_MAX_COUNT-1))
+                countdown=$((countdown-1))
             fi
-            if [[ ${DB_CHECK_MAX_COUNT} == 0 ]]; then
+            if [[ ${countdown} == 0 ]]; then
                 echo
-                echo "ERROR! Maximum number of retries (${DB_CHECK_MAX_COUNT}) reached while checking ${DB_BACKEND} db. Exiting"
+                echo "ERROR! Maximum number of retries (${CONNECTION_CHECK_MAX_COUNT}) reached."
+                echo "       while checking ${BACKEND} connection."
                 echo
-                break
+                echo "Last check result:"
+                echo
+                echo "${last_check_result}"
+                echo
+                exit 1
             else
-                sleep "${DB_CHECK_SLEEP_TIME}"
+                sleep "${CONNECTION_CHECK_SLEEP_TIME}"
             fi
         done
-        if [[ ${RES} != 0 ]]; then
-            echo "        ERROR: ${DB_URL} db could not be reached!"
-            echo
-            echo "${LAST_CHECK_RESULT}"
-            echo
-            export EXIT_CODE=${RES}
+    fi
+}
+
+function create_www_user() {
+    local local_password=""
+    # Warning: command environment variables (*_CMD) have priority over usual configuration variables
+    # for configuration parameters that require sensitive information. This is the case for the SQL database
+    # and the broker backend in this entrypoint script.
+    if [[ -n "${_AIRFLOW_WWW_USER_PASSWORD_CMD=}" ]]; then
+        local_password=$(eval "${_AIRFLOW_WWW_USER_PASSWORD_CMD}")
+        unset _AIRFLOW_WWW_USER_PASSWORD_CMD
+    elif [[ -n "${_AIRFLOW_WWW_USER_PASSWORD=}" ]]; then
+        local_password="${_AIRFLOW_WWW_USER_PASSWORD}"
+        unset _AIRFLOW_WWW_USER_PASSWORD
+    fi
+    if [[ -z ${local_password} ]]; then
+        echo
+        echo ERROR! Airflow Admin password not set via _AIRFLOW_WWW_USER_PASSWORD or _AIRFLOW_WWW_USER_PASSWORD_CMD variables!

Review comment:
       ```suggestion
           echo "ERROR! Airflow Admin password not set via _AIRFLOW_WWW_USER_PASSWORD or _AIRFLOW_WWW_USER_PASSWORD_CMD variables!"
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#issuecomment-761861009


   [The Workflow run](https://github.com/apache/airflow/actions/runs/492035400) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559457711



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal sqlite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using sqlite is
+intended only for testing purpose, never use sqlite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.
+You need to pass at lest password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
+``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
+the ``*_CMD`` will be evaluated as shell command and it's output will be set ass password.
+
+User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
+password for security reasons.
+
++-----------+--------------------------+--------------------------------------------------------------+
+| Parameter | Default                  | Environment variable                                         |
++===========+==========================+==============================================================+
+| username  | admin                    | _AIRFLOW_WWW_USER_USERNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| password  |                          | _AIRFLOW_WWW_USER_PASSWORD_CMD or _AIRFLOW_WWW_USER_PASSWORD |
++-----------+--------------------------+--------------------------------------------------------------+
+| firstname | Airflow                  | _AIRFLOW_WWW_USER_FIRSTNAME                                  |
++-----------+--------------------------+--------------------------------------------------------------+
+| lastname  | Admin                    | _AIRFLOW_WWW_USER_LASTNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| email     | airflowadmin@example.com | _AIRFLOW_WWW_USER_EMAIL                                      |
++-----------+--------------------------+--------------------------------------------------------------+
+| role      | Admin                    | _AIRFLOW_WWW_USER_ROLE                                       |
++-----------+--------------------------+--------------------------------------------------------------+
+
+In case the password is specified, the user will be attempted to be created, but the entrypoint will
+not fail if the attempt fails (this accounts for the case that the user is already created).
+
+You can, for example start the webserver in the production image with initializing the internal sqlite
+database and creating an ``admin/admin`` Admin user with the following command:
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    --env "_AIRFLOW_DB_UPGRADE=true" \
+    --env "_AIRFLOW_WWW_USER_CREATE=true" \
+    --env "_AIRFLOW_WWW_USER_PASSWORD=admin" \
+      apache/airflow:master-python3.8 webserver
+
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    --env "_AIRFLOW_DB_UPGRADE=true" \
+    --env "_AIRFLOW_WWW_USER_CREATE=true" \
+    --env "_AIRFLOW_WWW_USER_PASSWORD_CMD=echo admin" \
+      apache/airflow:master-python3.8 webserver
+
+The commands above perform initialization of the sqlite database, create admin user with admin password
+and Admin role. They also forward local port 8080 to the webserver port and finally start the webserver.
+
+
+Verify celery DB connection
+...........................
+
+In case Postgres or MySQL DB is used, and one of the 'scheduler", "celery", "worker", or "flower" commands

Review comment:
       ```suggestion
   In case Postgres or MySQL DB is used, and one of the ``scheduler``, ``celery``, ``worker``, or ``flower`` subcommands
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559423779



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal sqlite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using sqlite is
+intended only for testing purpose, never use sqlite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.
+You need to pass at lest password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
+``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
+the ``*_CMD`` will be evaluated as shell command and it's output will be set ass password.
+
+User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
+password for security reasons.
+
++-----------+--------------------------+--------------------------------------------------------------+
+| Parameter | Default                  | Environment variable                                         |
++===========+==========================+==============================================================+
+| username  | admin                    | _AIRFLOW_WWW_USER_USERNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| password  |                          | _AIRFLOW_WWW_USER_PASSWORD_CMD or _AIRFLOW_WWW_USER_PASSWORD |
++-----------+--------------------------+--------------------------------------------------------------+
+| firstname | Airflow                  | _AIRFLOW_WWW_USER_FIRSTNAME                                  |
++-----------+--------------------------+--------------------------------------------------------------+
+| lastname  | Admin                    | _AIRFLOW_WWW_USER_LASTNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| email     | airflowadmin@example.com | _AIRFLOW_WWW_USER_EMAIL                                      |
++-----------+--------------------------+--------------------------------------------------------------+
+| role      | Admin                    | _AIRFLOW_WWW_USER_ROLE                                       |
++-----------+--------------------------+--------------------------------------------------------------+
+
+In case the password is specified, the user will be attempted to be created, but the entrypoint will
+not fail if the attempt fails (this accounts for the case that the user is already created).
+
+You can, for example start the webserver in the production image with initializing the internal sqlite
+database and creating an ``admin/admin`` Admin user with the following command:
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD="admin" \
+      apache/airflow:master-python3.8 webserver
+
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD_CMD="echo admin" \
+      apache/airflow:master-python3.8 webserver

Review comment:
       Updated to add quotes around whole "VAR=value"




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559418778



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal sqlite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using sqlite is
+intended only for testing purpose, never use sqlite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.
+You need to pass at lest password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
+``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
+the ``*_CMD`` will be evaluated as shell command and it's output will be set ass password.
+
+User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
+password for security reasons.
+
++-----------+--------------------------+--------------------------------------------------------------+
+| Parameter | Default                  | Environment variable                                         |
++===========+==========================+==============================================================+
+| username  | admin                    | _AIRFLOW_WWW_USER_USERNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| password  |                          | _AIRFLOW_WWW_USER_PASSWORD_CMD or _AIRFLOW_WWW_USER_PASSWORD |
++-----------+--------------------------+--------------------------------------------------------------+
+| firstname | Airflow                  | _AIRFLOW_WWW_USER_FIRSTNAME                                  |
++-----------+--------------------------+--------------------------------------------------------------+
+| lastname  | Admin                    | _AIRFLOW_WWW_USER_LASTNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| email     | airflowadmin@example.com | _AIRFLOW_WWW_USER_EMAIL                                      |
++-----------+--------------------------+--------------------------------------------------------------+
+| role      | Admin                    | _AIRFLOW_WWW_USER_ROLE                                       |
++-----------+--------------------------+--------------------------------------------------------------+
+
+In case the password is specified, the user will be attempted to be created, but the entrypoint will
+not fail if the attempt fails (this accounts for the case that the user is already created).
+
+You can, for example start the webserver in the production image with initializing the internal sqlite
+database and creating an ``admin/admin`` Admin user with the following command:
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD="admin" \
+      apache/airflow:master-python3.8 webserver
+
+
+.. code-block:: bash
+
+  docker run -it -p 8080:8080 \
+    -e _AIRFLOW_DB_UPGRADE="true" \
+    -e _AIRFLOW_WWW_USER_CREATE="true" \
+    -e _AIRFLOW_WWW_USER_PASSWORD_CMD="echo admin" \
+      apache/airflow:master-python3.8 webserver

Review comment:
       I think this is only issue specific to .env file, the quotes when you pass environment variables via -e switch are not needed when you have no spaces, but they are absolutely necessary when you have spaces (otherwise shell treats them as another parameter). This is a "shelll" expansion standard, not  .env file standard and quoting variables there is really good practice.
   
   For example: 
   
   `-e _AIRFLOW_WWW_USER_PASSWORD_CMD=echo admin`
   
   will set `_AIRFLOW_WWW_USER_PASSWORD_CMD` variable to "echo" and it will try to run "admin" image.
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559455897



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal sqlite database (default) to upgrade the db and create

Review comment:
       ```suggestion
   when you are running airflow with internal SQLite database (default) to upgrade the db and create
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559883185



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -18,62 +18,70 @@
 Production Deployment
 ^^^^^^^^^^^^^^^^^^^^^
 
-It is time to deploy your DAG in production. To do this, first, you need to make sure that the Airflow is itself production-ready.
-Let's see what precautions you need to take.
+It is time to deploy your DAG in production. To do this, first, you need to make sure that the Airflow
+is itself production-ready. Let's see what precautions you need to take.
 
 Database backend
 ================
 
-Airflow comes with an ``SQLite`` backend by default. This allows the user to run Airflow without any external database.
-However, such a setup is meant to be used for testing purposes only; running the default setup in production can lead to data loss in multiple scenarios.
-If you want to run production-grade Airflow, make sure you :doc:`configure the backend <howto/set-up-database>` to be an external database such as PostgreSQL or MySQL.
+Airflow comes with an ``SQLite`` backend by default. This allows the user to run Airflow without any external
+database. However, such a setup is meant to be used for testing purposes only; running the default setup
+in production can lead to data loss in multiple scenarios. If you want to run production-grade Airflow,
+make sure you :doc:`configure the backend <howto/set-up-database>` to be an external database
+such as PostgreSQL or MySQL.
 
 You can change the backend using the following config
 
 .. code-block:: ini
 
- [core]
- sql_alchemy_conn = my_conn_string
+    [core]
+    sql_alchemy_conn = my_conn_string
 
 Once you have changed the backend, airflow needs to create all the tables required for operation.
 Create an empty DB and give airflow's user the permission to ``CREATE/ALTER`` it.
 Once that is done, you can run -
 
 .. code-block:: bash
 
- airflow db upgrade
+    airflow db upgrade
 
 ``upgrade`` keeps track of migrations already applied, so it's safe to run as often as you need.
 
 .. note::
 
- Do not use ``airflow db init`` as it can create a lot of default connections, charts, etc. which are not required in production DB.
+    Do not use ``airflow db init`` as it can create a lot of default connections, charts, etc. which are not
+    required in production DB.
 
 
 Multi-Node Cluster
 ==================
 
-Airflow uses :class:`airflow.executors.sequential_executor.SequentialExecutor` by default. However, by its nature, the user is limited to executing at most
-one task at a time. ``Sequential Executor`` also pauses the scheduler when it runs a task, hence not recommended in a production setup.
-You should use the :class:`Local executor <airflow.executors.local_executor.LocalExecutor>` for a single machine.
-For a multi-node setup, you should use the :doc:`Kubernetes executor <../executor/kubernetes>` or the :doc:`Celery executor <../executor/celery>`.
+Airflow uses :class:`airflow.executors.sequential_executor.SequentialExecutor` by default. However, by it

Review comment:
       Good catch!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559757655



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +762,130 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to ``/home/airflow``.
+You can read more about it in the "Support arbitrary user ids" chapter in the
+`Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_.
+
+Waits for Airflow DB connection
+...............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+The script detects backend type depending on the URL schema and assigns default port numbers if not specified
+in the URL. Then it loops until connection to the host/port specified can be established
+It tries ``CONNECTION_CHECK_MAX_COUNT`` times and sleeps ``CONNECTION_CHECK_SLEEP_TIME`` between checks
+
+Supported schemes:
+
+* ``postgres://`` - default port 5432
+* ``mysql://``    - default port 3306
+* ``sqlite://``
+
+In case of SQLite backend, there is no connection to establish and waiting is skipped.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal SQLite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using SQLite is
+intended only for testing purpose, never use SQLite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.

Review comment:
       ```suggestion
   ``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
   production, it is only useful if you would like to run a quick test with the production image.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559757725



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +762,130 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to ``/home/airflow``.
+You can read more about it in the "Support arbitrary user ids" chapter in the
+`Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_.
+
+Waits for Airflow DB connection
+...............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+The script detects backend type depending on the URL schema and assigns default port numbers if not specified
+in the URL. Then it loops until connection to the host/port specified can be established
+It tries ``CONNECTION_CHECK_MAX_COUNT`` times and sleeps ``CONNECTION_CHECK_SLEEP_TIME`` between checks
+
+Supported schemes:
+
+* ``postgres://`` - default port 5432
+* ``mysql://``    - default port 3306
+* ``sqlite://``
+
+In case of SQLite backend, there is no connection to establish and waiting is skipped.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal SQLite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using SQLite is
+intended only for testing purpose, never use SQLite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.
+You need to pass at lest password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or

Review comment:
       ```suggestion
   You need to pass at least password to create such a user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559460767



##########
File path: scripts/in_container/prod/entrypoint_prod.sh
##########
@@ -109,51 +109,116 @@ function verify_db_connection {
     fi
 }
 
-if ! whoami &> /dev/null; then
-  if [[ -w /etc/passwd ]]; then
-    echo "${USER_NAME:-default}:x:$(id -u):0:${USER_NAME:-default} user:${AIRFLOW_USER_HOME_DIR}:/sbin/nologin" \
-        >> /etc/passwd
-  fi
-  export HOME="${AIRFLOW_USER_HOME_DIR}"
-fi
-
 # Warning: command environment variables (*_CMD) have priority over usual configuration variables
 # for configuration parameters that require sensitive information. This is the case for the SQL database
 # and the broker backend in this entrypoint script.
 
 
-if [[ -n "${AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD=}" ]]; then
-    verify_db_connection "$(eval "$AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD")"
-else
-    # if no DB configured - use sqlite db by default
-    AIRFLOW__CORE__SQL_ALCHEMY_CONN="${AIRFLOW__CORE__SQL_ALCHEMY_CONN:="sqlite:///${AIRFLOW_HOME}/airflow.db"}"
-    verify_db_connection "${AIRFLOW__CORE__SQL_ALCHEMY_CONN}"
-fi
+function create_www_user() {
+    local local_password=""
+    if [[ -n "${_AIRFLOW_WWW_USER_PASSWORD_CMD=}" ]]; then
+        local_password=$(eval "${_AIRFLOW_WWW_USER_PASSWORD_CMD}")
+        unset _AIRFLOW_WWW_USER_PASSWORD_CMD
+    elif [[ -n "${_AIRFLOW_WWW_USER_PASSWORD=}" ]]; then
+        local_password="${_AIRFLOW_WWW_USER_PASSWORD}"
+        unset _AIRFLOW_WWW_USER_PASSWORD
+    fi
+    if [[ -z ${local_password} ]]; then
+        echo
+        echo ERROR! Airflow Admin password not set via _AIRFLOW_WWW_USER_PASSWORD or _AIRFLOW_WWW_USER_PASSWORD_CMD variables!
+        echo
+        exit 1
+    fi
 
+    airflow users create \
+       --username "${_AIRFLOW_WWW_USER_USERNAME="admin"}" \
+       --firstname "${_AIRFLOW_WWW_USER_FIRSTNAME="Airflow"}" \
+       --lastname "${_AIRFLOW_WWW_USER_LASTNME="Admin"}" \
+       --email "${_AIRFLOW_WWW_USER_EMAIL="airflowadmin@example.com"}" \
+       --role "${_AIRFLOW_WWW_USER_ROLE="Admin"}" \
+       --password "${local_password}" ||
+    airflow create_user \
+       --username "${_AIRFLOW_WWW_USER_USERNAME="admin"}" \
+       --firstname "${_AIRFLOW_WWW_USER_FIRSTNAME="Airflow"}" \
+       --lastname "${_AIRFLOW_WWW_USER_LASTNME="Admin"}" \
+       --email "${_AIRFLOW_WWW_USER_EMAIL="airflowadmin@example.com"}" \
+       --role "${_AIRFLOW_WWW_USER_ROLE="Admin"}" \
+       --password "${local_password}" || true
+}
 
-# The Bash and python commands still should verify the basic connections so they are run after the
-# DB check but before the broker check
-if [[ ${AIRFLOW_COMMAND} == "bash" ]]; then
-   shift
-   exec "/bin/bash" "${@}"
-elif [[ ${AIRFLOW_COMMAND} == "python" ]]; then
-   shift
-   exec "python" "${@}"
-elif [[ ${AIRFLOW_COMMAND} == "airflow" ]]; then
-   AIRFLOW_COMMAND="${2}"
-   shift
-fi
+function create_system_user_if_missing() {
+    # This is needed in case of OpenShift-compatible container execution. In case of OpenShift random
+    # User id is used when starting the image, however group 0 is kept as the user group. Our production
+    # Image is OpenShift compatible, so all permissions on all folders are set so that 0 group can exercise
+    # the same privileges as the default "airflow" user, this code checks if the user is already
+    # present in /etc/passwd and will create the system user dynamically, including setting its
+    # HOME directory to the /home/airflow so that (for example) the ${HOME}/.local folder where airflow is
+    # Installed can be automatically added to PYTHONPATH
+    if ! whoami &> /dev/null; then
+      if [[ -w /etc/passwd ]]; then
+        echo "${USER_NAME:-default}:x:$(id -u):0:${USER_NAME:-default} user:${AIRFLOW_USER_HOME_DIR}:/sbin/nologin" \
+            >> /etc/passwd
+      fi
+      export HOME="${AIRFLOW_USER_HOME_DIR}"
+    fi
+}
 
-# Note: the broker backend configuration concerns only a subset of Airflow components
-if [[ ${AIRFLOW_COMMAND} =~ ^(scheduler|celery|worker|flower)$ ]]; then
-    if [[ -n "${AIRFLOW__CELERY__BROKER_URL_CMD=}" ]]; then
-        verify_db_connection "$(eval "$AIRFLOW__CELERY__BROKER_URL_CMD")"
+function verify_connections() {
+    if [[ -n "${AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD=}" ]]; then
+        verify_db_connection "$(eval "${AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD}")"
     else
-        AIRFLOW__CELERY__BROKER_URL=${AIRFLOW__CELERY__BROKER_URL:=}
-        if [[ -n ${AIRFLOW__CELERY__BROKER_URL=} ]]; then
-            verify_db_connection "${AIRFLOW__CELERY__BROKER_URL}"
+        # if no DB configured - use sqlite db by default
+        AIRFLOW__CORE__SQL_ALCHEMY_CONN="${AIRFLOW__CORE__SQL_ALCHEMY_CONN:="sqlite:///${AIRFLOW_HOME}/airflow.db"}"
+        verify_db_connection "${AIRFLOW__CORE__SQL_ALCHEMY_CONN}"
+    fi
+
+}
+
+function upgrade_db() {
+    airflow db upgrade || airflow upgradedb || true
+}
+
+function verify_celery_connection_if_needed() {
+    # Note: the broker backend configuration concerns only a subset of Airflow components
+    if [[ ${AIRFLOW_COMMAND} =~ ^(scheduler|celery|worker|flower)$ ]]; then
+        if [[ -n "${AIRFLOW__CELERY__BROKER_URL_CMD=}" ]]; then
+            verify_db_connection "$(eval "${AIRFLOW__CELERY__BROKER_URL_CMD}")"
+        else
+            AIRFLOW__CELERY__BROKER_URL=${AIRFLOW__CELERY__BROKER_URL:=}
+            if [[ -n ${AIRFLOW__CELERY__BROKER_URL=} ]]; then
+                verify_db_connection "${AIRFLOW__CELERY__BROKER_URL}"
+            fi
         fi
     fi
+}
+
+function execute_bash_or_python_command_if_specified() {
+    # The Bash and python commands still should verify the basic connections so they are run after the

Review comment:
       ```suggestion
       # The Bash and Python commands still should verify the basic connections so they are run after the
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559233005



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,102 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directoy point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+ in the "Support arbitrary user ids" chapter.

Review comment:
       ```suggestion
   in the "Support arbitrary user ids" chapter.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559758723



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -845,37 +983,64 @@ This concept is implemented in the development version of the Helm Chart that is
 Secured Server and Service Access on Google Cloud
 =================================================
 
-This section describes techniques and solutions for securely accessing servers and services when your Airflow environment is deployed on Google Cloud, or you connect to Google services, or you are connecting to the Google API.
+This section describes techniques and solutions for securely accessing servers and services when your Airflow
+environment is deployed on Google Cloud, or you connect to Google services, or you are connecting
+to the Google API.
 
 IAM and Service Accounts
 ------------------------
 
-You should do not rely on internal network segmentation or firewalling as our primary security mechanisms. To protect your organization's data, every request you make should contain sender identity. In the case of Google Cloud, the identity is provided by `the IAM and Service account <https://cloud.google.com/iam/docs/service-accounts>`__. Each Compute Engine instance has an associated service account identity. It provides cryptographic credentials that your workload can use to prove its identity when making calls to Google APIs or third-party services. Each instance has access only to short-lived credentials. If you use Google-managed service account keys, then the private key is always held in escrow and is never directly accessible.
+You should do not rely on internal network segmentation or firewalling as our primary security mechanisms.

Review comment:
       ```suggestion
   You should not rely on internal network segmentation or firewalling as our primary security mechanisms.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#issuecomment-761851318


   When adding this one, it was already too much of a logic where function were mixed with code in the entrypoint so I used the opportunity to follow Google's guidelines and extract the logic into functions and call the functions - instead of inlining the logic. Looks much neater and it is easy to understand what happens step-by-step. Corresponding to that I also updated the documentation to describe what happens in the prod image entrypoint so that users are not surprised by what happens by default and still can (if they really want) try out the image and start webserver or generally initialise the DB inside of airflow by just passing the right variables. Good for quick sanity checks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559243704



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,102 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directoy point to /home/airflow.

Review comment:
       ```suggestion
   Airflow will automatically create such a user and make it's home directory point to /home/airflow.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#issuecomment-762569990


   All points addressed @kaxil !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
feluelle commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559362160



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,102 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform

Review comment:
       @potiuk Can you tell me why you decided to prefix the airflow variables with an `_`? Are you just not sure if this is the right way to do this? (i.e are they not worth to be that "public" :D)
   
   If you omit the underscore they're still different to airflow (config) env variables. I just would like to know :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559470667



##########
File path: scripts/in_container/prod/entrypoint_prod.sh
##########
@@ -109,51 +109,116 @@ function verify_db_connection {
     fi
 }
 
-if ! whoami &> /dev/null; then
-  if [[ -w /etc/passwd ]]; then
-    echo "${USER_NAME:-default}:x:$(id -u):0:${USER_NAME:-default} user:${AIRFLOW_USER_HOME_DIR}:/sbin/nologin" \
-        >> /etc/passwd
-  fi
-  export HOME="${AIRFLOW_USER_HOME_DIR}"
-fi
-
 # Warning: command environment variables (*_CMD) have priority over usual configuration variables
 # for configuration parameters that require sensitive information. This is the case for the SQL database
 # and the broker backend in this entrypoint script.
 
 
-if [[ -n "${AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD=}" ]]; then
-    verify_db_connection "$(eval "$AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD")"
-else
-    # if no DB configured - use sqlite db by default
-    AIRFLOW__CORE__SQL_ALCHEMY_CONN="${AIRFLOW__CORE__SQL_ALCHEMY_CONN:="sqlite:///${AIRFLOW_HOME}/airflow.db"}"
-    verify_db_connection "${AIRFLOW__CORE__SQL_ALCHEMY_CONN}"
-fi
+function create_www_user() {
+    local local_password=""
+    if [[ -n "${_AIRFLOW_WWW_USER_PASSWORD_CMD=}" ]]; then
+        local_password=$(eval "${_AIRFLOW_WWW_USER_PASSWORD_CMD}")
+        unset _AIRFLOW_WWW_USER_PASSWORD_CMD
+    elif [[ -n "${_AIRFLOW_WWW_USER_PASSWORD=}" ]]; then
+        local_password="${_AIRFLOW_WWW_USER_PASSWORD}"
+        unset _AIRFLOW_WWW_USER_PASSWORD
+    fi
+    if [[ -z ${local_password} ]]; then
+        echo
+        echo ERROR! Airflow Admin password not set via _AIRFLOW_WWW_USER_PASSWORD or _AIRFLOW_WWW_USER_PASSWORD_CMD variables!
+        echo
+        exit 1
+    fi
 
+    airflow users create \
+       --username "${_AIRFLOW_WWW_USER_USERNAME="admin"}" \
+       --firstname "${_AIRFLOW_WWW_USER_FIRSTNAME="Airflow"}" \
+       --lastname "${_AIRFLOW_WWW_USER_LASTNME="Admin"}" \
+       --email "${_AIRFLOW_WWW_USER_EMAIL="airflowadmin@example.com"}" \
+       --role "${_AIRFLOW_WWW_USER_ROLE="Admin"}" \
+       --password "${local_password}" ||
+    airflow create_user \
+       --username "${_AIRFLOW_WWW_USER_USERNAME="admin"}" \
+       --firstname "${_AIRFLOW_WWW_USER_FIRSTNAME="Airflow"}" \
+       --lastname "${_AIRFLOW_WWW_USER_LASTNME="Admin"}" \
+       --email "${_AIRFLOW_WWW_USER_EMAIL="airflowadmin@example.com"}" \
+       --role "${_AIRFLOW_WWW_USER_ROLE="Admin"}" \
+       --password "${local_password}" || true
+}
 
-# The Bash and python commands still should verify the basic connections so they are run after the
-# DB check but before the broker check
-if [[ ${AIRFLOW_COMMAND} == "bash" ]]; then
-   shift
-   exec "/bin/bash" "${@}"
-elif [[ ${AIRFLOW_COMMAND} == "python" ]]; then
-   shift
-   exec "python" "${@}"
-elif [[ ${AIRFLOW_COMMAND} == "airflow" ]]; then
-   AIRFLOW_COMMAND="${2}"
-   shift
-fi
+function create_system_user_if_missing() {
+    # This is needed in case of OpenShift-compatible container execution. In case of OpenShift random
+    # User id is used when starting the image, however group 0 is kept as the user group. Our production
+    # Image is OpenShift compatible, so all permissions on all folders are set so that 0 group can exercise
+    # the same privileges as the default "airflow" user, this code checks if the user is already
+    # present in /etc/passwd and will create the system user dynamically, including setting its
+    # HOME directory to the /home/airflow so that (for example) the ${HOME}/.local folder where airflow is
+    # Installed can be automatically added to PYTHONPATH
+    if ! whoami &> /dev/null; then
+      if [[ -w /etc/passwd ]]; then
+        echo "${USER_NAME:-default}:x:$(id -u):0:${USER_NAME:-default} user:${AIRFLOW_USER_HOME_DIR}:/sbin/nologin" \
+            >> /etc/passwd
+      fi
+      export HOME="${AIRFLOW_USER_HOME_DIR}"
+    fi
+}
 
-# Note: the broker backend configuration concerns only a subset of Airflow components
-if [[ ${AIRFLOW_COMMAND} =~ ^(scheduler|celery|worker|flower)$ ]]; then
-    if [[ -n "${AIRFLOW__CELERY__BROKER_URL_CMD=}" ]]; then
-        verify_db_connection "$(eval "$AIRFLOW__CELERY__BROKER_URL_CMD")"
+function verify_connections() {

Review comment:
       Right. I wanted to change those names but I forgot. Will do/




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559379840



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,102 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform

Review comment:
       Added explanation now, but I am open to any suggestions :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#discussion_r559456790



##########
File path: docs/apache-airflow/production-deployment.rst
##########
@@ -749,6 +749,105 @@ additional apt dev and runtime dependencies.
     --build-arg ADDITIONAL_RUNTIME_APT_DEPS="default-jre-headless"
 
 
+Actions executed at image start
+-------------------------------
+
+If you are using the default entrypoint of the production image,
+there are a few actions that are automatically performed when the container starts.
+In some cases, you can pass environment variables to the image to trigger some of that behaviour.
+
+The variables that control the "execution" behaviour start with ``_AIRFLOW`` to distinguish them
+from the variables used to build the image starting with ``AIRFLOW``.
+
+Creating system user
+....................
+
+Airflow image is Open-Shift compatible, which means that you can start it with random user ID and group id 0.
+Airflow will automatically create such a user and make it's home directory point to /home/airflow.
+You can read more about it in `Openshift best practices <https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html#images-create-guide-openshift_create-images>`_
+in the "Support arbitrary user ids" chapter.
+
+Verify Airflow DB connection
+............................
+
+In case Postgres or MySQL DB is used, the entrypoint will wait until the airflow DB connection becomes
+available. This happens always when you use the default entrypoint.
+
+Upgrading Airflow DB
+....................
+
+If you set ``_AIRFLOW_DB_UPGRADE`` variable to a non-empty value, the entrypoint will perform
+the ``airflow db upgrade`` command right after verifying the connection. You can also use this
+when you are running airflow with internal sqlite database (default) to upgrade the db and create
+admin users at entrypoint, so that you can start the webserver immediately. Note - using sqlite is
+intended only for testing purpose, never use sqlite in production as it has severe limitations when it
+comes to concurrency.
+
+
+Creating admin user
+...................
+
+The entrypoint can also create www admin user automatically when you enter it. you need to set
+``_AIRFLOW_WWW_USER_CREATE`` to a non-empty value in order to do that. This is not intended for
+production, it is only useful if you run a quick test with the production image.
+You need to pass at lest password to create such user via ``_AIRFLOW_WWW_USER_PASSWORD_CMD`` or
+``_AIRFLOW_WWW_USER_PASSWORD_CMD`` similarly like for other ``*_CMD`` variables, the content of
+the ``*_CMD`` will be evaluated as shell command and it's output will be set ass password.
+
+User creation will fail if none of the ``PASSWORD`` variables are set - there is no default for
+password for security reasons.
+
++-----------+--------------------------+--------------------------------------------------------------+
+| Parameter | Default                  | Environment variable                                         |
++===========+==========================+==============================================================+
+| username  | admin                    | _AIRFLOW_WWW_USER_USERNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| password  |                          | _AIRFLOW_WWW_USER_PASSWORD_CMD or _AIRFLOW_WWW_USER_PASSWORD |
++-----------+--------------------------+--------------------------------------------------------------+
+| firstname | Airflow                  | _AIRFLOW_WWW_USER_FIRSTNAME                                  |
++-----------+--------------------------+--------------------------------------------------------------+
+| lastname  | Admin                    | _AIRFLOW_WWW_USER_LASTNAME                                   |
++-----------+--------------------------+--------------------------------------------------------------+
+| email     | airflowadmin@example.com | _AIRFLOW_WWW_USER_EMAIL                                      |
++-----------+--------------------------+--------------------------------------------------------------+
+| role      | Admin                    | _AIRFLOW_WWW_USER_ROLE                                       |
++-----------+--------------------------+--------------------------------------------------------------+
+
+In case the password is specified, the user will be attempted to be created, but the entrypoint will
+not fail if the attempt fails (this accounts for the case that the user is already created).
+
+You can, for example start the webserver in the production image with initializing the internal sqlite

Review comment:
       ```suggestion
   You can, for example start the webserver in the production image with initializing the internal SQLite
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #13728: Adds automated user creation in production image

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #13728:
URL: https://github.com/apache/airflow/pull/13728#issuecomment-761947640


   Temprary failures - it's good to go.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org