You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/01/14 00:55:58 UTC

[GitHub] [airflow] mik-laj opened a new pull request #13660: Add quick start for Airflow on Docker

mik-laj opened a new pull request #13660:
URL: https://github.com/apache/airflow/pull/13660


   Hello,
   
   Adds a docker-compose file that contains a simple sample configuration that allows you to quickly run Airflow in a Docker environment. This is a part of https://github.com/apache/airflow/issues/8605#issuecomment-759469443, but much simpler because .... novice users don't care at all about their environment, but need to get all components up and running quickly. In the next steps, we can work on adding wizards that will allow you to generate your own docker-compose file, but I think it will require some dependent documentation and this is not needed in this section of the documentation yet.
   
   Part of: https://github.com/apache/airflow/issues/8605
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-766320076


   Looks good but I added a few fixups: 
   
   1) Missing .dockerignore entries for logs/plugins/env/dags
   2)  The DOCKER_COMPOSE_ARGS are not needed IMHO . Docker compose run will allocate terminal as needed and you can disable it with -T command. Also interactive mode works out-of-the-box with docker-compose-run
   3) I removed 'airflow' from docker-compose run in airflow.sh. It is not neeed, it does not work with airfllow 2.0 image and it does not allow to run the two useful `python` and `bash` commands
   4) changed ${*} into "${@}" (that's the proper way of passing parameters containing spaces potentially) 
   5) I added examples with "python/bash" commands.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563308334



##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,170 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ v1.27.0 and newer on your workstation.
+
+Older versions of ``docker-compose`` do not support all features required by ``docker-compose.yaml`` file, so double check that it meets the minimum version requirements.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should fetch `docker-compose.yaml <../docker-compose.yaml>`__.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}docker-compose.yaml'
+
+This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``airflow-init`` - The initialization service.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
+
+- ``./dags`` - you can put your DAG files here.
+- ``./logs`` - contains logs from task execution and scheduler.
+- ``./plugins`` - you can put your :doc:`custom plugins </plugins>` here.
+
+Initializing Environment
+========================
+
+Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
+
+On **Linux**, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
+
+.. code-block:: bash
+
+    mkdir ./dags ./logs ./plugins
+    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
+
+On **all operating system**, you need to run database migrations and create the first user account. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose up --rm airflow-init
+
+After initialization is complete, you should see a message like below.
+
+.. code-block:: bash
+
+    airflow-init_1       | Upgrades done
+    airflow-init_1       | Admin user airflow created
+    airflow-init_1       | 2.1.0

Review comment:
       I added extra checks for docs build: https://github.com/apache/airflow/pull/13660/commits/341c7813610b0ae21b159fc86c0d410d6f923f77




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563270909



##########
File path: docs/apache-airflow/concepts.rst
##########
@@ -23,6 +23,38 @@ Concepts
 The Airflow platform is a tool for describing, executing, and monitoring
 workflows.
 
+.. _architecture:
+
+Basic Airflow architecture
+''''''''''''''''''''''''''
+
+Primarily intended for development use, the basic Airflow architecture with the Local and Sequential executors is an
+excellent starting point for understanding the architecture of Apache Airflow.
+
+.. image:: img/arch-diag-basic.png
+
+
+There are a few components to note:
+
+* **Metadata Database**: Airflow uses a SQL database to store metadata about the data pipelines being run. In the
+  diagram above, this is represented as Postgres which is extremely popular with Airflow.
+  Alternate databases supported with Airflow include MySQL.
+
+* **Web Server** and **Scheduler**: The Airflow web server and Scheduler are separate processes run (in this case)
+  on the local machine and interact with the database mentioned above.
+
+* The **Executor** is shown separately above, since it is commonly discussed within Airflow and in the documentation, but
+  in reality it is NOT a separate process, but run within the Scheduler.
+
+* The **Worker(s)** are separate processes which also interact with the other components of the Airflow architecture and
+  the metadata repository.

Review comment:
       Ahh. And that's not my text. I only moved this paragraph from another place, but I will check it more carefully.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-762193621


   > With this it is quite easy to change the access to the container and the airflow user.
   
   I would like this example to be simple and only use one file. This is a quick start for beginners, so we shouldn't use too complex syntax and require many steps.  I will try to think of something to be able to introduce basic parameterization without adding extra steps to the instruction.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
turbaszek commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563270018



##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,170 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ v1.27.0 and newer on your workstation.
+
+Older versions of ``docker-compose`` do not support all features required by ``docker-compose.yaml`` file, so double check that it meets the minimum version requirements.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should fetch `docker-compose.yaml <../docker-compose.yaml>`__.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}docker-compose.yaml'
+
+This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``airflow-init`` - The initialization service.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
+
+- ``./dags`` - you can put your DAG files here.
+- ``./logs`` - contains logs from task execution and scheduler.
+- ``./plugins`` - you can put your :doc:`custom plugins </plugins>` here.
+
+Initializing Environment
+========================
+
+Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
+
+On **Linux**, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
+
+.. code-block:: bash
+
+    mkdir ./dags ./logs ./plugins
+    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
+
+On **all operating system**, you need to run database migrations and create the first user account. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose up --rm airflow-init
+
+After initialization is complete, you should see a message like below.
+
+.. code-block:: bash
+
+    airflow-init_1       | Upgrades done
+    airflow-init_1       | Admin user airflow created
+    airflow-init_1       | 2.1.0
+    start_airflow-init_1 exited with code 0
+
+The account created has the login ``airflow`` and the password ``airflow``.
+
+Running Airflow
+===============
+
+Now you can start all services:
+
+.. code-block:: bash
+
+    docker-compose up

Review comment:
       In case of one customer we pause all DAGs before starting scheduler when running locally. This is a good idea if you want to run Airflow instance locally with production DAGs which often have schedules and may not work locally due to external dependencies. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-766320076


   Looks good but I added a few fixups: 
   
   1) Missing .dockerignore entries for logs/plugins/env/dags
   2)  The DOCKER_COMPOSE_ARGS are not needed IMHO . The `docker compose run` will allocate terminal as needed and you can disable it with -T flag. Also interactive mode works out-of-the-box with docker-compose-run. removed them
   3) I removed 'airflow' from docker-compose run in airflow.sh. It is not neeed, it does not work with airfllow 2.0 image and it does not allow to run the two useful `python` and `bash` commands
   4) changed ${*} into "${@}" (that's the proper way of passing parameters containing spaces potentially) 
   5) I added examples with "python/bash" commands.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r557039807



##########
File path: docs/apache-airflow/start/docker-compose.yaml
##########
@@ -0,0 +1,101 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+version: '3'
+x-airflow-common:
+  &airflow-common
+  image: apache/airflow:2.0.0
+  environment:
+    - AIRFLOW__CORE__EXECUTOR=CeleryExecutor
+    - AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow
+    - AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:airflow@postgres/airflow
+    - AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
+    - AIRFLOW__CORE__FERNET_KEY=
+    - AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION=True
+    - AIRFLOW__CORE__LOAD_EXAMPLES=True
+  volumes:
+    - ./dags:/opt/airflow/dags

Review comment:
       It would be great to mention in the documentation that you can place dags, plugin python code and see logs in tose subdirectories of folder where the docker compse is. Those subdirectories should also be .dockerignored in case someone runs the docker-compose from this folder.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-766320076


   Looks good but I added a few fixups: 
   
   1) Missing .dockerignore entries for logs/plugins/env/dags
   2)  The DOCKER_COMPOSE_ARGS are not needed IMHO . The `docker compose run` will allocate terminal as needed and you can disable it with -T flag. Also interactive mode works out-of-the-box with docker-compose-run
   3) I removed 'airflow' from docker-compose run in airflow.sh. It is not neeed, it does not work with airfllow 2.0 image and it does not allow to run the two useful `python` and `bash` commands
   4) changed ${*} into "${@}" (that's the proper way of passing parameters containing spaces potentially) 
   5) I added examples with "python/bash" commands.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r564018402



##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,170 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ v1.27.0 and newer on your workstation.
+
+Older versions of ``docker-compose`` do not support all features required by ``docker-compose.yaml`` file, so double check that it meets the minimum version requirements.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should fetch `docker-compose.yaml <../docker-compose.yaml>`__.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}docker-compose.yaml'
+
+This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``airflow-init`` - The initialization service.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
+
+- ``./dags`` - you can put your DAG files here.
+- ``./logs`` - contains logs from task execution and scheduler.
+- ``./plugins`` - you can put your :doc:`custom plugins </plugins>` here.
+
+Initializing Environment
+========================
+
+Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
+
+On **Linux**, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
+
+.. code-block:: bash
+
+    mkdir ./dags ./logs ./plugins
+    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
+
+On **all operating system**, you need to run database migrations and create the first user account. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose up --rm airflow-init
+
+After initialization is complete, you should see a message like below.
+
+.. code-block:: text
+
+    airflow-init_1       | Upgrades done
+    airflow-init_1       | Admin user airflow created
+    airflow-init_1       | 2.1.0.dev0
+    start_airflow-init_1 exited with code 0
+
+The account created has the login ``airflow`` and the password ``airflow``.
+
+Running Airflow
+===============
+
+Now you can start all services:
+
+.. code-block:: bash
+
+    docker-compose up
+
+In the second terminal you can check the condition of the containers and make sure that no containers are in unhealthy condition:
+
+.. code-block:: bash
+
+    $ docker ps
+    CONTAINER ID   IMAGE                             COMMAND                  CREATED          STATUS                    PORTS                              NAMES
+    247ebe6cf87a   apache/airflow:master-python3.8   "/usr/bin/dumb-init …"   3 minutes ago    Up 3 minutes              8080/tcp                           compose_airflow-worker_1
+    ed9b09fc84b1   apache/airflow:master-python3.8   "/usr/bin/dumb-init …"   3 minutes ago    Up 3 minutes              8080/tcp                           compose_airflow-scheduler_1
+    65ac1da2c219   apache/airflow:master-python3.8   "/usr/bin/dumb-init …"   3 minutes ago    Up 3 minutes (healthy)    0.0.0.0:5555->5555/tcp, 8080/tcp   compose_flower_1
+    7cb1fb603a98   apache/airflow:master-python3.8   "/usr/bin/dumb-init …"   3 minutes ago    Up 3 minutes (healthy)    0.0.0.0:8080->8080/tcp             compose_airflow-webserver_1
+    74f3bbe506eb   postgres:13                       "docker-entrypoint.s…"   18 minutes ago   Up 17 minutes (healthy)   5432/tcp                           compose_postgres_1
+    0bd6576d23cb   redis:latest                      "docker-entrypoint.s…"   10 hours ago     Up 17 minutes (healthy)   0.0.0.0:6379->6379/tcp             compose_redis_1
+
+Once the cluster has started up, you can log in to the web interface and try to run some tasks. The webserver available at: ``http://localhost:8080``. The default account has the login ``airflow`` and the password ``airflow``.
+
+.. image:: /img/dags.png
+
+Accessing Command Line Interface
+================================
+
+You can also run :doc:`CLI commands </usage-cli>`, but you have to do it in one of the defined ``airflow-*`` services. For example, to run ``airflow info``, run the following command:
+
+.. code-block:: bash
+
+    docker-compose run airflow-worker airflow info
+
+If you have Linux or Mac OS, you can make your work easier and download a optional wrapper scripts that will allow you to run commands with a simpler command.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}airflow.sh'
+        chmod +x airflow.sh
+
+Now you can run commands easier.
+
+.. code-block:: bash
+
+    ./airflow.sh info
+
+You can also use ``bash`` as parameter to enter interactive bash shell in the container or ``python`` to enter
+python container.
+
+.. code-block:: bash
+
+    ./airflow.sh bash
+
+.. code-block:: bash
+
+    ./airflow.sh python
+
+Cleaning up
+===========
+
+To stop and delete containers, delete volumes with database data and download images, run:
+
+.. code-block:: bash
+
+    docker-compose down --volumes --rmi all
+
+Notes
+=====
+
+By default, the Docker Compose file uses the latest Airflow image (`apache/airflow< <https://hub.docker.com/r/apache/airflow>`__). If you need, you can :ref:`customize and extend it <docker_image>`.

Review comment:
       ```suggestion
   By default, the Docker Compose file uses the latest Airflow image (`apache/airflow <https://hub.docker.com/r/apache/airflow>`__). If you need, you can :ref:`customize and extend it <docker_image>`.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563270268



##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,170 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ v1.27.0 and newer on your workstation.
+
+Older versions of ``docker-compose`` do not support all features required by ``docker-compose.yaml`` file, so double check that it meets the minimum version requirements.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should fetch `docker-compose.yaml <../docker-compose.yaml>`__.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}docker-compose.yaml'
+
+This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``airflow-init`` - The initialization service.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
+
+- ``./dags`` - you can put your DAG files here.
+- ``./logs`` - contains logs from task execution and scheduler.
+- ``./plugins`` - you can put your :doc:`custom plugins </plugins>` here.
+
+Initializing Environment
+========================
+
+Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
+
+On **Linux**, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
+
+.. code-block:: bash
+
+    mkdir ./dags ./logs ./plugins
+    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
+
+On **all operating system**, you need to run database migrations and create the first user account. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose up --rm airflow-init
+
+After initialization is complete, you should see a message like below.
+
+.. code-block:: bash
+
+    airflow-init_1       | Upgrades done
+    airflow-init_1       | Admin user airflow created
+    airflow-init_1       | 2.1.0
+    start_airflow-init_1 exited with code 0
+
+The account created has the login ``airflow`` and the password ``airflow``.
+
+Running Airflow
+===============
+
+Now you can start all services:
+
+.. code-block:: bash
+
+    docker-compose up

Review comment:
       https://github.com/apache/airflow/pull/13660/files#diff-d6b3b45a036b2fb62762c6f8bc11f4b7832b3d012c81819dec51dc88ae304d54R50




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-766312443


   > I am wondering if we should use a .env file to populate some docker-compose values (which is passed to docker-compose automatically if in same dir).
   
   I added a few parameters to the docker-compose file, but I was very careful not to complicate this file.
   I only added these parameters in a few cases:
   - If it was necessary to ensure compatibility with Linux, (``AIRFLOW_UID``, ``AIRFLOW_GID``),
   -  if I thought it could be a really frequently changed option (``AIRFLOW_IMAGE_NAME``),
   - if it is a good example of using parameterization (_AIRFLOW_WWW_USER_USERNAME, _AIRFLOW_WWW_USER_PASSWORD)
   
   Each variable has a default value defined so the .env file is optional. I also didn't define variables for internal details, e.g. database passwords, because Docker provides network isolation.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-766320289


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest master or amend the last commit of the PR, and push it with --force-with-lease.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563270141



##########
File path: docs/apache-airflow/start/docker-compose.yaml
##########
@@ -0,0 +1,134 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       Because this file is part of the documentation package, to make it easier to release with each new documentation release. We should not link to files that are in Github or other places in the official documentation, if we describe the official installation.
   
   > PUBLICATION
   > Projects SHALL publish official releases and SHALL NOT publish unreleased materials outside the development community.
   
   >During the process of developing software and preparing a release, various packages are made available to the development community for testing purposes. Projects MUST direct outsiders towards official releases rather than raw source repositories, nightly builds, snapshots, release candidates, or any other similar packages. The only people who are supposed to know about such developer resources are individuals actively participating in development or following the dev list and thus aware of the conditions placed on unreleased materials.
   
   If we are going to publish this file elsewhere, we will have to have a separate publishing process for this file. As for the directory, it is the same directory as docker.rst so as not to create additional confusion. Now the entire guide is in one directory.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563308334



##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,170 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ v1.27.0 and newer on your workstation.
+
+Older versions of ``docker-compose`` do not support all features required by ``docker-compose.yaml`` file, so double check that it meets the minimum version requirements.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should fetch `docker-compose.yaml <../docker-compose.yaml>`__.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}docker-compose.yaml'
+
+This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``airflow-init`` - The initialization service.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
+
+- ``./dags`` - you can put your DAG files here.
+- ``./logs`` - contains logs from task execution and scheduler.
+- ``./plugins`` - you can put your :doc:`custom plugins </plugins>` here.
+
+Initializing Environment
+========================
+
+Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
+
+On **Linux**, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
+
+.. code-block:: bash
+
+    mkdir ./dags ./logs ./plugins
+    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
+
+On **all operating system**, you need to run database migrations and create the first user account. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose up --rm airflow-init
+
+After initialization is complete, you should see a message like below.
+
+.. code-block:: bash
+
+    airflow-init_1       | Upgrades done
+    airflow-init_1       | Admin user airflow created
+    airflow-init_1       | 2.1.0

Review comment:
       I added extra checks for docs build: https://github.com/apache/airflow/pull/13660/commits/341c7813610b0ae21b159fc86c0d410d6f923f77

##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,170 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ v1.27.0 and newer on your workstation.
+
+Older versions of ``docker-compose`` do not support all features required by ``docker-compose.yaml`` file, so double check that it meets the minimum version requirements.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should fetch `docker-compose.yaml <../docker-compose.yaml>`__.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}docker-compose.yaml'
+
+This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``airflow-init`` - The initialization service.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
+
+- ``./dags`` - you can put your DAG files here.
+- ``./logs`` - contains logs from task execution and scheduler.
+- ``./plugins`` - you can put your :doc:`custom plugins </plugins>` here.
+
+Initializing Environment
+========================
+
+Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
+
+On **Linux**, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
+
+.. code-block:: bash
+
+    mkdir ./dags ./logs ./plugins
+    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
+
+On **all operating system**, you need to run database migrations and create the first user account. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose up --rm airflow-init
+
+After initialization is complete, you should see a message like below.
+
+.. code-block:: bash

Review comment:
       ```suggestion
   .. code-block:: text
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
feluelle commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r559354820



##########
File path: docs/apache-airflow/start/docker-compose.yaml
##########
@@ -0,0 +1,101 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+version: '3'
+x-airflow-common:
+  &airflow-common
+  image: apache/airflow:2.0.0
+  environment:
+    - AIRFLOW__CORE__EXECUTOR=CeleryExecutor
+    - AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow
+    - AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:airflow@postgres/airflow
+    - AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
+    - AIRFLOW__CORE__FERNET_KEY=
+    - AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION=True
+    - AIRFLOW__CORE__LOAD_EXAMPLES=True
+  volumes:
+    - ./dags:/opt/airflow/dags
+    - ./logs:/opt/airflow/logs
+    - ./plugins:/opt/airflow/plugins
+  depends_on:
+    redis:
+      condition: service_healthy
+    postgres:
+      condition: service_healthy
+
+services:
+  postgres:
+    image: postgres:9.5

Review comment:
       ```suggestion
       image: postgres:13
   ```
   Why such an old version of postgres? Choose one of currently supported versions for airflow `9.6, 10, 11, 12, 13`. I run 13 since it is out and didn't experience any issues.

##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,103 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ on your workstation.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should download `docker-compose.yaml <../docker-compose.yaml>`__. This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Running Airflow
+===============
+
+Before starting Airflow for the first time, you need to initialize the database. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose run --rm airflow-webserver \
+        airflow db init
+
+Access to the web server requires a user account, to create it, run:
+
+.. code-block:: bash
+
+    docker-compose run --rm airflow-webserver \
+        airflow users create \
+            --role Admin \
+            --username airflow \
+            --password airflow \
+            --email airflow@airflow.com \
+            --firstname airflow \
+            --lastname airflow

Review comment:
       @potiuk also created a PR to do this in the image #13728. Do we need both possibilities?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563270141



##########
File path: docs/apache-airflow/start/docker-compose.yaml
##########
@@ -0,0 +1,134 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       Because this file is part of the documentation package, to make it easier to release with each new documentation release. We should not link to files that are in Github or other places in the official documentation, if we describe the official installation.
   
   > PUBLICATION
   > Projects SHALL publish official releases and SHALL NOT publish unreleased materials outside the development community.
   
   >During the process of developing software and preparing a release, various packages are made available to the development community for testing purposes. Projects MUST direct outsiders towards official releases rather than raw source repositories, nightly builds, snapshots, release candidates, or any other similar packages. The only people who are supposed to know about such developer resources are individuals actively participating in development or following the dev list and thus aware of the conditions placed on unreleased materials.
   
   If we are going to publish this file elsewhere, we will have to have a separate publishing process for this file. As for the directory, it is the same directory as `docker.rst`, so as not to create additional confusion. Now the entire guide is in one directory.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r557039807



##########
File path: docs/apache-airflow/start/docker-compose.yaml
##########
@@ -0,0 +1,101 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+version: '3'
+x-airflow-common:
+  &airflow-common
+  image: apache/airflow:2.0.0
+  environment:
+    - AIRFLOW__CORE__EXECUTOR=CeleryExecutor
+    - AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow
+    - AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:airflow@postgres/airflow
+    - AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
+    - AIRFLOW__CORE__FERNET_KEY=
+    - AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION=True
+    - AIRFLOW__CORE__LOAD_EXAMPLES=True
+  volumes:
+    - ./dags:/opt/airflow/dags

Review comment:
       It would be great to mention in the documentation that you can place dags, plugin python code and see logs in tose subdirectories of folder where the docker compse is. Those subdirectories should also be .dockerignored in case someone runs the docker-compose from this folder (I assume they will be created automatically?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
turbaszek commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563269324



##########
File path: docs/apache-airflow/start/docker-compose.yaml
##########
@@ -0,0 +1,134 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       I'm quite confuse - why do we keep docker-compose in docs directory?

##########
File path: docs/apache-airflow/start/docker-compose.yaml
##########
@@ -0,0 +1,134 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       I'm quite confused - why do we keep docker-compose in docs directory?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
feluelle commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563282119



##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,170 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ v1.27.0 and newer on your workstation.
+
+Older versions of ``docker-compose`` do not support all features required by ``docker-compose.yaml`` file, so double check that it meets the minimum version requirements.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should fetch `docker-compose.yaml <../docker-compose.yaml>`__.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}docker-compose.yaml'
+
+This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``airflow-init`` - The initialization service.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
+
+- ``./dags`` - you can put your DAG files here.
+- ``./logs`` - contains logs from task execution and scheduler.
+- ``./plugins`` - you can put your :doc:`custom plugins </plugins>` here.
+
+Initializing Environment
+========================
+
+Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
+
+On **Linux**, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
+
+.. code-block:: bash
+
+    mkdir ./dags ./logs ./plugins
+    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
+
+On **all operating system**, you need to run database migrations and create the first user account. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose up --rm airflow-init
+
+After initialization is complete, you should see a message like below.
+
+.. code-block:: bash
+
+    airflow-init_1       | Upgrades done
+    airflow-init_1       | Admin user airflow created
+    airflow-init_1       | 2.1.0

Review comment:
       Is this the version? Why `2.1.0`?
   
   Further down we mention the version again for example `apache/airflow:2.0.0`. This should be the same, or?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563294820



##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,170 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ v1.27.0 and newer on your workstation.
+
+Older versions of ``docker-compose`` do not support all features required by ``docker-compose.yaml`` file, so double check that it meets the minimum version requirements.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should fetch `docker-compose.yaml <../docker-compose.yaml>`__.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}docker-compose.yaml'
+
+This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``airflow-init`` - The initialization service.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
+
+- ``./dags`` - you can put your DAG files here.
+- ``./logs`` - contains logs from task execution and scheduler.
+- ``./plugins`` - you can put your :doc:`custom plugins </plugins>` here.
+
+Initializing Environment
+========================
+
+Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
+
+On **Linux**, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
+
+.. code-block:: bash
+
+    mkdir ./dags ./logs ./plugins
+    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
+
+On **all operating system**, you need to run database migrations and create the first user account. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose up --rm airflow-init
+
+After initialization is complete, you should see a message like below.
+
+.. code-block:: bash
+
+    airflow-init_1       | Upgrades done
+    airflow-init_1       | Admin user airflow created
+    airflow-init_1       | 2.1.0

Review comment:
       It should be Airflow 2.0.1, but it is not unreleased yet. At this point, I used this version because I was running the image from the master branch where we have version 2.1.0.dev, then deleted the suffix ".dev". 
   https://github.com/apache/airflow/blob/f473ca7130f844bc59477674e641b42b80698bb7/setup.py#L41
   I am concerned that none of the released images works, as we have not yet voted on any release that includes the latest changes.
   https://github.com/apache/airflow/pull/13728
   I will try to unify it so that there is one version used in both places.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563270268



##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,170 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ v1.27.0 and newer on your workstation.
+
+Older versions of ``docker-compose`` do not support all features required by ``docker-compose.yaml`` file, so double check that it meets the minimum version requirements.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should fetch `docker-compose.yaml <../docker-compose.yaml>`__.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}docker-compose.yaml'
+
+This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``airflow-init`` - The initialization service.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
+
+- ``./dags`` - you can put your DAG files here.
+- ``./logs`` - contains logs from task execution and scheduler.
+- ``./plugins`` - you can put your :doc:`custom plugins </plugins>` here.
+
+Initializing Environment
+========================
+
+Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
+
+On **Linux**, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
+
+.. code-block:: bash
+
+    mkdir ./dags ./logs ./plugins
+    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
+
+On **all operating system**, you need to run database migrations and create the first user account. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose up --rm airflow-init
+
+After initialization is complete, you should see a message like below.
+
+.. code-block:: bash
+
+    airflow-init_1       | Upgrades done
+    airflow-init_1       | Admin user airflow created
+    airflow-init_1       | 2.1.0
+    start_airflow-init_1 exited with code 0
+
+The account created has the login ``airflow`` and the password ``airflow``.
+
+Running Airflow
+===============
+
+Now you can start all services:
+
+.. code-block:: bash
+
+    docker-compose up

Review comment:
       >     AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
   
   https://github.com/apache/airflow/pull/13660/files#diff-d6b3b45a036b2fb62762c6f8bc11f4b7832b3d012c81819dec51dc88ae304d54R50




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-762210144


   > I would like this example to be simple and only use one file. This is a quick start for beginners, so we shouldn't use too complex syntax and require many steps. I will try to think of something to be able to introduce basic parameterization without adding extra steps to the instruction.
   
   Agree with @mik-laj -> if we have a single file to use, this is soooo much easier to share and use via gists and other similar mechanisms, or simple links to file in the repo. I wish it was easy to do in our Dockerfiles, but this is rather complex there and you usually needs all the "surrounding" stuff like scripts, empty dirs etc. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563270204



##########
File path: docs/apache-airflow/concepts.rst
##########
@@ -23,6 +23,38 @@ Concepts
 The Airflow platform is a tool for describing, executing, and monitoring
 workflows.
 
+.. _architecture:
+
+Basic Airflow architecture
+''''''''''''''''''''''''''
+
+Primarily intended for development use, the basic Airflow architecture with the Local and Sequential executors is an
+excellent starting point for understanding the architecture of Apache Airflow.
+
+.. image:: img/arch-diag-basic.png
+
+
+There are a few components to note:
+
+* **Metadata Database**: Airflow uses a SQL database to store metadata about the data pipelines being run. In the
+  diagram above, this is represented as Postgres which is extremely popular with Airflow.
+  Alternate databases supported with Airflow include MySQL.
+
+* **Web Server** and **Scheduler**: The Airflow web server and Scheduler are separate processes run (in this case)
+  on the local machine and interact with the database mentioned above.
+
+* The **Executor** is shown separately above, since it is commonly discussed within Airflow and in the documentation, but
+  in reality it is NOT a separate process, but run within the Scheduler.
+
+* The **Worker(s)** are separate processes which also interact with the other components of the Airflow architecture and
+  the metadata repository.

Review comment:
       We support only celery executor in this guide.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563270614



##########
File path: docs/apache-airflow/concepts.rst
##########
@@ -23,6 +23,38 @@ Concepts
 The Airflow platform is a tool for describing, executing, and monitoring
 workflows.
 
+.. _architecture:
+
+Basic Airflow architecture
+''''''''''''''''''''''''''
+
+Primarily intended for development use, the basic Airflow architecture with the Local and Sequential executors is an
+excellent starting point for understanding the architecture of Apache Airflow.
+
+.. image:: img/arch-diag-basic.png
+
+
+There are a few components to note:
+
+* **Metadata Database**: Airflow uses a SQL database to store metadata about the data pipelines being run. In the
+  diagram above, this is represented as Postgres which is extremely popular with Airflow.
+  Alternate databases supported with Airflow include MySQL.
+
+* **Web Server** and **Scheduler**: The Airflow web server and Scheduler are separate processes run (in this case)
+  on the local machine and interact with the database mentioned above.
+
+* The **Executor** is shown separately above, since it is commonly discussed within Airflow and in the documentation, but
+  in reality it is NOT a separate process, but run within the Scheduler.
+
+* The **Worker(s)** are separate processes which also interact with the other components of the Airflow architecture and
+  the metadata repository.

Review comment:
       ```suggestion
     the metadata database.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563257927



##########
File path: docs/apache-airflow/start/docker-compose.yaml
##########
@@ -0,0 +1,101 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+version: '3'
+x-airflow-common:
+  &airflow-common
+  image: apache/airflow:2.0.0
+  environment:
+    - AIRFLOW__CORE__EXECUTOR=CeleryExecutor
+    - AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow
+    - AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:airflow@postgres/airflow
+    - AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
+    - AIRFLOW__CORE__FERNET_KEY=
+    - AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION=True
+    - AIRFLOW__CORE__LOAD_EXAMPLES=True
+  volumes:
+    - ./dags:/opt/airflow/dags

Review comment:
       I added description to ``docker-compose.yaml`` section.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
turbaszek commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563270779



##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,170 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ v1.27.0 and newer on your workstation.
+
+Older versions of ``docker-compose`` do not support all features required by ``docker-compose.yaml`` file, so double check that it meets the minimum version requirements.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should fetch `docker-compose.yaml <../docker-compose.yaml>`__.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}docker-compose.yaml'
+
+This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``airflow-init`` - The initialization service.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
+
+- ``./dags`` - you can put your DAG files here.
+- ``./logs`` - contains logs from task execution and scheduler.
+- ``./plugins`` - you can put your :doc:`custom plugins </plugins>` here.
+
+Initializing Environment
+========================
+
+Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
+
+On **Linux**, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
+
+.. code-block:: bash
+
+    mkdir ./dags ./logs ./plugins
+    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
+
+On **all operating system**, you need to run database migrations and create the first user account. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose up --rm airflow-init
+
+After initialization is complete, you should see a message like below.
+
+.. code-block:: bash
+
+    airflow-init_1       | Upgrades done
+    airflow-init_1       | Admin user airflow created
+    airflow-init_1       | 2.1.0
+    start_airflow-init_1 exited with code 0
+
+The account created has the login ``airflow`` and the password ``airflow``.
+
+Running Airflow
+===============
+
+Now you can start all services:
+
+.. code-block:: bash
+
+    docker-compose up

Review comment:
       Isn't it overwritten by information from DAG?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563329307



##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,170 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ v1.27.0 and newer on your workstation.
+
+Older versions of ``docker-compose`` do not support all features required by ``docker-compose.yaml`` file, so double check that it meets the minimum version requirements.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should fetch `docker-compose.yaml <../docker-compose.yaml>`__.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}docker-compose.yaml'
+
+This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``airflow-init`` - The initialization service.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
+
+- ``./dags`` - you can put your DAG files here.
+- ``./logs`` - contains logs from task execution and scheduler.
+- ``./plugins`` - you can put your :doc:`custom plugins </plugins>` here.
+
+Initializing Environment
+========================
+
+Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
+
+On **Linux**, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
+
+.. code-block:: bash
+
+    mkdir ./dags ./logs ./plugins
+    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
+
+On **all operating system**, you need to run database migrations and create the first user account. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose up --rm airflow-init
+
+After initialization is complete, you should see a message like below.
+
+.. code-block:: bash

Review comment:
       ```suggestion
   .. code-block:: text
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563269804



##########
File path: docs/apache-airflow/start/docker-compose.yaml
##########
@@ -0,0 +1,134 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       Besause this is an example for end-users. It's not intended to be run as regular part of development process.

##########
File path: docs/apache-airflow/start/docker-compose.yaml
##########
@@ -0,0 +1,134 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       Because this is an example for end-users. It's not intended to be run as regular part of development process.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-759872139


   The PR is likely ready to be merged. No tests are needed as no important environment files, nor python files were modified by it. However, committers might decide that full test matrix is needed and add the 'full tests needed' label. Then you should rebase it to the latest master or amend the last commit of the PR, and push it with --force-with-lease.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-759878816


   [The Workflow run](https://github.com/apache/airflow/actions/runs/484240577) is cancelling this PR. Building images for the PR has failed. Follow the the workflow link to check the reason.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-766320182


   And it works now on linux nicely :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r557046057



##########
File path: docs/apache-airflow/start/docker-compose.yaml
##########
@@ -0,0 +1,101 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+version: '3'
+x-airflow-common:
+  &airflow-common
+  image: apache/airflow:2.0.0
+  environment:
+    - AIRFLOW__CORE__EXECUTOR=CeleryExecutor
+    - AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow
+    - AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:airflow@postgres/airflow
+    - AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
+    - AIRFLOW__CORE__FERNET_KEY=
+    - AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION=True
+    - AIRFLOW__CORE__LOAD_EXAMPLES=True
+  volumes:
+    - ./dags:/opt/airflow/dags
+    - ./logs:/opt/airflow/logs

Review comment:
       Also this makes it very hard to work on the docker-compose from the sources because once those folders are created as root you cannot switch branches until you delete them.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-766320076


   Looks good but I added a few fixups: 
   
   1) Missing .gitignore entries for logs/plugins/env/dags
   2)  The DOCKER_COMPOSE_ARGS are not needed IMHO . The `docker compose run` will allocate terminal as needed and you can disable it with -T flag. Also interactive mode works out-of-the-box with docker-compose-run. removed them
   3) I removed `airflow` from docker-compose run in airflow.sh. It is not neeed, it does not work with airfllow 2.0 image and it does not allow to run the two useful `python` and `bash` commands
   4) changed ${*} into "${@}" (that's the proper way of passing parameters containing spaces potentially) 
   5) I added examples with "python/bash" commands.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r564018402



##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,170 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ v1.27.0 and newer on your workstation.
+
+Older versions of ``docker-compose`` do not support all features required by ``docker-compose.yaml`` file, so double check that it meets the minimum version requirements.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should fetch `docker-compose.yaml <../docker-compose.yaml>`__.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}docker-compose.yaml'
+
+This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``airflow-init`` - The initialization service.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
+
+- ``./dags`` - you can put your DAG files here.
+- ``./logs`` - contains logs from task execution and scheduler.
+- ``./plugins`` - you can put your :doc:`custom plugins </plugins>` here.
+
+Initializing Environment
+========================
+
+Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
+
+On **Linux**, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
+
+.. code-block:: bash
+
+    mkdir ./dags ./logs ./plugins
+    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
+
+On **all operating system**, you need to run database migrations and create the first user account. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose up --rm airflow-init
+
+After initialization is complete, you should see a message like below.
+
+.. code-block:: text
+
+    airflow-init_1       | Upgrades done
+    airflow-init_1       | Admin user airflow created
+    airflow-init_1       | 2.1.0.dev0
+    start_airflow-init_1 exited with code 0
+
+The account created has the login ``airflow`` and the password ``airflow``.
+
+Running Airflow
+===============
+
+Now you can start all services:
+
+.. code-block:: bash
+
+    docker-compose up
+
+In the second terminal you can check the condition of the containers and make sure that no containers are in unhealthy condition:
+
+.. code-block:: bash
+
+    $ docker ps
+    CONTAINER ID   IMAGE                             COMMAND                  CREATED          STATUS                    PORTS                              NAMES
+    247ebe6cf87a   apache/airflow:master-python3.8   "/usr/bin/dumb-init …"   3 minutes ago    Up 3 minutes              8080/tcp                           compose_airflow-worker_1
+    ed9b09fc84b1   apache/airflow:master-python3.8   "/usr/bin/dumb-init …"   3 minutes ago    Up 3 minutes              8080/tcp                           compose_airflow-scheduler_1
+    65ac1da2c219   apache/airflow:master-python3.8   "/usr/bin/dumb-init …"   3 minutes ago    Up 3 minutes (healthy)    0.0.0.0:5555->5555/tcp, 8080/tcp   compose_flower_1
+    7cb1fb603a98   apache/airflow:master-python3.8   "/usr/bin/dumb-init …"   3 minutes ago    Up 3 minutes (healthy)    0.0.0.0:8080->8080/tcp             compose_airflow-webserver_1
+    74f3bbe506eb   postgres:13                       "docker-entrypoint.s…"   18 minutes ago   Up 17 minutes (healthy)   5432/tcp                           compose_postgres_1
+    0bd6576d23cb   redis:latest                      "docker-entrypoint.s…"   10 hours ago     Up 17 minutes (healthy)   0.0.0.0:6379->6379/tcp             compose_redis_1
+
+Once the cluster has started up, you can log in to the web interface and try to run some tasks. The webserver available at: ``http://localhost:8080``. The default account has the login ``airflow`` and the password ``airflow``.
+
+.. image:: /img/dags.png
+
+Accessing Command Line Interface
+================================
+
+You can also run :doc:`CLI commands </usage-cli>`, but you have to do it in one of the defined ``airflow-*`` services. For example, to run ``airflow info``, run the following command:
+
+.. code-block:: bash
+
+    docker-compose run airflow-worker airflow info
+
+If you have Linux or Mac OS, you can make your work easier and download a optional wrapper scripts that will allow you to run commands with a simpler command.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}airflow.sh'
+        chmod +x airflow.sh
+
+Now you can run commands easier.
+
+.. code-block:: bash
+
+    ./airflow.sh info
+
+You can also use ``bash`` as parameter to enter interactive bash shell in the container or ``python`` to enter
+python container.
+
+.. code-block:: bash
+
+    ./airflow.sh bash
+
+.. code-block:: bash
+
+    ./airflow.sh python
+
+Cleaning up
+===========
+
+To stop and delete containers, delete volumes with database data and download images, run:
+
+.. code-block:: bash
+
+    docker-compose down --volumes --rmi all
+
+Notes
+=====
+
+By default, the Docker Compose file uses the latest Airflow image (`apache/airflow< <https://hub.docker.com/r/apache/airflow>`__). If you need, you can :ref:`customize and extend it <docker_image>`.

Review comment:
       ```suggestion
   By default, the Docker Compose file uses the latest Airflow image (`apache/airflow <https://hub.docker.com/r/apache/airflow>`__). If you need, you can :ref:`customize and extend it <docker_image>`.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-762193621


   > With this it is quite easy to change the access to the container and the airflow user.
   
   I would like this example to be simple and only use one file. This is a quick start for beginners, so we shouldn't use too complex syntax and require many steps.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
turbaszek commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563270430



##########
File path: docs/apache-airflow/concepts.rst
##########
@@ -23,6 +23,38 @@ Concepts
 The Airflow platform is a tool for describing, executing, and monitoring
 workflows.
 
+.. _architecture:
+
+Basic Airflow architecture
+''''''''''''''''''''''''''
+
+Primarily intended for development use, the basic Airflow architecture with the Local and Sequential executors is an
+excellent starting point for understanding the architecture of Apache Airflow.
+
+.. image:: img/arch-diag-basic.png
+
+
+There are a few components to note:
+
+* **Metadata Database**: Airflow uses a SQL database to store metadata about the data pipelines being run. In the
+  diagram above, this is represented as Postgres which is extremely popular with Airflow.
+  Alternate databases supported with Airflow include MySQL.
+
+* **Web Server** and **Scheduler**: The Airflow web server and Scheduler are separate processes run (in this case)
+  on the local machine and interact with the database mentioned above.
+
+* The **Executor** is shown separately above, since it is commonly discussed within Airflow and in the documentation, but
+  in reality it is NOT a separate process, but run within the Scheduler.
+
+* The **Worker(s)** are separate processes which also interact with the other components of the Airflow architecture and
+  the metadata repository.

Review comment:
       I may have missed it but I could not found where we say it. In fact we say something totally different than CeleryExecutor:
   
   > Primarily intended for development use, the basic Airflow architecture with the Local and Sequential executors is an excellent starting point for understanding the architecture of Apache Airflow.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
turbaszek commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563270116



##########
File path: docs/apache-airflow/concepts.rst
##########
@@ -23,6 +23,38 @@ Concepts
 The Airflow platform is a tool for describing, executing, and monitoring
 workflows.
 
+.. _architecture:
+
+Basic Airflow architecture
+''''''''''''''''''''''''''
+
+Primarily intended for development use, the basic Airflow architecture with the Local and Sequential executors is an
+excellent starting point for understanding the architecture of Apache Airflow.
+
+.. image:: img/arch-diag-basic.png
+
+
+There are a few components to note:
+
+* **Metadata Database**: Airflow uses a SQL database to store metadata about the data pipelines being run. In the
+  diagram above, this is represented as Postgres which is extremely popular with Airflow.
+  Alternate databases supported with Airflow include MySQL.
+
+* **Web Server** and **Scheduler**: The Airflow web server and Scheduler are separate processes run (in this case)
+  on the local machine and interact with the database mentioned above.
+
+* The **Executor** is shown separately above, since it is commonly discussed within Airflow and in the documentation, but
+  in reality it is NOT a separate process, but run within the Scheduler.
+
+* The **Worker(s)** are separate processes which also interact with the other components of the Airflow architecture and
+  the metadata repository.

Review comment:
       Are workers required in case of every executor? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj merged pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj merged pull request #13660:
URL: https://github.com/apache/airflow/pull/13660


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r557043752



##########
File path: docs/apache-airflow/start/docker-compose.yaml
##########
@@ -0,0 +1,101 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+version: '3'
+x-airflow-common:
+  &airflow-common
+  image: apache/airflow:2.0.0
+  environment:
+    - AIRFLOW__CORE__EXECUTOR=CeleryExecutor
+    - AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow
+    - AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:airflow@postgres/airflow
+    - AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
+    - AIRFLOW__CORE__FERNET_KEY=
+    - AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION=True
+    - AIRFLOW__CORE__LOAD_EXAMPLES=True
+  volumes:
+    - ./dags:/opt/airflow/dags
+    - ./logs:/opt/airflow/logs

Review comment:
       Unfortunately this does not work on linux.
   
   When you run the "db init' command you get this error:
   
   ```
   Unable to load the config, contains a configuration error.
   Traceback (most recent call last):
     File "/usr/local/lib/python3.6/pathlib.py", line 1248, in mkdir
       self._accessor.mkdir(self, mode)
     File "/usr/local/lib/python3.6/pathlib.py", line 387, in wrapped
       return strfunc(str(pathobj), *args)
   FileNotFoundError: [Errno 2] No such file or directory: '/opt/airflow/logs/scheduler/2021-01-14'
   
   During handling of the above exception, another exception occurred:
   
   Traceback (most recent call last):
     File "/usr/local/lib/python3.6/logging/config.py", line 565, in configure
       handler = self.configure_handler(handlers[name])
     File "/usr/local/lib/python3.6/logging/config.py", line 738, in configure_handler
       result = factory(**kwargs)
     File "/home/airflow/.local/lib/python3.6/site-packages/airflow/utils/log/file_processor_handler.py", line 46, in __init__
       Path(self._get_log_directory()).mkdir(parents=True, exist_ok=True)
     File "/usr/local/lib/python3.6/pathlib.py", line 1252, in mkdir
       self.parent.mkdir(parents=True, exist_ok=True)
     File "/usr/local/lib/python3.6/pathlib.py", line 1248, in mkdir
       self._accessor.mkdir(self, mode)
     File "/usr/local/lib/python3.6/pathlib.py", line 387, in wrapped
       return strfunc(str(pathobj), *args)
   PermissionError: [Errno 13] Permission denied: '/opt/airflow/logs/scheduler'
   
   During handling of the above exception, another exception occurred:
   
   Traceback (most recent call last):
     File "/home/airflow/.local/bin/airflow", line 5, in <module>
       from airflow.__main__ import main
     File "/home/airflow/.local/lib/python3.6/site-packages/airflow/__init__.py", line 46, in <module>
       settings.initialize()
     File "/home/airflow/.local/lib/python3.6/site-packages/airflow/settings.py", line 432, in initialize
       LOGGING_CLASS_PATH = configure_logging()
     File "/home/airflow/.local/lib/python3.6/site-packages/airflow/logging_config.py", line 62, in configure_logging
       raise e
     File "/home/airflow/.local/lib/python3.6/site-packages/airflow/logging_config.py", line 57, in configure_logging
       dictConfig(logging_config)
     File "/usr/local/lib/python3.6/logging/config.py", line 802, in dictConfig
       dictConfigClass(config).configure()
     File "/usr/local/lib/python3.6/logging/config.py", line 573, in configure
       '%r: %s' % (name, e))
   ValueError: Unable to configure handler 'processor': [Errno 13] Permission denied: '/opt/airflow/logs/scheduler'
   ```
   
   The reason is that the "dags"  "logs" and "plugins" folder are created as "root" owned automatically and they seem to be not writeable for airflow process. 
   
   Also after running it "dags",  "logs" and "plugins"  folder are created with the "root" user which makes it difiicult for the user to manage it (they have to run `sudo` to delete the folders and files created.  Idealy those folders should be created with the host user or there should be an easy way to delete them. 
   
   ```
   quick-start-docker+ 2 ± ls -la
   total 80
   drwxrwxr-x  5 jarek jarek  4096 sty 14 05:39 .
   drwxr-xr-x 10 jarek jarek 12288 sty 14 05:30 ..
   drwxr-xr-x  2 root  root   4096 sty 14 05:39 dags
   -rw-rw-r--  1 jarek jarek  2767 sty 14 05:30 docker-compose.yaml
   -rw-rw-r--  1 jarek jarek  4889 sty 14 05:30 docker.rst
   -rw-rw-r--  1 jarek jarek   991 sty 14 05:30 index.rst
   -rw-rw-r--  1 jarek jarek  3809 sty 14 05:30 local.rst
   drwxr-xr-x  2 root  root   4096 sty 14 05:39 logs
   drwxr-xr-x  2 root  root   4096 sty 14 05:39 plugins
   ```
   
   In order to fix it, you can utillise the fact that airflow is openshift compatible. Airflow inside production docker file can be run as "any" user providing that group of that user is "0".  This should allowe you to run all commands as "HOST" user with group "0" but I think it also requires to make sure that the folders are created with appropriate HOST user automatically.
   
   See https://docs.openshift.com/container-platform/4.1/openshift_images/create-images.html "support arbitrary user ids" chapter. 
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-766320122


   Feel fre to modify them if you feel it can be done better.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r557041375



##########
File path: docs/apache-airflow/start/docker-compose.yaml
##########
@@ -0,0 +1,101 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+version: '3'
+x-airflow-common:

Review comment:
       This docker-compose does not work with docker-compose version 1.25.3 which is default version on Linux 'debian buster' LTS. You need to manually instal 1.27.4 which supports anchors.
   
   When you try to ruin in the 1.25.3 you get this message that does not explain much:
   
   ```
   ERROR: The Compose file './docker-compose.yaml' is invalid because:
   Invalid top-level property "x-airflow-common". Valid top-level sections for this Compose file are: version, services, networks, volumes, and extensions starting with "x-".
   
   You might be seeing this error because you're using the wrong Compose file version. Either specify a supported version (e.g "2.2" or "3.3") and place your service definitions under the `services` key, or omit the `version` key and place your service definitions at the root of the file to use version 1.
   For more on the Compose file format versions, see https://docs.docker.com/compose/compose-file/
   services.airflow-scheduler.depends_on contains an invalid type, it should be an array
   services.airflow-webserver.depends_on contains an invalid type, it should be an array
   services.airflow-worker.depends_on contains an invalid type, it should be an array
   services.flower.depends_on contains an invalid type, it should be an array
   ```
   
   It would be great if we find a way to provide better message explaining that minimum supported version is X (wihich one is it ) and directing people to installation/upgrade instructions. At the very least - if we cannot do it we should clearly state it in the docs that this is prerequisite and explain that this message means they have to upgrade docker-compose.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563258045



##########
File path: docs/apache-airflow/start/docker-compose.yaml
##########
@@ -0,0 +1,101 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+version: '3'
+x-airflow-common:
+  &airflow-common
+  image: apache/airflow:2.0.0
+  environment:
+    - AIRFLOW__CORE__EXECUTOR=CeleryExecutor
+    - AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow
+    - AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:airflow@postgres/airflow
+    - AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
+    - AIRFLOW__CORE__FERNET_KEY=
+    - AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION=True
+    - AIRFLOW__CORE__LOAD_EXAMPLES=True
+  volumes:
+    - ./dags:/opt/airflow/dags
+    - ./logs:/opt/airflow/logs

Review comment:
       I improved compatibility with Linux. See: 
   https://github.com/apache/airflow/pull/13660/commits/d1fdb571fd679eea3154b352f88ecf773f54e814




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r559503476



##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,103 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ on your workstation.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should download `docker-compose.yaml <../docker-compose.yaml>`__. This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Running Airflow
+===============
+
+Before starting Airflow for the first time, you need to initialize the database. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose run --rm airflow-webserver \
+        airflow db init
+
+Access to the web server requires a user account, to create it, run:
+
+.. code-block:: bash
+
+    docker-compose run --rm airflow-webserver \
+        airflow users create \
+            --role Admin \
+            --username airflow \
+            --password airflow \
+            --email airflow@airflow.com \
+            --firstname airflow \
+            --lastname airflow

Review comment:
       Jarek's change has not been merged yet. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-766320076


   Looks good but I added a few fixups: 
   
   1) Missing .gitignore entries for logs/plugins/env/dags
   2)  The DOCKER_COMPOSE_ARGS are not needed IMHO . The `docker compose run` will allocate terminal as needed and you can disable it with -T flag. Also interactive mode works out-of-the-box with docker-compose-run. removed them
   3) I removed 'airflow' from docker-compose run in airflow.sh. It is not neeed, it does not work with airfllow 2.0 image and it does not allow to run the two useful `python` and `bash` commands
   4) changed ${*} into "${@}" (that's the proper way of passing parameters containing spaces potentially) 
   5) I added examples with "python/bash" commands.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-766320076


   Looks good but I added a few fixups: 
   
   1) Missing .dockerignore entries for logs/plugins/env/dags
   2)  The DOCKER_COMPOSE_ARGS are not needed IMHO . Docker compose run will allocate terminal as needed and you can disable it with -T flag. Also interactive mode works out-of-the-box with docker-compose-run
   3) I removed 'airflow' from docker-compose run in airflow.sh. It is not neeed, it does not work with airfllow 2.0 image and it does not allow to run the two useful `python` and `bash` commands
   4) changed ${*} into "${@}" (that's the proper way of passing parameters containing spaces potentially) 
   5) I added examples with "python/bash" commands.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-765627074


   TODO: I would like to add how to use CLI with wrappers.
   Similar to: 
   https://github.com/KlubJagiellonski/pola-backend/blob/master/scripts/aws-cli.sh
   https://github.com/KlubJagiellonski/pola-backend/blob/master/scripts/aws-cli-dev.sh
   https://github.com/mik-laj/presto-hive-kerberos-docker/blob/master/psql.sh
   https://github.com/mik-laj/presto-hive-kerberos-docker/blob/master/mc.sh
   https://github.com/mik-laj/presto-hive-kerberos-docker/blob/master/hive.sh
   
   This in some cases will allow us to forget that everything is running in Docker.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563273654



##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,170 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ v1.27.0 and newer on your workstation.
+
+Older versions of ``docker-compose`` do not support all features required by ``docker-compose.yaml`` file, so double check that it meets the minimum version requirements.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should fetch `docker-compose.yaml <../docker-compose.yaml>`__.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}docker-compose.yaml'
+
+This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``airflow-init`` - The initialization service.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
+
+- ``./dags`` - you can put your DAG files here.
+- ``./logs`` - contains logs from task execution and scheduler.
+- ``./plugins`` - you can put your :doc:`custom plugins </plugins>` here.
+
+Initializing Environment
+========================
+
+Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
+
+On **Linux**, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
+
+.. code-block:: bash
+
+    mkdir ./dags ./logs ./plugins
+    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
+
+On **all operating system**, you need to run database migrations and create the first user account. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose up --rm airflow-init
+
+After initialization is complete, you should see a message like below.
+
+.. code-block:: bash
+
+    airflow-init_1       | Upgrades done
+    airflow-init_1       | Admin user airflow created
+    airflow-init_1       | 2.1.0
+    start_airflow-init_1 exited with code 0
+
+The account created has the login ``airflow`` and the password ``airflow``.
+
+Running Airflow
+===============
+
+Now you can start all services:
+
+.. code-block:: bash
+
+    docker-compose up

Review comment:
       Yes. This can be overridden by a parameter in the DAG constructor.. This sounds like expected behavior. The user has control over how the environment behaves.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#issuecomment-765627074


   TODO: I would like to add optional sections on how to use CLI with wrappers.
   Similar to: 
   https://github.com/KlubJagiellonski/pola-backend/blob/master/scripts/aws-cli.sh
   https://github.com/KlubJagiellonski/pola-backend/blob/master/scripts/aws-cli-dev.sh
   https://github.com/mik-laj/presto-hive-kerberos-docker/blob/master/psql.sh
   https://github.com/mik-laj/presto-hive-kerberos-docker/blob/master/mc.sh
   https://github.com/mik-laj/presto-hive-kerberos-docker/blob/master/hive.sh
   
   This in some cases will allow us to forget that everything is running in Docker.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563257870



##########
File path: docs/apache-airflow/start/docker-compose.yaml
##########
@@ -0,0 +1,101 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+version: '3'
+x-airflow-common:

Review comment:
       Thanks for pointing this out. I added a description to the "Before you begin" section.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563259121



##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,103 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ on your workstation.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should download `docker-compose.yaml <../docker-compose.yaml>`__. This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Running Airflow
+===============
+
+Before starting Airflow for the first time, you need to initialize the database. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose run --rm airflow-webserver \
+        airflow db init
+
+Access to the web server requires a user account, to create it, run:
+
+.. code-block:: bash
+
+    docker-compose run --rm airflow-webserver \
+        airflow users create \
+            --role Admin \
+            --username airflow \
+            --password airflow \
+            --email airflow@airflow.com \
+            --firstname airflow \
+            --lastname airflow

Review comment:
       I updated the PR.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563295263



##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,170 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ v1.27.0 and newer on your workstation.
+
+Older versions of ``docker-compose`` do not support all features required by ``docker-compose.yaml`` file, so double check that it meets the minimum version requirements.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should fetch `docker-compose.yaml <../docker-compose.yaml>`__.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}docker-compose.yaml'
+
+This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``airflow-init`` - The initialization service.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
+
+- ``./dags`` - you can put your DAG files here.
+- ``./logs`` - contains logs from task execution and scheduler.
+- ``./plugins`` - you can put your :doc:`custom plugins </plugins>` here.
+
+Initializing Environment
+========================
+
+Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
+
+On **Linux**, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
+
+.. code-block:: bash
+
+    mkdir ./dags ./logs ./plugins
+    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
+
+On **all operating system**, you need to run database migrations and create the first user account. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose up --rm airflow-init
+
+After initialization is complete, you should see a message like below.
+
+.. code-block:: bash
+
+    airflow-init_1       | Upgrades done
+    airflow-init_1       | Admin user airflow created
+    airflow-init_1       | 2.1.0

Review comment:
       2.0.0 works with my fixup but without creating user

##########
File path: docs/apache-airflow/start/docker.rst
##########
@@ -0,0 +1,170 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Running Airflow in Docker
+#########################
+
+This quick-start guide will allow you to quickly start Airflow with :doc:`CeleryExecutor </executor/celery>` in Docker. This is the fastest way to start Airflow.
+
+Before you begin
+================
+
+Follow these steps to install the necessary tools.
+
+1. Install `Docker Community Edition (CE) <https://docs.docker.com/engine/installation/>`__ on your workstation.
+2. Install `Docker Compose <https://docs.docker.com/compose/install/>`__ v1.27.0 and newer on your workstation.
+
+Older versions of ``docker-compose`` do not support all features required by ``docker-compose.yaml`` file, so double check that it meets the minimum version requirements.
+
+``docker-compose.yaml``
+=======================
+
+To deploy Airflow on Docker Compose, you should fetch `docker-compose.yaml <../docker-compose.yaml>`__.
+
+.. jinja:: quick_start_ctx
+
+    .. code-block:: bash
+
+        curl -LfO '{{ doc_root_url }}docker-compose.yaml'
+
+This file contains several service definitions:
+
+- ``airflow-scheduler`` - The :doc:`scheduler </scheduler>` monitors all tasks and DAGs, then triggers the
+  task instances once their dependencies are complete.
+- ``airflow-webserver`` - The webserver available at ``http://localhost:8080``.
+- ``airflow-worker`` - The worker that executes the tasks given by the scheduler.
+- ``airflow-init`` - The initialization service.
+- ``flower`` - `The flower app <https://flower.readthedocs.io/en/latest/>`__ for monitoring the environment. It is available at ``http://localhost:8080``.
+- ``postgres`` - The database.
+- ``redis`` - `The redis <https://redis.io/>`__ - broker that forwards messages from scheduler to worker.
+
+All these services allow you to run Airflow with :doc:`CeleryExecutor </executor/celery>`. For more information, see :ref:`architecture`.
+
+Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
+
+- ``./dags`` - you can put your DAG files here.
+- ``./logs`` - contains logs from task execution and scheduler.
+- ``./plugins`` - you can put your :doc:`custom plugins </plugins>` here.
+
+Initializing Environment
+========================
+
+Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
+
+On **Linux**, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
+
+.. code-block:: bash
+
+    mkdir ./dags ./logs ./plugins
+    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
+
+On **all operating system**, you need to run database migrations and create the first user account. To do it, run.
+
+.. code-block:: bash
+
+    docker-compose up --rm airflow-init
+
+After initialization is complete, you should see a message like below.
+
+.. code-block:: bash
+
+    airflow-init_1       | Upgrades done
+    airflow-init_1       | Admin user airflow created
+    airflow-init_1       | 2.1.0

Review comment:
       2.0.0 works with my fixup but without creating admin user




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13660: Add quick start for Airflow on Docker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13660:
URL: https://github.com/apache/airflow/pull/13660#discussion_r563272457



##########
File path: docs/apache-airflow/concepts.rst
##########
@@ -23,6 +23,38 @@ Concepts
 The Airflow platform is a tool for describing, executing, and monitoring
 workflows.
 
+.. _architecture:
+
+Basic Airflow architecture
+''''''''''''''''''''''''''
+
+Primarily intended for development use, the basic Airflow architecture with the Local and Sequential executors is an
+excellent starting point for understanding the architecture of Apache Airflow.
+
+.. image:: img/arch-diag-basic.png
+
+
+There are a few components to note:
+
+* **Metadata Database**: Airflow uses a SQL database to store metadata about the data pipelines being run. In the
+  diagram above, this is represented as Postgres which is extremely popular with Airflow.
+  Alternate databases supported with Airflow include MySQL.
+
+* **Web Server** and **Scheduler**: The Airflow web server and Scheduler are separate processes run (in this case)
+  on the local machine and interact with the database mentioned above.
+
+* The **Executor** is shown separately above, since it is commonly discussed within Airflow and in the documentation, but
+  in reality it is NOT a separate process, but run within the Scheduler.
+
+* The **Worker(s)** are separate processes which also interact with the other components of the Airflow architecture and
+  the metadata repository.

Review comment:
       At the very beginning of this section it says this is the architecture for LocalExecutor and SeqentialExecutor, and both of these executors use a separate process to run the DAG code, so everything is correct.
   
   > Primarily intended for development use, the basic Airflow architecture with the Local and Sequential executors is an
   > excellent starting point for understanding the architecture of Apache Airflow.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org