You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/06/05 18:12:10 UTC

[GitHub] [airflow] feluelle commented on a change in pull request #8877: Add S3ToRedshift example dag and system test

feluelle commented on a change in pull request #8877:
URL: https://github.com/apache/airflow/pull/8877#discussion_r436064245



##########
File path: airflow/providers/amazon/aws/example_dags/example_s3_to_redshift.py
##########
@@ -41,13 +42,26 @@
     schedule_interval=None,
     tags=['example']
 ) as dag:
+    preparation__task_load_sample_data_to_s3 = PythonOperator(
+        python_callable=lambda: S3Hook().load_string("0,Airflow", f'{S3_KEY}/{REDSHIFT_TABLE}', S3_BUCKET),
+        task_id='preparation__load_sample_data_to_s3'
+    )
+    preparation__task_create_table = PostgresOperator(
+        sql=f'CREATE TABLE {REDSHIFT_TABLE}(Id int, Name varchar)',
+        postgres_conn_id='redshift_default',
+        task_id='preparation__create_table'
+    )

Review comment:
       WDYT of having some `preparation` tasks? Not everything I can prepare with terraform.

##########
File path: scripts/ci/docker-compose/local.yml
##########
@@ -62,6 +62,7 @@ services:
       - HOST_USER_ID
       - HOST_GROUP_ID
       - HOST_HOME=${HOME}
+      - HOST_AIRFLOW_SOURCES

Review comment:
       and here.

##########
File path: Dockerfile.ci
##########
@@ -308,26 +308,32 @@ RUN if [[ -n "${ADDITIONAL_PYTHON_DEPS}" ]]; then \
         pip install ${ADDITIONAL_PYTHON_DEPS}; \
     fi
 
+# Additional binaries to load if needed
 RUN \
     AWSCLI_IMAGE="amazon/aws-cli:latest" && \
     AZURECLI_IMAGE="mcr.microsoft.com/azure-cli:latest" && \
     GCLOUD_IMAGE="gcr.io/google.com/cloudsdktool/cloud-sdk:latest" && \
-    echo "docker run --rm -it -v \${HOST_HOME}/.aws:/root/.aws ${AWSCLI_IMAGE} \"\$@\"" \
-        > /usr/bin/aws && \
+    TERRAFORM_IMAGE="hashicorp/terraform:latest" && \
+    echo "#!/bin/bash" | tee -a /usr/bin/aws /usr/bin/az /usr/bin/bq /usr/bin/gcloud /usr/bin/gsutil /usr/bin/terraform && \
+    echo "docker run --rm -it -v \${HOST_HOME}/.aws:/root/.aws -v \${HOST_AIRFLOW_SOURCES}:/opt/airflow ${AWSCLI_IMAGE} \"\$@\"" \
+        >> /usr/bin/aws && \
     echo "docker pull ${AWSCLI_IMAGE}" > /usr/bin/aws-update && \
-    echo "docker run --rm -it -v \${HOST_HOME}/.azure:/root/.azure ${AZURECLI_IMAGE} \"\$@\"" \
-        > /usr/bin/az && \
+    echo "docker run --rm -it -v \${HOST_HOME}/.azure:/root/.azure -v \${HOST_AIRFLOW_SOURCES}:/opt/airflow ${AZURECLI_IMAGE} \"\$@\"" \
+        >> /usr/bin/az && \
     echo "docker pull ${AZURECLI_IMAGE}" > /usr/bin/az-update && \
-    echo "docker run --rm -it -v \${HOST_HOME}/.config:/root/.config ${GCLOUD_IMAGE} bq \"\$@\"" \
-        > /usr/bin/bq && \
+    echo "docker run --rm -it -v \${HOST_HOME}/.config:/root/.config -v \${HOST_AIRFLOW_SOURCES}:/opt/airflow ${GCLOUD_IMAGE} bq \"\$@\"" \
+        >> /usr/bin/bq && \
     echo "docker pull ${GCLOUD_IMAGE}" > /usr/bin/bq-update && \
-    echo "docker run --rm -it -v \${HOST_HOME}/.config:/root/.config ${GCLOUD_IMAGE} gcloud \"\$@\"" \
-        > /usr/bin/gcloud && \
+    echo "docker run --rm -it -v \${HOST_HOME}/.config:/root/.config -v \${HOST_AIRFLOW_SOURCES}:/opt/airflow ${GCLOUD_IMAGE} gcloud \"\$@\"" \
+        >> /usr/bin/gcloud && \
     echo "docker pull ${GCLOUD_IMAGE}" > /usr/bin/gcloud-update && \
-    echo "docker run --rm -it -v \${HOST_HOME}/.config:/root/.config ${GCLOUD_IMAGE} gsutil \"\$@\"" \
-        > /usr/bin/gsutil && \
+    echo "docker run --rm -it -v \${HOST_HOME}/.config:/root/.config -v \${HOST_AIRFLOW_SOURCES}:/opt/airflow ${GCLOUD_IMAGE} gsutil \"\$@\"" \
+        >> /usr/bin/gsutil && \
     echo "docker pull ${GCLOUD_IMAGE}" > /usr/bin/gsutil-update && \
-    chmod a+x /usr/bin/aws /usr/bin/az /usr/bin/bq /usr/bin/gcloud /usr/bin/gsutil
+    echo "docker run --rm -it -v \${HOST_HOME}/.aws:/root/.aws -v \${HOST_HOME}/.azure:/root/.azure -v \${HOST_HOME}/.config:/root/.config -v \${HOST_AIRFLOW_SOURCES}:/opt/airflow -w /opt/airflow --env-file <(env | grep TF) ${TERRAFORM_IMAGE} \"\$@\"" \
+        >> /usr/bin/terraform && \

Review comment:
       Here I added all config dirs from aws, azure and google so terraform can use those to authenticate.
   
   I also added `-v \${HOST_AIRFLOW_SOURCES}:/opt/airflow` to all of them so they can use airflow sources. For example terraform needs to access its `*.tf` and `*.tfvars` files.
   
   `--env-file <(env | grep TF)` <- this is an interesting one. I did this to share all `TF*` env variables. So you can define those in `variables.env` from breeze and we do not need another one. Unfortunatly we can't just do `/files/airflow-breeze-config/variables.env` because this env file also has bash syntax in it - not only env var declarations.

##########
File path: scripts/ci/docker-compose/local-prod.yml
##########
@@ -40,4 +40,5 @@ services:
       - HOST_USER_ID
       - HOST_GROUP_ID
       - HOST_HOME=${HOME}
+      - HOST_AIRFLOW_SOURCES

Review comment:
       @potiuk I added `HOST_AIRFLOW_SOURCES` here and set it in my `variables.env` file. WDYT?

##########
File path: Dockerfile.ci
##########
@@ -308,26 +308,32 @@ RUN if [[ -n "${ADDITIONAL_PYTHON_DEPS}" ]]; then \
         pip install ${ADDITIONAL_PYTHON_DEPS}; \
     fi
 
+# Additional binaries to load if needed
 RUN \
     AWSCLI_IMAGE="amazon/aws-cli:latest" && \
     AZURECLI_IMAGE="mcr.microsoft.com/azure-cli:latest" && \
     GCLOUD_IMAGE="gcr.io/google.com/cloudsdktool/cloud-sdk:latest" && \
-    echo "docker run --rm -it -v \${HOST_HOME}/.aws:/root/.aws ${AWSCLI_IMAGE} \"\$@\"" \
-        > /usr/bin/aws && \
+    TERRAFORM_IMAGE="hashicorp/terraform:latest" && \
+    echo "#!/bin/bash" | tee -a /usr/bin/aws /usr/bin/az /usr/bin/bq /usr/bin/gcloud /usr/bin/gsutil /usr/bin/terraform && \

Review comment:
       I added this line so python can run them properly - all bin's need to have a `#!/bin/bash` header.

##########
File path: tests/test_utils/terraform.py
##########
@@ -0,0 +1,36 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from tests.test_utils.system_tests_class import SystemTest
+
+
+class Terraform(SystemTest):
+    TERRAFORM_DIR: str
+
+    def setUp(self) -> None:
+        self.execute_cmd(["terraform", "init", "-input=false", self.TERRAFORM_DIR])
+        self.execute_cmd(["terraform", "plan", "-input=false", self.TERRAFORM_DIR])
+        self.execute_cmd(["terraform", "apply", "-input=false", "-auto-approve", self.TERRAFORM_DIR])
+
+    def get_tf_output(self, name):
+        output = self.check_output(["terraform", "output", name]).decode('utf-8').replace("\r\n", "")
+        self.log.info(output)
+        return output
+
+    def tearDown(self) -> None:
+        self.execute_cmd(["terraform", "plan", "-destroy", "-input=false", self.TERRAFORM_DIR])
+        self.execute_cmd(["terraform", "destroy", "-input=false", "-auto-approve", self.TERRAFORM_DIR])

Review comment:
       I am not quite happy about it. This can probably be structured better.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org