You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/06/14 08:04:43 UTC

[GitHub] [airflow] Sanchit112 commented on a change in pull request #16084: Added new pipeline example for the tutorial docs (Issue #11208)

Sanchit112 commented on a change in pull request #16084:
URL: https://github.com/apache/airflow/pull/16084#discussion_r650641615



##########
File path: docs/apache-airflow/tutorial.rst
##########
@@ -376,3 +376,130 @@ Here's a few things you might want to do next:
       - Review the :ref:`List of operators <pythonapi:operators>`
       - Review the :ref:`Macros reference<macros>`
     - Write your first pipeline!
+
+
+Lets look at another example; we need to get some data from a file which is hosted online and need to insert into our local database. We also need to look at removing duplicate rows while inserting.
+
+Initial setup
+'''''''''''''
+We need to have docker and postgres installed.
+We will be using this `docker file <https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#docker-compose-yaml>`_
+Follow the instructions properly to set up Airflow.
+
+Create a Employee table in postgres using this
+
+.. code-block:: sql
+
+  create table "Employees"
+  (
+      "Serial Number" numeric not null
+   constraint employees\_pk
+              primary key,
+      "Company Name" text,
+      "Employee Markme" text,
+      "Description" text,
+      "Leave" integer
+  );
+
+  create table "Employees_temp"
+  (
+      "Serial Number" numeric not null
+   constraint employees\_pk
+              primary key,
+      "Company Name" text,
+      "Employee Markme" text,
+      "Description" text,
+      "Leave" integer
+  );
+
+Let's break this down into 3 steps: get data, insert data, merge data:
+
+.. code-block:: python
+
+  def get_data():
+      url = "https://docs.google.com/uc?export=download&id=1a0RGUW2oYxyhIQYuezG_u8cxgUaAQtZw"

Review comment:
       I moved the file to my drive




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org