You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/04 03:34:17 UTC

[GitHub] [airflow] josh-fell opened a new pull request #19400: Cleanup dynamic `start_date` use for miscellaneous Google example DAGs

josh-fell opened a new pull request #19400:
URL: https://github.com/apache/airflow/pull/19400


   There is an ongoing effort to enhance example DAGs by setting static values for `start_date`, ensuring `catchup=False` is set, cleaning up and/or implementing useful `default_args`, and removing/limiting the use of redundant, default connection ID values.
   
   This PR mainly addresses the static `start_date` and `catchup` clean up areas in example DAGs in the across non-Google Cloud providers (e.g. Ads, Firebase, LevelDB, Marketing Platform, and Suite). 
   
   The following additional (small) updates are also included:
   -  The `example_campaign_manager` example DAG was updated to take advantage of `XComArgs` rather than the classic Jinja templating to pull `XComs`.
   - Updated task-dependency notation in `example_firestore` to improve readability using `chain()`.
   - Removed unnecessary setting of `dag=dag` in `example_leveldb`
   
   > Note: There was an opportunity to consolidate task args to `default_args` for most, if not all, DAGs. However, these updates were not implemented as they would have conflicted too much with the operator documentation that references the example DAGs.
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] o-nikolas commented on a change in pull request #19400: Cleanup dynamic `start_date` use for miscellaneous Google example DAGs

Posted by GitBox <gi...@apache.org>.
o-nikolas commented on a change in pull request #19400:
URL: https://github.com/apache/airflow/pull/19400#discussion_r743229736



##########
File path: airflow/providers/google/marketing_platform/example_dags/example_campaign_manager.py
##########
@@ -87,20 +87,21 @@
 with models.DAG(
     "example_campaign_manager",
     schedule_interval='@once',  # Override to match your needs,
-    start_date=dates.days_ago(1),
+    start_date=datetime(2021, 1, 1),
+    catchup=False,
 ) as dag:
     # [START howto_campaign_manager_insert_report_operator]
     create_report = GoogleCampaignManagerInsertReportOperator(
         profile_id=PROFILE_ID, report=REPORT, task_id="create_report"
     )
-    report_id = "{{ task_instance.xcom_pull('create_report')['id'] }}"
+    report_id = create_report.output["report_id"]
     # [END howto_campaign_manager_insert_report_operator]
 
     # [START howto_campaign_manager_run_report_operator]
     run_report = GoogleCampaignManagerRunReportOperator(
         profile_id=PROFILE_ID, report_id=report_id, task_id="run_report"
     )
-    file_id = "{{ task_instance.xcom_pull('run_report')['id'] }}"
+    file_id = run_report.output["file_id"]

Review comment:
       Caught up offline with Josh, and this now makes sense to me :) The GoogleCampaignManagerRunReportOperator was also setting another xcom key that specifically stored the 'file_id' (which is what Josh is using now), whereas the previous code ignored this key and fetched the id from the full response stored in the default return value xcom.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #19400: Cleanup dynamic `start_date` use for miscellaneous Google example DAGs

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #19400:
URL: https://github.com/apache/airflow/pull/19400#issuecomment-969942825


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] o-nikolas commented on a change in pull request #19400: Cleanup dynamic `start_date` use for miscellaneous Google example DAGs

Posted by GitBox <gi...@apache.org>.
o-nikolas commented on a change in pull request #19400:
URL: https://github.com/apache/airflow/pull/19400#discussion_r743193187



##########
File path: airflow/providers/google/marketing_platform/example_dags/example_campaign_manager.py
##########
@@ -87,20 +87,21 @@
 with models.DAG(
     "example_campaign_manager",
     schedule_interval='@once',  # Override to match your needs,
-    start_date=dates.days_ago(1),
+    start_date=datetime(2021, 1, 1),
+    catchup=False,
 ) as dag:
     # [START howto_campaign_manager_insert_report_operator]
     create_report = GoogleCampaignManagerInsertReportOperator(
         profile_id=PROFILE_ID, report=REPORT, task_id="create_report"
     )
-    report_id = "{{ task_instance.xcom_pull('create_report')['id'] }}"
+    report_id = create_report.output["report_id"]
     # [END howto_campaign_manager_insert_report_operator]
 
     # [START howto_campaign_manager_run_report_operator]
     run_report = GoogleCampaignManagerRunReportOperator(
         profile_id=PROFILE_ID, report_id=report_id, task_id="run_report"
     )
-    file_id = "{{ task_instance.xcom_pull('run_report')['id'] }}"
+    file_id = run_report.output["file_id"]

Review comment:
       Should the index into `'[id]'` here have been dropped? 

##########
File path: airflow/providers/google/marketing_platform/example_dags/example_campaign_manager.py
##########
@@ -87,20 +87,21 @@
 with models.DAG(
     "example_campaign_manager",
     schedule_interval='@once',  # Override to match your needs,
-    start_date=dates.days_ago(1),
+    start_date=datetime(2021, 1, 1),
+    catchup=False,
 ) as dag:
     # [START howto_campaign_manager_insert_report_operator]
     create_report = GoogleCampaignManagerInsertReportOperator(
         profile_id=PROFILE_ID, report=REPORT, task_id="create_report"
     )
-    report_id = "{{ task_instance.xcom_pull('create_report')['id'] }}"
+    report_id = create_report.output["report_id"]
     # [END howto_campaign_manager_insert_report_operator]
 
     # [START howto_campaign_manager_run_report_operator]
     run_report = GoogleCampaignManagerRunReportOperator(
         profile_id=PROFILE_ID, report_id=report_id, task_id="run_report"
     )
-    file_id = "{{ task_instance.xcom_pull('run_report')['id'] }}"
+    file_id = run_report.output["file_id"]

Review comment:
       Caught up offline with Josh, and this now makes sense to me :) The GoogleCampaignManagerRunReportOperator was also setting another xcom key that specifically stored the 'file_id' (which is what he's using now), whereas the previous code ignored this key and fetched the id from the full response stored in the default return value xcom.

##########
File path: airflow/providers/google/marketing_platform/example_dags/example_campaign_manager.py
##########
@@ -87,20 +87,21 @@
 with models.DAG(
     "example_campaign_manager",
     schedule_interval='@once',  # Override to match your needs,
-    start_date=dates.days_ago(1),
+    start_date=datetime(2021, 1, 1),
+    catchup=False,
 ) as dag:
     # [START howto_campaign_manager_insert_report_operator]
     create_report = GoogleCampaignManagerInsertReportOperator(
         profile_id=PROFILE_ID, report=REPORT, task_id="create_report"
     )
-    report_id = "{{ task_instance.xcom_pull('create_report')['id'] }}"
+    report_id = create_report.output["report_id"]
     # [END howto_campaign_manager_insert_report_operator]
 
     # [START howto_campaign_manager_run_report_operator]
     run_report = GoogleCampaignManagerRunReportOperator(
         profile_id=PROFILE_ID, report_id=report_id, task_id="run_report"
     )
-    file_id = "{{ task_instance.xcom_pull('run_report')['id'] }}"
+    file_id = run_report.output["file_id"]

Review comment:
       Caught up offline with Josh, and this now makes sense to me :) The GoogleCampaignManagerRunReportOperator was also setting another xcom key that specifically stored the 'file_id' (which is what Josh is using now), whereas the previous code ignored this key and fetched the id from the full response stored in the default return value xcom.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] o-nikolas commented on a change in pull request #19400: Cleanup dynamic `start_date` use for miscellaneous Google example DAGs

Posted by GitBox <gi...@apache.org>.
o-nikolas commented on a change in pull request #19400:
URL: https://github.com/apache/airflow/pull/19400#discussion_r743193187



##########
File path: airflow/providers/google/marketing_platform/example_dags/example_campaign_manager.py
##########
@@ -87,20 +87,21 @@
 with models.DAG(
     "example_campaign_manager",
     schedule_interval='@once',  # Override to match your needs,
-    start_date=dates.days_ago(1),
+    start_date=datetime(2021, 1, 1),
+    catchup=False,
 ) as dag:
     # [START howto_campaign_manager_insert_report_operator]
     create_report = GoogleCampaignManagerInsertReportOperator(
         profile_id=PROFILE_ID, report=REPORT, task_id="create_report"
     )
-    report_id = "{{ task_instance.xcom_pull('create_report')['id'] }}"
+    report_id = create_report.output["report_id"]
     # [END howto_campaign_manager_insert_report_operator]
 
     # [START howto_campaign_manager_run_report_operator]
     run_report = GoogleCampaignManagerRunReportOperator(
         profile_id=PROFILE_ID, report_id=report_id, task_id="run_report"
     )
-    file_id = "{{ task_instance.xcom_pull('run_report')['id'] }}"
+    file_id = run_report.output["file_id"]

Review comment:
       Should the index into `'[id]'` here have been dropped? 

##########
File path: airflow/providers/google/marketing_platform/example_dags/example_campaign_manager.py
##########
@@ -87,20 +87,21 @@
 with models.DAG(
     "example_campaign_manager",
     schedule_interval='@once',  # Override to match your needs,
-    start_date=dates.days_ago(1),
+    start_date=datetime(2021, 1, 1),
+    catchup=False,
 ) as dag:
     # [START howto_campaign_manager_insert_report_operator]
     create_report = GoogleCampaignManagerInsertReportOperator(
         profile_id=PROFILE_ID, report=REPORT, task_id="create_report"
     )
-    report_id = "{{ task_instance.xcom_pull('create_report')['id'] }}"
+    report_id = create_report.output["report_id"]
     # [END howto_campaign_manager_insert_report_operator]
 
     # [START howto_campaign_manager_run_report_operator]
     run_report = GoogleCampaignManagerRunReportOperator(
         profile_id=PROFILE_ID, report_id=report_id, task_id="run_report"
     )
-    file_id = "{{ task_instance.xcom_pull('run_report')['id'] }}"
+    file_id = run_report.output["file_id"]

Review comment:
       Caught up offline with Josh, and this now makes sense to me :) The GoogleCampaignManagerRunReportOperator was also setting another xcom key that specifically stored the 'file_id' (which is what he's using now), whereas the previous code ignored this key and fetched the id from the full response stored in the default return value xcom.

##########
File path: airflow/providers/google/marketing_platform/example_dags/example_campaign_manager.py
##########
@@ -87,20 +87,21 @@
 with models.DAG(
     "example_campaign_manager",
     schedule_interval='@once',  # Override to match your needs,
-    start_date=dates.days_ago(1),
+    start_date=datetime(2021, 1, 1),
+    catchup=False,
 ) as dag:
     # [START howto_campaign_manager_insert_report_operator]
     create_report = GoogleCampaignManagerInsertReportOperator(
         profile_id=PROFILE_ID, report=REPORT, task_id="create_report"
     )
-    report_id = "{{ task_instance.xcom_pull('create_report')['id'] }}"
+    report_id = create_report.output["report_id"]
     # [END howto_campaign_manager_insert_report_operator]
 
     # [START howto_campaign_manager_run_report_operator]
     run_report = GoogleCampaignManagerRunReportOperator(
         profile_id=PROFILE_ID, report_id=report_id, task_id="run_report"
     )
-    file_id = "{{ task_instance.xcom_pull('run_report')['id'] }}"
+    file_id = run_report.output["file_id"]

Review comment:
       Caught up offline with Josh, and this now makes sense to me :) The GoogleCampaignManagerRunReportOperator was also setting another xcom key that specifically stored the 'file_id' (which is what Josh is using now), whereas the previous code ignored this key and fetched the id from the full response stored in the default return value xcom.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #19400: Cleanup dynamic `start_date` use for miscellaneous Google example DAGs

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #19400:
URL: https://github.com/apache/airflow/pull/19400


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] o-nikolas commented on a change in pull request #19400: Cleanup dynamic `start_date` use for miscellaneous Google example DAGs

Posted by GitBox <gi...@apache.org>.
o-nikolas commented on a change in pull request #19400:
URL: https://github.com/apache/airflow/pull/19400#discussion_r743193187



##########
File path: airflow/providers/google/marketing_platform/example_dags/example_campaign_manager.py
##########
@@ -87,20 +87,21 @@
 with models.DAG(
     "example_campaign_manager",
     schedule_interval='@once',  # Override to match your needs,
-    start_date=dates.days_ago(1),
+    start_date=datetime(2021, 1, 1),
+    catchup=False,
 ) as dag:
     # [START howto_campaign_manager_insert_report_operator]
     create_report = GoogleCampaignManagerInsertReportOperator(
         profile_id=PROFILE_ID, report=REPORT, task_id="create_report"
     )
-    report_id = "{{ task_instance.xcom_pull('create_report')['id'] }}"
+    report_id = create_report.output["report_id"]
     # [END howto_campaign_manager_insert_report_operator]
 
     # [START howto_campaign_manager_run_report_operator]
     run_report = GoogleCampaignManagerRunReportOperator(
         profile_id=PROFILE_ID, report_id=report_id, task_id="run_report"
     )
-    file_id = "{{ task_instance.xcom_pull('run_report')['id'] }}"
+    file_id = run_report.output["file_id"]

Review comment:
       Should the index into `'[id]'` here have been dropped? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] o-nikolas commented on a change in pull request #19400: Cleanup dynamic `start_date` use for miscellaneous Google example DAGs

Posted by GitBox <gi...@apache.org>.
o-nikolas commented on a change in pull request #19400:
URL: https://github.com/apache/airflow/pull/19400#discussion_r743229736



##########
File path: airflow/providers/google/marketing_platform/example_dags/example_campaign_manager.py
##########
@@ -87,20 +87,21 @@
 with models.DAG(
     "example_campaign_manager",
     schedule_interval='@once',  # Override to match your needs,
-    start_date=dates.days_ago(1),
+    start_date=datetime(2021, 1, 1),
+    catchup=False,
 ) as dag:
     # [START howto_campaign_manager_insert_report_operator]
     create_report = GoogleCampaignManagerInsertReportOperator(
         profile_id=PROFILE_ID, report=REPORT, task_id="create_report"
     )
-    report_id = "{{ task_instance.xcom_pull('create_report')['id'] }}"
+    report_id = create_report.output["report_id"]
     # [END howto_campaign_manager_insert_report_operator]
 
     # [START howto_campaign_manager_run_report_operator]
     run_report = GoogleCampaignManagerRunReportOperator(
         profile_id=PROFILE_ID, report_id=report_id, task_id="run_report"
     )
-    file_id = "{{ task_instance.xcom_pull('run_report')['id'] }}"
+    file_id = run_report.output["file_id"]

Review comment:
       Caught up offline with Josh, and this now makes sense to me :) The GoogleCampaignManagerRunReportOperator was also setting another xcom key that specifically stored the 'file_id' (which is what he's using now), whereas the previous code ignored this key and fetched the id from the full response stored in the default return value xcom.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org