You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/08/17 08:00:47 UTC

[GitHub] [airflow] ashb opened a new pull request, #25758: Used many-to-many relation for finding attached dataset events.

ashb opened a new pull request, #25758:
URL: https://github.com/apache/airflow/pull/25758

   The previous PR added this join table -- by using it we make the
   association of which events created a DagRun fixed at DagRun creation
   time; previously if you deleted old DagRuns then the oldest remaining
   run would "collect" all the oldest events.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ashb commented on a diff in pull request #25758: Used many-to-many relation for finding attached dataset events.

Posted by GitBox <gi...@apache.org>.
ashb commented on code in PR #25758:
URL: https://github.com/apache/airflow/pull/25758#discussion_r948348140


##########
airflow/api_connexion/endpoints/dag_run_endpoint.py:
##########
@@ -116,43 +115,12 @@ def get_upstream_dataset_events(
             "DAGRun not found",
             detail=f"DAGRun with DAG ID: '{dag_id}' and DagRun ID: '{dag_run_id}' not found",
         )
-    events = _get_upstream_dataset_events(dag_run=dag_run, session=session)
+    events = dag_run.dataset_events

Review Comment:
   `consumed_dataset_events` it is.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ashb merged pull request #25758: Used many-to-many relation for finding attached dataset events.

Posted by GitBox <gi...@apache.org>.
ashb merged PR #25758:
URL: https://github.com/apache/airflow/pull/25758


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ashb commented on a diff in pull request #25758: Used many-to-many relation for finding attached dataset events.

Posted by GitBox <gi...@apache.org>.
ashb commented on code in PR #25758:
URL: https://github.com/apache/airflow/pull/25758#discussion_r947601322


##########
airflow/api_connexion/endpoints/dag_run_endpoint.py:
##########
@@ -116,43 +115,12 @@ def get_upstream_dataset_events(
             "DAGRun not found",
             detail=f"DAGRun with DAG ID: '{dag_id}' and DagRun ID: '{dag_run_id}' not found",
         )
-    events = _get_upstream_dataset_events(dag_run=dag_run, session=session)
+    events = dag_run.dataset_events

Review Comment:
   The only events on a DagRun are the events that that dag run "consumes" -- DagRuns don't produce datasets.
   
   Perhaps the names should change, but the tests still pass (though I did change them, but I didn't change what dag the event is generated against)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] norm commented on pull request #25758: Used many-to-many relation for finding attached dataset events.

Posted by GitBox <gi...@apache.org>.
norm commented on PR #25758:
URL: https://github.com/apache/airflow/pull/25758#issuecomment-1218202920

   Apart from the backref name, LGTM. Lots of deleted lines 👍.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #25758: Used many-to-many relation for finding attached dataset events.

Posted by GitBox <gi...@apache.org>.
uranusjr commented on code in PR #25758:
URL: https://github.com/apache/airflow/pull/25758#discussion_r947593019


##########
airflow/api_connexion/endpoints/dag_run_endpoint.py:
##########
@@ -116,43 +115,12 @@ def get_upstream_dataset_events(
             "DAGRun not found",
             detail=f"DAGRun with DAG ID: '{dag_id}' and DagRun ID: '{dag_run_id}' not found",
         )
-    events = _get_upstream_dataset_events(dag_run=dag_run, session=session)
+    events = dag_run.dataset_events

Review Comment:
   This seems to be a different value. The previous function retrieves events that triggered `dag_run`, but `dag_run.dataset_events` is the events emitted by the run (if I understand the logic correctly)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] norm commented on a diff in pull request #25758: Used many-to-many relation for finding attached dataset events.

Posted by GitBox <gi...@apache.org>.
norm commented on code in PR #25758:
URL: https://github.com/apache/airflow/pull/25758#discussion_r948134956


##########
airflow/api_connexion/endpoints/dag_run_endpoint.py:
##########
@@ -116,43 +115,12 @@ def get_upstream_dataset_events(
             "DAGRun not found",
             detail=f"DAGRun with DAG ID: '{dag_id}' and DagRun ID: '{dag_run_id}' not found",
         )
-    events = _get_upstream_dataset_events(dag_run=dag_run, session=session)
+    events = dag_run.dataset_events

Review Comment:
   Without prior knowledge the name does imply events *from* the DagRun, not *causing* the DagRun, at least to my eyes. `consumed_dataset_events` perhaps?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org