You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "blag (via GitHub)" <gi...@apache.org> on 2023/02/09 23:32:32 UTC

[GitHub] [airflow] blag commented on a diff in pull request #29441: datasets, next_run_datasets, remove unnecessary timestamp filter

blag commented on code in PR #29441:
URL: https://github.com/apache/airflow/pull/29441#discussion_r1102128972


##########
airflow/www/views.py:
##########
@@ -3715,7 +3715,6 @@ def next_run_datasets(self, dag_id):
                     DatasetEvent,
                     and_(
                         DatasetEvent.dataset_id == DatasetModel.id,
-                        DatasetEvent.timestamp > DatasetDagRunQueue.created_at,

Review Comment:
   😬  It's been a long minute since I wrote this, but...
   
   I believe when I wrote this the intent with the `lastUpdate` field was to only show last updates _since the last time the dag was queued/run_. But yeah, the `lastUpdate` label isn't descriptive enough for that.
   
   Option 1: Personally, I would consider changing the name of what the `lastUpdate` field is rendered to, something like "Last update since last run" or something more wordy.
   
   Option 2: But if you don't want to do that, and you want to display the last update for every dataset regardless of whether has already been "consumed" by a DagRun (eg: in either the DatasetDagRunQueue or actually scheduled into a DagRun), then yeah it makes sense to remove this filter. However, I would also remove the `and_` around it since then there would only be one filter condition in that join:
   
   ```python
                   .join(
                       DatasetEvent,
                       DatasetEvent.dataset_id == DatasetModel.id,
                       isouter=True,
                   )
   ```
   
   If you go for option 2, I think you should be able to compare the existence and creation time of the DDRQ with the DatasetEvent timestamp to figure out whether or not the last update time has already triggered a DagRun or if it has partially satisfied the conditions of a future DagRun.
   
   Hope this makes sense.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org