You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/16 14:53:28 UTC
[GitHub] [airflow] AnithaG-Oak opened a new issue #19618: Execution_date not rendering after airflow upgrade
AnithaG-Oak opened a new issue #19618:
URL: https://github.com/apache/airflow/issues/19618
### Apache Airflow version
2.2.2 (latest released)
### Operating System
Debian GNU/Linux
### Versions of Apache Airflow Providers
apache-airflow-providers-amazon==2.3.0
apache-airflow-providers-apache-spark==2.0.1
apache-airflow-providers-cncf-kubernetes==2.1.0
apache-airflow-providers-ftp==2.0.1
apache-airflow-providers-http==2.0.1
apache-airflow-providers-imap==2.0.1
apache-airflow-providers-sqlite==2.0.1
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
_No response_
### What happened
Hi,
We recently upgraded airflow from 2.1.0 to 2.2.2 (2.1.0 to 2.2.0 to 2.2.1 to 2.2.2) and DAGs aren't running as expected. All these DAGs were added before the upgrade itself and they were running fine.
We use execution_date parameter in SparkSubmitOperator which was rendering fine before the upgrade fails now returning None
"{{ (execution_date if execution_date.microsecond > 0 else dag.following_schedule(execution_date)).isoformat() }}"
DAG run fails with the error
jinja2.exceptions.UndefinedError: 'None' has no attribute 'isoformat'
Tried wiping out the database and ran as a fresh DAG but still same error
Any help would be appreciated
### What you expected to happen
_No response_
### How to reproduce
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970626298
> > Could you confirm if dag_run.logical_date would work for this ?
>
> Can you try it pleaase? I cannot "confirm it" but if it works that will give us a clue to look further
dag_run.logical_date didn't work either
<img width="1780" alt="Screenshot 2021-11-17 at 1 20 45 AM" src="https://user-images.githubusercontent.com/78701588/142055827-b81115ff-0455-46cd-993d-df1ae30a0d68.png">
<img width="1780" alt="Screenshot 2021-11-17 at 1 21 11 AM" src="https://user-images.githubusercontent.com/78701588/142055844-94bb6d30-5a32-4d61-aead-15379d87c4cd.png">
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk edited a comment on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970551844
I believe I know what the problem is. yes the `dag_run.logical_date` should work and I know what the problem is @uranusjr
@uranusjr - this is the "coerce_date" in following_schedule:
```
def following_schedule(self, dttm):
"""
Calculates the following schedule for this dag in UTC.
:param dttm: utc datetime
:return: utc datetime
"""
warnings.warn(
"`DAG.following_schedule()` is deprecated. Use `DAG.next_dagrun_info(restricted=False)` instead.",
category=DeprecationWarning,
stacklevel=2,
)
data_interval = self.infer_automated_data_interval(timezone.coerce_datetime(dttm))
next_info = self.next_dagrun_info(data_interval, restricted=False)
if next_info is None:
return None
return next_info.data_interval.start
```
Here is `coerce_date`:
```
def coerce_datetime(v: Union[None, dt.datetime, DateTime]) -> Optional[DateTime]:
"""Convert whatever is passed in to an timezone-aware ``pendulum.DateTime``."""
if v is None:
return None
if v.tzinfo is None:
v = make_aware(v)
if isinstance(v, DateTime):
return v
return pendulum.instance(v)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak edited a comment on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971215617
> But the calendar widget introduced in #16141 does not employ sub-second accurary, so the milliseconds become zero, and all manually-triggered runs now evaluate the expression to dag.following_schedule(execution_date).isoformat()
instead, which always fails because a manually triggered run never has a following schedule (both prior and after AIP-39).
also this reasoning doesn't hold true for this issue since the error is seen for scheduled sub dags
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971215617
> #16141
also this reasoning doesn't hold true for this issue since the error is seen for scheduled sub dags
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak edited a comment on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970487106
> > It creates a SparkSubmitOperator. I have updated the example
>
> OK. Now we need to know what `spark_args()` does as well.
ah mb.. it creates application_args (List[String]) from args and kwargs arguments. Something like this
```
def spark_args(*args, **kwargs) -> List[str]:
arr = [a for a in args]
for key, value in kwargs.items():
arr += [f"--{key}", str(value)]
return arr
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk edited a comment on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970557564
Ah. I see the `@once` indeed you arre right.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970476229
> It creates a SparkSubmitOperator. I have updated the example
OK. Now we need to know what `spark_args()` does as well.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970471389
It creates a SparkSubmitOperator. I have updated the example
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970591367
When the DAGs are manually triggered(after a couple of runs), it renders the date. But in scheduled runs, it always fails returning None
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] uranusjr edited a comment on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970555814
I can see why the error happens, but not why the error is wrong. A `@once` schedule does not have a following schedule (because it runs only once?), so `dag.following_schedule(execution_date)` is correct to return `None`. This also should be the case if you use `dag_run.logical_date`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] uranusjr commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970555814
I can see why the error happens, but not _why_ the error is, well, an error. A `@once` schedule does not have a following schedule (because it runs only once?), so `dag.following_schedule(execution_date)` is correct to return `None`. This should be the case if you use `dag_run.logical_date`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak edited a comment on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971201961
One correction. `(execution_date if execution_date.microsecond > 0 else dag.following_schedule(execution_date)).isoformat()
`
fails for scheduled runs as well. **`dag.following_schedule(execution_date)` returns null for scheduled sub dags**
Even if we use` dag_run.run_type` to figure out the run type, what is the right way to get execution_date for scheduled runs ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] uranusjr edited a comment on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971202767
`execution_date` will still work until 3.0. `logical_date` is the new variable name (the value is the same). But by the fact you’re using `following_schedule`, I suspect `data_interval_end` actually provides more suitable semantics for you use case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970350854
Thanks for opening your first issue here! Be sure to follow the issue template!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] uranusjr commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970408823
Please provide a DAG for reproduction.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970462807
Any spark job would do to reproduce this issue since the DAG fails before triggering the job
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970487106
> > It creates a SparkSubmitOperator. I have updated the example
>
> OK. Now we need to know what `spark_args()` does as well.
ah mb.. it creates application_args (List[String]) from args and kwargs arguments. Something like this
`def spark_args(*args, **kwargs) -> List[str]:
arr = [a for a in args]
for key, value in kwargs.items():
arr += [f"--{key}", str(value)]
return arr`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] uranusjr commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971198721
Oh I know what’s going on now. We’ve been looking at entirely wrong places.
The change that actually broke the logic is #16141. Previously, a manually-triggered run always have execution date set to `timezone.now()`, which has an approximately 99.9999% change to return a non-zero millisecond, so
```python
(execution_date if execution_date.microsecond > 0 else dag.following_schedule(execution_date)).isoformat()
```
is always evaluated to
```python
execution_date.isoformat()
```
which works.
But the calendar widget introduced iin the aforementioned PR does not employ sub-second accurary, so the milliseconds become zero, and all manually-triggered runs now evaluate the expression to
```python
dag.following_schedule(execution_date).isoformat()
```
instead, which always fails because a manually triggered run never has a following schedule (both prior and after AIP-39).
Ultimately, using the millisecond field to distinguish between scheduled and manual runs is a poor implementation in the first place. There’s still a 0.0001% chance the detection would fail, even if Airflow never broke that. It is both more semantically correct and reliable to use `dag_run.run_type` instead.
I think the best we can do here is to add an entry to UPDATING.md to describe this behavioural change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] uranusjr commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970564124
Closing since this is not a bug, everything is working as expected. The template accidentally works previously, and unfortunately breaks now. It’s still a bug in user code and can only be fixed by the user.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] uranusjr commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970562961
Before the timetable implementation, `schedule_interval=@once` is normalised to `normalized_schedule_interval=None` internally. That does not match either branches. `DAG.following_schedule()` returns None regardless what you pass to it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970580063
If schedule_interval is a valid case, is it coerce_date issue ?
Could you confirm if dag_run.logical_date would work for this ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971201961
One correction. `(execution_date if execution_date.microsecond > 0 else dag.following_schedule(execution_date)).isoformat()
`
fails for scheduled runs and not manual runs. **`dag.following_schedule(execution_date)` returns null for scheduled sub dags**
Even if we use` dag_run.run_type` to figure out the run type, what is the right way to get execution_date for scheduled runs ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970551844
AH I Know what the problem is. yes the `dag_run.logical_date` should work and I know what the problem is @uranusjr
@uranusjr - this is the "coerce_date" in following_schedule:
```
def following_schedule(self, dttm):
"""
Calculates the following schedule for this dag in UTC.
:param dttm: utc datetime
:return: utc datetime
"""
warnings.warn(
"`DAG.following_schedule()` is deprecated. Use `DAG.next_dagrun_info(restricted=False)` instead.",
category=DeprecationWarning,
stacklevel=2,
)
data_interval = self.infer_automated_data_interval(timezone.coerce_datetime(dttm))
next_info = self.next_dagrun_info(data_interval, restricted=False)
if next_info is None:
return None
return next_info.data_interval.start
```
Here is `coerce_date`:
```
def coerce_datetime(v: Union[None, dt.datetime, DateTime]) -> Optional[DateTime]:
"""Convert whatever is passed in to an timezone-aware ``pendulum.DateTime``."""
if v is None:
return None
if v.tzinfo is None:
v = make_aware(v)
if isinstance(v, DateTime):
return v
return pendulum.instance(v)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] uranusjr edited a comment on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970555814
I can see why the error happens, but not why the error is wrong. A `@once` schedule does not have a following schedule (because it runs only once?), so `dag.following_schedule(execution_date)` is correct to return `None`. This should be the case if you use `dag_run.logical_date`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970572917
@potiuk @uranusjr the @once is just one of the example DAGs here. This error is being seen in DAGs with schedule intervals as well
for example:
```
with DAG(
f"workflow_{type}",
default_args=DEFAULT_ARGS,
schedule_interval=f"0 {first},{second} * * *",
description=f"{type} workflow",
max_active_runs=1,
catchup=False,
) as dag:
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970560487
or maybe this did not return None before for @once schedule ?
```
next_info = self.timetable.next_dagrun_info(
last_automated_dagrun=pendulum.instance(dttm),
restriction=TimeRestriction(earliest=None, latest=None, catchup=True),
)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970618635
> Could you confirm if dag_run.logical_date would work for this ?
Can you try it pleaase? I cannot "confirm it" but if it works that will give us a clue to look further
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970557564
Ah. I see the "@once" indeed you arre right.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] uranusjr edited a comment on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971198721
Oh I know what’s going on now. We’ve been looking at entirely wrong places.
The change that actually broke the logic is #16141. Previously, a manually-triggered run always has its execution date set to `timezone.now()`, which has an approximately 99.9999% chance to return a non-zero millisecond, so
```python
(execution_date if execution_date.microsecond > 0 else dag.following_schedule(execution_date)).isoformat()
```
is (almost) always evaluated to
```python
execution_date.isoformat()
```
which works.
But the calendar widget introduced in #16141 does not employ sub-second accurary, so the milliseconds become zero, and all manually-triggered runs now evaluate the expression to
```python
dag.following_schedule(execution_date).isoformat()
```
instead, which always fails because a manually triggered run never has a following schedule (both prior and after AIP-39).
Ultimately, using the millisecond field to distinguish between scheduled and manual runs is a poor implementation in the first place. There’s still a 0.0001% chance the detection would fail, even if Airflow never broke that. It is both more semantically correct and reliable to use `dag_run.run_type` instead.
I think the best we can do here is to add an entry to UPDATING.md to describe this behavioural change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak edited a comment on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970461874
```
properties = SparkProperties.from_config("small")
def execution_date():
return "{{ (execution_date if execution_date.microsecond > 0 else dag.following_schedule(execution_date)).isoformat() }}"
def spark_job(task_id: str,
spark: SparkProperties,
main_class: Optional[str],
*args,
py_files: Optional[str] = None,
jars: Optional[str] = None,
**kwargs,) -> SparkSubmitOperator:
return SparkSubmitOperator(
task_id=task_id,
application=path,
conf=spark.conf(),
py_files=py_files,
jars=jars,
conn_id=spark.conn_id,
java_class=main_class or None,
total_executor_cores=spark.total_executor_cores,
executor_cores=spark.executor_cores,
executor_memory=spark.executor_memory,
driver_memory=spark.driver_memory,
name=task_id,
num_executors=spark.num_executors,
application_args=spark_args(*args, **kwargs),
env_vars=spark.env_vars(),
)
def test1() -> SparkSubmitOperator:
return spark_job(
task_id=f"test_dag",
spark=properties,
main_class="TestSparkJob",
execution_date=execution_date()
)
def test_dag(this: DAG) -> DAG:
with this:
(label("start") >> test1() >> label("finish"))
return this
with DAG(
f"test_workflow", default_args=DEFAULT_ARGS, schedule_interval="@once", max_active_runs=1,
) as dag:
(test_dag(dag))
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970561383
or even this which was before the timetables:
```
def following_schedule(self, dttm):
"""
Calculates the following schedule for this dag in UTC.
:param dttm: utc datetime
:return: utc datetime
"""
if isinstance(self.normalized_schedule_interval, str):
# we don't want to rely on the transitions created by
# croniter as they are not always correct
dttm = pendulum.instance(dttm)
naive = timezone.make_naive(dttm, self.timezone)
cron = croniter(self.normalized_schedule_interval, naive)
# We assume that DST transitions happen on the minute/hour
if not self.is_fixed_time_schedule():
# relative offset (eg. every 5 minutes)
delta = cron.get_next(datetime) - naive
following = dttm.in_timezone(self.timezone) + delta
else:
# absolute (e.g. 3 AM)
naive = cron.get_next(datetime)
tz = self.timezone
following = timezone.make_aware(naive, tz)
return timezone.convert_to_utc(following)
elif self.normalized_schedule_interval is not None:
return timezone.convert_to_utc(dttm + self.normalized_schedule_interval)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970461874
```
properties = SparkProperties.from_config("small")
def execution_date():
return "{{ (execution_date if execution_date.microsecond > 0 else dag.following_schedule(execution_date)).isoformat() }}"
def test1() -> SparkSubmitOperator:
return spark_job(
task_id=f"test_dag",
spark=properties,
main_class="TestSparkJob",
execution_date=execution_date()
)
def test_dag(this: DAG) -> DAG:
with this:
(label("start") >> test1() >> label("finish"))
return this
with DAG(
f"test_workflow", default_args=DEFAULT_ARGS, schedule_interval="@once", max_active_runs=1,
) as dag:
(test_dag(dag))
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970549332
OK. Seems prety legit then. Hmm. Interesting - we converted the `execution_date` to be lazy-proxy object so essentially it should work in a backwards compatible way (correct @uranusjr ?).
I wonder what happen if you replace the execution_date with ``dag_run.logical_date`` - see deprecated values here https://airflow.apache.org/docs/apache-airflow/stable/templates-ref.html?highlight=macros#variables - but it should work regardless, so it would be great to track this one down.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971197871
Narrowing down further, `dag.following_schedule(execution_date) ` returns null for scheduled sub dags only. Main dags return value
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak edited a comment on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971215617
> But the calendar widget introduced in #16141 does not employ sub-second accurary, so the milliseconds become zero, and all manually-triggered runs now evaluate the expression to
dag.following_schedule(execution_date).isoformat()
instead, which always fails because a manually triggered run never has a following schedule (both prior and after AIP-39).
also this reasoning doesn't hold true for this issue since the error is seen for scheduled sub dags
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] uranusjr commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971202767
`execution_date` will still work until 3.0. `logical_date` is the new variable name (the value is the same). But by the fact you’re using `followin_schedule`, I suspect `data_interval_end` actually provides more suitable semantics for you use case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] uranusjr commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970561881
The lazy object proxy handles `microsecond` correctly:
```pycon
>>> from lazy_object_proxy import Proxy
>>> from airflow.utils import timezone
>>> from datetime import datetime
>>> p = Proxy(lambda: datetime.now())
>>> p.microsecond
262739
```
But regardless of that, the template line doesn’t really make sense. Whether `execution_date.microsecond > 0` is entirely context-dependant—if you run the DAG enough times you’d eventually get a case where `execution_date.microsecond == 0`—and when that happens, the line would fail unconditionally.
I’m inclined to categorise this as a user error that we can’t help with.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970558320
Different behaviour of `if execution_date.microsecond > 0` ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] uranusjr closed issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
uranusjr closed issue #19618:
URL: https://github.com/apache/airflow/issues/19618
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970466931
> Any spark job would do to reproduce this issue since the DAG fails before triggering the job
What's spark_job() doing ? This must be something internal to your organisation?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] uranusjr edited a comment on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971198721
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk edited a comment on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970551844
I believe I know what the problem is. yes the `dag_run.logical_date` should work and I know what the problem is: @uranusjr - this is the "coerce_date" in following_schedule:
```
def following_schedule(self, dttm):
"""
Calculates the following schedule for this dag in UTC.
:param dttm: utc datetime
:return: utc datetime
"""
warnings.warn(
"`DAG.following_schedule()` is deprecated. Use `DAG.next_dagrun_info(restricted=False)` instead.",
category=DeprecationWarning,
stacklevel=2,
)
data_interval = self.infer_automated_data_interval(timezone.coerce_datetime(dttm))
next_info = self.next_dagrun_info(data_interval, restricted=False)
if next_info is None:
return None
return next_info.data_interval.start
```
Here is `coerce_date`:
```
def coerce_datetime(v: Union[None, dt.datetime, DateTime]) -> Optional[DateTime]:
"""Convert whatever is passed in to an timezone-aware ``pendulum.DateTime``."""
if v is None:
return None
if v.tzinfo is None:
v = make_aware(v)
if isinstance(v, DateTime):
return v
return pendulum.instance(v)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade
Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970548765
On a related note, noticed from the airflow doc that {{ execution_date }} is deprecated.
I see that {{ ts }} is equivalent to execution_date isoformat but this is a string type. Do we have an equivalent for execution_date in datetime type ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org