You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/16 14:53:28 UTC

[GitHub] [airflow] AnithaG-Oak opened a new issue #19618: Execution_date not rendering after airflow upgrade

AnithaG-Oak opened a new issue #19618:
URL: https://github.com/apache/airflow/issues/19618


   ### Apache Airflow version
   
   2.2.2 (latest released)
   
   ### Operating System
   
   Debian GNU/Linux
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-amazon==2.3.0
   apache-airflow-providers-apache-spark==2.0.1
   apache-airflow-providers-cncf-kubernetes==2.1.0
   apache-airflow-providers-ftp==2.0.1
   apache-airflow-providers-http==2.0.1
   apache-airflow-providers-imap==2.0.1
   apache-airflow-providers-sqlite==2.0.1
   
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   Hi,
   We recently upgraded airflow from 2.1.0 to 2.2.2 (2.1.0 to 2.2.0 to 2.2.1 to 2.2.2) and DAGs aren't running as expected. All these DAGs were added before the upgrade itself and they were running fine.
   We use execution_date parameter in SparkSubmitOperator which was rendering fine before the upgrade fails now returning None
   
   "{{ (execution_date if execution_date.microsecond > 0 else dag.following_schedule(execution_date)).isoformat() }}"
   
   DAG run fails with the error
   
   jinja2.exceptions.UndefinedError: 'None' has no attribute 'isoformat'
   
   Tried wiping out the database and ran as a fresh DAG but still same error
   
   Any help would be appreciated
   
   ### What you expected to happen
   
   _No response_
   
   ### How to reproduce
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970626298


   > > Could you confirm if dag_run.logical_date would work for this ?
   > 
   > Can you try it pleaase? I cannot "confirm it" but if it works that will give us a clue to look further
   
   dag_run.logical_date didn't work either
   <img width="1780" alt="Screenshot 2021-11-17 at 1 20 45 AM" src="https://user-images.githubusercontent.com/78701588/142055827-b81115ff-0455-46cd-993d-df1ae30a0d68.png">
   <img width="1780" alt="Screenshot 2021-11-17 at 1 21 11 AM" src="https://user-images.githubusercontent.com/78701588/142055844-94bb6d30-5a32-4d61-aead-15379d87c4cd.png">
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970551844


   I believe I know what the problem is. yes the `dag_run.logical_date` should work and I know what the problem is  @uranusjr 
   
   @uranusjr  - this is the "coerce_date" in following_schedule:
   
   ```
       def following_schedule(self, dttm):
           """
           Calculates the following schedule for this dag in UTC.
   
           :param dttm: utc datetime
           :return: utc datetime
           """
           warnings.warn(
               "`DAG.following_schedule()` is deprecated. Use `DAG.next_dagrun_info(restricted=False)` instead.",
               category=DeprecationWarning,
               stacklevel=2,
           )
           data_interval = self.infer_automated_data_interval(timezone.coerce_datetime(dttm))
           next_info = self.next_dagrun_info(data_interval, restricted=False)
           if next_info is None:
               return None
           return next_info.data_interval.start
   ```
   
   Here is `coerce_date`:
   
   ```
   
   def coerce_datetime(v: Union[None, dt.datetime, DateTime]) -> Optional[DateTime]:
       """Convert whatever is passed in to an timezone-aware ``pendulum.DateTime``."""
       if v is None:
           return None
       if v.tzinfo is None:
           v = make_aware(v)
       if isinstance(v, DateTime):
           return v
       return pendulum.instance(v)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak edited a comment on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971215617


   > But the calendar widget introduced in #16141 does not employ sub-second accurary, so the milliseconds become zero, and all manually-triggered runs now evaluate the expression to dag.following_schedule(execution_date).isoformat()
   instead, which always fails because a manually triggered run never has a following schedule (both prior and after AIP-39).
   
   
   
   also this reasoning doesn't hold true for this issue since the error is seen for scheduled sub dags


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971215617


   > #16141
   
   also this reasoning doesn't hold true for this issue since the error is seen for scheduled sub dags


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak edited a comment on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970487106


   > > It creates a SparkSubmitOperator. I have updated the example
   > 
   > OK. Now we need to know what `spark_args()` does as well.
   
   ah mb.. it creates application_args (List[String]) from args and kwargs arguments. Something like this
   
   ```
   def spark_args(*args, **kwargs) -> List[str]:
       arr = [a for a in args]
       for key, value in kwargs.items():
           arr += [f"--{key}", str(value)]
       return arr
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970557564


   Ah. I see the `@once` indeed you arre right. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970476229


   > It creates a SparkSubmitOperator. I have updated the example
   
   OK. Now we need to know what `spark_args()` does as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970471389


   It creates a SparkSubmitOperator. I have updated the example


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970591367


   When the DAGs are manually triggered(after a couple of runs), it renders the date. But in scheduled runs, it always fails returning None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970555814


   I can see why the error happens, but not why the error is wrong. A `@once` schedule does not have a following schedule (because it runs only once?), so `dag.following_schedule(execution_date)` is correct to return `None`. This also should be the case if you use `dag_run.logical_date`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970555814


   I can see why the error happens, but not _why_ the error is, well, an error. A `@once` schedule does not have a following schedule (because it runs only once?), so `dag.following_schedule(execution_date)` is correct to return `None`. This should be the case if you use `dag_run.logical_date`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak edited a comment on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971201961


   One correction. `(execution_date if execution_date.microsecond > 0 else dag.following_schedule(execution_date)).isoformat()
   `
   fails for scheduled runs as well. **`dag.following_schedule(execution_date)` returns null for scheduled sub dags**
   
   Even if we use` dag_run.run_type` to figure out the run type, what is the right way to get execution_date for scheduled runs ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971202767


   `execution_date` will still work until 3.0. `logical_date` is the new variable name (the value is the same). But by the fact you’re using `following_schedule`, I suspect `data_interval_end` actually provides more suitable semantics for you use case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970350854


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970408823


   Please provide a DAG for reproduction.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970462807


   Any spark job would do to reproduce this issue since the DAG fails before triggering the job 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970487106


   > > It creates a SparkSubmitOperator. I have updated the example
   > 
   > OK. Now we need to know what `spark_args()` does as well.
   
   ah mb.. it creates application_args (List[String]) from args and kwargs arguments. Something like this
   
   `def spark_args(*args, **kwargs) -> List[str]:
       arr = [a for a in args]
       for key, value in kwargs.items():
           arr += [f"--{key}", str(value)]
       return arr`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971198721


   Oh I know what’s going on now. We’ve been looking at entirely wrong places.
   
   The change that actually broke the logic is #16141. Previously, a manually-triggered run always have execution date set to `timezone.now()`, which has an approximately 99.9999% change to return a non-zero millisecond, so
   
   ```python
   (execution_date if execution_date.microsecond > 0 else dag.following_schedule(execution_date)).isoformat()
   ```
   
   is always evaluated to
   
   ```python
   execution_date.isoformat()
   ```
   
   which works.
   
   But the calendar widget introduced iin the aforementioned PR does not employ sub-second accurary, so the milliseconds become zero, and all manually-triggered runs now evaluate the expression to
   
   ```python
   dag.following_schedule(execution_date).isoformat()
   ```
   
   instead, which always fails because a manually triggered run never has a following schedule (both prior and after AIP-39).
   
   Ultimately, using the millisecond field to distinguish between scheduled and manual runs is a poor implementation in the first place. There’s still a 0.0001% chance the detection would fail, even if Airflow never broke that. It is both more semantically correct and reliable to use `dag_run.run_type` instead.
   
   I think the best we can do here is to add an entry to UPDATING.md to describe this behavioural change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970564124


   Closing since this is not a bug, everything is working as expected. The template accidentally works previously, and unfortunately breaks now. It’s still a bug in user code and can only be fixed by the user.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970562961


   Before the timetable implementation, `schedule_interval=@once` is normalised to `normalized_schedule_interval=None` internally. That does not match either branches. `DAG.following_schedule()` returns None regardless what you pass to it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970580063


   If schedule_interval is a valid case, is it coerce_date issue ?
   Could you confirm if dag_run.logical_date would work for this ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971201961


   One correction. `(execution_date if execution_date.microsecond > 0 else dag.following_schedule(execution_date)).isoformat()
   `
   fails for scheduled runs and not manual runs. **`dag.following_schedule(execution_date)` returns null for scheduled sub dags**
   
   Even if we use` dag_run.run_type` to figure out the run type, what is the right way to get execution_date for scheduled runs ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970551844


   AH I Know what the problem is. yes the `dag_run.logical_date` should work and I know what the problem is  @uranusjr 
   
   @uranusjr  - this is the "coerce_date" in following_schedule:
   
   ```
       def following_schedule(self, dttm):
           """
           Calculates the following schedule for this dag in UTC.
   
           :param dttm: utc datetime
           :return: utc datetime
           """
           warnings.warn(
               "`DAG.following_schedule()` is deprecated. Use `DAG.next_dagrun_info(restricted=False)` instead.",
               category=DeprecationWarning,
               stacklevel=2,
           )
           data_interval = self.infer_automated_data_interval(timezone.coerce_datetime(dttm))
           next_info = self.next_dagrun_info(data_interval, restricted=False)
           if next_info is None:
               return None
           return next_info.data_interval.start
   ```
   
   Here is `coerce_date`:
   
   ```
   
   def coerce_datetime(v: Union[None, dt.datetime, DateTime]) -> Optional[DateTime]:
       """Convert whatever is passed in to an timezone-aware ``pendulum.DateTime``."""
       if v is None:
           return None
       if v.tzinfo is None:
           v = make_aware(v)
       if isinstance(v, DateTime):
           return v
       return pendulum.instance(v)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970555814


   I can see why the error happens, but not why the error is wrong. A `@once` schedule does not have a following schedule (because it runs only once?), so `dag.following_schedule(execution_date)` is correct to return `None`. This should be the case if you use `dag_run.logical_date`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970572917


   @potiuk  @uranusjr the @once is just one of the example DAGs here. This error is being seen in DAGs with schedule intervals as well
   
   for example:
   
   ```
   with DAG(
           f"workflow_{type}",
           default_args=DEFAULT_ARGS,
           schedule_interval=f"0 {first},{second} * * *",
           description=f"{type} workflow",
           max_active_runs=1,
           catchup=False,
       ) as dag:
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970560487


   or maybe this did not return None before for @once schedule ?
   
   ```
           next_info = self.timetable.next_dagrun_info(
               last_automated_dagrun=pendulum.instance(dttm),
               restriction=TimeRestriction(earliest=None, latest=None, catchup=True),
           )
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970618635


   > Could you confirm if dag_run.logical_date would work for this ?
   
   Can you try it pleaase? I cannot "confirm it" but if it works that will give us a clue to look further


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970557564


   Ah. I see the "@once" indeed you arre right. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971198721


   Oh I know what’s going on now. We’ve been looking at entirely wrong places.
   
   The change that actually broke the logic is #16141. Previously, a manually-triggered run always has its execution date set to `timezone.now()`, which has an approximately 99.9999% chance to return a non-zero millisecond, so
   
   ```python
   (execution_date if execution_date.microsecond > 0 else dag.following_schedule(execution_date)).isoformat()
   ```
   
   is (almost) always evaluated to
   
   ```python
   execution_date.isoformat()
   ```
   
   which works.
   
   But the calendar widget introduced in #16141 does not employ sub-second accurary, so the milliseconds become zero, and all manually-triggered runs now evaluate the expression to
   
   ```python
   dag.following_schedule(execution_date).isoformat()
   ```
   
   instead, which always fails because a manually triggered run never has a following schedule (both prior and after AIP-39).
   
   Ultimately, using the millisecond field to distinguish between scheduled and manual runs is a poor implementation in the first place. There’s still a 0.0001% chance the detection would fail, even if Airflow never broke that. It is both more semantically correct and reliable to use `dag_run.run_type` instead.
   
   I think the best we can do here is to add an entry to UPDATING.md to describe this behavioural change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak edited a comment on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970461874


   ```
   properties = SparkProperties.from_config("small")
   
   def execution_date():
       return "{{ (execution_date if execution_date.microsecond > 0 else dag.following_schedule(execution_date)).isoformat() }}"
   
   def spark_job(task_id: str,
       spark: SparkProperties,
       main_class: Optional[str],
       *args,
       py_files: Optional[str] = None,
       jars: Optional[str] = None,
       **kwargs,) -> SparkSubmitOperator:
      return SparkSubmitOperator(
           task_id=task_id,
           application=path,
           conf=spark.conf(),
           py_files=py_files,
           jars=jars,
           conn_id=spark.conn_id,
           java_class=main_class or None,
           total_executor_cores=spark.total_executor_cores,
           executor_cores=spark.executor_cores,
           executor_memory=spark.executor_memory,
           driver_memory=spark.driver_memory,
           name=task_id,
           num_executors=spark.num_executors,
           application_args=spark_args(*args, **kwargs),
           env_vars=spark.env_vars(),
       )
   
   def test1() -> SparkSubmitOperator:
       return spark_job(
           task_id=f"test_dag",
           spark=properties,
           main_class="TestSparkJob",
           execution_date=execution_date()
       )
   
   
   def test_dag(this: DAG) -> DAG:
       with this:
           (label("start") >> test1() >> label("finish"))
       return this
   
   
   with DAG(
       f"test_workflow", default_args=DEFAULT_ARGS, schedule_interval="@once", max_active_runs=1,
   ) as dag:
       (test_dag(dag))
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970561383


   or even this which was before the timetables:
   
   ```
       def following_schedule(self, dttm):
           """
           Calculates the following schedule for this dag in UTC.
   
           :param dttm: utc datetime
           :return: utc datetime
           """
           if isinstance(self.normalized_schedule_interval, str):
               # we don't want to rely on the transitions created by
               # croniter as they are not always correct
               dttm = pendulum.instance(dttm)
               naive = timezone.make_naive(dttm, self.timezone)
               cron = croniter(self.normalized_schedule_interval, naive)
   
               # We assume that DST transitions happen on the minute/hour
               if not self.is_fixed_time_schedule():
                   # relative offset (eg. every 5 minutes)
                   delta = cron.get_next(datetime) - naive
                   following = dttm.in_timezone(self.timezone) + delta
               else:
                   # absolute (e.g. 3 AM)
                   naive = cron.get_next(datetime)
                   tz = self.timezone
                   following = timezone.make_aware(naive, tz)
               return timezone.convert_to_utc(following)
           elif self.normalized_schedule_interval is not None:
               return timezone.convert_to_utc(dttm + self.normalized_schedule_interval)
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970461874


   ```
   properties = SparkProperties.from_config("small")
   
   def execution_date():
       return "{{ (execution_date if execution_date.microsecond > 0 else dag.following_schedule(execution_date)).isoformat() }}"
   
   def test1() -> SparkSubmitOperator:
       return spark_job(
           task_id=f"test_dag",
           spark=properties,
           main_class="TestSparkJob",
           execution_date=execution_date()
       )
   
   
   def test_dag(this: DAG) -> DAG:
       with this:
           (label("start") >> test1() >> label("finish"))
       return this
   
   
   with DAG(
       f"test_workflow", default_args=DEFAULT_ARGS, schedule_interval="@once", max_active_runs=1,
   ) as dag:
       (test_dag(dag))
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970549332


   OK. Seems prety legit then. Hmm. Interesting - we converted the `execution_date` to be lazy-proxy object so essentially it should work in a backwards compatible way (correct @uranusjr ?).
   
   I wonder what happen if you replace the execution_date with  ``dag_run.logical_date``  - see deprecated values here https://airflow.apache.org/docs/apache-airflow/stable/templates-ref.html?highlight=macros#variables - but it should work regardless, so it would be great to track this one down.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971197871


   Narrowing down further, `dag.following_schedule(execution_date) ` returns null for scheduled sub dags only. Main dags return value


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak edited a comment on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971215617


   > But the calendar widget introduced in #16141 does not employ sub-second accurary, so the milliseconds become zero, and all manually-triggered runs now evaluate the expression to
   
   dag.following_schedule(execution_date).isoformat()
   instead, which always fails because a manually triggered run never has a following schedule (both prior and after AIP-39).
   
   
   
   also this reasoning doesn't hold true for this issue since the error is seen for scheduled sub dags


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971202767


   `execution_date` will still work until 3.0. `logical_date` is the new variable name (the value is the same). But by the fact you’re using `followin_schedule`, I suspect `data_interval_end` actually provides more suitable semantics for you use case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970561881


   The lazy object proxy handles `microsecond` correctly:
   
   ```pycon
   >>> from lazy_object_proxy import Proxy
   >>> from airflow.utils import timezone
   >>> from datetime import datetime
   >>> p = Proxy(lambda: datetime.now())
   >>> p.microsecond
   262739
   ```
   
   But regardless of that, the template line doesn’t really make sense. Whether `execution_date.microsecond > 0` is entirely context-dependant—if you run the DAG enough times you’d eventually get a case where `execution_date.microsecond == 0`—and when that happens, the line would fail unconditionally.
   
   I’m inclined to categorise this as a user error that we can’t help with.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970558320


   Different behaviour of `if execution_date.microsecond > 0` ?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr closed issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
uranusjr closed issue #19618:
URL: https://github.com/apache/airflow/issues/19618


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970466931


   > Any spark job would do to reproduce this issue since the DAG fails before triggering the job
   
   What's spark_job() doing  ? This must be something internal to your organisation? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-971198721






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970551844


   I believe I know what the problem is. yes the `dag_run.logical_date` should work and I know what the problem is: @uranusjr  - this is the "coerce_date" in following_schedule:
   
   ```
       def following_schedule(self, dttm):
           """
           Calculates the following schedule for this dag in UTC.
   
           :param dttm: utc datetime
           :return: utc datetime
           """
           warnings.warn(
               "`DAG.following_schedule()` is deprecated. Use `DAG.next_dagrun_info(restricted=False)` instead.",
               category=DeprecationWarning,
               stacklevel=2,
           )
           data_interval = self.infer_automated_data_interval(timezone.coerce_datetime(dttm))
           next_info = self.next_dagrun_info(data_interval, restricted=False)
           if next_info is None:
               return None
           return next_info.data_interval.start
   ```
   
   Here is `coerce_date`:
   
   ```
   
   def coerce_datetime(v: Union[None, dt.datetime, DateTime]) -> Optional[DateTime]:
       """Convert whatever is passed in to an timezone-aware ``pendulum.DateTime``."""
       if v is None:
           return None
       if v.tzinfo is None:
           v = make_aware(v)
       if isinstance(v, DateTime):
           return v
       return pendulum.instance(v)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AnithaG-Oak commented on issue #19618: Execution_date not rendering after airflow upgrade

Posted by GitBox <gi...@apache.org>.
AnithaG-Oak commented on issue #19618:
URL: https://github.com/apache/airflow/issues/19618#issuecomment-970548765


   On a related note, noticed from the airflow doc that {{ execution_date }} is deprecated.
   I see that {{ ts }} is equivalent to execution_date isoformat but this is a string type. Do we have an equivalent for execution_date in datetime type ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org