You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/12/08 07:01:04 UTC

[GitHub] [airflow] dstandish opened a new pull request #12910: Fix return type in prev-date context variables

dstandish opened a new pull request #12910:
URL: https://github.com/apache/airflow/pull/12910


   Addresses question here https://github.com/apache/airflow/discussions/12503


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538846319



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil please see previous comment (see heading _On the failing test_)for reasoning on this change.   the issue is in test cases (for python virtualenv operator) there is no dag run, due to test design.  in this situation (when previous_ti is called on a TI with no dag run), airflow's behavior is not to attempt to retrieve a "previous" task instance but instead to create a _new_ TI with the execution date of the prev schedule.  
   
   importantly, a brand new TI such as this has no start_date.  so here we must check not only for prev_ti but also that this ti has a start date.
   
   to remove this check, we could ensure that the TI created in the test has a dag run.
   
   but perhaps there is reason why we are testing with TIs that have no dag run -- i don't know




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#issuecomment-741587280


   [The Workflow run](https://github.com/apache/airflow/actions/runs/410053575) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538858979



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       yes this or
   ```
   if prev_ti and prev_ti.start_date:
       return pendulum.instance(prev_ti.start_date)
   ```
   
   or 
   ```
   if prev_ti and prev_ti.start_date is not None:
       return pendulum.instance(prev_ti.start_date)
   ```
   All fine with me.  Will wait for guidance on changing test vs changing code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#issuecomment-741533306


   [The Workflow run](https://github.com/apache/airflow/actions/runs/409867843) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538851306



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       I would rather have:
   
   ```python
   return prev_ti and prev_ti.start_date is not None and pendulum.instance(prev_ti.start_date)
   ```
   
   I am not sure if we need to fix test or the code, my brain is not working right now. Looping in @ashb @turbaszek @XD-DENG for reviews




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#issuecomment-741076236


   update... 
   
   **On impact to users**
   
   so the only time that `str(dt)` differs from `str(pendulum.instance(dt))` is when dt has not tz info.
   
   in our case, these attributes already always had tzinfo.
   
   so there is no difference in string representation as a result of this change.
   
   so i don't think there is any significant change in behavior that merits calling out in updating, apart from the mere fact that it's now pendulum instead of datetime
   
   
   **On the failing test**
   
   This was failing in `tests/operators/test_python.py::TestPythonVirtualenvOperator` because the `start_date` on `previous_ti` is not populated in the test code.
   
   See here https://github.com/apache/airflow/blob/ff25bd6ffed53de651254f7c3e6d254e4191bfc7/airflow/models/taskinstance.py#L691-L696.
   
   Since we're running without a dag run, a new task instance is manufactured.  But since this task instance has never been run, it has no start_date.  So I added a check for presence of start_date.  
   
   Instead of adding this check, I could have modified the test so that we test within a dag_run.   In this case, the code would have tried to retrieve a previous dag run, and not finding one, would have returned None for `previous_ti`.  Which is how it would generally behave in real life.  
   
   I am not sure which approach would be more desirable, but in case there are reasons for supporting the no-dagrun case so I have left the test logic as is.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538844288



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       Shouldn't this be:
   ```suggestion
           return prev_ti and pendulum.instance(prev_ti.start_date)
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538846319



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil please see previous comment for reasining on this change.   the issue is in test cases there is no dag run, there for no _actual_ previous ti is retrieved;  instead a new one is created _which has no start date_
   
   to fix this, we'd have to add a dag run in these two virtualenv operator tests or something like that




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538846319



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil please see previous comment for reasoning on this change.   the issue is in test cases (for python virtualenv operator) there is no dag run, due to test design.  in this situation (when previous_ti is called on a TI with no dag run), airflow's behavior is not to attempt to retrieve a "previous" task instance but instead to create a _new_ TI with the execution date of the prev schedule.  
   
   importantly, a brand new TI such as this has no start_date.  so we must check not only for prev_ti but also that this ti has a start date
   
   to remove this check, we could ensure that the TI created in the test has a dag run.
   
   but perhaps there is reason why we are testing with no dag run -- i don't know




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r539002022



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil i updated the python operator tests to create a dag run, so the operative code can assume presence of a `start_date`.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#issuecomment-740793192


   ok i will check the tests and add to updating.md


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538858979



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       yes this or
   ```
   if prev_ti and prev_ti.start_date:  # my preference, i guess
       return pendulum.instance(prev_ti.start_date)
   ```
   
   or 
   ```
   if prev_ti and prev_ti.start_date is not None:
       return pendulum.instance(prev_ti.start_date)
   ```
   All fine with me.  Will wait for guidance on changing test vs changing code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#issuecomment-741521411


   [The Workflow run](https://github.com/apache/airflow/actions/runs/409818707) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#issuecomment-740781663


   Tests are failing, yeah will need a note in Updating.md too


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538846319



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil please see previous comment for reasining on this change.   the issue is in test cases there is no dag run --> no _actual_ ti so a new one is created _which has no start date_
   
   to fix this, we'd have to add a dag run in these two virtualenv operator tests or something like that




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538846319



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil please see previous comment for reasining on this change.   the issue is in test cases (for python virtualenv operator) there is no dag run, there for no _actual_ previous ti is retrieved;  instead a new one is created _which has no start date_
   
   so we must check not only for prev_ti but also that this ti has a start date
   
   to fix this, we'd have to add a dag run in these two virtualenv operator tests  -- or something like that
   
   but perhaps there is reason why we are testing with no dag run -- i don't know




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538846319



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil please see previous comment (heading _On the failing test_)for reasoning on this change.   the issue is in test cases (for python virtualenv operator) there is no dag run, due to test design.  in this situation (when previous_ti is called on a TI with no dag run), airflow's behavior is not to attempt to retrieve a "previous" task instance but instead to create a _new_ TI with the execution date of the prev schedule.  
   
   importantly, a brand new TI such as this has no start_date.  so here we must check not only for prev_ti but also that this ti has a start date.
   
   to remove this check, we could ensure that the TI created in the test has a dag run.
   
   but perhaps there is reason why we are testing with TIs that have no dag run -- i don't know




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538846319



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil please see previous comment for reasoning on this change.   the issue is in test cases (for python virtualenv operator) there is no dag run, due to test design.  in this situation (when previous_ti is called on a TI with no dag run), airflow's behavior is not to attempt to retrieve a "previous" task instance but instead to create a _new_ TI with the execution date of the prev schedule.  
   
   importantly, a brand new TI such as this has no start_date.  so here we must check not only for prev_ti but also that this ti has a start date.
   
   to remove this check, we could ensure that the TI created in the test has a dag run.
   
   but perhaps there is reason why we are testing with TIs that have no dag run -- i don't know




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538846319



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil please see previous comment for reasoning on this change.   the issue is in test cases (for python virtualenv operator) there is no dag run, due to test design.  in this situation, airflow's behavior is not to attempt to retrieve a "previous" task instance but instead to create a new one -- importantly, without a start_date.
   
   so we must check not only for prev_ti but also that this ti has a start date
   
   to fix this, we'd have to add a dag run in these two virtualenv operator tests  -- or something like that
   
   but perhaps there is reason why we are testing with no dag run -- i don't know




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#issuecomment-740792150


   since pendulum inherits from datetime, i think the only thing that really breaks is string representation, though i could be wrong.
   
   apologies for providing us with a conundrum here.  it was one of my first contributions if not the first.  
   
   i took a look and i remember what was going on.  at the time, prev_execution_date and next_execution_date returned native datetime despite docs saying pendulum and execution_date being pendulum.  So when I added prev_execution_date_success etc, I kept the behavior and documentation consistent with that of prev_execution_date.  Why I didn't just wrap it in pendulum.instance, no clue.  Maybe I thought that under the hood execution_date was supposed to be always pendulum and this was merely a bug that would be fixed.
   
   In PR https://github.com/apache/airflow/pull/5654, this was partially fixed.  But the _success attributes were not updated.  So while now prev_execution_date returns pendulum, the _success versions do not.
   
   If this can be fixed for prev_execution_date, as it was in https://github.com/apache/airflow/pull/5654, I suppose it can be fixed for prev_execution_date_success.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil merged pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
kaxil merged pull request #12910:
URL: https://github.com/apache/airflow/pull/12910


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538846319



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil please see previous comment for reasining on this change.   the issue is in test cases there is no dag run, there for no _actual_ previous ti is retrieved;  instead a new one is created _which has no start date_
   
   so we must check not only for prev_ti but also that this ti has a start date
   
   to fix this, we'd have to add a dag run in these two virtualenv operator tests or something like that




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538846319



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil please see previous comment for reasoning on this change.   the issue is in test cases (for python virtualenv operator) there is no dag run, therefore no _actual_ previous ti is retrieved;  instead a new one is created _which has no start date_
   
   so we must check not only for prev_ti but also that this ti has a start date
   
   to fix this, we'd have to add a dag run in these two virtualenv operator tests  -- or something like that
   
   but perhaps there is reason why we are testing with no dag run -- i don't know




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538858979



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       yes this or
   ```
   if prev_ti and prev_ti.start_date:  # option 2 -- my preference, i guess
       return pendulum.instance(prev_ti.start_date)
   ```
   
   or 
   ```
   if prev_ti and prev_ti.start_date is not None:  # option 3
       return pendulum.instance(prev_ti.start_date)
   ```
   All fine with me.  Will wait for guidance on changing test vs changing code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#issuecomment-740792150


   since pendulum inherits from datetime, i think the only thing that really breaks is string representation, though i could be wrong.
   
   apologies for providing us with a conundrum here.  it was one of my first contributions if not the first.  i took a look and i remember what was going on.  at the time, prev_execution_date and next_execution_date returned native datetime despite docs saying pendulum and execution_date being pendulum.  So when I added prev_execution_date_success etc, I kept the behavior and documentation consistent with that of prev_execution_date.  Why I didn't just wrap it in pendulum.instance, no clue.  Maybe I thought that under the hood execution_date was supposed to be always pendulum and this was merely a bug that would be fixed.
   
   In PR https://github.com/apache/airflow/pull/5654, this was partially fixed.  But the _success attributes were not updated.  So while now prev_execution_date returns pendulum, the _success versions do not.
   
   If this can be fixed for prev_execution_date, as it was in https://github.com/apache/airflow/pull/5654, I suppose it can be fixed for prev_execution_date_success.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538846319



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil please see previous comment for reasining on this change.   the issue is in test cases there is no dag run, there for no _actual_ previous ti is retrieved;  instead a new one is created _which has no start date_
   
   so we must check not only for prev_ti but also that this ti has a start date
   
   to fix this, we'd have to add a dag run in these two virtualenv operator tests  -- or something like that
   
   but perhaps there is reason why we are testing with no dag run -- i don't know




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] RosterIn commented on pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
RosterIn commented on pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#issuecomment-740497837


   I think that it's good to fix to be compatible with the docs (return it as pendulum object) but this is now also a breaking change as it's very likely that people are using it as datetime (even though the indention was never to exposed it as datetime)
   
   Since 2.0 is soon maybe all needed is just note in updating.md and this can get in? @kaxil


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538851306



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       I would rather have:
   
   ```
   return prev_ti and prev_ti.start_date is not None and pendulum.instance(prev_ti.start_date)
   ```
   Although my brain is not working right now. Looping in @ashb @turbaszek @XD-DENG for reviews




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#issuecomment-741076236


   update... 
   
   **On impact to users**
   
   so the only time that `str(dt)` differs from `str(pendulum.instance(dt))` is when dt has not tz info.
   
   in our case, these attributes already always had tzinfo.
   
   so there is no difference in string representation as a result of this change.
   
   so i don't think there is any significant change in behavior that merits calling out in updating, apart from the mere fact that it's now pendulum instead of datetime
   
   
   **On the failing test**
   
   The cause of this failure is the way `get_previous_ti` works on `TaskInstance`.
   See here https://github.com/apache/airflow/blob/ff25bd6ffed53de651254f7c3e6d254e4191bfc7/airflow/models/taskinstance.py#L691-L696.
   
   Since we're running without a dag run, a new task instance is manufactured.  But since this task instance has never been run, it has no start_date.  So I added a check for presence of start_date.  
   
   Instead of adding this check, I could have modified the test so that we test within a dag_run.   In this case, the code would have tried to retrieve a previous dag run, and not finding one, would have returned None for `previous_ti`.  Which is how it would generally behave in real life.  But maybe there are reasons for supporting the no-dagrun case so I leave as is.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538846319



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil please see previous comment for reasoning on this change.   the issue is in test cases (for python virtualenv operator) there is no dag run, due to test design.  in this situation (when previous_ti is called on a TI with no dag run), airflow's behavior is not to attempt to retrieve a "previous" task instance but instead to create a new one -- importantly, without a start_date.
   
   so we must check not only for prev_ti but also that this ti has a start date
   
   to fix this, we'd have to add a dag run in these two virtualenv operator tests  -- or something like that
   
   but perhaps there is reason why we are testing with no dag run -- i don't know




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538846319



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil please see previous comment for reasoning on this change.   the issue is in test cases (for python virtualenv operator) there is no dag run, there for no _actual_ previous ti is retrieved;  instead a new one is created _which has no start date_
   
   so we must check not only for prev_ti but also that this ti has a start date
   
   to fix this, we'd have to add a dag run in these two virtualenv operator tests  -- or something like that
   
   but perhaps there is reason why we are testing with no dag run -- i don't know




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538846319



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil please see previous comment for reasoning on this change.   the issue is in test cases (for python virtualenv operator) there is no dag run, due to test design.  in this situation (when previous_ti is called on a TI with no dag run), airflow's behavior is not to attempt to retrieve a "previous" task instance but instead to create a _new_ TI with the execution date of the prev schedule.  
   
   importantly, a brand new TI such as this has no start_date.  so we must check not only for prev_ti but also that this ti has a start date
   
   to remove this check, we could ensure that the TI created in the test has a dag run.
   
   but perhaps there is reason why we are testing with TIs that have no dag run -- i don't know




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538858979



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       yes this or...
   
   option 2:  (my preference, i guess)
   ```
   if prev_ti and prev_ti.start_date:
       return pendulum.instance(prev_ti.start_date)
   ```
   
   or option 3: 
   ```
   if prev_ti and prev_ti.start_date is not None:
       return pendulum.instance(prev_ti.start_date)
   ```
   
   Any way is fine with me.  Will wait for guidance on changing test vs changing code.

##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       yes this or...
   
   option 2:  (my preference, i guess)
   ```
   if prev_ti and prev_ti.start_date:
       return pendulum.instance(prev_ti.start_date)
   ```
   
   or option 3: 
   ```
   if prev_ti and prev_ti.start_date is not None:
       return pendulum.instance(prev_ti.start_date)
   ```
   
   Any way is fine with me.  Will wait for guidance on changing test vs changing code bbefore updating




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #12910: Fix return type in prev-date context variables

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #12910:
URL: https://github.com/apache/airflow/pull/12910#discussion_r538846319



##########
File path: airflow/models/taskinstance.py
##########
@@ -773,7 +773,7 @@ def get_previous_start_date(
         """
         self.log.debug("previous_start_date was called")
         prev_ti = self.get_previous_ti(state=state, session=session)
-        return prev_ti and prev_ti.start_date
+        return prev_ti and prev_ti.start_date and pendulum.instance(prev_ti.start_date)

Review comment:
       @kaxil please see previous comment for reasoning on this change.   the issue is in test cases (for python virtualenv operator) there is no dag run, due to test design.  in this situation (when previous_ti is called on a TI with no dag run), airflow's behavior is not to attempt to retrieve a "previous" task instance but instead to create a _new_ TI with the execution date of the prev schedule.  
   
   importantly, a brand new TI such as this has no start_date.  so we must check not only for prev_ti but also that this ti has a start date
   
   to fix this, we'd have to add a dag run in these two virtualenv operator tests  -- or something like that
   
   but perhaps there is reason why we are testing with no dag run -- i don't know




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org