You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/09/26 15:10:00 UTC
[GitHub] [airflow] potiuk opened a new pull request #18531: Workaround intermittently failing scheduler test
potiuk opened a new pull request #18531:
URL: https://github.com/apache/airflow/pull/18531
Some of the executions of this test return dagrun in Queued
rather than Running state. This PR attempts to wokraround it
by trying to re-run scheduling in such case (up to several times)
<!--
Thank you for contributing! Please make sure that your code changes
are covered with tests. And in case of new features or big changes
remember to adjust the documentation.
Feel free to ping committers for the review!
In case of existing issue, reference it using one of the following:
closes: #ISSUE
related: #ISSUE
How to write a good git commit message:
http://chris.beams.io/posts/git-commit/
-->
---
**^ Add meaningful description above**
Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #18531: Workaround intermittently failing scheduler test
Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18531:
URL: https://github.com/apache/airflow/pull/18531#issuecomment-937604663
Take a look @ephraimbuddy please - I think I got it, but would love confirmation :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] ephraimbuddy commented on a change in pull request #18531: Workaround intermittently failing scheduler test
Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on a change in pull request #18531:
URL: https://github.com/apache/airflow/pull/18531#discussion_r716273133
##########
File path: tests/jobs/test_scheduler_job.py
##########
@@ -2658,9 +2659,14 @@ def test_do_schedule_max_active_runs_dag_timed_out(self, dag_maker):
assert run1_ti.state == State.SKIPPED
# Run scheduling again to assert run2 has started
- self.scheduler_job._do_scheduling(session)
- run2 = session.merge(run2)
- session.refresh(run2)
+ for i in range(1, 10):
+ self.scheduler_job._do_scheduling(session)
+ run2 = session.merge(run2)
+ session.refresh(run2)
+ if run2.state == State.QUEUED:
+ sleep(0.1)
+ continue
+ break
Review comment:
For the code starting from 2661, I would suggest this:
```python
# Run scheduling again to assert run2 has started
self.scheduler_job._start_queued_dagruns(session)
session.flush()
run2 = session.merge(run2)
session.refresh(run2)
assert run2.state == State.RUNNING
```
Since this is testing max_active_runs and dag_timeout, I think we don't need to schedule the task instances.
We can also run the _schedule_dag_run to have it put the ti into scheduled:
```python
# Run scheduling again to assert run2 has started
self.scheduler_job._start_queued_dagruns(session)
session.flush()
self.scheduler_job._schedule_dag_run(run2, session)
run2 = session.merge(run2)
session.refresh(run2)
assert run2.state == State.RUNNING
run2_ti = run2.get_task_instance(task1.task_id, session)
assert run2_ti.state == State.SCHEDULED
```
What I have observed is that using `_do_scheduling` in tests usually doesn't do what we want. I prefer using `_start_queued_dagrun` to start dagruns instead of using `do_scheduling`. Maybe we should use it here too.
https://github.com/apache/airflow/blob/2643345e4b72064c605e42901a3dc531e6aa2f4e/tests/jobs/test_scheduler_job.py#L2755
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk merged pull request #18531: Stabilize flaky test_do_schedule_max_active_runs_dag_timed_out
Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #18531:
URL: https://github.com/apache/airflow/pull/18531
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #18531: Workaround intermittently failing scheduler test
Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18531:
URL: https://github.com/apache/airflow/pull/18531#issuecomment-937604663
Take a look @ephraimbuddy please - I think I got it, but would love confirmation :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk merged pull request #18531: Stabilize flaky test_do_schedule_max_active_runs_dag_timed_out
Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #18531:
URL: https://github.com/apache/airflow/pull/18531
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #18531: Workaround intermittently failing scheduler test
Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #18531:
URL: https://github.com/apache/airflow/pull/18531#discussion_r716678182
##########
File path: tests/jobs/test_scheduler_job.py
##########
@@ -2658,9 +2659,14 @@ def test_do_schedule_max_active_runs_dag_timed_out(self, dag_maker):
assert run1_ti.state == State.SKIPPED
# Run scheduling again to assert run2 has started
- self.scheduler_job._do_scheduling(session)
- run2 = session.merge(run2)
- session.refresh(run2)
+ for i in range(1, 10):
+ self.scheduler_job._do_scheduling(session)
+ run2 = session.merge(run2)
+ session.refresh(run2)
+ if run2.state == State.QUEUED:
+ sleep(0.1)
+ continue
+ break
Review comment:
Ah coll. Good points. I will take a look a bit closer soon and see it. I would love to learn a bit more on how those tests are working :D
##########
File path: tests/jobs/test_scheduler_job.py
##########
@@ -2658,9 +2659,14 @@ def test_do_schedule_max_active_runs_dag_timed_out(self, dag_maker):
assert run1_ti.state == State.SKIPPED
# Run scheduling again to assert run2 has started
- self.scheduler_job._do_scheduling(session)
- run2 = session.merge(run2)
- session.refresh(run2)
+ for i in range(1, 10):
+ self.scheduler_job._do_scheduling(session)
+ run2 = session.merge(run2)
+ session.refresh(run2)
+ if run2.state == State.QUEUED:
+ sleep(0.1)
+ continue
+ break
Review comment:
Ah cool. Good points. I will take a look a bit closer soon and see it. I would love to learn a bit more on how those tests are working :D
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #18531: Workaround intermittently failing schediuler test
Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18531:
URL: https://github.com/apache/airflow/pull/18531#issuecomment-927289950
Not sure if this is a good solution - maybe the state SHOULD be RUNNING immediately and we have some actual problem ? But worth trying: @ashb @ephraimbuddy - I would love if you took a look to see if this is a legitimate possibility to have a QUEUED state there for a short while (and whether my approach to workaround is correct).
Example failure that made me create this PR: https://github.com/apache/airflow/runs/3712148691?check_suite_focus=true#step:6:9688
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #18531: Workaround intermittently failing scheduler test
Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18531:
URL: https://github.com/apache/airflow/pull/18531#issuecomment-927321490
Trying out on full tests
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk closed pull request #18531: Workaround intermittently failing scheduler test
Posted by GitBox <gi...@apache.org>.
potiuk closed pull request #18531:
URL: https://github.com/apache/airflow/pull/18531
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org