You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/07/07 20:25:30 UTC
[GitHub] [airflow] jedcunningham opened a new pull request, #24906: Fix zombie task handling with multiple schedulers
jedcunningham opened a new pull request, #24906:
URL: https://github.com/apache/airflow/pull/24906
Each scheduler was looking at all running tasks for zombies, leading to
multiple schedulers handling the zombies. This causes problems with
retries (e.g. being marked as FAILED instead of UP_FOR_RETRY) and
callbacks (e.g. `on_failure_callback` being called multiple times).
When the second scheduler tries to determine if the task is able to be retried,
and it's already in UP_FOR_RETRY (the first scheduler already finished),
it sees the "next" try_number (as it's no longer running),
which then leads it to be FAILED instead.
The easy fix is to simply restrict each scheduler to its own TIs, as
orphaned running TIs will be adopted anyways.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] jedcunningham commented on pull request #24906: Fix zombie task handling with multiple schedulers
Posted by GitBox <gi...@apache.org>.
jedcunningham commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1179214836
Haha no worries, it happens. Were there changes you wanted?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] jedcunningham commented on pull request #24906: Fix zombie task handling with multiple schedulers
Posted by GitBox <gi...@apache.org>.
jedcunningham commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1178255693
I don't think we need a comment in that section, frankly I'm not sure it would have helped me. I was just moving too quickly and didn't look closely enough.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] jedcunningham merged pull request #24906: Fix zombie task handling with multiple schedulers
Posted by GitBox <gi...@apache.org>.
jedcunningham merged PR #24906:
URL: https://github.com/apache/airflow/pull/24906
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] jedcunningham commented on pull request #24906: Fix zombie task handling with multiple schedulers
Posted by GitBox <gi...@apache.org>.
jedcunningham commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1178220111
@collinmcnulty, how's that?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] collinmcnulty commented on pull request #24906: Fix zombie task handling with multiple schedulers
Posted by GitBox <gi...@apache.org>.
collinmcnulty commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1178188908
I know we want atomic changes but do you also want to get rid of line 1373 which does nothing?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] collinmcnulty commented on pull request #24906: Fix zombie task handling with multiple schedulers
Posted by GitBox <gi...@apache.org>.
collinmcnulty commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1178199831
Maybe that deserves a comment then?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] collinmcnulty commented on pull request #24906: Fix zombie task handling with multiple schedulers
Posted by GitBox <gi...@apache.org>.
collinmcnulty commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1178233265
I meant that it might be good to comment that the line that seemingly does nothing actually helps distinguish between localtaskjob and taskinstance. I know that's a bit off topic from the thrust of this PR so feel free to ignore, but it's something we discussed in troubleshooting so maybe we can save the next person from re-discovering that the line does have a purpose after all.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] jedcunningham commented on pull request #24906: Fix zombie task handling with multiple schedulers
Posted by GitBox <gi...@apache.org>.
jedcunningham commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1178197604
Turns out it _does_ do something, `TaskInstance` vs `LocalTaskJob`. I just completely overlooked it yesterday, and there is test coverage 😉.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] dstandish commented on pull request #24906: Fix zombie task handling with multiple schedulers
Posted by GitBox <gi...@apache.org>.
dstandish commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1179212848
sorry @jedcunningham had a pending review that i forgot to finish
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org