You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/07/07 20:25:30 UTC

[GitHub] [airflow] jedcunningham opened a new pull request, #24906: Fix zombie task handling with multiple schedulers

jedcunningham opened a new pull request, #24906:
URL: https://github.com/apache/airflow/pull/24906

   Each scheduler was looking at all running tasks for zombies, leading to
   multiple schedulers handling the zombies. This causes problems with
   retries (e.g. being marked as FAILED instead of UP_FOR_RETRY) and
   callbacks (e.g. `on_failure_callback` being called multiple times).
   
   When the second scheduler tries to determine if the task is able to be retried,
   and it's already in UP_FOR_RETRY (the first scheduler already finished),
   it sees the "next" try_number (as it's no longer running),
   which then leads it to be FAILED instead.
   
   The easy fix is to simply restrict each scheduler to its own TIs, as
   orphaned running TIs will be adopted anyways.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jedcunningham commented on pull request #24906: Fix zombie task handling with multiple schedulers

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1179214836

   Haha no worries, it happens. Were there changes you wanted?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jedcunningham commented on pull request #24906: Fix zombie task handling with multiple schedulers

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1178255693

   I don't think we need a comment in that section, frankly I'm not sure it would have helped me. I was just moving too quickly and didn't look closely enough.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jedcunningham merged pull request #24906: Fix zombie task handling with multiple schedulers

Posted by GitBox <gi...@apache.org>.
jedcunningham merged PR #24906:
URL: https://github.com/apache/airflow/pull/24906


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jedcunningham commented on pull request #24906: Fix zombie task handling with multiple schedulers

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1178220111

   @collinmcnulty, how's that?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] collinmcnulty commented on pull request #24906: Fix zombie task handling with multiple schedulers

Posted by GitBox <gi...@apache.org>.
collinmcnulty commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1178188908

   I know we want atomic changes but do you also want to get rid of line 1373 which does nothing?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] collinmcnulty commented on pull request #24906: Fix zombie task handling with multiple schedulers

Posted by GitBox <gi...@apache.org>.
collinmcnulty commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1178199831

   Maybe that deserves a comment then?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] collinmcnulty commented on pull request #24906: Fix zombie task handling with multiple schedulers

Posted by GitBox <gi...@apache.org>.
collinmcnulty commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1178233265

   I meant that it might be good to comment that the line that seemingly does nothing actually helps distinguish between localtaskjob and taskinstance. I know that's a bit off topic from the thrust of this PR so feel free to ignore, but it's something we discussed in troubleshooting so maybe we can save the next person from re-discovering that the line does have a purpose after all. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jedcunningham commented on pull request #24906: Fix zombie task handling with multiple schedulers

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1178197604

   Turns out it _does_ do something, `TaskInstance` vs `LocalTaskJob`. I just completely overlooked it yesterday, and there is test coverage 😉.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on pull request #24906: Fix zombie task handling with multiple schedulers

Posted by GitBox <gi...@apache.org>.
dstandish commented on PR #24906:
URL: https://github.com/apache/airflow/pull/24906#issuecomment-1179212848

   sorry @jedcunningham had a pending review that i forgot to finish


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org