You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/08/30 12:45:35 UTC

[GitHub] [airflow] collinmcnulty opened a new pull request #17911: Warn that job_heartbeat_sec must be less than scheduler_health_check_threshold

collinmcnulty opened a new pull request #17911:
URL: https://github.com/apache/airflow/pull/17911


   Adds note to the configuration reference warning that setting  `job_heartbeat_sec` to a value greater than `scheduler_health_check_threshold` will lead to tasks being marked as zombie.
   
   related: #16573


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] collinmcnulty commented on pull request #17911: Warn that job_heartbeat_sec must be less than scheduler_health_check_threshold

Posted by GitBox <gi...@apache.org>.
collinmcnulty commented on pull request #17911:
URL: https://github.com/apache/airflow/pull/17911#issuecomment-908334532


   I've rebased from main and did not edit the file that's causing the static tests to fail. Might be related to [this recent merge](https://github.com/apache/airflow/pull/17787).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] closed pull request #17911: Warn that job_heartbeat_sec must be less than scheduler_health_check_threshold

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #17911:
URL: https://github.com/apache/airflow/pull/17911


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17911: Warn that job_heartbeat_sec must be less than scheduler_health_check_threshold

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17911:
URL: https://github.com/apache/airflow/pull/17911#issuecomment-909741345


   I believe if those are correct observations (about relation between different timeouts/thresholds), then we should check those values while starting components of Airflow and raise Warnings instead (or on top of ) documentation update. 
   
   People will not look in the documentation usually, even if they have an issue they will not look in the docs, but they **might** monitor logs for warnings and they **willl** find such warnings if we ask them to provide "suspicious logs" when the see a problem. 
   
   Also - we have PLENTY of parameters. Maybe there are other relations between different values that we should also check and warn the users about @ashb @kaxil @ephraimbuddy @jedcunningham ? 
   
   I think you've been recently chasing quite a number of similar reports/jobs and I believe there are some of those inter-parameter relations that come to your mind immediately as invalid and once that we should flag? I saw at least few advises in the issues in the form ("Hey, your configuration is really wrong  - you should not use X parallelism when you have just Y cores". Maybe we should figure out and codify some of those in the forms of warning messages? This eventually means less work for those who answer to the common issues of various people and far less frustration at the side of the users.
   
   I think that might bring down a number of issues in the future significantly if we have such checks in-place and raise them as warnings with an explanation why the warnings are raised.
   
   WDYT? Any other candidates for such "sanity checks with the parameter configurations" ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #17911: Warn that job_heartbeat_sec must be less than scheduler_health_check_threshold

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #17911:
URL: https://github.com/apache/airflow/pull/17911#issuecomment-944818076


   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17911: Warn that job_heartbeat_sec must be less than scheduler_health_check_threshold

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17911:
URL: https://github.com/apache/airflow/pull/17911#issuecomment-909741345


   I believe if those are correct observations (about relation between different timeouts/thresholds), then we should check those values while starting components of Airflow and raise Warnings instead (or on top of ) documentation update. 
   
   People will not look in the documentation usually, even if they have an issue they will not look in the docs, but they **might** monitor logs for warnings and they **willl** find such warnings if we ask them to provide "suspicious logs" when the see a problem. 
   
   Also - we have PLENTY of parameters. Maybe there are other relations between different values that we should also check and warn the users about @ashb @kaxil @ephraimbuddy @jedcunningham ? 
   
   I think you've been recently chasing quite a number of similar reports/jobs and I believe there are some of those inter-parameter relations that come to your mind immediately as invalid and once that we should flag? I saw at least few advises in the issues in the form ("Hey, your configuration is really wrong  - you should not use X parallelism when you have just Y cores". Maybe we should figure out and codify some of those in the forms of warning messages? This eventually means less work for those who answer to the common issues of various people and far less frustration at the side of the users.
   
   I think that might bring down a number of issues in the future significantly if we have such checks in-place and raise them as warnings with an explanation why the warnings are raised.
   
   WDYT? Any other candidates for such "sanity checks with the parameter configurations" ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] collinmcnulty commented on pull request #17911: Warn that job_heartbeat_sec must be less than scheduler_health_check_threshold

Posted by GitBox <gi...@apache.org>.
collinmcnulty commented on pull request #17911:
URL: https://github.com/apache/airflow/pull/17911#issuecomment-908334532


   I've rebased from main and did not edit the file that's causing the static tests to fail. Might be related to [this recent merge](https://github.com/apache/airflow/pull/17787).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org