You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/03/04 13:35:58 UTC

[GitHub] [airflow] Wilderone opened a new issue #21994: scheduler: livenessProbe redundant exec leads to false-positive result

Wilderone opened a new issue #21994:
URL: https://github.com/apache/airflow/issues/21994


   ### Official Helm Chart version
   
   1.4.0 (latest released)
   
   ### Apache Airflow version
   
   2.2.4 (latest released)
   
   ### Kubernetes Version
   
   v1.21.5-eks-bc4871b
   
   ### Helm Chart configuration
   
   _No response_
   
   ### Docker Image customisations
   
   _No response_
   
   ### What happened
   
   Current livenessProbe in scheduler returns 0 exit code even if execution failed, this leads to container won't restart if it necessary. 
   I see two ways to fix this:
   1. Simply remove ```- exec``` from livenessProbe
   2. Move livenessProbe to values for customisation by client
   
   ### What you expected to happen
   
   Restart container if livenessProve failed
   
   ### How to reproduce
   
   Fastest way - change livenessProbe to:
   ```
   - sh
   - c
   - exec
   - cat /123
   ```
   You will see no errors in pod describe
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jedcunningham closed issue #21994: scheduler: livenessProbe redundant exec leads to false-positive result

Posted by GitBox <gi...@apache.org>.
jedcunningham closed issue #21994:
URL: https://github.com/apache/airflow/issues/21994


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jedcunningham commented on issue #21994: scheduler: livenessProbe redundant exec leads to false-positive result

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on issue #21994:
URL: https://github.com/apache/airflow/issues/21994#issuecomment-1062154933


   This is fixed by #22041.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #21994: scheduler: livenessProbe redundant exec leads to false-positive result

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #21994:
URL: https://github.com/apache/airflow/issues/21994#issuecomment-1059167977


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #21994: scheduler: livenessProbe redundant exec leads to false-positive result

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #21994:
URL: https://github.com/apache/airflow/issues/21994#issuecomment-1060100995


   Hey @ephraimbuddy @jedcunningham  - I think that one might be important enough to cancel 1.5.0 of chart and make a release candidate 2.  I just checked it and I think this is is indeed wrong what we have.
   
   The current liveness probe is:
   
   ```
               exec:
                 command:
                   - sh
                   - -c
                   - exec
                   - |
                     CONNECTION_CHECK_MAX_COUNT=0 /entrypoint python -Wignore -c "
                     import os
                     os.environ['AIRFLOW__CORE__LOGGING_LEVEL'] = 'ERROR'
                     os.environ['AIRFLOW__LOGGING__LOGGING_LEVEL'] = 'ERROR'
   
                     from airflow.jobs.scheduler_job import SchedulerJob
                     from airflow.utils.db import create_session
                     from airflow.utils.net import get_hostname
                     import sys
   
                     with create_session() as session:
                         job = session.query(SchedulerJob).filter_by(hostname=get_hostname()).order_by(
                             SchedulerJob.latest_heartbeat.desc()).limit(1).first()
   
                     sys.exit(0 if job.is_alive() else 1)
                     "
   ```
   
   But it should be IMHO:
   
   ```
               exec:
                 command:
                   - sh
                   - -c
                   - | 
                     CONNECTION_CHECK_MAX_COUNT=0 /entrypoint python -Wignore -c "
                     import os
                     os.environ['AIRFLOW__CORE__LOGGING_LEVEL'] = 'ERROR'
                     os.environ['AIRFLOW__LOGGING__LOGGING_LEVEL'] = 'ERROR'
   
                     from airflow.jobs.scheduler_job import SchedulerJob
                     from airflow.utils.db import create_session
                     from airflow.utils.net import get_hostname
                     import sys
   
                     with create_session() as session:
                         job = session.query(SchedulerJob).filter_by(hostname=get_hostname()).order_by(
                             SchedulerJob.latest_heartbeat.desc()).limit(1).first()
   
                     sys.exit(0 if job.is_alive() else 1)
                     "
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on issue #21994: scheduler: livenessProbe redundant exec leads to false-positive result

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on issue #21994:
URL: https://github.com/apache/airflow/issues/21994#issuecomment-1060622975


   > Hey @ephraimbuddy @jedcunningham - I think that one might be important enough to cancel 1.5.0 of chart and make a release candidate 2. I just checked it and I think this is is indeed wrong what we have.
   
   Yeah. I support having an RC 2. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org