You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "chuxiangfeng (via GitHub)" <gi...@apache.org> on 2023/02/02 08:18:49 UTC

[GitHub] [airflow] chuxiangfeng opened a new issue, #29303: KubernetesPodOperator not make task status to success due the task execution time is very short

chuxiangfeng opened a new issue, #29303:
URL: https://github.com/apache/airflow/issues/29303

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### What happened
   
   airflow version:2.4.1
   python:3.7.13
   After the KubernetesPodOperator task is created,The await_pod_start method will check every second to see if the Pod has started successfully. 
   `    def await_pod_start(self, pod: V1Pod, startup_timeout: int = 120) -> None:
           curr_time = datetime.now()
           while True:
               remote_pod = self.read_pod(pod)
               if remote_pod.status.phase != PodPhase.PENDING:
                   break
               self.log.warning("Pod not yet started: %s", pod.metadata.name)
               delta = datetime.now() - curr_time
               if delta.total_seconds() >= startup_timeout:
                   msg = (
                       f"Pod took longer than {startup_timeout} seconds to start. "
                       "Check the pod events in kubernetes to determine why."
                   )
                   raise PodLaunchFailedException(msg)
               time.sleep(1)`
   However, my Pod will end immediately without running under some conditions. This will cause await_pod_start to fail to detect that the Pod status is Completed. It's stuck in Pod not yet started, it's always blocked, and the mission status is always Running, even after many days
   ```
   [2023-02-02, 08:00:02 CST] {kubernetes_pod.py:381} INFO - `try_number` of task_instance: 1
   [2023-02-02, 08:00:02 CST] {kubernetes_pod.py:382} INFO - `try_number` of pod: 1
   [2023-02-02, 08:00:02 CST] {pod_manager.py:180} WARNING - Pod not yet started: t-crawler-gem-ipo-request-b2eb9a08d4474705afd43bff54323785
   [2023-02-02, 08:00:03 CST] {pod_manager.py:180} WARNING - Pod not yet started: t-crawler-gem-ipo-request-b2eb9a08d4474705afd43bff54323785
   [2023-02-02, 08:00:04 CST] {pod_manager.py:180} WARNING - Pod not yet started: t-crawler-gem-ipo-request-b2eb9a08d4474705afd43bff54323785
   [2023-02-02, 08:00:06 CST] {pod_manager.py:180} WARNING - Pod not yet started: t-crawler-gem-ipo-request-b2eb9a08d4474705afd43bff54323785
   [2023-02-02, 08:00:07 CST] {pod_manager.py:180} WARNING - Pod not yet started: t-crawler-gem-ipo-request-b2eb9a08d4474705afd43bff54323785
   [2023-02-02, 08:00:08 CST] {pod_manager.py:180} WARNING - Pod not yet started: t-crawler-gem-ipo-request-b2eb9a08d4474705afd43bff54323785
   [2023-02-02, 08:00:09 CST] {pod_manager.py:180} WARNING - Pod not yet started: t-crawler-gem-ipo-request-b2eb9a08d4474705afd43bff54323785
   [2023-02-02, 08:00:10 CST] {pod_manager.py:180} WARNING - Pod not yet started: t-crawler-gem-ipo-request-b2eb9a08d4474705afd43bff54323785
   [2023-02-02, 08:00:11 CST] {pod_manager.py:180} WARNING - Pod not yet started: t-crawler-gem-ipo-request-b2eb9a08d4474705afd43bff54323785
   [2023-02-02, 08:00:12 CST] {pod_manager.py:180} WARNING - Pod not yet started: t-crawler-gem-ipo-request-b2eb9a08d4474705afd43bff54323785
   [2023-02-02, 08:00:13 CST] {pod_manager.py:180} WARNING - Pod not yet started: t-crawler-gem-ipo-request-b2eb9a08d4474705afd43bff54323785
   [2023-02-02, 08:00:14 CST] {pod_manager.py:180} WARNING - Pod not yet started: t-crawler-gem-ipo-request-b2eb9a08d4474705afd43bff54323785
   ```
   after above logs,no more log print,task still in Running.
   
   ### What you think should happen instead
   
   after Pod task completed, dag task status set to success.
   
   ### How to reproduce
   
   The Pod mission does nothing and ends immediately
   
   ### Operating System
   
   PRETTY_NAME="Debian GNU/Linux 11 (bullseye)" NAME="Debian GNU/Linux" VERSION_ID="11" VERSION="11 (bullseye)" VERSION_CODENAME=bullseye ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/"
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow==2.4.2
   apache-airflow-providers-cncf-kubernetes==4.4.0
   apache-airflow-providers-common-sql==1.2.0
   apache-airflow-providers-dingding==3.1.0
   apache-airflow-providers-elasticsearch==3.0.3
   apache-airflow-providers-ftp==2.1.2
   apache-airflow-providers-http==2.1.2
   apache-airflow-providers-imap==2.2.3
   apache-airflow-providers-mongo==2.3.3
   apache-airflow-providers-mysql==2.2.3
   apache-airflow-providers-postgres==4.1.0
   apache-airflow-providers-redis==2.0.4
   apache-airflow-providers-sqlite==3.2.1
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] chuxiangfeng commented on issue #29303: KubernetesPodOperator not make task status to success due the task execution time is very short

Posted by "chuxiangfeng (via GitHub)" <gi...@apache.org>.
chuxiangfeng commented on issue #29303:
URL: https://github.com/apache/airflow/issues/29303#issuecomment-1457589040

   > > kubectl get pod t-crawler-gem-ipo-request-39a8a54e91394e8198711966fb7ec66d -o jsonpath='{.status.phase}' -n crawler
   > > Succeeded
   > 
   > @chuxiangfeng Since kubectl returns Succeeded as status, it may be a bug in kubernetes client, can you check which version of kubernetes lib is installed on Airflow server?
   > 
   > ```shell
   > pip freeze | grep kubernetes
   > ```
   apache-airflow-providers-cncf-kubernetes==4.4.0
   kubernetes==23.6.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] commented on issue #29303: KubernetesPodOperator not make task status to success due the task execution time is very short

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #29303:
URL: https://github.com/apache/airflow/issues/29303#issuecomment-1696588758

   This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hussein-awala commented on issue #29303: KubernetesPodOperator not make task status to success due the task execution time is very short

Posted by "hussein-awala (via GitHub)" <gi...@apache.org>.
hussein-awala commented on issue #29303:
URL: https://github.com/apache/airflow/issues/29303#issuecomment-1445116060

   > kubectl get pod t-crawler-gem-ipo-request-39a8a54e91394e8198711966fb7ec66d -o jsonpath='{.status.phase}' -n crawler
   Succeeded
   
   @chuxiangfeng Since kubectl returns Succeeded as status, it may be a bug in kubernetes client, can you check which version of kubernetes lib is installed on Airflow server?
   ```bash
   pip freeze | grep kubernetes
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] chuxiangfeng commented on issue #29303: KubernetesPodOperator not make task status to success due the task execution time is very short

Posted by "chuxiangfeng (via GitHub)" <gi...@apache.org>.
chuxiangfeng commented on issue #29303:
URL: https://github.com/apache/airflow/issues/29303#issuecomment-1418389427

   Anyone can help?Hurry to settle


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] closed issue #29303: KubernetesPodOperator not make task status to success due the task execution time is very short

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed issue #29303: KubernetesPodOperator not make task status to success due the task execution time is very short
URL: https://github.com/apache/airflow/issues/29303


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] chuxiangfeng commented on issue #29303: KubernetesPodOperator not make task status to success due the task execution time is very short

Posted by "chuxiangfeng (via GitHub)" <gi...@apache.org>.
chuxiangfeng commented on issue #29303:
URL: https://github.com/apache/airflow/issues/29303#issuecomment-1420469000

   > 
   
   i mean it runs and terminates in less than second, for example, The Pod life cycle is less than 1 second:
   ```
   Initialized | True | 2023-02-02 16:23:43 | PodCompleted
   Ready | False | 2023-02-02 16:23:43 | PodCompleted
   ContainersReady | False | 2023-02-02 16:23:43 | PodCompleted
   PodScheduled | True | 2023-02-02 16:23:43
   ```
   The dag task remains in the runing state, A few days later, many tasks will be in the Running state without ending or failing, In fact, those missions have long since ended.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] nathadfield commented on issue #29303: KubernetesPodOperator not make task status to success due the task execution time is very short

Posted by "nathadfield (via GitHub)" <gi...@apache.org>.
nathadfield commented on issue #29303:
URL: https://github.com/apache/airflow/issues/29303#issuecomment-1621329032

   @chuxiangfeng Picking up this thread.  It would be good to see if this is still a problem on more recent Kubernetes client versions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hussein-awala commented on issue #29303: KubernetesPodOperator not make task status to success due the task execution time is very short

Posted by "hussein-awala (via GitHub)" <gi...@apache.org>.
hussein-awala commented on issue #29303:
URL: https://github.com/apache/airflow/issues/29303#issuecomment-1420638969

   What do get when you run this command:
   ```bash
   kubectl get pod <POD_NAME> -o jsonpath='{.status.phase}'
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] commented on issue #29303: KubernetesPodOperator not make task status to success due the task execution time is very short

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #29303:
URL: https://github.com/apache/airflow/issues/29303#issuecomment-1712667784

   This issue has been closed because it has not received response from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hussein-awala commented on issue #29303: KubernetesPodOperator not make task status to success due the task execution time is very short

Posted by "hussein-awala (via GitHub)" <gi...@apache.org>.
hussein-awala commented on issue #29303:
URL: https://github.com/apache/airflow/issues/29303#issuecomment-1419769934

   This method breaks the loop when the pod status phase is not `PENDING`, how can short execution time block this method since the phase will be changed to `SUCCEEDED` or `FAILED` which are not different from `PENDING`.
   >  my Pod will end immediately without running under some conditions.
   What do you mean by "without running"? do you mean that it runs and terminates in less than second or it doesn't run at all?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] chuxiangfeng commented on issue #29303: KubernetesPodOperator not make task status to success due the task execution time is very short

Posted by "chuxiangfeng (via GitHub)" <gi...@apache.org>.
chuxiangfeng commented on issue #29303:
URL: https://github.com/apache/airflow/issues/29303#issuecomment-1421882060

   > What do get when you run this command:
   > 
   > ```shell
   > kubectl get pod <POD_NAME> -o jsonpath='{.status.phase}'
   > ```
   
   kubectl get pod t-crawler-gem-ipo-request-39a8a54e91394e8198711966fb7ec66d -o jsonpath='{.status.phase}' -n crawler
   Succeeded%


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org