You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/08/29 02:54:41 UTC

[GitHub] [airflow] luozhaoyu edited a comment on issue #10541: KubernetesPodOperator stuck in `up_for_retry` state after scheduler restart.

luozhaoyu edited a comment on issue #10541:
URL: https://github.com/apache/airflow/issues/10541#issuecomment-683225572


   I also encountered the same issue using:
   1. manifest generated from helm chart master branch
   2. KubernetesPodOperator
   3. using both minikube and a real k8s cluster
   4. docker image 1.10.12-python3.8
   
   
   ```
   airflow@airflow-scheduler-54797f7ddb-5bsb7:/opt/airflow$ airflow run my_example start1 2020-08-24T09:00:00+00:00 -sd /tmp/my_example.py
   [2020-08-29 02:51:24,996] {settings.py:233} DEBUG - Setting up DB connection pool (PID 22402)
   [2020-08-29 02:51:24,996] {settings.py:273} DEBUG - settings.configure_orm(): Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=22402
   [2020-08-29 02:51:25,162] {sentry.py:179} DEBUG - Could not configure Sentry: No module named 'blinker', using DummySentry instead.
   [2020-08-29 02:51:25,228] {__init__.py:45} DEBUG - Cannot import  due to  doesn't look like a module path
   [2020-08-29 02:51:25,467] {cli_action_loggers.py:42} DEBUG - Adding <function default_action_log at 0x7f112d7b3430> to pre execution callback
   [2020-08-29 02:51:25,861] {cli_action_loggers.py:68} DEBUG - Calling callbacks: [<function default_action_log at 0x7f112d7b3430>]
   [2020-08-29 02:51:25,887] {settings.py:233} DEBUG - Setting up DB connection pool (PID 22402)
   [2020-08-29 02:51:25,887] {settings.py:241} DEBUG - settings.configure_orm(): Using NullPool
   /home/airflow/.local/lib/python3.8/site-packages/airflow/kubernetes/pod_generator.py:39: DeprecationWarning: This module is deprecated. Please use `airflow.kubernetes.pod`.
     from airflow.contrib.kubernetes.pod import _extract_volume_mounts
   [2020-08-29 02:51:26,196] {__init__.py:50} INFO - Using executor KubernetesExecutor
   [2020-08-29 02:51:26,200] {dagbag.py:417} INFO - Filling up the DagBag from /tmp/my_example.py
   [2020-08-29 02:51:26,201] {dagbag.py:245} DEBUG - Importing /tmp/my_example.py
   [2020-08-29 02:51:26,210] {dagbag.py:384} DEBUG - Loaded DAG <DAG: my_example>
   Running %s on host %s <TaskInstance: my_example.start1 2020-08-24T09:00:00+00:00 [None]> airflow-scheduler-54797f7ddb-5bsb7
   Traceback (most recent call last):
     File "/home/airflow/.local/bin/airflow", line 37, in <module>
       args.func(args)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/cli.py", line 76, in wrapper
       return f(*args, **kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/bin/cli.py", line 579, in run
       _run(args, dag, ti)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/bin/cli.py", line 500, in _run
       executor.start()
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/kubernetes_executor.py", line 786, in start
       self.clear_not_launched_queued_tasks()
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/db.py", line 74, in wrapper
       return func(*args, **kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/kubernetes_executor.py", line 719, in clear_not_launched_queued_tasks
       pod_list = self.kube_client.list_namespaced_pod(
     File "/home/airflow/.local/lib/python3.8/site-packages/kubernetes/client/api/core_v1_api.py", line 12803, in list_namespaced_pod
       (data) = self.list_namespaced_pod_with_http_info(namespace, **kwargs)  # noqa: E501
     File "/home/airflow/.local/lib/python3.8/site-packages/kubernetes/client/api/core_v1_api.py", line 12891, in list_namespaced_pod_with_http_info
       return self.api_client.call_api(
     File "/home/airflow/.local/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 340, in call_api
       return self.__call_api(resource_path, method,
     File "/home/airflow/.local/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 172, in __call_api
       response_data = self.request(
     File "/home/airflow/.local/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 362, in request
       return self.rest_client.GET(url,
     File "/home/airflow/.local/lib/python3.8/site-packages/kubernetes/client/rest.py", line 237, in GET
       return self.request("GET", url,
     File "/home/airflow/.local/lib/python3.8/site-packages/kubernetes/client/rest.py", line 231, in request
       raise ApiException(http_resp=r)
   kubernetes.client.rest.ApiException: (403)
   Reason: Forbidden
   HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'Date': 'Sat, 29 Aug 2020 02:51:26 GMT', 'Content-Length': '282'})
   HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods is forbidden: User \"system:serviceaccount:airflow:airflow\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\"","reason":"Forbidden","details":{"kind":"pods"},"code":403}
   ``` 
   
   This is my DAG:
   ```
   from airflow import DAG
   from datetime import datetime, timedelta
   from airflow.contrib.operators.kubernetes_pod_operator import KubernetesPodOperator
   from airflow.operators.dummy_operator import DummyOperator
   
   
   default_args = {
       'owner': 'airflow',
       'depends_on_past': False,
       'start_date': datetime.now() - timedelta(days=1),
       'email': ['airflow@example.com'],
       'email_on_failure': False,
       'email_on_retry': False,
       'retries': 1,
       'retry_delay': timedelta(minutes=5)
   }
   
   dag = DAG(
       'my_example', default_args=default_args)
   
   
   start1 = KubernetesPodOperator(namespace='airflow',
                             image="python:3.6",
                             image_pull_policy="Always",
                             cmds=["python","-c"],
                             arguments=["print('hello world')"],
                             labels={"foo": "bar"},
                             name="start1",
                             resources={"request_cpu": "256m", "limit_cpu": "1", "request_memory": "256Mi","limit_memory": "1Gi"},
                             task_id="start1",
                             get_logs=True,
                             dag=dag
                             )
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org