You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/08/26 16:43:49 UTC

[GitHub] [airflow] michalmisiewicz opened a new issue #10586: KubernetesPodOperator truncates logs

michalmisiewicz opened a new issue #10586:
URL: https://github.com/apache/airflow/issues/10586


   <!--
   
   Welcome to Apache Airflow!  For a smooth issue process, try to answer the following questions.
   Don't worry if they're not all applicable; just try to include what you can :-)
   
   If you need to include code snippets or logs, please put them in fenced code
   blocks.  If they're super-long, please use the details tag like
   <details><summary>super-long log</summary> lots of stuff </details>
   
   Please delete these comment blocks before submitting the issue.
   
   -->
   
   <!--
   
   IMPORTANT!!!
   
   PLEASE CHECK "SIMILAR TO X EXISTING ISSUES" OPTION IF VISIBLE
   NEXT TO "SUBMIT NEW ISSUE" BUTTON!!!
   
   PLEASE CHECK IF THIS ISSUE HAS BEEN REPORTED PREVIOUSLY USING SEARCH!!!
   
   Please complete the next sections or the issue will be closed.
   This questions are the first thing we need to know to understand the context.
   
   -->
   
   **Apache Airflow version**:  1.10.10
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`): v1.15.11
   
   KubernetesPodOperator truncates logs when container produced more than 10 lines of logs before execution of `read_pod_logs` function. Is there any reason to make 10 as default value for `tail_lines` argument ?
   ```python
       def read_pod_logs(self, pod: V1Pod, tail_lines: int = 10):
           """Reads log from the POD"""
           try:
               return self._client.read_namespaced_pod_log(
                   name=pod.metadata.name,
                   namespace=pod.metadata.namespace,
                   container='base',
                   follow=True,
                   tail_lines=tail_lines,
                   _preload_content=False
               )
           except BaseHTTPError as e:
               raise AirflowException(
                   'There was an error reading the kubernetes API: {}'.format(e)
               )
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] michalmisiewicz commented on issue #10586: KubernetesPodOperator truncates logs

Posted by GitBox <gi...@apache.org>.
michalmisiewicz commented on issue #10586:
URL: https://github.com/apache/airflow/issues/10586#issuecomment-683947340


   > I'd be interested in making a PR to expose the tail_lines property to the user in the k8sPodOperator since this would allow us to get back onto the official version of the operator in future releases.
   
   The fix should be applied also for KubernetesExecutor 🙂
   I'm now testing stability of KubernetesExecutor with tail_lines equal to 0.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dobromirmontauk commented on issue #10586: KubernetesPodOperator truncates logs

Posted by GitBox <gi...@apache.org>.
dobromirmontauk commented on issue #10586:
URL: https://github.com/apache/airflow/issues/10586#issuecomment-689961846


   Would love to see a fix to this!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] SamWheating commented on issue #10586: KubernetesPodOperator truncates logs

Posted by GitBox <gi...@apache.org>.
SamWheating commented on issue #10586:
URL: https://github.com/apache/airflow/issues/10586#issuecomment-683850951


   Hey, I've been running into this issue and we've created a modified copy of k8spodoperator as a workaround. I'd be interested in making a PR to expose the tail_lines property to the user in the k8sPodOperator.
   
   If its alright with you, you can assign this issue to me and I can get a PR up hopefully sometime this week?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] michalmisiewicz closed issue #10586: KubernetesPodOperator truncates logs

Posted by GitBox <gi...@apache.org>.
michalmisiewicz closed issue #10586:
URL: https://github.com/apache/airflow/issues/10586


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] SamWheating edited a comment on issue #10586: KubernetesPodOperator truncates logs

Posted by GitBox <gi...@apache.org>.
SamWheating edited a comment on issue #10586:
URL: https://github.com/apache/airflow/issues/10586#issuecomment-683850951


   👋 
   
   I've been running into this issue and we've created a modified copy of k8spodoperator as a workaround. I'd be interested in making a PR to expose the tail_lines property to the user in the k8sPodOperator since this would allow us to get back onto the official version of the operator in future releases. 
   
   If its alright with you, you can assign this issue to me and I can get a PR up hopefully sometime this week?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] SamWheating edited a comment on issue #10586: KubernetesPodOperator truncates logs

Posted by GitBox <gi...@apache.org>.
SamWheating edited a comment on issue #10586:
URL: https://github.com/apache/airflow/issues/10586#issuecomment-683850951


   Hey, I've been running into this issue and we've created a modified copy of k8spodoperator as a workaround. I'd be interested in making a PR to expose the tail_lines property to the user in the k8sPodOperator since this would allow us to get back onto the official version of the operator in future releases. 
   
   If its alright with you, you can assign this issue to me and I can get a PR up hopefully sometime this week?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dimberman commented on issue #10586: KubernetesPodOperator truncates logs

Posted by GitBox <gi...@apache.org>.
dimberman commented on issue #10586:
URL: https://github.com/apache/airflow/issues/10586#issuecomment-683959236


   @SamWheating I'd say retrieve all logs in all cases. The fact that these are truncated is a bug. Glad to help whenever you have a PR ready!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] SamWheating commented on issue #10586: KubernetesPodOperator truncates logs

Posted by GitBox <gi...@apache.org>.
SamWheating commented on issue #10586:
URL: https://github.com/apache/airflow/issues/10586#issuecomment-683955857


   Sounds good. From the kubernetes python client source:
   ```
   :param int tail_lines: If set, the number of lines from the end of the logs to show. If not specified, logs are shown from the creation of the container or sinceSeconds or sinceTime
   ```
   Is there a reason users would ever want truncate lines? Maybe its just easier to not set this value and retrieve all logs in all cases. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] michalmisiewicz commented on issue #10586: KubernetesPodOperator truncates logs

Posted by GitBox <gi...@apache.org>.
michalmisiewicz commented on issue #10586:
URL: https://github.com/apache/airflow/issues/10586#issuecomment-681003042


   Looking on previous K8S related PRs I would like to ask @kaxil for help 🙂 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] SamWheating commented on issue #10586: KubernetesPodOperator truncates logs

Posted by GitBox <gi...@apache.org>.
SamWheating commented on issue #10586:
URL: https://github.com/apache/airflow/issues/10586#issuecomment-683926160


   Yup, I'm just updating our clusters to 1.10.12 and then I'll try to replicate the error case (a pod producing a lot of logs quickly and then shutting down).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dimberman commented on issue #10586: KubernetesPodOperator truncates logs

Posted by GitBox <gi...@apache.org>.
dimberman commented on issue #10586:
URL: https://github.com/apache/airflow/issues/10586#issuecomment-683864815


   @SamWheating assigned :). Could you please make sure that this issue is persistent in 1.10.12?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] SamWheating commented on issue #10586: KubernetesPodOperator truncates logs

Posted by GitBox <gi...@apache.org>.
SamWheating commented on issue #10586:
URL: https://github.com/apache/airflow/issues/10586#issuecomment-683942214


   Replicated the issue in Airflow 1.10.12 like so:
   
   Created a DAG to really quickly log a bunch of stuff in a k8s pod:
   ```python
   from datetime import timedelta
   
   from airflow import models
   from airflow import utils
   from airflow.contrib.operators.kubernetes_pod_operator import KubernetesPodOperator
   
   dag = models.DAG(
       'data-infrastructure-examples.sam-k8s-fast-logging',
       start_date=utils.dates.days_ago(1),
       max_active_runs=1,
       dagrun_timeout=timedelta(minutes=1),
       schedule_interval=None
   )
   
   hello_world_pod = KubernetesPodOperator(namespace='data-infrastructure-examples',
                                           image="busybox",
                                           name="count-to-10000",
                                           in_cluster=True,
                                           task_id="count-to-10000",
                                           get_logs=True,
                                           dag=dag,
                                           is_delete_operator_pod=False,
                                           cmds=['seq', '10000']
                                           )
   
   ```
   
   Running the DAG then only provides the last 10 lines of the log in the airflow UI:
   ```
   [2020-08-31 18:08:08,673] {taskinstance.py:901} INFO - Executing <Task(KubernetesPodOperator): count-to-10000> on 2020-08-31T18:08:01.278616+00:00
   [2020-08-31 18:08:08,678] {standard_task_runner.py:54} INFO - Started process 34196 to run task
   [2020-08-31 18:08:08,848] {standard_task_runner.py:77} INFO - Running: ['airflow', 'run', 'data-infrastructure-examples.sam-k8s-fast-logging', 'count-to-10000', '2020-08-31T18:08:01.278616+00:00', '--job_id', '57658', '--pool', 'default_pool', '--raw', '-sd', 'DAGS_FOLDER/data_infrastructure_examples/kubernetes_fast_logs.py', '--cfg_path', '/tmp/tmp1jvekhel']
   [2020-08-31 18:08:08,849] {standard_task_runner.py:78} INFO - Job 57658: Subtask count-to-10000
   [2020-08-31 18:08:08,955] {logging_mixin.py:112} INFO - Running %s on host %s <TaskInstance: data-infrastructure-examples.sam-k8s-fast-logging.count-to-10000 2020-08-31T18:08:01.278616+00:00 [running]> airflow-worker-7dcb77d8dc-bfxnw
   [2020-08-31 18:08:09,027] {logging_mixin.py:112} WARNING - /usr/local/lib/python3.7/site-packages/airflow/kubernetes/pod_launcher.py:330: DeprecationWarning: Using `airflow.contrib.kubernetes.pod.Pod` is deprecated. Please use `k8s.V1Pod`.
     security_context=_extract_security_context(pod.spec.security_context)
   [2020-08-31 18:08:09,027] {logging_mixin.py:112} WARNING - /usr/local/lib/python3.7/site-packages/airflow/kubernetes/pod_launcher.py:77: DeprecationWarning: Using `airflow.contrib.kubernetes.pod.Pod` is deprecated. Please use `k8s.V1Pod` instead.
     pod = self._mutate_pod_backcompat(pod)
   [2020-08-31 18:08:09,699] {pod_launcher.py:173} INFO - Event: count-to-10000-f146290239a24243b48f8dbf0a26c8e5 had an event of type Pending
   [2020-08-31 18:08:09,699] {pod_launcher.py:139} WARNING - Pod not yet started: count-to-10000-f146290239a24243b48f8dbf0a26c8e5
   [2020-08-31 18:08:10,709] {pod_launcher.py:173} INFO - Event: count-to-10000-f146290239a24243b48f8dbf0a26c8e5 had an event of type Pending
   [2020-08-31 18:08:10,709] {pod_launcher.py:139} WARNING - Pod not yet started: count-to-10000-f146290239a24243b48f8dbf0a26c8e5
   [2020-08-31 18:08:11,716] {pod_launcher.py:173} INFO - Event: count-to-10000-f146290239a24243b48f8dbf0a26c8e5 had an event of type Pending
   [2020-08-31 18:08:11,716] {pod_launcher.py:139} WARNING - Pod not yet started: count-to-10000-f146290239a24243b48f8dbf0a26c8e5
   [2020-08-31 18:08:12,725] {pod_launcher.py:173} INFO - Event: count-to-10000-f146290239a24243b48f8dbf0a26c8e5 had an event of type Succeeded
   [2020-08-31 18:08:12,726] {pod_launcher.py:287} INFO - Event with job id count-to-10000-f146290239a24243b48f8dbf0a26c8e5 Succeeded
   [2020-08-31 18:08:12,751] {pod_launcher.py:156} INFO - b'9991\n'
   [2020-08-31 18:08:12,751] {pod_launcher.py:156} INFO - b'9992\n'
   [2020-08-31 18:08:12,751] {pod_launcher.py:156} INFO - b'9993\n'
   [2020-08-31 18:08:12,751] {pod_launcher.py:156} INFO - b'9994\n'
   [2020-08-31 18:08:12,752] {pod_launcher.py:156} INFO - b'9995\n'
   [2020-08-31 18:08:12,752] {pod_launcher.py:156} INFO - b'9996\n'
   [2020-08-31 18:08:12,752] {pod_launcher.py:156} INFO - b'9997\n'
   [2020-08-31 18:08:12,752] {pod_launcher.py:156} INFO - b'9998\n'
   [2020-08-31 18:08:12,752] {pod_launcher.py:156} INFO - b'9999\n'
   [2020-08-31 18:08:12,752] {pod_launcher.py:156} INFO - b'10000\n'
   [2020-08-31 18:08:12,766] {pod_launcher.py:173} INFO - Event: count-to-10000-f146290239a24243b48f8dbf0a26c8e5 had an event of type Succeeded
   [2020-08-31 18:08:12,766] {pod_launcher.py:287} INFO - Event with job id count-to-10000-f146290239a24243b48f8dbf0a26c8e5 Succeeded
   [2020-08-31 18:08:12,772] {pod_launcher.py:173} INFO - Event: count-to-10000-f146290239a24243b48f8dbf0a26c8e5 had an event of type Succeeded
   [2020-08-31 18:08:12,772] {pod_launcher.py:287} INFO - Event with job id count-to-10000-f146290239a24243b48f8dbf0a26c8e5 Succeeded
   [2020-08-31 18:08:12,793] {taskinstance.py:1070} INFO - Marking task as SUCCESS.dag_id=data-infrastructure-examples.sam-k8s-fast-logging, task_id=count-to-10000, execution_date=20200831T180801, start_date=20200831T180808, end_date=20200831T180812
   [2020-08-31 18:08:13,568] {local_task_job.py:102} INFO - Task exited with return code 0
   ```
   
   While checking the logs on the completed pod show the entire count:
   ```
   > kubectl logs -f -n data-infrastructure-examples count-to-10000-f146290239a24243b48f8dbf0a26c8e5
   1
   2
   3
   4
   5
   6
   (all the other numbers in the sequence)
   9999
   10000
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] michalmisiewicz commented on issue #10586: KubernetesPodOperator truncates logs

Posted by GitBox <gi...@apache.org>.
michalmisiewicz commented on issue #10586:
URL: https://github.com/apache/airflow/issues/10586#issuecomment-710426149


   Fixed in #11325 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #10586: KubernetesPodOperator truncates logs

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #10586:
URL: https://github.com/apache/airflow/issues/10586#issuecomment-681076412


   tagging in @dimberman if you have any idea about this


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org