You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Jon Davies (JIRA)" <ji...@apache.org> on 2018/08/28 13:14:00 UTC

[jira] [Created] (AIRFLOW-2970) Kubernetes logging is broken

Jon Davies created AIRFLOW-2970:
-----------------------------------

             Summary: Kubernetes logging is broken
                 Key: AIRFLOW-2970
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2970
             Project: Apache Airflow
          Issue Type: Bug
            Reporter: Jon Davies
            Assignee: Daniel Imberman


I'm using Airflow with the Kubernetes executor and pod operator. And my DAGs are configured to do get_log=True and all my DAGs are set to log to stdout and I can see all the logs in kubectl logs.

I can see that the scheduler logs things to: $AIRFLOW_HOME/logs/scheduler/2018-08-28/*

However, this just consists of:

{code:java}
[2018-08-28 13:03:27,695] {jobs.py:385} INFO - Started process (PID=16994) to work on /home/airflow/dags/dag.py
[2018-08-28 13:03:27,697] {jobs.py:1782} INFO - Processing file /home/airflow/dags/dag.py for tasks to queue
[2018-08-28 13:03:27,697] {logging_mixin.py:95} INFO - [2018-08-28 13:03:27,697] {models.py:258} INFO - Filling up the DagBag from /home/airflow/dags/dag.py
{code}

If I quickly exec into the executor the scheduler spins up, I can see that things are properly logged to:

{code:java}
/home/airflow/logs/dag$ tail -f dag-downloader/2018-08-28T13\:05\:07.704072+00\:00/1.log
[2018-08-28 13:05:24,399] {logging_mixin.py:95} INFO - [2018-08-28 13:05:24,399] {pod_launcher.py:112} INFO - Event: dag-downloader-015ca48c had an event of type Pending
...
[2018-08-28 13:05:37,193] {logging_mixin.py:95} INFO - [2018-08-28 13:05:37,193] {pod_launcher.py:95} INFO - b'INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Starting new HTTPS connection (7): blah-blah.s3.eu-west-1.amazonaws.com\n'
...
...all other log lines from pod...
{code}

However, this executor pod only exists for the duration of the lifetime of the task pod so the logs are lost pretty much immediately after the task runs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)