You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ut0mt8 (via GitHub)" <gi...@apache.org> on 2023/03/01 10:47:46 UTC

[GitHub] [airflow] ut0mt8 opened a new issue, #29834: Airflow Stdout or live Logging with Kubernetes Executor

ut0mt8 opened a new issue, #29834:
URL: https://github.com/apache/airflow/issues/29834

   ### Apache Airflow version
   
   2.5.1
   
   ### What happened
   
   I use kubernetes executor. With some bad patch/ manual tweaking it works more or less.
   The only very annoying issue is despite all my attempt I cannot have "live" logging of my running dags...
   S3 logging is working fine it's displayed on the UI only after the end of execution of the dags which is very very frustrating...
   
   Also what is broken imo is we can only see what happen live by going into the pod and reading the local log file (which is not correctly reported in the UI). At least if we can logs on stdout it can be viewable on kubernetes logs 
   
   ### What you think should happen instead
   
   Logs should be viewable live on the UI, or at least on stdout.
   
   ### How to reproduce
   
   So the config is airflow 2.5.1
   
   Kubernetes Executor
   
   Custom Logger
   
   ````
   cat  config/log_config.py
   from copy import deepcopy
   from airflow.config_templates.airflow_local_settings import DEFAULT_LOGGING_CONFIG
   
   LOGGING_CONFIG = deepcopy(DEFAULT_LOGGING_CONFIG)
   LOGGING_CONFIG["handlers"]["stdout"] = {
       "class": "logging.StreamHandler",
       "formatter": "airflow",
       "stream": "ext://sys.stdout",
       "filters": ["mask_secrets"],
   }
   
   LOGGING_CONFIG['loggers']['airflow.task']['handlers'] = ['stdout', 'task']
   ````
   
   ````
   airflow.cfg
   ## Logs ##
       [logging]
       base_log_folder = /opt/airflow/logs
       logging_level = WARNING
       celery_logging_level =
       logging_config_class = log_config.LOGGING_CONFIG
       task_log_reader = task
       extra_logger_names = connexion,sqlalchemy
       worker_log_server_port = 8793
   
       colored_console_log = False
       colored_log_format = [%%(blue)s%%(asctime)s%%(reset)s] {%%(blue)s%%(filename)s:%%(reset)s%%(lineno)d} %%(log_color)s%%(levelname)s%%(reset)s - %%(log_color)s%%(message)s%%(reset)s
       colored_formatter_class = airflow.utils.log.colored_log.CustomTTYColoredFormatter
   
       log_format = [%%(asctime)s] {%%(filename)s:%%(lineno)d} %%(levelname)s - %%(message)s
       simple_log_format = %%(asctime)s %%(levelname)s - %%(message)s
   
       task_log_prefix_template = {ti.dag_id}-{ti.task_id}-{execution_date}-{try_number}
       log_filename_template = {{ $.Values.config.log_filename_template }}
       log_processor_filename_template = {{ $.Values.config.log_processor_filename_template }}
       dag_processor_manager_log_location = /opt/airflow/logs/dag_processor_manager/dag_processor_manager.log
   
       # Remote logs
       remote_logging = True
       remote_log_conn_id = aws_default
       remote_base_log_folder = {{ $.Values.config.log_folder }}
       encrypt_s3_logs = False
   ````
   
   and the example dag:
   
   ````
   import datetime
   import logging
   
   from airflow import DAG
   from airflow.operators.python import PythonOperator
   from airflow.utils.dates import days_ago
   from kubernetes.client import models as k8s
   
   log: logging.log = logging.getLogger("airflow")
   log.setLevel(logging.INFO)
   
   LOGGER = logging.getLogger("airflow.task")
   LOGGER2 = logging.getLogger(__name__)
   
   with DAG(
       dag_id='example_kubernetes',
       description="example_kubernetes",
       schedule_interval=None,
       start_date=days_ago(2)
   ) as dag:
   
       def stuff(ds, **kwargs):
           import time
   
           run_date = str(ds)
           log.info("run date is : " + str(run_date))
           LOGGER.error("airflow.task >>> plifff ")
           LOGGER2.error("__name__ >>> plofff ")
   
           with open("/tmp/myfile.txt", "w") as f:
               f.write("Now the file has more content!")
               f.close()
   
           time.sleep(200)
           log.info("finished")
   
       task = PythonOperator(
           task_id="task",
           python_callable=stuff,
           executor_config={
               "pod_override": k8s.V1Pod(
                   spec=k8s.V1PodSpec(
                       containers=[
                           k8s.V1Container(
                               name="base",
                               resources=k8s.V1ResourceRequirements(
                                   requests={
                                       "cpu": "100m",
                                       "memory": "256Mi"
                                   },
                                   limits={
                                       "cpu": "1000m",
                                       "memory": "420Mi"
                                   }
                               )
                           )
                       ]
                   )
               )
           }
       )
   
   task
   ````
   
   ### Operating System
   
   Docker/Kubernetes
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   slight modified airflow helm charts
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk closed issue #29834: Airflow Stdout or live Logging with Kubernetes Executor

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk closed issue #29834: Airflow Stdout or live Logging with Kubernetes Executor
URL: https://github.com/apache/airflow/issues/29834


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org