You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/01/21 11:49:14 UTC

[GitHub] [airflow] SamJia opened a new issue #21013: logging in Operator not work

SamJia opened a new issue #21013:
URL: https://github.com/apache/airflow/issues/21013


   ### Apache Airflow version
   
   2.2.3 (latest released)
   
   ### What happened
   
   Operator logging not work. 
   For example, for a task with logging:
   ![image](https://user-images.githubusercontent.com/12780802/150499992-d939cdfc-0e3f-471e-b3ad-7232b8f313d9.png)
   the log in webserver is:
   ![image](https://user-images.githubusercontent.com/12780802/150500091-e357c1b8-8cab-424f-aed1-21c957c6d6ed.png)
   no any log information are recorded.
   
   In my view, the reason is that when a TaskInstance is loaded through session with DagRun.run_id in function `_get_ti` in airflow/cli/commands/task_command.py, ti._log object is not properly initialzed.
   ![image](https://user-images.githubusercontent.com/12780802/150521672-0a9fd2d6-c231-4b9a-812e-78910c067bd4.png)
   
   Then, when call 'ti.init_run_context', ti._log will be set as logging.getLogger(self.__class__.__module__ + '.' + self.__class__.__name__) according to function log of class LoggingMixin in airflow/utils/log/logging_mixin.py. And the context of logging.getLogger('airflow.task') is not properly set.
   ![image](https://user-images.githubusercontent.com/12780802/150521756-1816cde0-9616-48d6-aca9-e18203e21f1b.png)
   
   Following, in function '_capture_task_logs' in airflow/cli/commands/task_command.py, the root logger configuration is replaced with the airflow.task configuration, which context is not set.
   ![image](https://user-images.githubusercontent.com/12780802/150521822-4c6a8840-2a60-459b-a063-2b22a0706295.png)
   
   It means all the loging happen after the replacement cannot be recorded.
   
   Possible way to fix:
   
   In function `refresh_from_task`, initialize self._log like follows:
   ![image](https://user-images.githubusercontent.com/12780802/150521531-3b53c984-1a81-40da-8b04-d99babe1de54.png)
   
   Moreover, it may be also addeed in function `refresh_from_db`
   
   ### What you expected to happen
   
   I wish the log looks like this:
   ![image](https://user-images.githubusercontent.com/12780802/150502087-9b8dd557-f60a-45fe-a179-9223b32dfbba.png)
   
   
   ### How to reproduce
   
   My config related to logging in airlfow.cfg:
   ```
   [logging]
   base_log_folder = /home/airflow/airflow/logs
   remote_logging = False
   remote_log_conn_id =
   google_key_path =
   remote_base_log_folder =
   encrypt_s3_logs = False
   logging_level = INFO
   fab_logging_level = INFO
   logging_config_class =
   colored_console_log = True
   colored_log_format = [%%(blue)s%%(asctime)s%%(reset)s] {%%(blue)s%%(filename)s:%%(reset)s%%(lineno)d} %%(log_color)s%%(levelname)s%%(reset)s - %%(log_color)s%%(message)s%%(reset)s
   colored_formatter_class = airflow.utils.log.colored_log.CustomTTYColoredFormatter
   log_format = [%%(asctime)s] {%%(filename)s:%%(lineno)d} %%(levelname)s - %%(message)s
   simple_log_format = %%(asctime)s %%(levelname)s - %%(message)s
   task_log_prefix_template =
   log_filename_template = {{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{ try_number }}.log
   log_processor_filename_template = {{ filename }}.log
   dag_processor_manager_log_location = /home/airflow/airflow/logs/dag_processor_manager/dag_processor_manager.log
   task_log_reader = task
   extra_logger_names =
   worker_log_server_port = 8793
   ```
   
   My DAG and task:
   
   ```
   import datetime
   
   from airflow import DAG
   from airflow.decorators import task
   from airflow.operators.python import PythonOperator
   
   
   def test_log():
       import logging
   
       logging.info("airflow.task >>> INFO logger test")
       logging.warn("airflow.task >>> WARN logger test")
       logging.error("airflow.task >>> ERROR logger test")
       logging.critical("airflow.task >>> CRITICAL logger test")
   
   with DAG(
       'test_log',
       description='test log',
       schedule_interval=None,
       start_date=datetime.datetime(2022, 1, 19),
       catchup=False,
       tags=['log'],
   ) as dag:
       task = PythonOperator(
           task_id='test_log_task',
           python_callable=test_log,
           dag=dag
       )
   ```
   
   ### Operating System
   
   NAME="Ubuntu" VERSION="18.04.4 LTS (Bionic Beaver)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 18.04.4 LTS" VERSION_ID="18.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=bionic UBUNTU_CODENAME=bionic
   
   ### Versions of Apache Airflow Providers
   
   None
   
   ### Deployment
   
   Virtualenv installation
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] SamJia commented on issue #21013: logging in Operator not work

Posted by GitBox <gi...@apache.org>.
SamJia commented on issue #21013:
URL: https://github.com/apache/airflow/issues/21013#issuecomment-1018624809


   > Actually I have tried to use logger `airflow.task`, but also failed
   
   ![image](https://user-images.githubusercontent.com/12780802/150556056-90dfae46-1352-434a-a455-8668289d58c9.png)
   ![image](https://user-images.githubusercontent.com/12780802/150556096-640b9487-50c4-4691-82fd-0d18606d9ca3.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] SamJia removed a comment on issue #21013: logging in Operator not work

Posted by GitBox <gi...@apache.org>.
SamJia removed a comment on issue #21013:
URL: https://github.com/apache/airflow/issues/21013#issuecomment-1018624182


   Actually I have tried to use logger `airflow.task`, but also failed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed issue #21013: logging in Operator not work

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #21013:
URL: https://github.com/apache/airflow/issues/21013


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #21013: logging in Operator not work

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #21013:
URL: https://github.com/apache/airflow/issues/21013#issuecomment-1018493602


   You are using logging module improperly. By `logging,info` (and similar) you are logging stuff to "root" logger:
   
   https://docs.python.org/3/library/logging.html#logging.info
   
   > logging.info(msg, *args, **kwargs)¶
   Logs a message with level INFO on the root logger. The arguments are interpreted as for debug().
   
   Instead you should log message to "airflow.task" logger if you want messages to show up in task log:
   
   ```
   logger = logging.getLogger("airflow.task")
   logger.info(...)
   ``


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #21013: logging in Operator not work

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #21013:
URL: https://github.com/apache/airflow/issues/21013#issuecomment-1018436785


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] SamJia edited a comment on issue #21013: logging in Operator not work

Posted by GitBox <gi...@apache.org>.
SamJia edited a comment on issue #21013:
URL: https://github.com/apache/airflow/issues/21013#issuecomment-1018624809


   > You are using logging module improperly. By `logging,info` (and similar) you are logging stuff to "root" logger:
   > 
   > https://docs.python.org/3/library/logging.html#logging.info
   > 
   > > logging.info(msg, *args, **kwargs)¶
   > > Logs a message with level INFO on the root logger. The arguments are interpreted as for debug().
   > 
   > Instead you should log message to "airflow.task" logger if you want messages to show up in task log:
   > 
   > ```
   > logger = logging.getLogger("airflow.task")
   > logger.info(...)
   > ``
   > ```
   
   
   Actually I have tried to use logger `airflow.task`, but also failed
   
   ![image](https://user-images.githubusercontent.com/12780802/150556056-90dfae46-1352-434a-a455-8668289d58c9.png)
   ![image](https://user-images.githubusercontent.com/12780802/150556096-640b9487-50c4-4691-82fd-0d18606d9ca3.png)
   
   Moreover, considering the level and handler of `root` logger are replaced with the level and handler of `airflow.task` logger, I think they should work same in task operators.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] SamJia commented on issue #21013: logging in Operator not work

Posted by GitBox <gi...@apache.org>.
SamJia commented on issue #21013:
URL: https://github.com/apache/airflow/issues/21013#issuecomment-1018624182


   Actually I have tried to use logger `airflow.task`, but also failed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #21013: logging in Operator not work

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #21013:
URL: https://github.com/apache/airflow/issues/21013#issuecomment-1018656967


   Cannot reproduce it with "standard" airflow configuration. You must have some custom logging configuration in local settings that causes it:
   
   
   ```python
   import datetime
   
   from airflow import DAG
   from airflow.decorators import task
   from airflow.operators.python import PythonOperator
   
   
   def test_log():
       import logging
   
       logger = logging.getLogger("airflow.task")
       logger.info("airflow.task >>> INFO logger test")
       logger.warn("airflow.task >>> WARN logger test")
       logger.error("airflow.task >>> ERROR logger test")
       logger.critical("airflow.task >>> CRITICAL logger test")
   
   with DAG(
       'test_log',
       description='test log',
       schedule_interval=None,
       start_date=datetime.datetime(2022, 1, 19),
       catchup=False,
       tags=['log'],
   ) as dag:
       task = PythonOperator(
           task_id='test_log_task',
           python_callable=test_log,
           dag=dag
       )
   
   ```
   
   Resulting log:
   ```
   
   *** Reading local file: /root/airflow/logs/test_log/test_log_task/2022-01-21T16:17:12.235643+00:00/1.log
   [2022-01-21, 16:17:12 UTC] {taskinstance.py:1032} INFO - Dependencies all met for <TaskInstance: test_log.test_log_task manual__2022-01-21T16:17:12.235643+00:00 [queued]>
   [2022-01-21, 16:17:12 UTC] {taskinstance.py:1032} INFO - Dependencies all met for <TaskInstance: test_log.test_log_task manual__2022-01-21T16:17:12.235643+00:00 [queued]>
   [2022-01-21, 16:17:12 UTC] {taskinstance.py:1238} INFO - 
   --------------------------------------------------------------------------------
   [2022-01-21, 16:17:12 UTC] {taskinstance.py:1239} INFO - Starting attempt 1 of 1
   [2022-01-21, 16:17:12 UTC] {taskinstance.py:1240} INFO - 
   --------------------------------------------------------------------------------
   [2022-01-21, 16:17:12 UTC] {taskinstance.py:1259} INFO - Executing <Task(PythonOperator): test_log_task> on 2022-01-21 16:17:12.235643+00:00
   [2022-01-21, 16:17:12 UTC] {standard_task_runner.py:52} INFO - Started process 465 to run task
   [2022-01-21, 16:17:12 UTC] {standard_task_runner.py:76} INFO - Running: ['***', 'tasks', 'run', 'test_log', 'test_log_task', 'manual__2022-01-21T16:17:12.235643+00:00', '--job-id', '3', '--raw', '--subdir', 'DAGS_FOLDER/test.py', '--cfg-path', '/tmp/tmpjrm_lflu', '--error-file', '/tmp/tmpowwm0hd3']
   [2022-01-21, 16:17:12 UTC] {standard_task_runner.py:77} INFO - Job 3: Subtask test_log_task
   [2022-01-21, 16:17:12 UTC] {logging_mixin.py:109} INFO - Running <TaskInstance: test_log.test_log_task manual__2022-01-21T16:17:12.235643+00:00 [running]> on host afc5ced1b873
   [2022-01-21, 16:17:12 UTC] {taskinstance.py:1426} INFO - Exporting the following env vars:
   AIRFLOW_CTX_DAG_OWNER=***
   AIRFLOW_CTX_DAG_ID=test_log
   AIRFLOW_CTX_TASK_ID=test_log_task
   AIRFLOW_CTX_EXECUTION_DATE=2022-01-21T16:17:12.235643+00:00
   AIRFLOW_CTX_DAG_RUN_ID=manual__2022-01-21T16:17:12.235643+00:00
   [2022-01-21, 16:17:12 UTC] {test.py:12} INFO - ***.task >>> INFO logger test
   [2022-01-21, 16:17:12 UTC] {test.py:13} WARNING - ***.task >>> WARN logger test
   [2022-01-21, 16:17:12 UTC] {test.py:14} ERROR - ***.task >>> ERROR logger test
   [2022-01-21, 16:17:12 UTC] {test.py:15} CRITICAL - ***.task >>> CRITICAL logger test
   [2022-01-21, 16:17:12 UTC] {python.py:175} INFO - Done. Returned value was: None
   [2022-01-21, 16:17:12 UTC] {taskinstance.py:1277} INFO - Marking task as SUCCESS. dag_id=test_log, task_id=test_log_task, execution_date=20220121T161712, start_date=20220121T161712, end_date=20220121T161712
   [2022-01-21, 16:17:12 UTC] {local_task_job.py:154} INFO - Task exited with return code 0
   [2022-01-21, 16:17:12 UTC] {local_task_job.py:264} INFO - 0 downstream tasks scheduled from follow-on schedule check
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org