You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/01/21 11:49:14 UTC
[GitHub] [airflow] SamJia opened a new issue #21013: logging in Operator not work
SamJia opened a new issue #21013:
URL: https://github.com/apache/airflow/issues/21013
### Apache Airflow version
2.2.3 (latest released)
### What happened
Operator logging not work.
For example, for a task with logging:
![image](https://user-images.githubusercontent.com/12780802/150499992-d939cdfc-0e3f-471e-b3ad-7232b8f313d9.png)
the log in webserver is:
![image](https://user-images.githubusercontent.com/12780802/150500091-e357c1b8-8cab-424f-aed1-21c957c6d6ed.png)
no any log information are recorded.
In my view, the reason is that when a TaskInstance is loaded through session with DagRun.run_id in function `_get_ti` in airflow/cli/commands/task_command.py, ti._log object is not properly initialzed.
![image](https://user-images.githubusercontent.com/12780802/150521672-0a9fd2d6-c231-4b9a-812e-78910c067bd4.png)
Then, when call 'ti.init_run_context', ti._log will be set as logging.getLogger(self.__class__.__module__ + '.' + self.__class__.__name__) according to function log of class LoggingMixin in airflow/utils/log/logging_mixin.py. And the context of logging.getLogger('airflow.task') is not properly set.
![image](https://user-images.githubusercontent.com/12780802/150521756-1816cde0-9616-48d6-aca9-e18203e21f1b.png)
Following, in function '_capture_task_logs' in airflow/cli/commands/task_command.py, the root logger configuration is replaced with the airflow.task configuration, which context is not set.
![image](https://user-images.githubusercontent.com/12780802/150521822-4c6a8840-2a60-459b-a063-2b22a0706295.png)
It means all the loging happen after the replacement cannot be recorded.
Possible way to fix:
In function `refresh_from_task`, initialize self._log like follows:
![image](https://user-images.githubusercontent.com/12780802/150521531-3b53c984-1a81-40da-8b04-d99babe1de54.png)
Moreover, it may be also addeed in function `refresh_from_db`
### What you expected to happen
I wish the log looks like this:
![image](https://user-images.githubusercontent.com/12780802/150502087-9b8dd557-f60a-45fe-a179-9223b32dfbba.png)
### How to reproduce
My config related to logging in airlfow.cfg:
```
[logging]
base_log_folder = /home/airflow/airflow/logs
remote_logging = False
remote_log_conn_id =
google_key_path =
remote_base_log_folder =
encrypt_s3_logs = False
logging_level = INFO
fab_logging_level = INFO
logging_config_class =
colored_console_log = True
colored_log_format = [%%(blue)s%%(asctime)s%%(reset)s] {%%(blue)s%%(filename)s:%%(reset)s%%(lineno)d} %%(log_color)s%%(levelname)s%%(reset)s - %%(log_color)s%%(message)s%%(reset)s
colored_formatter_class = airflow.utils.log.colored_log.CustomTTYColoredFormatter
log_format = [%%(asctime)s] {%%(filename)s:%%(lineno)d} %%(levelname)s - %%(message)s
simple_log_format = %%(asctime)s %%(levelname)s - %%(message)s
task_log_prefix_template =
log_filename_template = {{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{ try_number }}.log
log_processor_filename_template = {{ filename }}.log
dag_processor_manager_log_location = /home/airflow/airflow/logs/dag_processor_manager/dag_processor_manager.log
task_log_reader = task
extra_logger_names =
worker_log_server_port = 8793
```
My DAG and task:
```
import datetime
from airflow import DAG
from airflow.decorators import task
from airflow.operators.python import PythonOperator
def test_log():
import logging
logging.info("airflow.task >>> INFO logger test")
logging.warn("airflow.task >>> WARN logger test")
logging.error("airflow.task >>> ERROR logger test")
logging.critical("airflow.task >>> CRITICAL logger test")
with DAG(
'test_log',
description='test log',
schedule_interval=None,
start_date=datetime.datetime(2022, 1, 19),
catchup=False,
tags=['log'],
) as dag:
task = PythonOperator(
task_id='test_log_task',
python_callable=test_log,
dag=dag
)
```
### Operating System
NAME="Ubuntu" VERSION="18.04.4 LTS (Bionic Beaver)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 18.04.4 LTS" VERSION_ID="18.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=bionic UBUNTU_CODENAME=bionic
### Versions of Apache Airflow Providers
None
### Deployment
Virtualenv installation
### Deployment details
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] SamJia commented on issue #21013: logging in Operator not work
Posted by GitBox <gi...@apache.org>.
SamJia commented on issue #21013:
URL: https://github.com/apache/airflow/issues/21013#issuecomment-1018624809
> Actually I have tried to use logger `airflow.task`, but also failed
![image](https://user-images.githubusercontent.com/12780802/150556056-90dfae46-1352-434a-a455-8668289d58c9.png)
![image](https://user-images.githubusercontent.com/12780802/150556096-640b9487-50c4-4691-82fd-0d18606d9ca3.png)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] SamJia removed a comment on issue #21013: logging in Operator not work
Posted by GitBox <gi...@apache.org>.
SamJia removed a comment on issue #21013:
URL: https://github.com/apache/airflow/issues/21013#issuecomment-1018624182
Actually I have tried to use logger `airflow.task`, but also failed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk closed issue #21013: logging in Operator not work
Posted by GitBox <gi...@apache.org>.
potiuk closed issue #21013:
URL: https://github.com/apache/airflow/issues/21013
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #21013: logging in Operator not work
Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #21013:
URL: https://github.com/apache/airflow/issues/21013#issuecomment-1018493602
You are using logging module improperly. By `logging,info` (and similar) you are logging stuff to "root" logger:
https://docs.python.org/3/library/logging.html#logging.info
> logging.info(msg, *args, **kwargs)¶
Logs a message with level INFO on the root logger. The arguments are interpreted as for debug().
Instead you should log message to "airflow.task" logger if you want messages to show up in task log:
```
logger = logging.getLogger("airflow.task")
logger.info(...)
``
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #21013: logging in Operator not work
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #21013:
URL: https://github.com/apache/airflow/issues/21013#issuecomment-1018436785
Thanks for opening your first issue here! Be sure to follow the issue template!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] SamJia edited a comment on issue #21013: logging in Operator not work
Posted by GitBox <gi...@apache.org>.
SamJia edited a comment on issue #21013:
URL: https://github.com/apache/airflow/issues/21013#issuecomment-1018624809
> You are using logging module improperly. By `logging,info` (and similar) you are logging stuff to "root" logger:
>
> https://docs.python.org/3/library/logging.html#logging.info
>
> > logging.info(msg, *args, **kwargs)¶
> > Logs a message with level INFO on the root logger. The arguments are interpreted as for debug().
>
> Instead you should log message to "airflow.task" logger if you want messages to show up in task log:
>
> ```
> logger = logging.getLogger("airflow.task")
> logger.info(...)
> ``
> ```
Actually I have tried to use logger `airflow.task`, but also failed
![image](https://user-images.githubusercontent.com/12780802/150556056-90dfae46-1352-434a-a455-8668289d58c9.png)
![image](https://user-images.githubusercontent.com/12780802/150556096-640b9487-50c4-4691-82fd-0d18606d9ca3.png)
Moreover, considering the level and handler of `root` logger are replaced with the level and handler of `airflow.task` logger, I think they should work same in task operators.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] SamJia commented on issue #21013: logging in Operator not work
Posted by GitBox <gi...@apache.org>.
SamJia commented on issue #21013:
URL: https://github.com/apache/airflow/issues/21013#issuecomment-1018624182
Actually I have tried to use logger `airflow.task`, but also failed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #21013: logging in Operator not work
Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #21013:
URL: https://github.com/apache/airflow/issues/21013#issuecomment-1018656967
Cannot reproduce it with "standard" airflow configuration. You must have some custom logging configuration in local settings that causes it:
```python
import datetime
from airflow import DAG
from airflow.decorators import task
from airflow.operators.python import PythonOperator
def test_log():
import logging
logger = logging.getLogger("airflow.task")
logger.info("airflow.task >>> INFO logger test")
logger.warn("airflow.task >>> WARN logger test")
logger.error("airflow.task >>> ERROR logger test")
logger.critical("airflow.task >>> CRITICAL logger test")
with DAG(
'test_log',
description='test log',
schedule_interval=None,
start_date=datetime.datetime(2022, 1, 19),
catchup=False,
tags=['log'],
) as dag:
task = PythonOperator(
task_id='test_log_task',
python_callable=test_log,
dag=dag
)
```
Resulting log:
```
*** Reading local file: /root/airflow/logs/test_log/test_log_task/2022-01-21T16:17:12.235643+00:00/1.log
[2022-01-21, 16:17:12 UTC] {taskinstance.py:1032} INFO - Dependencies all met for <TaskInstance: test_log.test_log_task manual__2022-01-21T16:17:12.235643+00:00 [queued]>
[2022-01-21, 16:17:12 UTC] {taskinstance.py:1032} INFO - Dependencies all met for <TaskInstance: test_log.test_log_task manual__2022-01-21T16:17:12.235643+00:00 [queued]>
[2022-01-21, 16:17:12 UTC] {taskinstance.py:1238} INFO -
--------------------------------------------------------------------------------
[2022-01-21, 16:17:12 UTC] {taskinstance.py:1239} INFO - Starting attempt 1 of 1
[2022-01-21, 16:17:12 UTC] {taskinstance.py:1240} INFO -
--------------------------------------------------------------------------------
[2022-01-21, 16:17:12 UTC] {taskinstance.py:1259} INFO - Executing <Task(PythonOperator): test_log_task> on 2022-01-21 16:17:12.235643+00:00
[2022-01-21, 16:17:12 UTC] {standard_task_runner.py:52} INFO - Started process 465 to run task
[2022-01-21, 16:17:12 UTC] {standard_task_runner.py:76} INFO - Running: ['***', 'tasks', 'run', 'test_log', 'test_log_task', 'manual__2022-01-21T16:17:12.235643+00:00', '--job-id', '3', '--raw', '--subdir', 'DAGS_FOLDER/test.py', '--cfg-path', '/tmp/tmpjrm_lflu', '--error-file', '/tmp/tmpowwm0hd3']
[2022-01-21, 16:17:12 UTC] {standard_task_runner.py:77} INFO - Job 3: Subtask test_log_task
[2022-01-21, 16:17:12 UTC] {logging_mixin.py:109} INFO - Running <TaskInstance: test_log.test_log_task manual__2022-01-21T16:17:12.235643+00:00 [running]> on host afc5ced1b873
[2022-01-21, 16:17:12 UTC] {taskinstance.py:1426} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=***
AIRFLOW_CTX_DAG_ID=test_log
AIRFLOW_CTX_TASK_ID=test_log_task
AIRFLOW_CTX_EXECUTION_DATE=2022-01-21T16:17:12.235643+00:00
AIRFLOW_CTX_DAG_RUN_ID=manual__2022-01-21T16:17:12.235643+00:00
[2022-01-21, 16:17:12 UTC] {test.py:12} INFO - ***.task >>> INFO logger test
[2022-01-21, 16:17:12 UTC] {test.py:13} WARNING - ***.task >>> WARN logger test
[2022-01-21, 16:17:12 UTC] {test.py:14} ERROR - ***.task >>> ERROR logger test
[2022-01-21, 16:17:12 UTC] {test.py:15} CRITICAL - ***.task >>> CRITICAL logger test
[2022-01-21, 16:17:12 UTC] {python.py:175} INFO - Done. Returned value was: None
[2022-01-21, 16:17:12 UTC] {taskinstance.py:1277} INFO - Marking task as SUCCESS. dag_id=test_log, task_id=test_log_task, execution_date=20220121T161712, start_date=20220121T161712, end_date=20220121T161712
[2022-01-21, 16:17:12 UTC] {local_task_job.py:154} INFO - Task exited with return code 0
[2022-01-21, 16:17:12 UTC] {local_task_job.py:264} INFO - 0 downstream tasks scheduled from follow-on schedule check
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org