You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "taharah (via GitHub)" <gi...@apache.org> on 2023/10/05 18:39:32 UTC
[I] Airflow task log handling broken for non-Kubernetes executors [airflow]
taharah opened a new issue, #34783:
URL: https://github.com/apache/airflow/issues/34783
### Apache Airflow version
Other Airflow 2 version (please specify below)
### What happened
After upgrading Airflow from 2.5.3 to 2.6.3, none of our task logs for the ECS operator were being written to the configured file and S3 remote write handlers.
### What you think should happen instead
Airflow task logs should be written to the configured handlers.
### How to reproduce
Configure a remote write handler for a non-Kubernetes based executor.
### Operating System
Amazon Linux 2
### Versions of Apache Airflow Providers
apache-airflow-providers-amazon = "7.4.1"
apache-airflow-providers-celery = "3.3.4"
### Deployment
Other
### Deployment details
Airflow is deployed to Amazon ECS.
### Anything else
The issue seems to be a result of the change from #28440. Specifically, the change to remove the handlers from the source logger (https://github.com/apache/airflow/pull/28440/files#diff-ad618185a072910e49c11770954af009d1cc070b120a4fde5f2fc095a588271bR704), which was not the previous behavior.
I am happy to submit a change for this; however, want to get some direction on what others feel is the best path forward, e.g., making this behavior configurable or reverting to the original behavior.
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Airflow task log handling broken for non-Kubernetes executors [airflow]
Posted by "eladkal (via GitHub)" <gi...@apache.org>.
eladkal commented on issue #34783:
URL: https://github.com/apache/airflow/issues/34783#issuecomment-1803147884
> I'll work on putting together PRs to make the necessary changes.
Hi @taharah did you get a chance to work on it?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Airflow task log handling broken for non-Kubernetes executors [airflow]
Posted by "Taragolis (via GitHub)" <gi...@apache.org>.
Taragolis commented on issue #34783:
URL: https://github.com/apache/airflow/issues/34783#issuecomment-1749722882
Could you provide a bit more details, e.g example of your dag. I've unable to reproduce on current main branch
```console
[2023-10-05, 22:05:02 UTC] {ecs.py:531} INFO - Running ECS Task - Task definition: sample-task:1 - on cluster test-cluster
[2023-10-05, 22:05:02 UTC] {ecs.py:534} INFO - EcsOperator overrides: {'containerOverrides': [{'name': 'busybox', 'command': ['/bin/sh', '-c', 'echo regular; sleep 6; echo finish']}]}
[2023-10-05, 22:05:02 UTC] {base.py:73} INFO - Using connection ID 'aws_default' for task execution.
[2023-10-05, 22:05:02 UTC] {credentials.py:1123} INFO - Found credentials in environment variables.
[2023-10-05, 22:05:04 UTC] {ecs.py:647} INFO - ECS task ID is: aef0db066c2f48faad6303efc27180f0
[2023-10-05, 22:05:04 UTC] {ecs.py:573} INFO - Starting ECS Task Log Fetcher
[2023-10-05, 22:05:34 UTC] {base.py:73} INFO - Using connection ID 'aws_default' for task execution.
[2023-10-05, 22:05:34 UTC] {credentials.py:1123} INFO - Found credentials in environment variables.
[2023-10-05, 22:05:35 UTC] {task_log_fetcher.py:63} INFO - [2023-10-05 22:05:17,228] regular
[2023-10-05, 22:05:35 UTC] {task_log_fetcher.py:63} INFO - [2023-10-05 22:05:23,229] finish
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Airflow task log handling broken for non-Kubernetes executors [airflow]
Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on issue #34783:
URL: https://github.com/apache/airflow/issues/34783#issuecomment-1749449121
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Airflow task log handling broken for non-Kubernetes executors [airflow]
Posted by "taharah (via GitHub)" <gi...@apache.org>.
taharah commented on issue #34783:
URL: https://github.com/apache/airflow/issues/34783#issuecomment-1750713541
@Taragolis thank you for looking into this so quickly! Shortly after opening this issue, I was able to identify the real root cause for why our logs stopped being written to S3. We had added a logger for `airflow`, which had propagate set to `false`, in order to only write those logs to a file and not the console. The config was:
```python
LOGGING_CONFIG["loggers"]["airflow"] = {
"handlers": ["file"],
"level": LOG_LEVEL,
"propagate": False,
}
```
After removing the aforementioned logger, our task logs began to show up as expected. However, the reason we hit this issue was still due to the changes made in #28440. The changes made in that PR introduced a hard requirement for the `airflow.task` logs to be propagated to the root logger in order for the handlers associated with `airflow.task` to be invoked.
Thus, there still needs to be a couple of things done in order to fully resolve this issue.
1. Update the task logging documentation to include a note about requiring that the `airflow.task` logs are able to be propagated to the root logger.
2. While investigating this issue, I noticed a small bug with the new implementation that results in differing behavior than the original implementation. Namely, the configuration for the root logger is not reverted, i.e., the handlers and log level, when the context manager exits.
I'll work on putting together PRs to make the necessary changes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org