You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "antonio-antuan (via GitHub)" <gi...@apache.org> on 2023/03/01 15:36:04 UTC
[GitHub] [airflow] antonio-antuan opened a new issue, #29841: high memory leak, cannot start even webserver
antonio-antuan opened a new issue, #29841:
URL: https://github.com/apache/airflow/issues/29841
### Apache Airflow version
2.5.1
### What happened
I'd used airflow 2.3.1 and everything was fine.
Then I decided to move to airflow 2.5.1.
I can't start even webserver, airflow on my laptop consumes the entire memory (32Gb) and OOM killer comes.
I investigated a bit. So it starts with airflow 2.3.4. Only using official docker image (apache/airflow:2.3.4) and only on linux laptop, mac is ok.
Memory leak starts when source code tries to import for example `airflow.cli.commands.webserver_command` module using `airflow.utils.module_loading.import_string`.
I dived deeply and found that it happens when "import daemon" is performed.
You can reproduce it with this command: `docker run --rm --entrypoint="" apache/airflow:2.3.4 /bin/bash -c "python -c 'import daemon'"`. Once again, reproducec only on linux (my kernel is 6.1.12).
That's weird considering `daemon` hasn't been changed since 2018.
### What you think should happen instead
_No response_
### How to reproduce
docker run --rm --entrypoint="" apache/airflow:2.3.4 /bin/bash -c "python -c 'import daemon'"
### Operating System
Arch Linux (kernel 6.1.12)
### Versions of Apache Airflow Providers
_No response_
### Deployment
Docker-Compose
### Deployment details
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #29841: high memory leak, cannot start even webserver
Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #29841:
URL: https://github.com/apache/airflow/issues/29841#issuecomment-1450518290
Thanks for looking into it and finding out the root cause being python-daemon. So far we missed the root cause of it, but we knew the new containerd has the problem due to their changed settings.
This is a known issue with container.d changing their default setting - we had been discussing it recenty in https://github.com/apache/airflow/discussions/29731 and the discussion contains also some workarounds that you can use.
The fact that we know it's python-daemon opens up a possibility that we can likely patch it somehow while waiting for either containerd reverting the change or Python daemon fixing their behaviour
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] antonio-antuan commented on issue #29841: high memory leak, cannot start even webserver
Posted by "antonio-antuan (via GitHub)" <gi...@apache.org>.
antonio-antuan commented on issue #29841:
URL: https://github.com/apache/airflow/issues/29841#issuecomment-1450434725
created an issue for python-daemon: https://pagure.io/python-daemon/issue/72
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #29841: high memory leak, cannot start even webserver
Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #29841:
URL: https://github.com/apache/airflow/issues/29841#issuecomment-1454756641
The `python-daemon==3.0.0` has been releassed with the fix - you can upgrade it and it should fix the problem.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] antonio-antuan commented on issue #29841: high memory leak, cannot start even webserver
Posted by "antonio-antuan (via GitHub)" <gi...@apache.org>.
antonio-antuan commented on issue #29841:
URL: https://github.com/apache/airflow/issues/29841#issuecomment-1450419914
ok, clarified
this is the place where things happens: site-packages/daemon/daemon.py:868
```
def get_maximum_file_descriptors():
""" Get the maximum number of open file descriptors for this process.
:return: The number (integer) to use as the maximum number of open
files for this process.
The maximum is the process hard resource limit of maximum number of
open file descriptors. If the limit is “infinity”, a default value
of ``MAXFD`` is returned.
"""
(__, hard_limit) = resource.getrlimit(resource.RLIMIT_NOFILE)
result = hard_limit
if hard_limit == resource.RLIM_INFINITY:
result = MAXFD
return result
_total_file_descriptor_range = (0, get_maximum_file_descriptors())
_total_file_descriptor_set = set(range(*_total_file_descriptor_range))
```
apache/airflow:2.3.1 (and above I think) doesn't have this version of library so nothing happens during initialization.
I'd like to suggest better ulimits for image
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] antonio-antuan commented on issue #29841: high memory leak, cannot start even webserver
Posted by "antonio-antuan (via GitHub)" <gi...@apache.org>.
antonio-antuan commented on issue #29841:
URL: https://github.com/apache/airflow/issues/29841#issuecomment-1450531952
Glad to help :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk closed issue #29841: high memory leak, cannot start even webserver
Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk closed issue #29841: high memory leak, cannot start even webserver
URL: https://github.com/apache/airflow/issues/29841
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org