You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@apisix.apache.org by GitBox <gi...@apache.org> on 2022/03/26 02:14:11 UTC

[GitHub] [apisix] pguokun opened a new issue #6725: bug: 内存持续增涨OOM

pguokun opened a new issue #6725:
URL: https://github.com/apache/apisix/issues/6725


   ### Current Behavior
   
   ## 环境
   服务器配置:阿里云(8核 32G)
   操作系统:CentOS 7.9
   
   ## 问题现象:内存持续涨
   ![image](https://user-images.githubusercontent.com/15042141/160220515-2f6f890a-160d-4909-b5df-a38faf07d5a7.png)
   
   ## apisix自定义配置
   http_configuration_snippet配置:
       client_header_buffer_size 4k;
       client_body_buffer_size  512k;
       large_client_header_buffers 4 32k;
       sendfile on;
       tcp_nopush     on;
       tcp_nodelay on;
       proxy_connect_timeout    300;
       proxy_read_timeout       300;
       proxy_send_timeout       300;
       proxy_buffer_size        32k;
       proxy_buffers            4 16k;
       proxy_busy_buffers_size 32k;
       proxy_temp_file_write_size 32k;
   其他配置都是开源版本默认
   
   ## 插件使用
   kafka-logger
   prometheus
   zipkin
   
   
   ### Expected Behavior
   
   _No response_
   
   ### Error Logs
   
   _No response_
   
   ### Steps to Reproduce
   
   1、阿里云-容器服务 ACK helm安装
   2、apisix日志推送到kafka
   
   ### Environment
   
   - APISIX version (run `apisix version`): 2.12.1
   - Operating system (run `uname -a`): Linux apisix-6d7cd4c778-28jgx 3.10.0-1160.15.2.el7.x86_64
   - OpenResty / Nginx version (run `openresty -V` or `nginx -V`): openresty/1.19.9.1
   - etcd version, if relevant (run `curl http://127.0.0.1:9090/v1/server_info`): 无
   - APISIX Dashboard version, if relevant: 2.10.1
   - Plugin runner version, for issues related to plugin runners:
   - LuaRocks version, for installation issues (run `luarocks --version`):
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix] tzssangglass commented on issue #6725: help request: 内存持续增涨OOM

Posted by GitBox <gi...@apache.org>.
tzssangglass commented on issue #6725:
URL: https://github.com/apache/apisix/issues/6725#issuecomment-1079818811


   Is the memory performance of instance `29zhg` normal? Is there any difference?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix] jagerzhang edited a comment on issue #6725: help request: 内存持续增涨OOM

Posted by GitBox <gi...@apache.org>.
jagerzhang edited a comment on issue #6725:
URL: https://github.com/apache/apisix/issues/6725#issuecomment-1080081313


   @pguokun 我在腾讯云TKE(K8S)也发现有这个问题,内存增长趋于线性,内存缓缓增长到预设HPA之后会触发HPA弹性扩容,然后会一直扩再也不会缩回来了,得销毁重建才会回到最开始状态。
   
   我们这边用的插件如下:
   
   全局插件:
   - request-id
   - resonset-rewrite
   - kafka-logger
   
   路由插件:
   - proxy-rewrite (5%)
   - consumer-restriction (70%)
   - hmac(70%)
   - limit-count (70%)
   
   ![image](https://user-images.githubusercontent.com/9711651/160310159-bb340827-cbf4-47b6-9097-d7f8530d2aa4.png)
   ![image](https://user-images.githubusercontent.com/9711651/160310128-72dd4e65-a495-4e5a-825a-9b0d584b1971.png)
   
   不太清楚这里面有没有GC机制,如果有GC,GC的触发条件是不是内存占用到一定程度时才触发?而运行在容器里面,程序看到的内存其实是宿主机的内存,而非容器limit内存,会导致永远不可能触发GC呢?目前Python3.7及以下版本就有这个问题,看到的是宿主机的资源,导致GC不会被主动执行。
   
   以上,仅供参考哈~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix] GhangZh commented on issue #6725: help request: 内存持续增涨OOM

Posted by GitBox <gi...@apache.org>.
GhangZh commented on issue #6725:
URL: https://github.com/apache/apisix/issues/6725#issuecomment-1081917613


   I have the same problem, deployed on k8s with apisix pod 64G memory setting. 2GB traffic scenario on pressure test is OOM when uploading large files. but no OOM problem with traefik with 8G memory setting。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix] jagerzhang edited a comment on issue #6725: help request: 内存持续增涨OOM

Posted by GitBox <gi...@apache.org>.
jagerzhang edited a comment on issue #6725:
URL: https://github.com/apache/apisix/issues/6725#issuecomment-1080081313


   @pguokun 我在腾讯云TKE(K8S)也发现有这个问题,内存增长趋于线性,内存缓缓增长到预设HPA之后会触发HPA弹性扩容,然后会一直扩再也不会缩回来了,得销毁重建才会回到最开始状态。
   
   我们这边用的插件如下:
   
   全局插件:
   - request-id
   - resonset-rewrite
   - kafka-logger
   
   路由插件:
   - proxy-rewrite (5%)
   - consumer-restriction (70%)
   - hmac(70%)
   - limit-count (70%)
   
   以下是2C2G的Pod配置,之前用的1C1G跑半个月就会触发HPA扩容了。
   ![image](https://user-images.githubusercontent.com/9711651/160310159-bb340827-cbf4-47b6-9097-d7f8530d2aa4.png)
   ![image](https://user-images.githubusercontent.com/9711651/160310128-72dd4e65-a495-4e5a-825a-9b0d584b1971.png)
   
   不太清楚这里面有没有GC机制,如果有GC,GC的触发条件是不是内存占用到一定程度时才触发?而运行在容器里面,程序看到的内存其实是宿主机的内存,而非容器limit内存,会导致永远不可能触发GC呢?目前Python3.7及以下版本就有这个问题,看到的是宿主机的资源,导致GC不会被主动执行。
   
   以上,仅供参考哈~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix] tokers commented on issue #6725: help request: 内存持续增涨OOM

Posted by GitBox <gi...@apache.org>.
tokers commented on issue #6725:
URL: https://github.com/apache/apisix/issues/6725#issuecomment-1080074756


   @pguokun What about other metrics like QPS, CPU?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix] tokers commented on issue #6725: help request: 内存持续增涨OOM

Posted by GitBox <gi...@apache.org>.
tokers commented on issue #6725:
URL: https://github.com/apache/apisix/issues/6725#issuecomment-1082518661


   Memory allocated by `client_boy_buffer_size` is not managed by LuaJIT. It will be recycled after the request finalizing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix] spacewander commented on issue #6725: help request: 内存持续增涨OOM

Posted by GitBox <gi...@apache.org>.
spacewander commented on issue #6725:
URL: https://github.com/apache/apisix/issues/6725#issuecomment-1082539004


   > client_body_buffer_size 10240m
   
   This buffer is preallocated per request. If you want to reduce disk io, you can configure to avoid request buffering via https://apisix.apache.org/docs/apisix/plugins/proxy-control


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix] jagerzhang commented on issue #6725: help request: 内存持续增涨OOM

Posted by GitBox <gi...@apache.org>.
jagerzhang commented on issue #6725:
URL: https://github.com/apache/apisix/issues/6725#issuecomment-1080081313


   @pguokun 我在腾讯云TKE(K8S)也发现有这个问题,内存增长趋于线性,我们这边用的插件如下:
   
   全局插件:
   - request-id
   - resonset-rewrite
   - kafka-logger
   
   路由插件:
   - proxy-rewrite (5%)
   - consumer-restriction (70%)
   - hmac(70%)
   - limit-count (70%)
   
   ![image](https://user-images.githubusercontent.com/9711651/160310159-bb340827-cbf4-47b6-9097-d7f8530d2aa4.png)
   ![image](https://user-images.githubusercontent.com/9711651/160310128-72dd4e65-a495-4e5a-825a-9b0d584b1971.png)
   
   不太清楚这里面有没有GC机制,如果有GC,GC的触发条件是不是内存占用到一定程度时才触发?而运行在容器里面,程序看到的内存其实是宿主机的内存,而非容器limit内存,会导致永远不可能触发GC呢?目前Python3.7及以下版本就有这个问题,看到的是宿主机的资源,导致GC不会被主动执行。
   
   以上,仅供参考哈~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix] kwanhur commented on issue #6725: help request: 内存持续增涨OOM

Posted by GitBox <gi...@apache.org>.
kwanhur commented on issue #6725:
URL: https://github.com/apache/apisix/issues/6725#issuecomment-1079661060


   If ok, suggest you disable all the plugins to eliminate their effects. Then enable plugin one by one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix] spacewander commented on issue #6725: help request: 内存持续增涨OOM

Posted by GitBox <gi...@apache.org>.
spacewander commented on issue #6725:
URL: https://github.com/apache/apisix/issues/6725#issuecomment-1081414570


   LuaJIT manages the GC total count in the GCState.total field, so it won't depend on the host's memory info.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix] GhangZh edited a comment on issue #6725: help request: 内存持续增涨OOM

Posted by GitBox <gi...@apache.org>.
GhangZh edited a comment on issue #6725:
URL: https://github.com/apache/apisix/issues/6725#issuecomment-1081917613


   I have the same problem, deployed on k8s with 64G memory setting. i set client_body_buffer_size to 10240M. pressure test 20Gb traffic scenario is uploading large files and it will OOM. if i remove client_body_buffer_size configuration, disk io will be very high. I also tried to write client_body_temp_path to /dev/shm and it would also OOM.
   ```bash
       sendfile on;
       tcp_nopush on;
       tcp_nodelay on;
       client_header_buffer_size 16m;
       large_client_header_buffers 4 16m;
       client_body_buffer_size 10240m;
       client_max_body_size 10240m;
       proxy_buffering off;
       proxy_buffers 4 32k;
       proxy_buffer_size 32k;
       proxy_connect_timeout 30s;
       proxy_send_timeout   600s;
       proxy_read_timeout   600s;
       proxy_cache off;
       proxy_request_buffering off;
       client_body_temp_path /dev/shm;
   ```
   But with traefik memory set to 8G there is no OOM problem either


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix] GhangZh edited a comment on issue #6725: help request: 内存持续增涨OOM

Posted by GitBox <gi...@apache.org>.
GhangZh edited a comment on issue #6725:
URL: https://github.com/apache/apisix/issues/6725#issuecomment-1081917613


   I have the same problem, deployed on k8s with 64G memory setting. i set client_body_buffer_size to 10240M. pressure test 20Gb traffic scenario is uploading large files and it will OOM. if i remove client_body_buffer_size configuration, disk io will be very high. I also tried to write client_body_temp_path to /dev/shm and it would also OOM.
       sendfile on;
       tcp_nopush on;
       tcp_nodelay on;
       client_header_buffer_size 16m;
       large_client_header_buffers 4 16m;
       client_body_buffer_size 10240m;
       client_max_body_size 10240m;
       proxy_buffering off;
       proxy_buffers 4 32k;
       proxy_buffer_size 32k;
       proxy_connect_timeout 30s;
       proxy_send_timeout   600s;
       proxy_read_timeout   600s;
       proxy_cache off;
       proxy_request_buffering off;
       client_body_temp_path /dev/shm;
   But with traefik memory set to 8G there is no OOM problem either


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix] jagerzhang commented on issue #6725: help request: 内存持续增涨OOM

Posted by GitBox <gi...@apache.org>.
jagerzhang commented on issue #6725:
URL: https://github.com/apache/apisix/issues/6725#issuecomment-1081442042


   > LuaJIT manages the GC total count in the GCState.total field, so it won't depend on the host's memory info.
   
   got it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org