You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficserver.apache.org by GitBox <gi...@apache.org> on 2022/03/15 20:35:38 UTC

[GitHub] [trafficserver] pbchou opened a new issue #8728: Lua Plugin memory leak on remap configuration reloads.

pbchou opened a new issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728


   Hi. We are experiencing a memory leak on remap configuration reloads while using the Lua plugin. This seems to be a known issue based on comments in the source code. We will need a fix as our Lua scripts leak close 1GB of memory on every reload in production.
   
   Mention @shukitchan from the original mailing list discussion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficserver] pbchou commented on issue #8728: Lua Plugin memory leak on remap configuration reloads.

Posted by GitBox <gi...@apache.org>.
pbchou commented on issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728#issuecomment-1069691941


   Some interesting testing results --
   
   I am running with a remap configuration consisting of eight lines. Each line uses the same single state for the Lua plugin. Each allocates 100MB of memory when it is initialized. So the total allocation is 800MB per remap configuration loading.
   
   I am varying three variables --
   1. free(ih) on or off
   2. clear global registry on or off
   3. running garbage collector on demand on or off
   
   Tests --
   * ATS 9.1.1 -free -global -gc --> after x8 reloads allocation is 7.2GB used and never decreases
   * ATS 9.1.1 +free -global -gc --> allocation cycles between 2.4/3.2GB down to 1.6GB [ So it turns out the GC is run on its own, but it appears to be based on some kind of threshold. some times goes up 2.4GB before cycles and other times 3.2GB before cycles. Note that it only frees down to 1.6GB not 800MB. ]
   * ATS 9.1.1 +free +global -gc --> allocation cycles between 1.8-2.5GB down to 0.8-1.3GB [ So it will free more memory than the above case. ]
   * ATS 9.1.1 +free +global +gc --> allocation cycles between 1.6GB down to 0.8GB [ This is the same behavior as with 7.1.4. ]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficserver] shukitchan commented on issue #8728: Lua Plugin memory leak on remap configuration reloads.

Posted by GitBox <gi...@apache.org>.
shukitchan commented on issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728#issuecomment-1071630990


   1) I am thinking whether we can do the gc in __clean__ function instead in your own lua script ? Will that make a difference ?
   
   2) I am checking #6737 and am thinking whether we still need ref count or not in our case. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficserver] pbchou commented on issue #8728: Lua Plugin memory leak on remap configuration reloads.

Posted by GitBox <gi...@apache.org>.
pbchou commented on issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728#issuecomment-1068611232


   Yes, both steps seem to be required based on my testing. If I am correct in that remap create new instances all run in one thread (and the same is true for deletes), then it should be straight-forward to reference count the instance handle to control clearing the global registry and freeing the instance handle. However, what was the reason for removing the garbage collector calls? The commit reference mentioned fixing core dumps?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficserver] randall closed issue #8728: Lua Plugin memory leak on remap configuration reloads.

Posted by GitBox <gi...@apache.org>.
randall closed issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficserver] pbchou commented on issue #8728: Lua Plugin memory leak on remap configuration reloads.

Posted by GitBox <gi...@apache.org>.
pbchou commented on issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728#issuecomment-1071492075


   @shukitchan -- Please see the commit that I added off the master branch in my personal fork. It should be referenced in this issue above this comment. Not sure what direction you wanted to take with this. If you think this is alright to go forward, then I can open a PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficserver] shukitchan commented on issue #8728: Lua Plugin memory leak on remap configuration reloads.

Posted by GitBox <gi...@apache.org>.
shukitchan commented on issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728#issuecomment-1071630990


   1) I am thinking whether we can do the gc in __clean__ function instead in your own lua script ? Will that make a difference ?
   
   2) I am checking #6737 and am thinking whether we still need ref count or not in our case. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficserver] pbchou commented on issue #8728: Lua Plugin memory leak on remap configuration reloads.

Posted by GitBox <gi...@apache.org>.
pbchou commented on issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728#issuecomment-1068573969


   I can't really tell except with the global and file-local variables since I am testing with 100MB chunks. There might be leakages of other things, but they are too small to register with only the small test script. You would probably have to run with tcmalloc or jemalloc with leak checking turned on to find any other small leaks.
   
   Couple of notes --
   
   1. For 9.1.1 we also cherry-picked the following #8589 (memory leak) and #8080 (core dump) at the recommendations of the community. 
   
   2. It seems that I had to reverse the following commits 36a8cd (which removed the calling the LuaJIT garbage collector when deleting remap instances) and 1f645a (which removed the freeing of the remap instance handle and the clearing of the global registry in the Lua states when deleting remap instances) to get the global and file-local variables to be cleared on reloads. I don't think reversing either one on its own will do it. I think the clearing of the registry and calling the garbage collector are probably more important than the leaking of the instance handle structure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficserver] pbchou commented on issue #8728: Lua Plugin memory leak on remap configuration reloads.

Posted by GitBox <gi...@apache.org>.
pbchou commented on issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728#issuecomment-1071492075


   @shukitchan -- Please see the commit that I added off the master branch in my personal fork. It should be referenced in this issue above this comment. Not sure what direction you wanted to take with this. If you think this is alright to go forward, then I can open a PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficserver] shukitchan commented on issue #8728: Lua Plugin memory leak on remap configuration reloads.

Posted by GitBox <gi...@apache.org>.
shukitchan commented on issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728#issuecomment-1074868292


   Ok. I think I understand the change you want. It looks ok. 
   I can try to test out these changes before next week. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficserver] pbchou commented on issue #8728: Lua Plugin memory leak on remap configuration reloads.

Posted by GitBox <gi...@apache.org>.
pbchou commented on issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728#issuecomment-1069500337


   It sounds like your use case is different than ours. This might be something that could be included as a plugin parameter to decide whether to run the garbage collector on remap instance deletes.
   
   Let me explain why we added the garbage collector calls. Our current production config generates close to 1GB of Lua memory usage. With the current LuaJIT v2.1.0-beta3 that is standard, the memory usage is limited to 2GB on Linux. This means that we could not reload remap config more than once before running the garbage collector. If one is willing to take the latest non-tagged LuaJIT snapshot, the 2GB limit is gone. However, then you would eventually run into system memory limitations. We are very concerned about memory usage as we would prefer to use the RAM for caching. Is there a way to set and guarantee a maximum amount of time or memory usage threshold between garbage collector cycles? Otherwise, how would we engineer our network controller in terms of pushing config files and executing config reloads? Out of hundreds of caches maybe some have run GC and some have not.
   
   FYI, in looking through the LuaJIT source, there is a garbage collector option to stop garbage collection. It sounds like this might be useful to you since Lua deciding on its own to run the GC in your use case might be a time bomb.
   
   ```
   lua_gc(L, LUA_GCSTOP, 0);
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficserver] pbchou commented on issue #8728: Lua Plugin memory leak on remap configuration reloads.

Posted by GitBox <gi...@apache.org>.
pbchou commented on issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728#issuecomment-1074357326


   1. We don't use any __clean__() function currently. I suppose we can modify our scripts to add one. What happens is the last remap instance will not free since the file-global is not out of scope at the time. So in my testing of 8 x 100MB instances you will free 700MB and not the last instance.
   2. It seems like you would still need to reference count because the remap system assumes that each plugin instance ih is unique. If you have multiple plugin instances using the same ih then that would result in a double free as soon as you hit the same ih twice right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficserver] pbchou commented on issue #8728: Lua Plugin memory leak on remap configuration reloads.

Posted by GitBox <gi...@apache.org>.
pbchou commented on issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728#issuecomment-1081214675


   @shukitchan -- I rebased and squashed my branch into the new branch above. There are no other changes. If I am understanding you correctly, then I will go ahead and create a PR based on this new branch. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficserver] ezelkow1 commented on issue #8728: Lua Plugin memory leak on remap configuration reloads.

Posted by GitBox <gi...@apache.org>.
ezelkow1 commented on issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728#issuecomment-1068490741


   From the mailing list, I had a few questions since we've also been tracking a leak on 9.1. So for the instances you are seeing this is it only from global/file-global-local var usages? Or just lua reloads in general?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficserver] shukitchan commented on issue #8728: Lua Plugin memory leak on remap configuration reloads.

Posted by GitBox <gi...@apache.org>.
shukitchan commented on issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728#issuecomment-1068601940


   So you are saying that you want to reverse those two changes : i.e. calling garbage collector and clearing of the global registry in the lua states when deleting remap instances. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficserver] shukitchan commented on issue #8728: Lua Plugin memory leak on remap configuration reloads.

Posted by GitBox <gi...@apache.org>.
shukitchan commented on issue #8728:
URL: https://github.com/apache/trafficserver/issues/8728#issuecomment-1068809859


   explicitly calling garbage collector is expensive and takes a long time. In a situation where there are many lines in remap.config using the lua script, it can take forever and the ATS can eventually crash. 
   
   Why do you need to call the garbage collector? Would it be better to leave it for the lua to decide even to collect the garbage? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org