You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "LakshSingla (via GitHub)" <gi...@apache.org> on 2023/03/03 06:30:28 UTC

[GitHub] [druid] LakshSingla commented on pull request #13755: Add a readOnly() method for PartitionedOutputChannel

LakshSingla commented on PR #13755:
URL: https://github.com/apache/druid/pull/13755#issuecomment-1453045496

   After debugging, it was found that there can be the following places of memory leaks (before this patch):
   1. The SuperSorter holds the PartitionedOutputChannels' map which contains the writableChannel and the memory allocator, which are not useful once the channel has been written to. Therefore while storing the PartitionedOutputChannels, we convert them to readOnly() which throws away the reference to the writableChannel and the memory allocator allowing them to be GCed and thereby reducing the footprint of the SuperSorter. Attached below is a heap dump before the PR
   <img width="764" alt="Screenshot 2023-03-03 at 11 52 00 AM" src="https://user-images.githubusercontent.com/30999375/222646939-d73a486f-05c2-4391-8290-f4833ffa8421.png">
   
   2. The output channels created by `ComposingOutputChannelFactory` contain a collection of writableChannels and the readableChannelSuppliers (from which it is composed off). Even if we throw away the reference to the frame memory allocators of the original channel, the readableChannelSuppliers still hold the reference to the memory allocators of the individual output channels. To alleviate this, while building the readableChannelSuppliers in the ComposingOutputChannelFactory, we only get the readOnly() version of the output channel.
   
   3. Another potential memory hog is when the output channels created by the `ComposingOutputChannelFactory` contain more than one outputChannel. Once an output channel is exhausted in the composition, we move on to the next output channel and never write to the older one again. However, we still hold the reference to the memory allocator of the older channels because the code assumes that we can write on it again (till the composition itself is marked as readOnly()). Therefore multiple memory allocators can be held up for a composition, even though we require one at a time. While the future ones are created lazily, the older ones are not closed, and this PR addresses that as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org