You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/06/16 01:50:07 UTC

[GitHub] [incubator-druid] himanshug edited a comment on issue #7824: Kafka index service use a lot of direct memory during segment publish

himanshug edited a comment on issue #7824: Kafka index service use a lot of direct memory during segment publish
URL: https://github.com/apache/incubator-druid/issues/7824#issuecomment-502412342
 
 
   Unfortunately, It is very hard to offer concrete advice on these things, I can give some general points for you to further investigate.
   
   > OS killed this task
   
   I am assuming that means linux OOM killer killed the process. If yes, then it probably can't be the memory used to load the segments in page cache as that doesn't count towards anonymous memory usage. So, that means the process really allocated loads of direct memory using those buffers you mentioned. 
   There could be a genuine memory leak there. You can identify that by doing a manual GC (e.g. using "jcmd <pid> GC.run" )  at time of high direct memory use, if GC frees up a whole bunch of direct memory then there is almost certainly a memory leak and we are not "closing/cleaning" discarded off-heap buffers. If you find that, you can try to then identify which buffers are we not cleaning up and/or create another ticket with your task log and analysis , someone else from community might be able to look into it.
   If GC does not free significant direct memory that means all that memory is really in use and code needs optimization to reduce its memory usage. My suspicion is that, most of that off-heap buffers are probably being used to merge the intermediate persisted segment merging , playing with some of configuration described in https://druid.apache.org/docs/latest/ingestion/native_tasks.html#tuningconfig might increase/decrease off heap memory usage.  (especially maxRowsInMemory, maxBytesInMemory, maxTotalRows .. try reducing maxTotalRows )
   
   Check whether segment handoffs are working fine, you can add "pushTimeout" setting from above if need be.
   
   you might be hitting issue described in https://github.com/apache/incubator-druid/pull/6699 , if you can build druid with that patch applied then try that.
   
   That said, please attache the task log of the process that ends up dead due to too much memory usage. I will take a quick look to see if there is something obvious that shows up.
   
   HTH.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org