You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/04/04 14:03:04 UTC

[GitHub] [pulsar] galrose opened a new issue #10136: Memory leak

galrose opened a new issue #10136:
URL: https://github.com/apache/pulsar/issues/10136


   **Describe the bug**
   I'm producing to a persistent topic in pulsar with no subscriptions. 
   While producing the direct memory increases every time there is disk latency from the bookie's storage. 
   I've stopped writing and the memory (in broker ) doesn't get released. It's been that way for a few days and I haven't found a solution. 
   
   **To Reproduce**
   Steps to reproduce the behavior:
   1. Create pulsar cluster from the ansible play book (Either 2.7.0 or 2.7.1 I tried on both) 
   2. Create a persistent topic (QW == QA == 3) 
   3. Produce to the topic using pulsar perf
   4. Wait for deletions or other actions that heavily affect the storage. 
   5. Stop producing and the direct memory doesn't get released. 
   
   **Expected behavior**
   I expect the broker to release the memory. 
   I don't understand why it increased every time there was throttling from the bookie in the first place and even if so, it should go back down when it successfully writes to all the bookies. 
   
   **Desktop (please complete the following information):**
    - OS: Centos 7.8
   
   **Additional context**
   I've been struggling with this for a while and were trying to use pulsar as a central messaging platform for a large organization with multiple tenants. It would really help if there is a solution to this problem either in the configuration or if it's a bug then someone that understands could look at it. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] galrose closed issue #10136: Memory leak

Posted by GitBox <gi...@apache.org>.
galrose closed issue #10136:
URL: https://github.com/apache/pulsar/issues/10136


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on issue #10136: Memory leak

Posted by GitBox <gi...@apache.org>.
lhotari commented on issue #10136:
URL: https://github.com/apache/pulsar/issues/10136#issuecomment-829183581


   For example, In the k8s Helm chart deployment, an environment variable can be set for the broker by adding it under `broker.configData` in the `values.yaml` for the deployment.
   
   ```
   broker:
     configData:
       PULSAR_EXTRA_OPTS: -Dpulsar.allocator.leak_detection=Simple
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari edited a comment on issue #10136: Memory leak

Posted by GitBox <gi...@apache.org>.
lhotari edited a comment on issue #10136:
URL: https://github.com/apache/pulsar/issues/10136#issuecomment-829159713


   For direct memory leaks, it's most likely about Netty pooled memory. For debugging the situation, it might be useful to check if there are log messages from the Netty leak detector.
   
   It turns out that the [default setting for Netty leak detection for Pulsar broker is disabled](https://github.com/apache/pulsar/blob/8ea4a39dc8bf6f2f23a160688bb70a80f6acfd4d/pulsar-common/src/main/java/org/apache/pulsar/common/allocator/PulsarByteBufAllocator.java#L60-L61).
   
   Please set `-Dpulsar.allocator.leak_detection=simple` in `PULSAR_EXTRA_OPTS` environment variable to enable the simple detection. After Netty leaks would get reported to the log with this type of of log messages: 
   
   https://github.com/netty/netty/blob/4.1/common/src/main/java/io/netty/util/ResourceLeakDetector.java#L318-L336
   
   The possible leak detection log entries can be found by grepping the logs for "LEAK: ".
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] galrose commented on issue #10136: Memory leak

Posted by GitBox <gi...@apache.org>.
galrose commented on issue #10136:
URL: https://github.com/apache/pulsar/issues/10136#issuecomment-829159281


   No because the maxMessageBufferSizeInMb is just the buffer for messages. Even after there is no producers and the all the data is in the bookie then the memory stays the same, instead of going back down. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] galrose closed issue #10136: Memory leak

Posted by GitBox <gi...@apache.org>.
galrose closed issue #10136:
URL: https://github.com/apache/pulsar/issues/10136


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] galrose commented on issue #10136: Memory leak

Posted by GitBox <gi...@apache.org>.
galrose commented on issue #10136:
URL: https://github.com/apache/pulsar/issues/10136#issuecomment-829160476


   Thanks I'll try to do that :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on issue #10136: Memory leak

Posted by GitBox <gi...@apache.org>.
lhotari commented on issue #10136:
URL: https://github.com/apache/pulsar/issues/10136#issuecomment-830023071


   Since there's also a possibility that the reported problem could be related to Netty Recycler usage, I'd recommend doing the testing with Netty Recycler completely turned off. 
   
   You can find positive reports about disabling the Netty Recycler at https://github.com/netty/netty/pull/5968#issuecomment-399578989 and https://github.com/netty/netty/pull/5968#issuecomment-352823429 . Elasticsearch [also disabled Netty Recycler completely](https://github.com/elastic/elasticsearch/blob/5baabff6670a8ed49297488ca8cac8ec12a2078d/distribution/tools/launchers/src/main/java/org/elasticsearch/tools/launchers/SystemJvmOptions.java#L50) because of the problems it causes. 
   
   Here's an example of the recommended options to set in `PULSAR_EXTRA_OPTS`:
   ```
   broker:
     configData:
       PULSAR_EXTRA_OPTS: -Dio.netty.recycler.maxCapacityPerThread=0 -Dpulsar.allocator.leak_detection=Simple -Dpulsar.allocator.exit_on_oom=true
   ```
   
   It would be very useful if you could also share the possible impact to performance (throughput, latency, resource utilization) after disabling Netty Recycler with the setting described above.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari edited a comment on issue #10136: Memory leak

Posted by GitBox <gi...@apache.org>.
lhotari edited a comment on issue #10136:
URL: https://github.com/apache/pulsar/issues/10136#issuecomment-829183581


   For example, In the k8s Helm chart deployment, an environment variable can be set for the broker by adding it under `broker.configData` in the `values.yaml` for the deployment.
   
   ```
   broker:
     configData:
       PULSAR_EXTRA_OPTS: -Dpulsar.allocator.leak_detection=Simple -Dpulsar.allocator.exit_on_oom=true -Dio.netty.recycler.maxCapacity.default=1000 -Dio.netty.recycler.linkCapacity=1024
   ```
   
   `-Dpulsar.allocator.exit_on_oom=true -Dio.netty.recycler.maxCapacity.default=1000 -Dio.netty.recycler.linkCapacity=1024` are the defaults in `PULSAR_EXTRA_OPTS` which would get overridden unless they are also specified in the new value.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari edited a comment on issue #10136: Memory leak

Posted by GitBox <gi...@apache.org>.
lhotari edited a comment on issue #10136:
URL: https://github.com/apache/pulsar/issues/10136#issuecomment-829159713


   For direct memory leaks, it's most likely about Netty pooled memory. For debugging the situation, it might be useful to check if there are log messages from the Netty leak detector.
   
   It turns out that the [default setting for Netty leak detection for Pulsar broker is disabled](https://github.com/apache/pulsar/blob/8ea4a39dc8bf6f2f23a160688bb70a80f6acfd4d/pulsar-common/src/main/java/org/apache/pulsar/common/allocator/PulsarByteBufAllocator.java#L60-L61).
   
   Please set `-Dpulsar.allocator.leak_detection=Simple` in `PULSAR_EXTRA_OPTS` environment variable to enable the simple detection. After Netty leaks would get reported to the log with this type of of log messages: 
   
   https://github.com/netty/netty/blob/4.1/common/src/main/java/io/netty/util/ResourceLeakDetector.java#L318-L336
   
   The possible leak detection log entries can be found by grepping the logs for "LEAK: ".
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari edited a comment on issue #10136: Memory leak

Posted by GitBox <gi...@apache.org>.
lhotari edited a comment on issue #10136:
URL: https://github.com/apache/pulsar/issues/10136#issuecomment-829159713


   For direct memory leaks, it's most likely about Netty pooled memory. For debugging the situation, it might be useful to check if there are log messages from the Netty leak detector.
   
   It turns out that the [default setting for Netty leak detection for Pulsar broker is 'Disabled'](https://github.com/apache/pulsar/blob/8ea4a39dc8bf6f2f23a160688bb70a80f6acfd4d/pulsar-common/src/main/java/org/apache/pulsar/common/allocator/PulsarByteBufAllocator.java#L60-L61).
   
   Please set `-Dpulsar.allocator.leak_detection=Simple` in `PULSAR_EXTRA_OPTS` environment variable to enable the simple detection. After Netty leaks would get reported to the log with this type of of log messages: 
   
   https://github.com/netty/netty/blob/4.1/common/src/main/java/io/netty/util/ResourceLeakDetector.java#L318-L336
   
   The possible leak detection log entries can be found by grepping the logs for "LEAK: ".
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on issue #10136: Memory leak

Posted by GitBox <gi...@apache.org>.
lhotari commented on issue #10136:
URL: https://github.com/apache/pulsar/issues/10136#issuecomment-829159713


   For direct memory leaks, it's most likely about Netty pooled memory. For debugging the situation, it might be useful to check if there are log messages from the Netty leak detector.
   
   The default setting for Netty leak detection for Pulsar broker is [SIMPLE](https://netty.io/4.1/api/io/netty/util/ResourceLeakDetector.Level.html#SIMPLE).
   
   @galrose Can you check if you find this type of log messages: 
   
   https://github.com/netty/netty/blob/4.1/common/src/main/java/io/netty/util/ResourceLeakDetector.java#L318-L336
   
   Please grep the logs for "LEAK: ".
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on issue #10136: Memory leak

Posted by GitBox <gi...@apache.org>.
lhotari commented on issue #10136:
URL: https://github.com/apache/pulsar/issues/10136#issuecomment-829153804


   Is this a duplicate of #9562 ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org