You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/06/09 16:14:06 UTC

[GitHub] [pulsar] lukestephenson opened a new issue #7223: Pulsar Broker is running compaction on multiple partitions in parallel.

lukestephenson opened a new issue #7223:
URL: https://github.com/apache/pulsar/issues/7223


   **Describe the bug**
   A single pulsar broker should only compact a single partition at a time.  Otherwise the memory usage of compacting multiple partitions in parallel increases the chance of OutOfMemoryErrors
   
   **To Reproduce**
   Steps to reproduce the behavior:
   1. Create a topic with many partitions
   2. Publish lots of data to all of the partitions
   3. Use pulsar admin to establish compaction thresholds that would trigger compaction for the partitions published to above.  `bin/pulsar-admin namespaces set-compaction-threshold \
     --threshold <some threshold> <tenant/namespace>`
   4. Check the logs.  Notice how on one broker compaction is triggered for the same partition within the same second (in the screenshot, these logs are all for broker 1).
   ![image](https://user-images.githubusercontent.com/395523/84141306-52408d80-aa96-11ea-9fa8-2eeea2621876.png)
   
   
   **Expected behavior**
   Compaction should only run for 1 partition at a time on a given broker.  The source code would suggest to me this was the intended behaviour, but it doesn't appear to be working that way.
   
   **Screenshots**
   Attached logs
   
   **Additional context**
   Raised on slack initially. https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1591243377186900
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sijie commented on issue #7223: Pulsar Broker is running compaction on multiple partitions in parallel.

Posted by GitBox <gi...@apache.org>.
sijie commented on issue #7223:
URL: https://github.com/apache/pulsar/issues/7223#issuecomment-641672450


   @lukestephenson I think we should add a throttling logic for people to define the parallelism for compaction. This would allow people to balance between the resource usage and the speed of compaction.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org