You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/09/14 09:28:21 UTC

[GitHub] [pulsar] hozumi opened a new issue #12036: Broker high cpu usage due to producer's batching misconfiguration.

hozumi opened a new issue #12036:
URL: https://github.com/apache/pulsar/issues/12036


   Hi,
   I asked about the high CPU usage of my brokers on the pulsar slack channel several months ago, which I cannot see that post now.
   I just want to share that I solved the problem by changing producer's batch configuration properly.
   
   I thought that I had already enabled batching, but I did set the following wrong configuration.
   
   1. 3000 micro seconds batch duration instead of 3000 ms.
   
   ```
       .batchingMaxPublishDelay(3000, TimeUnit.MICROSECONDS)
   ```
   Yeah, this is silly mistake.
   Also It should be note that the default value of batchingMaxPublishDelay is `1ms` , which will have no batching effects, I think.
   
   2. Unnecessary KEY_BASED BatcherBuilder
   ```
      .batcherBuilder(BatcherBuilder.KEY_BASED)
    ```
   I somehow thought that `BatcherBuilder.KEY_BASED` is necessary in order to send messages with the same key into a particular partition.
   A batch made with KEY_BASED only contains messages with the same key, which result in massive 1 message batches in my use case.
    ```
    Key based batch message container
    incoming single messages:
       (k1, v1), (k2, v1), (k3, v1), (k1, v2), (k2, v2), (k3, v2), (k1, v3), (k2, v3), (k3, v3)
    batched into multiple batch messages:
       [(k1, v1), (k1, v2), (k1, v3)], [(k2, v1), (k2, v2), (k2, v3)], [(k3, v1), (k3, v2), (k3, v3)]
    ```
    
   As the partitioned producer in the default routing-mode does assign message to a particular partition, I don't need `BatcherBuilder.KEY_BASED` for my use cases.
   https://pulsar.apache.org/docs/en/admin-api-topics/#routing-mode
   > RoundRobinPartition
   > If a key is specified on the message, the partitioned producer hashes the key and assigns message to a particular partition. This is the default mode.
   
   For those who encounter the similar performance problem, I will recommend you to check the actual number of batched messages by cli such as `examine-messages` , `peek-messages` and `get-message-by-id`.
   You can see number of batched messages as `X-Pulsar-num-batch-message`.
   
   ```
   $ docker exec -it pulsar_broker bin/pulsar-admin topics examine-messages --initialPosition latest "persistent://mytenant/mynamespace/mytopic-partition-0" | head
   Message ID: 4572594:27489
   Tenants:
   "X-Pulsar-batch-size    23678"
   "X-Pulsar-num-batch-message    48"
   ...
   $ docker exec -it pulsar_broker bin/pulsar-admin topics get-message-by-id --ledgerId 4572594 --entryId 27489 "persistent://mytenant/mynamespace/mytopic-partition-0"
   Batch Message ID: 4572594:27489:0
   Properties:
   "X-Pulsar-batch-size    23678"
   "X-Pulsar-num-batch-message    48"
   ...
   $ docker exec -it pulsar_broker bin/pulsar-admin topics peek-messages --subscription mysub1 --count 1 "persistent://mytenant/mynamespace/mytopic-partition-0" | head
   Batch Message ID: 4572594:33046:0
   Publish time: 1631608014336
   Event time: 0
   Properties:
   "X-Pulsar-batch-size    20086"
   "X-Pulsar-num-batch-message    43"
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] hozumi closed issue #12036: Broker high cpu usage due to producer's batching misconfiguration.

Posted by GitBox <gi...@apache.org>.
hozumi closed issue #12036:
URL: https://github.com/apache/pulsar/issues/12036


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org