You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by "Penghui Li (Jira)" <ji...@apache.org> on 2021/01/26 05:54:00 UTC

[jira] [Updated] (PULSAR-10) Improve the message backlogs for the topic

     [ https://issues.apache.org/jira/browse/PULSAR-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Penghui Li updated PULSAR-10:
-----------------------------
    Summary: Improve the message backlogs for the topic  (was: Improve the message backlogs)

> Improve the message backlogs for the topic
> ------------------------------------------
>
>                 Key: PULSAR-10
>                 URL: https://issues.apache.org/jira/browse/PULSAR-10
>             Project: Pulsar
>          Issue Type: Improvement
>            Reporter: Penghui Li
>            Priority: Major
>              Labels: gsoc2021
>
> In Pulsar, the client usually sends several messages with a batch. From the broker side, the broker receives a batch and write the batch message to the storage layer.
> The message backlog is maintaining how many messages should be handled for a subscription. But unfortunately, the current backlog is based on the batches, not the messages. This will confuse users that they have pushed 1000 messages to the topic, but from the subscription side, when to check the backlog, will return a value that lower than 1000 messages such as 100 batches. Not able to get the message based backlog is it's so expensive to calculate the number of messages in each batch.
>  
> PIP-70 [https://github.com/apache/pulsar/wiki/PIP-70%3A-Introduce-lightweight-raw-Message-metadata |https://github.com/apache/pulsar/wiki/PIP-70%3A-Introduce-lightweight-raw-Message-metadata]Introduced a broker level entry metadata which can support message index for a topic(or message offset of a topic). This will provide the ability to calculate the number of messages between a message index to another message index. So we can leverage PIP-70 to improve the message backlog implementation to able to get the message-based backlog.
>  
> For the Exclusive subscription or Failover subscription, it easy to implement by calculating the messages between the mark delete position and the LAC position. But for the Shared and Key_Shared subscription, the individual acknowledgment will bring some complexity. We can cache the individual acknowledgment count in the broker memory, so the way to calculate the message backlog for the Shared and Key_Shared subscription is `backlogOfTheMarkdeletePosition` - `IndividualAckCount`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)