You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Asaf Mesika <as...@gmail.com> on 2023/03/01 15:39:02 UTC

Re: [Discuss] PIP-248: Add backlog eviction metric

>
> Pulsar has 2 configurations for the backlog eviction
> <https://pulsar.apache.org/docs/2.11.x/cookbooks-retention-expiry/#backlog-quotas>
> : backlogQuotaDefaultLimitBytes and backlogQuotaDefaultLimitSecond.
> By default, backlog eviction is disabled, and also, there is a field named
> backlogQuotaMap in TopicPolicies
> <https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/HierarchyTopicPolicies.java#L45>
> /NamespaceSpacePolicies
> <https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/common/policies/data/Policies.java#L41> assists
> in controlling Topic/Namespace level backlog quota.
>
> If topic backlog reaches the threshold of any item, backlog eviction will
> be triggered, Pulsar will move subscription's cursor to skip unacknowledged
> messages.
>
> Before backlog eviction happens, we don't have a metric to monitor how
> long that it can reaches the threshold.
>

I  think you should fix this explanation:

In Pulsar, a subscription maintains a state of message acknowledged. A
subscription backlog is the set of messages which are unacknowledged.
A subscription backlog size is the sum of size of unacknowledged messages
(in bytes).
A topic can have many subscriptions.
A topic backlog is defined as the backlog size of the subscription which
has the oldest unacknowledged message. Since acknowledged messages can be
interleaved with unacknowledged messages, calculating the exact size of
that subscription can be expensive as it requires I/O operations to read
from the messages from the ledgers.
For that reason, the topic backlog is actually defined to be the estimated
backlog size of that subscription. It does so by summarizing the size of
all the ledgers, starting from the current active one, up to the ledger
which contains the oldest unacknowledged message (There is actually a
faster way to calculate it, but this is the definition of the estimation).

A topic backlog age is the age of the oldest unacknowledged message (in any
subscription). If that message was written 30 minutes ago, its age is 30
minutes.

Pulsar has a feature called backlog quota (place link). It allows the user
to define a quota - in effect, a limit - which limits the topic backlog.
There are two types of quotas:
* Size based: The limit is for the topic backlog size (as we defined above).
* Time based: The limit is for the topic's backlog age (as we defined
above).

Once a topic backlog exceeds either one of those limits, an action is taken
upon messages written to the topic:
* The producer write is placed on hold for a certain amount of time before
failing.
* The producer write is failed
* The subscriptions oldest unacknowledged messages will be acknowledged in
order until both the topic backlog size or age will fall inside the limit
(quota). The process is called backlog eviction (happens every interval)

The quotas can be defined as a default value for any topic, by using the
following broker configuration keys: backlogQuotaDefaultLimitBytes ,
backlogQuotaDefaultLimitSecond. It can also be specified directly for all
topics in a given namespace using the namespace policy, or a specific topic
using a topic policy.

The user today can calculate quota used for size based limit, since there
are two metrics that are exposed today on a topic level: "
pulsar_storage_backlog_quota_limit" and "pulsar_storage_backlog_size". You
can just divide the two to get a percentage.
For the time-based limit, the only metric exposed today is quota itself , "
pulsar_storage_backlog_quota_limit_time".

------------

I would create two metrics:

`pulsar_backlog_size_quota_used_percentage`
`pulsar_backlog_time_quota_used_percentage`

You would like to know what triggered the alert, hence two.
It's not the quota percentage, it's the quota used percentage.

----------

It checks if the backlog size exceeds the threshold(
> backlogQuotaDefaultLimitBytes), and it gets the current backlog size by
> calculating LedgerInfo
> <https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54>,
> it will not lead to I/O.

This is not correct.
It checks against the topic / namespace policy, and if it doesn't exist, it
falls back on the default configuration key mentioned above.

It checks if the backlog time exceeds the threshold(
> backlogQuotaDefaultLimitSecond). If preciseTimeBasedBacklogQuotaCheck is
> set to be true, it will read an entry from Bookkeeper, but the default
> value is false, which means it gets the backlog time by calculating
> LedgerInfo
> <https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54>.
> So in general, we don't need to worry about it will lead to I/O.


I'm afraid of that.
Today the quota is checked periodically, right? So that's how the operator
knows the cost in terms of I/O is limited.
 Now you are adding one additional I/O per collection, every 1 min by
default. That's a lot perhaps. How long is the check interval today?

Perhaps in the backlog quota check, you can persist the check result, and
use it? Persist the age that is.


------

Regarding "slowest_subscription"
I think the cost is too high, because the subscriptions will keep
alternating, which can generate so many unique time series. Since
Prometheus flush only every 2 hours, or any there TSDB, it will cost you
too much.

I suggest exposing the name via the topic stats. This way they can issue a
REST call to grab that subscription name only when the alert fires.

Thanks,

Asaf





On Tue, Feb 28, 2023 at 9:29 AM 太上玄元道君 <da...@apache.org> wrote:

> Hi Asaf,
> I've updated the PIP, PTAL
>
> Thank,
> Tao Jiuming
>
> Asaf Mesika <as...@gmail.com> 于2023年2月26日周日 23:03写道:
>
> > Hi,
> >
> > Pulsar has 2 configurations for the backlog eviction:
> > > backlogQuotaDefaultLimitBytes and backlogQuotaDefaultLimitSecond, if
> > > topic backlog reaches the threshold of any item, backlog eviction will
> be
> > > triggered.
> >
> > This seems like default values, not the actual values. Can you please
> > provide an explanation in the PIP and link to read more:
> > 1. Where do you define the backlog quota exactly? What is the granularity
> > (subscription?)
> > 2.  Is the backlog quota on by default? If so, what are the default
> values?
> >
> >
> >
> > *Notes*
> > 1. When the backlog quota limit is defined in Bytes, and you wish to know
> > how close a subscription is to its bytes limit, you need to calculate the
> > backlog size in bytes. From my understanding, there is an accurate
> > calculation (which is costly in terms of I/O) and there is an estimate of
> > it. I presume you would want to use the estimated one, is that correct?
> > The backlog quota itself, uses the accurate or the estimated when it
> starts
> > evicting entries (i.e. marking them as acknowledged)?
> >
> > 2. For the backlog limit specifying in time units, there is no estimate,
> as
> > it must be calculated all the time (earliest unacknowledged message
> > distance from now). How do you plan to calculate the current age of the
> > earliest message without bearing that I/O cost on each metric
> calculation?
> >
> > 3. In the Goal section, you specify that your goal is to add a
> "proximity"
> > metric.
> > a) You must define that - what is proximity metric exactly? What are its
> > units? How are you planning to calculate it?
> > b) Proximity is not a good term IMO. I personally have never seen this
> term
> > used in software systems, unless it's in the aviation/space industry.
> Once
> > you explain (a) I hope I can help provide alternative names.
> >
> > 4. Maybe we should provide the used quota percentage for both limits,
> > instead of one per both, since it's easier to act upon the alert when you
> > need which one triggered it.
> >
> > 5. I didn't understand the "slowest_subscription" label used when
> > describing the metric label. Can you please provide an explanation?
> >
> > 6. I suggest writing a "High Level Design" section, and add everything
> you
> > need to know for this proposal, so I don't need to read the
> > implementation details below (code).
> >
> > Thanks,
> >
> > Asaf
> >
> >
> > On Wed, Feb 22, 2023 at 4:52 PM 太上玄元道君 <da...@apache.org> wrote:
> >
> > > Hi all,
> > >
> > > I've started a PIP to discuss: PIP-248 Add backlog eviction metric
> > >
> > > ### Motivation:
> > >
> > > Pulsar has 2 configurations for the backlog eviction:
> > > `backlogQuotaDefaultLimitBytes` and `backlogQuotaDefaultLimitSecond`,
> if
> > > topic backlog reaches the threshold of any item, backlog eviction will
> be
> > > triggered.
> > >
> > > Before backlog eviction happens, we don't have a metric to monitor how
> > long
> > > that it can reaches the threshold.
> > >
> > > We can provide a progress bar metric to tell users some topics is about
> > to
> > > trigger backlog eviction. And users can subscribe the alert to schedule
> > > consumers.
> > >
> > > For more details, please read the PIP at
> > > https://github.com/apache/pulsar/issues/19601
> > >
> > > Thanks,
> > > Tao Jiuming
> > >
> >
>

Re: [Discuss] PIP-248: Add backlog eviction metric

Posted by 太上玄元道君 <da...@apache.org>.
> Perhaps clarify, they can call:
> getStats(subscriptionBacklogSize=true, getPreciseBacklog=false,
> getEarliestTimeInBacklog=false)

> Each subscription will contain backlogSize.
> The subscription with max backlogSize will also be the one with oldest
age.

Yes, "The subscription with max backlogSize will also be the one with
oldest age."
But we didn't expose the backlog message age, say, users just want to clear
backlog by using `backlogQuotaTime`.

However, it doesn't matter. It is just a way for users to troubleshoot the
backlog and does not affect the implementation of this PIP.
PIP just gives a use case that users can adjust to the actual situation.

Thanks,
Tao Jiuming

Re: [Discuss] PIP-248: Add backlog eviction metric

Posted by Asaf Mesika <as...@gmail.com>.
Ok.


>    1. Find the backlog subscriptions
>    After received the alarm, users could request Topics#getStats(topicName,
>    true/false, true, true)
>    <https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139> to
>    get the topic stats, and find which subscriptions are in backlog.
>    Pulsar exposed backlogSize and earliestMsgPublishTimeInBacklog in the
>    subscription level, and we will expose backlogQuotaSizeBytes and
>    backlogQuotaTimeSeconds in the topic level, so users could find which
>    subscriptions in backlog easily.
>
>
>

Perhaps clarify, they can call:
getStats(subscriptionBacklogSize=true, getPreciseBacklog=false,
getEarliestTimeInBacklog=false)

Each subscription will contain backlogSize.
The subscription with max backlogSize will also be the one with oldest age.

On Tue, Mar 14, 2023 at 6:04 PM 太上玄元道君 <da...@apache.org> wrote:

> > Need to replace (place link) with link.
>
> I replaced the `Motivation` with your advice.
>
> > We discussed adding the subscription name which triggered the time limit
> to
> > Topics.getStats().
> > Why?
>
> Since we have `pulsar_storage_backlog_eviction_count`,
> I think we don't need to expose the subscription name which triggered the
> backlog eviction.
>
> > I have to run getStats(getEarliestTimeInBacklog=true) and it's way more
> > expensive than the proposal above, since it needs to reach the earliest
> > message for *each* subscription.
>
> I don't think we need to save these expenses, it is only triggered when the
> user requests.
>  If the user does not set `getEarliestTimeInBacklog` to true, there will be
> no such overhead.
> We don't need to add complexity for very few calls
>
> > Also a bit less accurate - you want to get the subscription cached that
> > triggered it, using the same number to find it. Earliest backlog is
> > accurate but if the configuration flag is off, it's not the same number
> as
> > getStats.
>
> Such problems do exist. Maybe there are many backlogs when the user
> receives the alert,
> but the backlogs have been reduced when the endpoint(Topics#getStats) is
> requested.
> There is a time difference between them. However, when the user receives an
> alarm, it is only a notification.
>  When the user requests the endpoint, they may take action.
> I think it is reasonable to provide users with a more accurate backlog
> before they act.
>
> Thanks,
> Tao Jiuming
>
> Asaf Mesika <as...@gmail.com> 于2023年3月14日周二 16:51写道:
>
> > >
> > > Pulsar has a feature called backlog quota (place link)
> >
> > Need to replace (place link) with link.
> >
> >
> >
> > >    1. Find the backlog subscriptions
> > >    After received the alarm, users could request
> > Topics#getStats(topicName,
> > >    true/false, true, true)
> > >    <
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> >
> > to
> > >    get the topic stats, and find which subscriptions are in backlog.
> > >    Pulsar exposed backlogSize and earliestMsgPublishTimeInBacklog in
> the
> > >    subscription level, and we will expose backlogQuotaSizeBytes and
> > >    backlogQuotaTimeSeconds in the topic level, so users could find
> which
> > >    subscriptions in backlog easily.
> > >
> > > We have forgotten the other comment.
> > We discussed adding the subscription name which triggered the time limit
> to
> > Topics.getStats().
> > Why?
> >
> > I have to run getStats(getEarliestTimeInBacklog=true) and it's way more
> > expensive than the proposal above, since it needs to reach the earliest
> > message for *each* subscription.
> > Also a bit less accurate - you want to get the subscription cached that
> > triggered it, using the same number to find it. Earliest backlog is
> > accurate but if the configuration flag is off, it's not the same number
> as
> > getStats.
> >
> >
> > Nice to have (not mandatory) additions:
> >
> > I would add before
> >
> > >
> > >    1. After readEntryComplete
> > >    <
> >
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L2780
> > >,
> > >    cache its result:
> > >
> > > When this configuration flag is set to true, the broker does an I/O
> call
> > by reading the oldest entry to get its write timestamp. Once we have
> that,
> > we'll add caching to that value since we're going to use it for returning
> > the age.
> >
> > I would add before:
> >
> > > slowestReaderTimeBasedBacklogQuotaCheck
> > > <
> >
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L2817
> >
> > is
> > > a totally in-memory method, we just need to cache the
> > >
> >
> > When this configuration flag is set to false, the check uses an estimate
> of
> > the oldest entry timestamp, by taking the closing time of the ledger
> which
> > the message is contained at.
> >
> > On Fri, Mar 10, 2023 at 8:29 AM 太上玄元道君 <da...@apache.org> wrote:
> >
> > > I think yes, to avoid missing something, you can take a look if you
> have
> > > time.
> > >
> > > Thanks,
> > > Tao Jiuming
> > >
> > > Asaf Mesika <as...@gmail.com> 于2023年3月9日周四 17:40写道:
> > >
> > > > Is the PIP updated with all comments?
> > > >
> > > > On Thu, Mar 9, 2023 at 8:59 AM 太上玄元道君 <da...@apache.org> wrote:
> > > >
> > > > > > backlogQuotaLimitSize
> > > > > > should be `backlogQuotaSizeBytes`
> > > > >
> > > > > > backlogQuotaLimitTime
> > > > > > should be `backlogQuotaTimeSeconds`
> > > > >
> > > > > > So you need to rename the metric.
> > > > > > "pulsar_storage_backlog_quota_count" -->
> > > > > > `pulsar_storage_backlog_eviction_count`
> > > > >
> > > > > > the topic's existing subscription.
> > > > > > "subscription" --> "subscription*s*"
> > > > >
> > > > > > Number of backlog quota happends.
> > > > > > Number of times backlog evictions happened due to exceeding
> backlog
> > > > quota
> > > > > > (either time or size).
> > > > >
> > > > > Accepted, if there is no more need to change, I'll start the vote
> > next
> > > > > week.
> > > > >
> > > > > Thanks,
> > > > > Tao Jiuming
> > > > >
> > > > >
> > > > > Asaf Mesika <as...@gmail.com> 于2023年3月7日周二 00:02写道:
> > > > >
> > > > > > >
> > > > > > > Pulsar has a feature called backlog quota (place link).
> > > > > >
> > > > > > You need to place a link :)
> > > > > >
> > > > > > Expose pulsar_storage_backlog_quota_count in the topic leve
> > > > > >
> > > > > > You already have "pulsar_storage_backlog_size", so why do you
> need
> > > this
> > > > > > metric for?
> > > > > >
> > > > > > backlogQuotaLimitSize
> > > > > >
> > > > > > should be `backlogQuotaSizeBytes`
> > > > > >
> > > > > > backlogQuotaLimitTime
> > > > > >
> > > > > > should be `backlogQuotaTimeSeconds`
> > > > > >
> > > > > > What about goal no.4? Expose oldest unacknowledged message
> > > subscription
> > > > > > name?
> > > > > >
> > > > > > IMO, metrics are like API - perhaps indicate the change there as
> > well
> > > > > >
> > > > > > Record the event when dropBacklogForSizeLimit
> > > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L121
> > > > > > >
> > > > > > >  or dropBacklogForTimeLimit
> > > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L194
> > > > > >
> > > > > > is
> > > > > > > going to invoked.
> > > > > >
> > > > > >
> > > > > > Oh, now I get it.
> > > > > > So you need to rename the metric.
> > > > > > "pulsar_storage_backlog_quota_count" -->
> > > > > > `pulsar_storage_backlog_eviction_count`
> > > > > >
> > > > > >
> > > > > > > the topic's existing subscription.
> > > > > >
> > > > > > "subscription" --> "subscription*s*"
> > > > > >
> > > > > > Number of backlog quota happends.
> > > > > >
> > > > > > Number of times backlog evictions happened due to exceeding
> backlog
> > > > quota
> > > > > > (either time or size).
> > > > > >
> > > > > >
> > > > > > >    1. Find the backlog subscriptions
> > > > > > >    After received the alarm, users could request
> > > > > > Topics#getStats(topicName,
> > > > > > >    true/false, true, true)
> > > > > > >    <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > > > >
> > > > > > to
> > > > > > >    get the topic stats, and find which subscriptions are in
> > > backlog.
> > > > > > >    Pulsar exposed backlogSize and
> earliestMsgPublishTimeInBacklog
> > > in
> > > > > the
> > > > > > >    subscription level, and we will expose backlogQuotaLimitSize
> > and
> > > > > > >    backlogQuotaLimitTime in the topic level, so users could
> find
> > > > which
> > > > > > >    subscriptions in backlog easily.
> > > > > > >
> > > > > > > I wrote how it should be done IMO in a previous email.
> > > > > >
> > > > > >
> > > > > > On Mon, Mar 6, 2023 at 1:20 PM 太上玄元道君 <da...@apache.org> wrote:
> > > > > >
> > > > > > > Hi Aasf,
> > > > > > > I've updated the PIP, PTAL
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Tao Jiuming
> > > > > > >
> > > > > > > Asaf Mesika <as...@gmail.com> 于2023年3月5日周日 21:00写道:
> > > > > > >
> > > > > > > > On Thu, Mar 2, 2023 at 12:57 PM 太上玄元道君 <da...@apache.org>
> > > wrote:
> > > > > > > >
> > > > > > > > > > I  think you should fix this explanation:
> > > > > > > > >
> > > > > > > > > Thanks! I would like to copy the context you provide to the
> > PIP
> > > > > > > > motivation,
> > > > > > > > > your description is more detailed, so developers don't have
> > to
> > > go
> > > > > > > through
> > > > > > > > > the code.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Sure
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > Today the quota is checked periodically, right? So that's
> > how
> > > > the
> > > > > > > > > operator
> > > > > > > > > > knows the cost in terms of I/O is limited.
> > > > > > > > > > Now you are adding one additional I/O per collection,
> > every 1
> > > > min
> > > > > > by
> > > > > > > > > > default. That's a lot perhaps. How long is the check
> > interval
> > > > > > today?
> > > > > > > > >
> > > > > > > > > Actually, I don't want to introduce additional costs, I
> > thought
> > > > we
> > > > > > > > > could cache its result, so that it won't introduce
> additional
> > > > > costs.
> > > > > > > > > It may be that I did not make it clear in the PIP and
> caused
> > > this
> > > > > > > > > misunderstanding, sorry.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Ok, just to verify: You plan to modify the code that runs
> > > > > periodically
> > > > > > > the
> > > > > > > > backlog quota check, so the result will be cached there? This
> > way
> > > > > when
> > > > > > > you
> > > > > > > > pull that information from that code every 1min to expose it
> > as a
> > > > > > metric
> > > > > > > it
> > > > > > > > will have 0 I/O cost?
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > The user today can calculate quota used for size based
> > limit,
> > > > > since
> > > > > > > > there
> > > > > > > > > > are two metrics that are exposed today on a topic level:
> "
> > > > > > > > > > pulsar_storage_backlog_quota_limit" and
> > > > > > > "pulsar_storage_backlog_size".
> > > > > > > > > You
> > > > > > > > > > can just divide the two to get a percentage.
> > > > > > > > > > For the time-based limit, the only metric exposed today
> is
> > > > quota
> > > > > > > > itself ,
> > > > > > > > > "
> > > > > > > > > > pulsar_storage_backlog_quota_limit_time".
> > > > > > > > >
> > > > > > > > > I only noticed `pulsar_storage_backlog_size` but missed
> > > > > > > > > `pulsar_storage_backlog_quota_limit` and
> > > > > > > > > `pulsar_storage_backlog_quota_limit_time`. Many thanks for
> > your
> > > > > > > reminder.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > So, in this condition, we already have the following
> > > topic-level
> > > > > > > metrics:
> > > > > > > > > `pulsar_storage_backlog_size`: The total backlog size of
> the
> > > > topics
> > > > > > of
> > > > > > > > this
> > > > > > > > > topic owned by this broker (in bytes).
> > > > > > > > > `pulsar_storage_backlog_quota_limit`: The total amount of
> the
> > > > data
> > > > > in
> > > > > > > > this
> > > > > > > > > topic that limits the backlog quota (bytes).
> > > > > > > > > `pulsar_storage_backlog_quota_limit_time`: The backlog
> quota
> > > > limit
> > > > > in
> > > > > > > > > time(seconds). (This metric does not exists in the doc,
> need
> > to
> > > > > > > improve)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > We just need to add a new metric named
> > > > > > > > > `pulsar_storage_earliest_msg_publish_time_in_backlog` in
> the
> > > > > > > topic-level
> > > > > > > > > that indicates the publish time of the earliest message in
> > the
> > > > > > backlog.
> > > > > > > > > So users could get
> > `pulsar_backlog_size_quota_used_percentage`
> > > by
> > > > > > > divide
> > > > > > > > > `pulsar_storage_backlog_size ` and
> > > > > > > > >
> > > > `pulsar_storage_backlog_quota_limit`(`pulsar_storage_backlog_size`
> > > > > /
> > > > > > > > > `pulsar_storage_backlog_quota_limit`),
> > > > > > > > > and could get `pulsar_backlog_time_quota_used_percentage`
> by
> > > > divide
> > > > > > > `now
> > > > > > > > -
> > > > > > > > > pulsar_storage_earliest_msg_publish_time_in_backlog` and
> > > > > > > > > `pulsar_storage_backlog_quota_limit_time` (`now -
> > > > > > > > > pulsar_storage_earliest_msg_publish_time_in_backlog` /
> > > > > > > > > `pulsar_storage_backlog_quota_limit_time`).
> > > > > > > > >
> > > > > > > >
> > > > > > > > I think there is a problem with the name
> > > > > > > > `pulsar_storage_earliest_msg_publish_time_in_backlog` in the
> > > > > > topic-level:
> > > > > > > > * First, I prefer exposing the age rather than the publish
> > time.
> > > > > > > > * Second, it's a bit hard to figure out the meaning of the
> > > earliest
> > > > > msg
> > > > > > > in
> > > > > > > > the backlog.
> > > > > > > >
> > > > > > > > Maybe `pulsar_storage_backlog_age_seconds`? In the
> explanation
> > > you
> > > > > can
> > > > > > > > write: "The age (time passed since it was published) of the
> > > > earliest
> > > > > > > > unacknowledged message based on the topic's
> > > > > > > > existing subscriptions" ?
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > The backlog quota time checker runs periodically, so we can
> > > cache
> > > > > its
> > > > > > > > > result, so it won't lead to much costs.
> > > > > > > > >
> > > > > > > > > Pulsar also exposed subscription-level  `backlogSize` and
> > > > > > > > > `earliestMsgPublishTimeInBacklog` in Pulsar-Admin
> > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > > > > > > > >
> > > > > > > > > if
> > > > > > > > > `subscriptionBacklogSize` and `getEarliestTimeInBacklog`
> are
> > > > true.
> > > > > > > > > We can also expose `backlogQuotaLimiteSize` and
> > > > > > `backlogQuotaLimitTime`
> > > > > > > > of
> > > > > > > > > the topic to PulsarAdmin.
> > > > > > > > >
> > > > > > > >
> > > > > > > > What is the relationship you see between Pulsar exposing
> > > > > > > > subscriptionBacklogSize and earliestMsgPublishTimeInBacklog
> in
> > > > > > > > subscription, to exposing the backlog quota limits in pulsar
> > > admin?
> > > > > > > >
> > > > > > > > Limits can be exposed to Pulsar Admin, since it has 0 cost
> > > > associated
> > > > > > > with
> > > > > > > > it.
> > > > > > > > I think it's a good idea to do that.
> > > > > > > > The quota usage can also be exposed to pulsar admin, since we
> > > pull
> > > > > that
> > > > > > > > data from the backlog quota checker cache, so it has 0 cost
> as
> > > > well.
> > > > > > > >
> > > > > > > > As we said in previous email we can also expose
> > > > > > > > `backlogQuotaTimeOldestBacklogAgeSubscriptionName`
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > After users receive the backlog alert from metrics alerting
> > > > > systems,
> > > > > > > they
> > > > > > > > > can get the topic name, then, they can request
> > Topics#getStats
> > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > > > > > > > >
> > > > > > > > > to
> > > > > > > > > get which subscriptions are in the huge backlog.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > I agree users can use PulsarAdmin getStats for topic , with
> > > > > > > > getEarliestTimeInBacklog=true to find the oldest subscription
> > > > > > responsible
> > > > > > > > for exceeding quota, but we can give them that information
> > with 0
> > > > > cost
> > > > > > > > since we already have that subscription name cached (we spent
> > the
> > > > I/O
> > > > > > to
> > > > > > > > find out who that subscription is, let's just cache it and
> > > provide
> > > > > it).
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Tao Jiuming
> > > > > > > > >
> > > > > > > > > Asaf Mesika <as...@gmail.com> 于2023年3月1日周三 23:42写道:
> > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Pulsar has 2 configurations for the backlog eviction
> > > > > > > > > > > <
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://pulsar.apache.org/docs/2.11.x/cookbooks-retention-expiry/#backlog-quotas
> > > > > > > > > > >
> > > > > > > > > > > : backlogQuotaDefaultLimitBytes and
> > > > > > backlogQuotaDefaultLimitSecond.
> > > > > > > > > > > By default, backlog eviction is disabled, and also,
> there
> > > is
> > > > a
> > > > > > > field
> > > > > > > > > > named
> > > > > > > > > > > backlogQuotaMap in TopicPolicies
> > > > > > > > > > > <
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/HierarchyTopicPolicies.java#L45
> > > > > > > > > > >
> > > > > > > > > > > /NamespaceSpacePolicies
> > > > > > > > > > > <
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/common/policies/data/Policies.java#L41
> > > > > > > > > >
> > > > > > > > > > assists
> > > > > > > > > > > in controlling Topic/Namespace level backlog quota.
> > > > > > > > > > >
> > > > > > > > > > > If topic backlog reaches the threshold of any item,
> > backlog
> > > > > > > eviction
> > > > > > > > > will
> > > > > > > > > > > be triggered, Pulsar will move subscription's cursor to
> > > skip
> > > > > > > > > > unacknowledged
> > > > > > > > > > > messages.
> > > > > > > > > > >
> > > > > > > > > > > Before backlog eviction happens, we don't have a metric
> > to
> > > > > > monitor
> > > > > > > > how
> > > > > > > > > > > long that it can reaches the threshold.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I  think you should fix this explanation:
> > > > > > > > > >
> > > > > > > > > > In Pulsar, a subscription maintains a state of message
> > > > > > acknowledged.
> > > > > > > A
> > > > > > > > > > subscription backlog is the set of messages which are
> > > > > > unacknowledged.
> > > > > > > > > > A subscription backlog size is the sum of size of
> > > > unacknowledged
> > > > > > > > messages
> > > > > > > > > > (in bytes).
> > > > > > > > > > A topic can have many subscriptions.
> > > > > > > > > > A topic backlog is defined as the backlog size of the
> > > > > subscription
> > > > > > > > which
> > > > > > > > > > has the oldest unacknowledged message. Since acknowledged
> > > > > messages
> > > > > > > can
> > > > > > > > be
> > > > > > > > > > interleaved with unacknowledged messages, calculating the
> > > exact
> > > > > > size
> > > > > > > of
> > > > > > > > > > that subscription can be expensive as it requires I/O
> > > > operations
> > > > > to
> > > > > > > > read
> > > > > > > > > > from the messages from the ledgers.
> > > > > > > > > > For that reason, the topic backlog is actually defined to
> > be
> > > > the
> > > > > > > > > estimated
> > > > > > > > > > backlog size of that subscription. It does so by
> > summarizing
> > > > the
> > > > > > size
> > > > > > > > of
> > > > > > > > > > all the ledgers, starting from the current active one, up
> > to
> > > > the
> > > > > > > ledger
> > > > > > > > > > which contains the oldest unacknowledged message (There
> is
> > > > > > actually a
> > > > > > > > > > faster way to calculate it, but this is the definition of
> > the
> > > > > > > > > estimation).
> > > > > > > > > >
> > > > > > > > > > A topic backlog age is the age of the oldest
> unacknowledged
> > > > > message
> > > > > > > (in
> > > > > > > > > any
> > > > > > > > > > subscription). If that message was written 30 minutes
> ago,
> > > its
> > > > > age
> > > > > > is
> > > > > > > > 30
> > > > > > > > > > minutes.
> > > > > > > > > >
> > > > > > > > > > Pulsar has a feature called backlog quota (place link).
> It
> > > > allows
> > > > > > the
> > > > > > > > > user
> > > > > > > > > > to define a quota - in effect, a limit - which limits the
> > > topic
> > > > > > > > backlog.
> > > > > > > > > > There are two types of quotas:
> > > > > > > > > > * Size based: The limit is for the topic backlog size (as
> > we
> > > > > > defined
> > > > > > > > > > above).
> > > > > > > > > > * Time based: The limit is for the topic's backlog age
> (as
> > we
> > > > > > defined
> > > > > > > > > > above).
> > > > > > > > > >
> > > > > > > > > > Once a topic backlog exceeds either one of those limits,
> an
> > > > > action
> > > > > > is
> > > > > > > > > taken
> > > > > > > > > > upon messages written to the topic:
> > > > > > > > > > * The producer write is placed on hold for a certain
> amount
> > > of
> > > > > time
> > > > > > > > > before
> > > > > > > > > > failing.
> > > > > > > > > > * The producer write is failed
> > > > > > > > > > * The subscriptions oldest unacknowledged messages will
> be
> > > > > > > acknowledged
> > > > > > > > > in
> > > > > > > > > > order until both the topic backlog size or age will fall
> > > inside
> > > > > the
> > > > > > > > limit
> > > > > > > > > > (quota). The process is called backlog eviction (happens
> > > every
> > > > > > > > interval)
> > > > > > > > > >
> > > > > > > > > > The quotas can be defined as a default value for any
> topic,
> > > by
> > > > > > using
> > > > > > > > the
> > > > > > > > > > following broker configuration keys:
> > > > > backlogQuotaDefaultLimitBytes
> > > > > > ,
> > > > > > > > > > backlogQuotaDefaultLimitSecond. It can also be specified
> > > > directly
> > > > > > for
> > > > > > > > all
> > > > > > > > > > topics in a given namespace using the namespace policy,
> or
> > a
> > > > > > specific
> > > > > > > > > topic
> > > > > > > > > > using a topic policy.
> > > > > > > > > >
> > > > > > > > > > The user today can calculate quota used for size based
> > limit,
> > > > > since
> > > > > > > > there
> > > > > > > > > > are two metrics that are exposed today on a topic level:
> "
> > > > > > > > > > pulsar_storage_backlog_quota_limit" and
> > > > > > > "pulsar_storage_backlog_size".
> > > > > > > > > You
> > > > > > > > > > can just divide the two to get a percentage.
> > > > > > > > > > For the time-based limit, the only metric exposed today
> is
> > > > quota
> > > > > > > itself
> > > > > > > > > , "
> > > > > > > > > > pulsar_storage_backlog_quota_limit_time".
> > > > > > > > > >
> > > > > > > > > > ------------
> > > > > > > > > >
> > > > > > > > > > I would create two metrics:
> > > > > > > > > >
> > > > > > > > > > `pulsar_backlog_size_quota_used_percentage`
> > > > > > > > > > `pulsar_backlog_time_quota_used_percentage`
> > > > > > > > > >
> > > > > > > > > > You would like to know what triggered the alert, hence
> two.
> > > > > > > > > > It's not the quota percentage, it's the quota used
> > > percentage.
> > > > > > > > > >
> > > > > > > > > > ----------
> > > > > > > > > >
> > > > > > > > > > It checks if the backlog size exceeds the threshold(
> > > > > > > > > > > backlogQuotaDefaultLimitBytes), and it gets the current
> > > > backlog
> > > > > > > size
> > > > > > > > by
> > > > > > > > > > > calculating LedgerInfo
> > > > > > > > > > > <
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > > > > > > > > >,
> > > > > > > > > > > it will not lead to I/O.
> > > > > > > > > >
> > > > > > > > > > This is not correct.
> > > > > > > > > > It checks against the topic / namespace policy, and if it
> > > > doesn't
> > > > > > > > exist,
> > > > > > > > > it
> > > > > > > > > > falls back on the default configuration key mentioned
> > above.
> > > > > > > > > >
> > > > > > > > > > It checks if the backlog time exceeds the threshold(
> > > > > > > > > > > backlogQuotaDefaultLimitSecond). If
> > > > > > > preciseTimeBasedBacklogQuotaCheck
> > > > > > > > > is
> > > > > > > > > > > set to be true, it will read an entry from Bookkeeper,
> > but
> > > > the
> > > > > > > > default
> > > > > > > > > > > value is false, which means it gets the backlog time by
> > > > > > calculating
> > > > > > > > > > > LedgerInfo
> > > > > > > > > > > <
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > > > > > > > > >.
> > > > > > > > > > > So in general, we don't need to worry about it will
> lead
> > to
> > > > > I/O.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I'm afraid of that.
> > > > > > > > > > Today the quota is checked periodically, right? So that's
> > how
> > > > the
> > > > > > > > > operator
> > > > > > > > > > knows the cost in terms of I/O is limited.
> > > > > > > > > >  Now you are adding one additional I/O per collection,
> > every
> > > 1
> > > > > min
> > > > > > by
> > > > > > > > > > default. That's a lot perhaps. How long is the check
> > interval
> > > > > > today?
> > > > > > > > > >
> > > > > > > > > > Perhaps in the backlog quota check, you can persist the
> > check
> > > > > > result,
> > > > > > > > and
> > > > > > > > > > use it? Persist the age that is.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > ------
> > > > > > > > > >
> > > > > > > > > > Regarding "slowest_subscription"
> > > > > > > > > > I think the cost is too high, because the subscriptions
> > will
> > > > keep
> > > > > > > > > > alternating, which can generate so many unique time
> series.
> > > > Since
> > > > > > > > > > Prometheus flush only every 2 hours, or any there TSDB,
> it
> > > will
> > > > > > cost
> > > > > > > > you
> > > > > > > > > > too much.
> > > > > > > > > >
> > > > > > > > > > I suggest exposing the name via the topic stats. This way
> > > they
> > > > > can
> > > > > > > > issue
> > > > > > > > > a
> > > > > > > > > > REST call to grab that subscription name only when the
> > alert
> > > > > fires.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Asaf
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Tue, Feb 28, 2023 at 9:29 AM 太上玄元道君 <
> daojun@apache.org>
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Asaf,
> > > > > > > > > > > I've updated the PIP, PTAL
> > > > > > > > > > >
> > > > > > > > > > > Thank,
> > > > > > > > > > > Tao Jiuming
> > > > > > > > > > >
> > > > > > > > > > > Asaf Mesika <as...@gmail.com> 于2023年2月26日周日
> > 23:03写道:
> > > > > > > > > > >
> > > > > > > > > > > > Hi,
> > > > > > > > > > > >
> > > > > > > > > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > > > > > > > > > backlogQuotaDefaultLimitBytes and
> > > > > > > backlogQuotaDefaultLimitSecond,
> > > > > > > > > if
> > > > > > > > > > > > > topic backlog reaches the threshold of any item,
> > > backlog
> > > > > > > eviction
> > > > > > > > > > will
> > > > > > > > > > > be
> > > > > > > > > > > > > triggered.
> > > > > > > > > > > >
> > > > > > > > > > > > This seems like default values, not the actual
> values.
> > > Can
> > > > > you
> > > > > > > > please
> > > > > > > > > > > > provide an explanation in the PIP and link to read
> > more:
> > > > > > > > > > > > 1. Where do you define the backlog quota exactly?
> What
> > is
> > > > the
> > > > > > > > > > granularity
> > > > > > > > > > > > (subscription?)
> > > > > > > > > > > > 2.  Is the backlog quota on by default? If so, what
> are
> > > the
> > > > > > > default
> > > > > > > > > > > values?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *Notes*
> > > > > > > > > > > > 1. When the backlog quota limit is defined in Bytes,
> > and
> > > > you
> > > > > > wish
> > > > > > > > to
> > > > > > > > > > know
> > > > > > > > > > > > how close a subscription is to its bytes limit, you
> > need
> > > to
> > > > > > > > calculate
> > > > > > > > > > the
> > > > > > > > > > > > backlog size in bytes. From my understanding, there
> is
> > an
> > > > > > > accurate
> > > > > > > > > > > > calculation (which is costly in terms of I/O) and
> there
> > > is
> > > > an
> > > > > > > > > estimate
> > > > > > > > > > of
> > > > > > > > > > > > it. I presume you would want to use the estimated
> one,
> > is
> > > > > that
> > > > > > > > > correct?
> > > > > > > > > > > > The backlog quota itself, uses the accurate or the
> > > > estimated
> > > > > > when
> > > > > > > > it
> > > > > > > > > > > starts
> > > > > > > > > > > > evicting entries (i.e. marking them as acknowledged)?
> > > > > > > > > > > >
> > > > > > > > > > > > 2. For the backlog limit specifying in time units,
> > there
> > > is
> > > > > no
> > > > > > > > > > estimate,
> > > > > > > > > > > as
> > > > > > > > > > > > it must be calculated all the time (earliest
> > > unacknowledged
> > > > > > > message
> > > > > > > > > > > > distance from now). How do you plan to calculate the
> > > > current
> > > > > > age
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > > earliest message without bearing that I/O cost on
> each
> > > > metric
> > > > > > > > > > > calculation?
> > > > > > > > > > > >
> > > > > > > > > > > > 3. In the Goal section, you specify that your goal is
> > to
> > > > add
> > > > > a
> > > > > > > > > > > "proximity"
> > > > > > > > > > > > metric.
> > > > > > > > > > > > a) You must define that - what is proximity metric
> > > exactly?
> > > > > > What
> > > > > > > > are
> > > > > > > > > > its
> > > > > > > > > > > > units? How are you planning to calculate it?
> > > > > > > > > > > > b) Proximity is not a good term IMO. I personally
> have
> > > > never
> > > > > > seen
> > > > > > > > > this
> > > > > > > > > > > term
> > > > > > > > > > > > used in software systems, unless it's in the
> > > aviation/space
> > > > > > > > industry.
> > > > > > > > > > > Once
> > > > > > > > > > > > you explain (a) I hope I can help provide alternative
> > > > names.
> > > > > > > > > > > >
> > > > > > > > > > > > 4. Maybe we should provide the used quota percentage
> > for
> > > > both
> > > > > > > > limits,
> > > > > > > > > > > > instead of one per both, since it's easier to act
> upon
> > > the
> > > > > > alert
> > > > > > > > when
> > > > > > > > > > you
> > > > > > > > > > > > need which one triggered it.
> > > > > > > > > > > >
> > > > > > > > > > > > 5. I didn't understand the "slowest_subscription"
> label
> > > > used
> > > > > > when
> > > > > > > > > > > > describing the metric label. Can you please provide
> an
> > > > > > > explanation?
> > > > > > > > > > > >
> > > > > > > > > > > > 6. I suggest writing a "High Level Design" section,
> and
> > > add
> > > > > > > > > everything
> > > > > > > > > > > you
> > > > > > > > > > > > need to know for this proposal, so I don't need to
> read
> > > the
> > > > > > > > > > > > implementation details below (code).
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Asaf
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Feb 22, 2023 at 4:52 PM 太上玄元道君 <
> > > daojun@apache.org>
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > >
> > > > > > > > > > > > > I've started a PIP to discuss: PIP-248 Add backlog
> > > > eviction
> > > > > > > > metric
> > > > > > > > > > > > >
> > > > > > > > > > > > > ### Motivation:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Pulsar has 2 configurations for the backlog
> eviction:
> > > > > > > > > > > > > `backlogQuotaDefaultLimitBytes` and
> > > > > > > > > `backlogQuotaDefaultLimitSecond`,
> > > > > > > > > > > if
> > > > > > > > > > > > > topic backlog reaches the threshold of any item,
> > > backlog
> > > > > > > eviction
> > > > > > > > > > will
> > > > > > > > > > > be
> > > > > > > > > > > > > triggered.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Before backlog eviction happens, we don't have a
> > metric
> > > > to
> > > > > > > > monitor
> > > > > > > > > > how
> > > > > > > > > > > > long
> > > > > > > > > > > > > that it can reaches the threshold.
> > > > > > > > > > > > >
> > > > > > > > > > > > > We can provide a progress bar metric to tell users
> > some
> > > > > > topics
> > > > > > > is
> > > > > > > > > > about
> > > > > > > > > > > > to
> > > > > > > > > > > > > trigger backlog eviction. And users can subscribe
> the
> > > > alert
> > > > > > to
> > > > > > > > > > schedule
> > > > > > > > > > > > > consumers.
> > > > > > > > > > > > >
> > > > > > > > > > > > > For more details, please read the PIP at
> > > > > > > > > > > > > https://github.com/apache/pulsar/issues/19601
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > Tao Jiuming
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [Discuss] PIP-248: Add backlog eviction metric

Posted by 太上玄元道君 <da...@apache.org>.
> Need to replace (place link) with link.

I replaced the `Motivation` with your advice.

> We discussed adding the subscription name which triggered the time limit
to
> Topics.getStats().
> Why?

Since we have `pulsar_storage_backlog_eviction_count`,
I think we don't need to expose the subscription name which triggered the
backlog eviction.

> I have to run getStats(getEarliestTimeInBacklog=true) and it's way more
> expensive than the proposal above, since it needs to reach the earliest
> message for *each* subscription.

I don't think we need to save these expenses, it is only triggered when the
user requests.
 If the user does not set `getEarliestTimeInBacklog` to true, there will be
no such overhead.
We don't need to add complexity for very few calls

> Also a bit less accurate - you want to get the subscription cached that
> triggered it, using the same number to find it. Earliest backlog is
> accurate but if the configuration flag is off, it's not the same number as
> getStats.

Such problems do exist. Maybe there are many backlogs when the user
receives the alert,
but the backlogs have been reduced when the endpoint(Topics#getStats) is
requested.
There is a time difference between them. However, when the user receives an
alarm, it is only a notification.
 When the user requests the endpoint, they may take action.
I think it is reasonable to provide users with a more accurate backlog
before they act.

Thanks,
Tao Jiuming

Asaf Mesika <as...@gmail.com> 于2023年3月14日周二 16:51写道:

> >
> > Pulsar has a feature called backlog quota (place link)
>
> Need to replace (place link) with link.
>
>
>
> >    1. Find the backlog subscriptions
> >    After received the alarm, users could request
> Topics#getStats(topicName,
> >    true/false, true, true)
> >    <
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139>
> to
> >    get the topic stats, and find which subscriptions are in backlog.
> >    Pulsar exposed backlogSize and earliestMsgPublishTimeInBacklog in the
> >    subscription level, and we will expose backlogQuotaSizeBytes and
> >    backlogQuotaTimeSeconds in the topic level, so users could find which
> >    subscriptions in backlog easily.
> >
> > We have forgotten the other comment.
> We discussed adding the subscription name which triggered the time limit to
> Topics.getStats().
> Why?
>
> I have to run getStats(getEarliestTimeInBacklog=true) and it's way more
> expensive than the proposal above, since it needs to reach the earliest
> message for *each* subscription.
> Also a bit less accurate - you want to get the subscription cached that
> triggered it, using the same number to find it. Earliest backlog is
> accurate but if the configuration flag is off, it's not the same number as
> getStats.
>
>
> Nice to have (not mandatory) additions:
>
> I would add before
>
> >
> >    1. After readEntryComplete
> >    <
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L2780
> >,
> >    cache its result:
> >
> > When this configuration flag is set to true, the broker does an I/O call
> by reading the oldest entry to get its write timestamp. Once we have that,
> we'll add caching to that value since we're going to use it for returning
> the age.
>
> I would add before:
>
> > slowestReaderTimeBasedBacklogQuotaCheck
> > <
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L2817>
> is
> > a totally in-memory method, we just need to cache the
> >
>
> When this configuration flag is set to false, the check uses an estimate of
> the oldest entry timestamp, by taking the closing time of the ledger which
> the message is contained at.
>
> On Fri, Mar 10, 2023 at 8:29 AM 太上玄元道君 <da...@apache.org> wrote:
>
> > I think yes, to avoid missing something, you can take a look if you have
> > time.
> >
> > Thanks,
> > Tao Jiuming
> >
> > Asaf Mesika <as...@gmail.com> 于2023年3月9日周四 17:40写道:
> >
> > > Is the PIP updated with all comments?
> > >
> > > On Thu, Mar 9, 2023 at 8:59 AM 太上玄元道君 <da...@apache.org> wrote:
> > >
> > > > > backlogQuotaLimitSize
> > > > > should be `backlogQuotaSizeBytes`
> > > >
> > > > > backlogQuotaLimitTime
> > > > > should be `backlogQuotaTimeSeconds`
> > > >
> > > > > So you need to rename the metric.
> > > > > "pulsar_storage_backlog_quota_count" -->
> > > > > `pulsar_storage_backlog_eviction_count`
> > > >
> > > > > the topic's existing subscription.
> > > > > "subscription" --> "subscription*s*"
> > > >
> > > > > Number of backlog quota happends.
> > > > > Number of times backlog evictions happened due to exceeding backlog
> > > quota
> > > > > (either time or size).
> > > >
> > > > Accepted, if there is no more need to change, I'll start the vote
> next
> > > > week.
> > > >
> > > > Thanks,
> > > > Tao Jiuming
> > > >
> > > >
> > > > Asaf Mesika <as...@gmail.com> 于2023年3月7日周二 00:02写道:
> > > >
> > > > > >
> > > > > > Pulsar has a feature called backlog quota (place link).
> > > > >
> > > > > You need to place a link :)
> > > > >
> > > > > Expose pulsar_storage_backlog_quota_count in the topic leve
> > > > >
> > > > > You already have "pulsar_storage_backlog_size", so why do you need
> > this
> > > > > metric for?
> > > > >
> > > > > backlogQuotaLimitSize
> > > > >
> > > > > should be `backlogQuotaSizeBytes`
> > > > >
> > > > > backlogQuotaLimitTime
> > > > >
> > > > > should be `backlogQuotaTimeSeconds`
> > > > >
> > > > > What about goal no.4? Expose oldest unacknowledged message
> > subscription
> > > > > name?
> > > > >
> > > > > IMO, metrics are like API - perhaps indicate the change there as
> well
> > > > >
> > > > > Record the event when dropBacklogForSizeLimit
> > > > > > <
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L121
> > > > > >
> > > > > >  or dropBacklogForTimeLimit
> > > > > > <
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L194
> > > > >
> > > > > is
> > > > > > going to invoked.
> > > > >
> > > > >
> > > > > Oh, now I get it.
> > > > > So you need to rename the metric.
> > > > > "pulsar_storage_backlog_quota_count" -->
> > > > > `pulsar_storage_backlog_eviction_count`
> > > > >
> > > > >
> > > > > > the topic's existing subscription.
> > > > >
> > > > > "subscription" --> "subscription*s*"
> > > > >
> > > > > Number of backlog quota happends.
> > > > >
> > > > > Number of times backlog evictions happened due to exceeding backlog
> > > quota
> > > > > (either time or size).
> > > > >
> > > > >
> > > > > >    1. Find the backlog subscriptions
> > > > > >    After received the alarm, users could request
> > > > > Topics#getStats(topicName,
> > > > > >    true/false, true, true)
> > > > > >    <
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > > >
> > > > > to
> > > > > >    get the topic stats, and find which subscriptions are in
> > backlog.
> > > > > >    Pulsar exposed backlogSize and earliestMsgPublishTimeInBacklog
> > in
> > > > the
> > > > > >    subscription level, and we will expose backlogQuotaLimitSize
> and
> > > > > >    backlogQuotaLimitTime in the topic level, so users could find
> > > which
> > > > > >    subscriptions in backlog easily.
> > > > > >
> > > > > > I wrote how it should be done IMO in a previous email.
> > > > >
> > > > >
> > > > > On Mon, Mar 6, 2023 at 1:20 PM 太上玄元道君 <da...@apache.org> wrote:
> > > > >
> > > > > > Hi Aasf,
> > > > > > I've updated the PIP, PTAL
> > > > > >
> > > > > > Thanks,
> > > > > > Tao Jiuming
> > > > > >
> > > > > > Asaf Mesika <as...@gmail.com> 于2023年3月5日周日 21:00写道:
> > > > > >
> > > > > > > On Thu, Mar 2, 2023 at 12:57 PM 太上玄元道君 <da...@apache.org>
> > wrote:
> > > > > > >
> > > > > > > > > I  think you should fix this explanation:
> > > > > > > >
> > > > > > > > Thanks! I would like to copy the context you provide to the
> PIP
> > > > > > > motivation,
> > > > > > > > your description is more detailed, so developers don't have
> to
> > go
> > > > > > through
> > > > > > > > the code.
> > > > > > > >
> > > > > > >
> > > > > > > Sure
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > Today the quota is checked periodically, right? So that's
> how
> > > the
> > > > > > > > operator
> > > > > > > > > knows the cost in terms of I/O is limited.
> > > > > > > > > Now you are adding one additional I/O per collection,
> every 1
> > > min
> > > > > by
> > > > > > > > > default. That's a lot perhaps. How long is the check
> interval
> > > > > today?
> > > > > > > >
> > > > > > > > Actually, I don't want to introduce additional costs, I
> thought
> > > we
> > > > > > > > could cache its result, so that it won't introduce additional
> > > > costs.
> > > > > > > > It may be that I did not make it clear in the PIP and caused
> > this
> > > > > > > > misunderstanding, sorry.
> > > > > > > >
> > > > > > >
> > > > > > > Ok, just to verify: You plan to modify the code that runs
> > > > periodically
> > > > > > the
> > > > > > > backlog quota check, so the result will be cached there? This
> way
> > > > when
> > > > > > you
> > > > > > > pull that information from that code every 1min to expose it
> as a
> > > > > metric
> > > > > > it
> > > > > > > will have 0 I/O cost?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > The user today can calculate quota used for size based
> limit,
> > > > since
> > > > > > > there
> > > > > > > > > are two metrics that are exposed today on a topic level: "
> > > > > > > > > pulsar_storage_backlog_quota_limit" and
> > > > > > "pulsar_storage_backlog_size".
> > > > > > > > You
> > > > > > > > > can just divide the two to get a percentage.
> > > > > > > > > For the time-based limit, the only metric exposed today is
> > > quota
> > > > > > > itself ,
> > > > > > > > "
> > > > > > > > > pulsar_storage_backlog_quota_limit_time".
> > > > > > > >
> > > > > > > > I only noticed `pulsar_storage_backlog_size` but missed
> > > > > > > > `pulsar_storage_backlog_quota_limit` and
> > > > > > > > `pulsar_storage_backlog_quota_limit_time`. Many thanks for
> your
> > > > > > reminder.
> > > > > > > >
> > > > > > > >
> > > > > > > > So, in this condition, we already have the following
> > topic-level
> > > > > > metrics:
> > > > > > > > `pulsar_storage_backlog_size`: The total backlog size of the
> > > topics
> > > > > of
> > > > > > > this
> > > > > > > > topic owned by this broker (in bytes).
> > > > > > > > `pulsar_storage_backlog_quota_limit`: The total amount of the
> > > data
> > > > in
> > > > > > > this
> > > > > > > > topic that limits the backlog quota (bytes).
> > > > > > > > `pulsar_storage_backlog_quota_limit_time`: The backlog quota
> > > limit
> > > > in
> > > > > > > > time(seconds). (This metric does not exists in the doc, need
> to
> > > > > > improve)
> > > > > > > >
> > > > > > > >
> > > > > > > > We just need to add a new metric named
> > > > > > > > `pulsar_storage_earliest_msg_publish_time_in_backlog` in the
> > > > > > topic-level
> > > > > > > > that indicates the publish time of the earliest message in
> the
> > > > > backlog.
> > > > > > > > So users could get
> `pulsar_backlog_size_quota_used_percentage`
> > by
> > > > > > divide
> > > > > > > > `pulsar_storage_backlog_size ` and
> > > > > > > >
> > > `pulsar_storage_backlog_quota_limit`(`pulsar_storage_backlog_size`
> > > > /
> > > > > > > > `pulsar_storage_backlog_quota_limit`),
> > > > > > > > and could get `pulsar_backlog_time_quota_used_percentage` by
> > > divide
> > > > > > `now
> > > > > > > -
> > > > > > > > pulsar_storage_earliest_msg_publish_time_in_backlog` and
> > > > > > > > `pulsar_storage_backlog_quota_limit_time` (`now -
> > > > > > > > pulsar_storage_earliest_msg_publish_time_in_backlog` /
> > > > > > > > `pulsar_storage_backlog_quota_limit_time`).
> > > > > > > >
> > > > > > >
> > > > > > > I think there is a problem with the name
> > > > > > > `pulsar_storage_earliest_msg_publish_time_in_backlog` in the
> > > > > topic-level:
> > > > > > > * First, I prefer exposing the age rather than the publish
> time.
> > > > > > > * Second, it's a bit hard to figure out the meaning of the
> > earliest
> > > > msg
> > > > > > in
> > > > > > > the backlog.
> > > > > > >
> > > > > > > Maybe `pulsar_storage_backlog_age_seconds`? In the explanation
> > you
> > > > can
> > > > > > > write: "The age (time passed since it was published) of the
> > > earliest
> > > > > > > unacknowledged message based on the topic's
> > > > > > > existing subscriptions" ?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > The backlog quota time checker runs periodically, so we can
> > cache
> > > > its
> > > > > > > > result, so it won't lead to much costs.
> > > > > > > >
> > > > > > > > Pulsar also exposed subscription-level  `backlogSize` and
> > > > > > > > `earliestMsgPublishTimeInBacklog` in Pulsar-Admin
> > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > > > > > > >
> > > > > > > > if
> > > > > > > > `subscriptionBacklogSize` and `getEarliestTimeInBacklog` are
> > > true.
> > > > > > > > We can also expose `backlogQuotaLimiteSize` and
> > > > > `backlogQuotaLimitTime`
> > > > > > > of
> > > > > > > > the topic to PulsarAdmin.
> > > > > > > >
> > > > > > >
> > > > > > > What is the relationship you see between Pulsar exposing
> > > > > > > subscriptionBacklogSize and earliestMsgPublishTimeInBacklog in
> > > > > > > subscription, to exposing the backlog quota limits in pulsar
> > admin?
> > > > > > >
> > > > > > > Limits can be exposed to Pulsar Admin, since it has 0 cost
> > > associated
> > > > > > with
> > > > > > > it.
> > > > > > > I think it's a good idea to do that.
> > > > > > > The quota usage can also be exposed to pulsar admin, since we
> > pull
> > > > that
> > > > > > > data from the backlog quota checker cache, so it has 0 cost as
> > > well.
> > > > > > >
> > > > > > > As we said in previous email we can also expose
> > > > > > > `backlogQuotaTimeOldestBacklogAgeSubscriptionName`
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > After users receive the backlog alert from metrics alerting
> > > > systems,
> > > > > > they
> > > > > > > > can get the topic name, then, they can request
> Topics#getStats
> > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > > > > > > >
> > > > > > > > to
> > > > > > > > get which subscriptions are in the huge backlog.
> > > > > > > >
> > > > > > > >
> > > > > > > I agree users can use PulsarAdmin getStats for topic , with
> > > > > > > getEarliestTimeInBacklog=true to find the oldest subscription
> > > > > responsible
> > > > > > > for exceeding quota, but we can give them that information
> with 0
> > > > cost
> > > > > > > since we already have that subscription name cached (we spent
> the
> > > I/O
> > > > > to
> > > > > > > find out who that subscription is, let's just cache it and
> > provide
> > > > it).
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Tao Jiuming
> > > > > > > >
> > > > > > > > Asaf Mesika <as...@gmail.com> 于2023年3月1日周三 23:42写道:
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Pulsar has 2 configurations for the backlog eviction
> > > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://pulsar.apache.org/docs/2.11.x/cookbooks-retention-expiry/#backlog-quotas
> > > > > > > > > >
> > > > > > > > > > : backlogQuotaDefaultLimitBytes and
> > > > > backlogQuotaDefaultLimitSecond.
> > > > > > > > > > By default, backlog eviction is disabled, and also, there
> > is
> > > a
> > > > > > field
> > > > > > > > > named
> > > > > > > > > > backlogQuotaMap in TopicPolicies
> > > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/HierarchyTopicPolicies.java#L45
> > > > > > > > > >
> > > > > > > > > > /NamespaceSpacePolicies
> > > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/common/policies/data/Policies.java#L41
> > > > > > > > >
> > > > > > > > > assists
> > > > > > > > > > in controlling Topic/Namespace level backlog quota.
> > > > > > > > > >
> > > > > > > > > > If topic backlog reaches the threshold of any item,
> backlog
> > > > > > eviction
> > > > > > > > will
> > > > > > > > > > be triggered, Pulsar will move subscription's cursor to
> > skip
> > > > > > > > > unacknowledged
> > > > > > > > > > messages.
> > > > > > > > > >
> > > > > > > > > > Before backlog eviction happens, we don't have a metric
> to
> > > > > monitor
> > > > > > > how
> > > > > > > > > > long that it can reaches the threshold.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I  think you should fix this explanation:
> > > > > > > > >
> > > > > > > > > In Pulsar, a subscription maintains a state of message
> > > > > acknowledged.
> > > > > > A
> > > > > > > > > subscription backlog is the set of messages which are
> > > > > unacknowledged.
> > > > > > > > > A subscription backlog size is the sum of size of
> > > unacknowledged
> > > > > > > messages
> > > > > > > > > (in bytes).
> > > > > > > > > A topic can have many subscriptions.
> > > > > > > > > A topic backlog is defined as the backlog size of the
> > > > subscription
> > > > > > > which
> > > > > > > > > has the oldest unacknowledged message. Since acknowledged
> > > > messages
> > > > > > can
> > > > > > > be
> > > > > > > > > interleaved with unacknowledged messages, calculating the
> > exact
> > > > > size
> > > > > > of
> > > > > > > > > that subscription can be expensive as it requires I/O
> > > operations
> > > > to
> > > > > > > read
> > > > > > > > > from the messages from the ledgers.
> > > > > > > > > For that reason, the topic backlog is actually defined to
> be
> > > the
> > > > > > > > estimated
> > > > > > > > > backlog size of that subscription. It does so by
> summarizing
> > > the
> > > > > size
> > > > > > > of
> > > > > > > > > all the ledgers, starting from the current active one, up
> to
> > > the
> > > > > > ledger
> > > > > > > > > which contains the oldest unacknowledged message (There is
> > > > > actually a
> > > > > > > > > faster way to calculate it, but this is the definition of
> the
> > > > > > > > estimation).
> > > > > > > > >
> > > > > > > > > A topic backlog age is the age of the oldest unacknowledged
> > > > message
> > > > > > (in
> > > > > > > > any
> > > > > > > > > subscription). If that message was written 30 minutes ago,
> > its
> > > > age
> > > > > is
> > > > > > > 30
> > > > > > > > > minutes.
> > > > > > > > >
> > > > > > > > > Pulsar has a feature called backlog quota (place link). It
> > > allows
> > > > > the
> > > > > > > > user
> > > > > > > > > to define a quota - in effect, a limit - which limits the
> > topic
> > > > > > > backlog.
> > > > > > > > > There are two types of quotas:
> > > > > > > > > * Size based: The limit is for the topic backlog size (as
> we
> > > > > defined
> > > > > > > > > above).
> > > > > > > > > * Time based: The limit is for the topic's backlog age (as
> we
> > > > > defined
> > > > > > > > > above).
> > > > > > > > >
> > > > > > > > > Once a topic backlog exceeds either one of those limits, an
> > > > action
> > > > > is
> > > > > > > > taken
> > > > > > > > > upon messages written to the topic:
> > > > > > > > > * The producer write is placed on hold for a certain amount
> > of
> > > > time
> > > > > > > > before
> > > > > > > > > failing.
> > > > > > > > > * The producer write is failed
> > > > > > > > > * The subscriptions oldest unacknowledged messages will be
> > > > > > acknowledged
> > > > > > > > in
> > > > > > > > > order until both the topic backlog size or age will fall
> > inside
> > > > the
> > > > > > > limit
> > > > > > > > > (quota). The process is called backlog eviction (happens
> > every
> > > > > > > interval)
> > > > > > > > >
> > > > > > > > > The quotas can be defined as a default value for any topic,
> > by
> > > > > using
> > > > > > > the
> > > > > > > > > following broker configuration keys:
> > > > backlogQuotaDefaultLimitBytes
> > > > > ,
> > > > > > > > > backlogQuotaDefaultLimitSecond. It can also be specified
> > > directly
> > > > > for
> > > > > > > all
> > > > > > > > > topics in a given namespace using the namespace policy, or
> a
> > > > > specific
> > > > > > > > topic
> > > > > > > > > using a topic policy.
> > > > > > > > >
> > > > > > > > > The user today can calculate quota used for size based
> limit,
> > > > since
> > > > > > > there
> > > > > > > > > are two metrics that are exposed today on a topic level: "
> > > > > > > > > pulsar_storage_backlog_quota_limit" and
> > > > > > "pulsar_storage_backlog_size".
> > > > > > > > You
> > > > > > > > > can just divide the two to get a percentage.
> > > > > > > > > For the time-based limit, the only metric exposed today is
> > > quota
> > > > > > itself
> > > > > > > > , "
> > > > > > > > > pulsar_storage_backlog_quota_limit_time".
> > > > > > > > >
> > > > > > > > > ------------
> > > > > > > > >
> > > > > > > > > I would create two metrics:
> > > > > > > > >
> > > > > > > > > `pulsar_backlog_size_quota_used_percentage`
> > > > > > > > > `pulsar_backlog_time_quota_used_percentage`
> > > > > > > > >
> > > > > > > > > You would like to know what triggered the alert, hence two.
> > > > > > > > > It's not the quota percentage, it's the quota used
> > percentage.
> > > > > > > > >
> > > > > > > > > ----------
> > > > > > > > >
> > > > > > > > > It checks if the backlog size exceeds the threshold(
> > > > > > > > > > backlogQuotaDefaultLimitBytes), and it gets the current
> > > backlog
> > > > > > size
> > > > > > > by
> > > > > > > > > > calculating LedgerInfo
> > > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > > > > > > > >,
> > > > > > > > > > it will not lead to I/O.
> > > > > > > > >
> > > > > > > > > This is not correct.
> > > > > > > > > It checks against the topic / namespace policy, and if it
> > > doesn't
> > > > > > > exist,
> > > > > > > > it
> > > > > > > > > falls back on the default configuration key mentioned
> above.
> > > > > > > > >
> > > > > > > > > It checks if the backlog time exceeds the threshold(
> > > > > > > > > > backlogQuotaDefaultLimitSecond). If
> > > > > > preciseTimeBasedBacklogQuotaCheck
> > > > > > > > is
> > > > > > > > > > set to be true, it will read an entry from Bookkeeper,
> but
> > > the
> > > > > > > default
> > > > > > > > > > value is false, which means it gets the backlog time by
> > > > > calculating
> > > > > > > > > > LedgerInfo
> > > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > > > > > > > >.
> > > > > > > > > > So in general, we don't need to worry about it will lead
> to
> > > > I/O.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > I'm afraid of that.
> > > > > > > > > Today the quota is checked periodically, right? So that's
> how
> > > the
> > > > > > > > operator
> > > > > > > > > knows the cost in terms of I/O is limited.
> > > > > > > > >  Now you are adding one additional I/O per collection,
> every
> > 1
> > > > min
> > > > > by
> > > > > > > > > default. That's a lot perhaps. How long is the check
> interval
> > > > > today?
> > > > > > > > >
> > > > > > > > > Perhaps in the backlog quota check, you can persist the
> check
> > > > > result,
> > > > > > > and
> > > > > > > > > use it? Persist the age that is.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > ------
> > > > > > > > >
> > > > > > > > > Regarding "slowest_subscription"
> > > > > > > > > I think the cost is too high, because the subscriptions
> will
> > > keep
> > > > > > > > > alternating, which can generate so many unique time series.
> > > Since
> > > > > > > > > Prometheus flush only every 2 hours, or any there TSDB, it
> > will
> > > > > cost
> > > > > > > you
> > > > > > > > > too much.
> > > > > > > > >
> > > > > > > > > I suggest exposing the name via the topic stats. This way
> > they
> > > > can
> > > > > > > issue
> > > > > > > > a
> > > > > > > > > REST call to grab that subscription name only when the
> alert
> > > > fires.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Asaf
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Tue, Feb 28, 2023 at 9:29 AM 太上玄元道君 <da...@apache.org>
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Asaf,
> > > > > > > > > > I've updated the PIP, PTAL
> > > > > > > > > >
> > > > > > > > > > Thank,
> > > > > > > > > > Tao Jiuming
> > > > > > > > > >
> > > > > > > > > > Asaf Mesika <as...@gmail.com> 于2023年2月26日周日
> 23:03写道:
> > > > > > > > > >
> > > > > > > > > > > Hi,
> > > > > > > > > > >
> > > > > > > > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > > > > > > > > backlogQuotaDefaultLimitBytes and
> > > > > > backlogQuotaDefaultLimitSecond,
> > > > > > > > if
> > > > > > > > > > > > topic backlog reaches the threshold of any item,
> > backlog
> > > > > > eviction
> > > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > > > triggered.
> > > > > > > > > > >
> > > > > > > > > > > This seems like default values, not the actual values.
> > Can
> > > > you
> > > > > > > please
> > > > > > > > > > > provide an explanation in the PIP and link to read
> more:
> > > > > > > > > > > 1. Where do you define the backlog quota exactly? What
> is
> > > the
> > > > > > > > > granularity
> > > > > > > > > > > (subscription?)
> > > > > > > > > > > 2.  Is the backlog quota on by default? If so, what are
> > the
> > > > > > default
> > > > > > > > > > values?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *Notes*
> > > > > > > > > > > 1. When the backlog quota limit is defined in Bytes,
> and
> > > you
> > > > > wish
> > > > > > > to
> > > > > > > > > know
> > > > > > > > > > > how close a subscription is to its bytes limit, you
> need
> > to
> > > > > > > calculate
> > > > > > > > > the
> > > > > > > > > > > backlog size in bytes. From my understanding, there is
> an
> > > > > > accurate
> > > > > > > > > > > calculation (which is costly in terms of I/O) and there
> > is
> > > an
> > > > > > > > estimate
> > > > > > > > > of
> > > > > > > > > > > it. I presume you would want to use the estimated one,
> is
> > > > that
> > > > > > > > correct?
> > > > > > > > > > > The backlog quota itself, uses the accurate or the
> > > estimated
> > > > > when
> > > > > > > it
> > > > > > > > > > starts
> > > > > > > > > > > evicting entries (i.e. marking them as acknowledged)?
> > > > > > > > > > >
> > > > > > > > > > > 2. For the backlog limit specifying in time units,
> there
> > is
> > > > no
> > > > > > > > > estimate,
> > > > > > > > > > as
> > > > > > > > > > > it must be calculated all the time (earliest
> > unacknowledged
> > > > > > message
> > > > > > > > > > > distance from now). How do you plan to calculate the
> > > current
> > > > > age
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > earliest message without bearing that I/O cost on each
> > > metric
> > > > > > > > > > calculation?
> > > > > > > > > > >
> > > > > > > > > > > 3. In the Goal section, you specify that your goal is
> to
> > > add
> > > > a
> > > > > > > > > > "proximity"
> > > > > > > > > > > metric.
> > > > > > > > > > > a) You must define that - what is proximity metric
> > exactly?
> > > > > What
> > > > > > > are
> > > > > > > > > its
> > > > > > > > > > > units? How are you planning to calculate it?
> > > > > > > > > > > b) Proximity is not a good term IMO. I personally have
> > > never
> > > > > seen
> > > > > > > > this
> > > > > > > > > > term
> > > > > > > > > > > used in software systems, unless it's in the
> > aviation/space
> > > > > > > industry.
> > > > > > > > > > Once
> > > > > > > > > > > you explain (a) I hope I can help provide alternative
> > > names.
> > > > > > > > > > >
> > > > > > > > > > > 4. Maybe we should provide the used quota percentage
> for
> > > both
> > > > > > > limits,
> > > > > > > > > > > instead of one per both, since it's easier to act upon
> > the
> > > > > alert
> > > > > > > when
> > > > > > > > > you
> > > > > > > > > > > need which one triggered it.
> > > > > > > > > > >
> > > > > > > > > > > 5. I didn't understand the "slowest_subscription" label
> > > used
> > > > > when
> > > > > > > > > > > describing the metric label. Can you please provide an
> > > > > > explanation?
> > > > > > > > > > >
> > > > > > > > > > > 6. I suggest writing a "High Level Design" section, and
> > add
> > > > > > > > everything
> > > > > > > > > > you
> > > > > > > > > > > need to know for this proposal, so I don't need to read
> > the
> > > > > > > > > > > implementation details below (code).
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Asaf
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Feb 22, 2023 at 4:52 PM 太上玄元道君 <
> > daojun@apache.org>
> > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi all,
> > > > > > > > > > > >
> > > > > > > > > > > > I've started a PIP to discuss: PIP-248 Add backlog
> > > eviction
> > > > > > > metric
> > > > > > > > > > > >
> > > > > > > > > > > > ### Motivation:
> > > > > > > > > > > >
> > > > > > > > > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > > > > > > > > `backlogQuotaDefaultLimitBytes` and
> > > > > > > > `backlogQuotaDefaultLimitSecond`,
> > > > > > > > > > if
> > > > > > > > > > > > topic backlog reaches the threshold of any item,
> > backlog
> > > > > > eviction
> > > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > > > triggered.
> > > > > > > > > > > >
> > > > > > > > > > > > Before backlog eviction happens, we don't have a
> metric
> > > to
> > > > > > > monitor
> > > > > > > > > how
> > > > > > > > > > > long
> > > > > > > > > > > > that it can reaches the threshold.
> > > > > > > > > > > >
> > > > > > > > > > > > We can provide a progress bar metric to tell users
> some
> > > > > topics
> > > > > > is
> > > > > > > > > about
> > > > > > > > > > > to
> > > > > > > > > > > > trigger backlog eviction. And users can subscribe the
> > > alert
> > > > > to
> > > > > > > > > schedule
> > > > > > > > > > > > consumers.
> > > > > > > > > > > >
> > > > > > > > > > > > For more details, please read the PIP at
> > > > > > > > > > > > https://github.com/apache/pulsar/issues/19601
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Tao Jiuming
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [Discuss] PIP-248: Add backlog eviction metric

Posted by Asaf Mesika <as...@gmail.com>.
>
> Pulsar has a feature called backlog quota (place link)

Need to replace (place link) with link.



>    1. Find the backlog subscriptions
>    After received the alarm, users could request Topics#getStats(topicName,
>    true/false, true, true)
>    <https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139> to
>    get the topic stats, and find which subscriptions are in backlog.
>    Pulsar exposed backlogSize and earliestMsgPublishTimeInBacklog in the
>    subscription level, and we will expose backlogQuotaSizeBytes and
>    backlogQuotaTimeSeconds in the topic level, so users could find which
>    subscriptions in backlog easily.
>
> We have forgotten the other comment.
We discussed adding the subscription name which triggered the time limit to
Topics.getStats().
Why?

I have to run getStats(getEarliestTimeInBacklog=true) and it's way more
expensive than the proposal above, since it needs to reach the earliest
message for *each* subscription.
Also a bit less accurate - you want to get the subscription cached that
triggered it, using the same number to find it. Earliest backlog is
accurate but if the configuration flag is off, it's not the same number as
getStats.


Nice to have (not mandatory) additions:

I would add before

>
>    1. After readEntryComplete
>    <https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L2780>,
>    cache its result:
>
> When this configuration flag is set to true, the broker does an I/O call
by reading the oldest entry to get its write timestamp. Once we have that,
we'll add caching to that value since we're going to use it for returning
the age.

I would add before:

> slowestReaderTimeBasedBacklogQuotaCheck
> <https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L2817> is
> a totally in-memory method, we just need to cache the
>

When this configuration flag is set to false, the check uses an estimate of
the oldest entry timestamp, by taking the closing time of the ledger which
the message is contained at.

On Fri, Mar 10, 2023 at 8:29 AM 太上玄元道君 <da...@apache.org> wrote:

> I think yes, to avoid missing something, you can take a look if you have
> time.
>
> Thanks,
> Tao Jiuming
>
> Asaf Mesika <as...@gmail.com> 于2023年3月9日周四 17:40写道:
>
> > Is the PIP updated with all comments?
> >
> > On Thu, Mar 9, 2023 at 8:59 AM 太上玄元道君 <da...@apache.org> wrote:
> >
> > > > backlogQuotaLimitSize
> > > > should be `backlogQuotaSizeBytes`
> > >
> > > > backlogQuotaLimitTime
> > > > should be `backlogQuotaTimeSeconds`
> > >
> > > > So you need to rename the metric.
> > > > "pulsar_storage_backlog_quota_count" -->
> > > > `pulsar_storage_backlog_eviction_count`
> > >
> > > > the topic's existing subscription.
> > > > "subscription" --> "subscription*s*"
> > >
> > > > Number of backlog quota happends.
> > > > Number of times backlog evictions happened due to exceeding backlog
> > quota
> > > > (either time or size).
> > >
> > > Accepted, if there is no more need to change, I'll start the vote next
> > > week.
> > >
> > > Thanks,
> > > Tao Jiuming
> > >
> > >
> > > Asaf Mesika <as...@gmail.com> 于2023年3月7日周二 00:02写道:
> > >
> > > > >
> > > > > Pulsar has a feature called backlog quota (place link).
> > > >
> > > > You need to place a link :)
> > > >
> > > > Expose pulsar_storage_backlog_quota_count in the topic leve
> > > >
> > > > You already have "pulsar_storage_backlog_size", so why do you need
> this
> > > > metric for?
> > > >
> > > > backlogQuotaLimitSize
> > > >
> > > > should be `backlogQuotaSizeBytes`
> > > >
> > > > backlogQuotaLimitTime
> > > >
> > > > should be `backlogQuotaTimeSeconds`
> > > >
> > > > What about goal no.4? Expose oldest unacknowledged message
> subscription
> > > > name?
> > > >
> > > > IMO, metrics are like API - perhaps indicate the change there as well
> > > >
> > > > Record the event when dropBacklogForSizeLimit
> > > > > <
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L121
> > > > >
> > > > >  or dropBacklogForTimeLimit
> > > > > <
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L194
> > > >
> > > > is
> > > > > going to invoked.
> > > >
> > > >
> > > > Oh, now I get it.
> > > > So you need to rename the metric.
> > > > "pulsar_storage_backlog_quota_count" -->
> > > > `pulsar_storage_backlog_eviction_count`
> > > >
> > > >
> > > > > the topic's existing subscription.
> > > >
> > > > "subscription" --> "subscription*s*"
> > > >
> > > > Number of backlog quota happends.
> > > >
> > > > Number of times backlog evictions happened due to exceeding backlog
> > quota
> > > > (either time or size).
> > > >
> > > >
> > > > >    1. Find the backlog subscriptions
> > > > >    After received the alarm, users could request
> > > > Topics#getStats(topicName,
> > > > >    true/false, true, true)
> > > > >    <
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > >
> > > > to
> > > > >    get the topic stats, and find which subscriptions are in
> backlog.
> > > > >    Pulsar exposed backlogSize and earliestMsgPublishTimeInBacklog
> in
> > > the
> > > > >    subscription level, and we will expose backlogQuotaLimitSize and
> > > > >    backlogQuotaLimitTime in the topic level, so users could find
> > which
> > > > >    subscriptions in backlog easily.
> > > > >
> > > > > I wrote how it should be done IMO in a previous email.
> > > >
> > > >
> > > > On Mon, Mar 6, 2023 at 1:20 PM 太上玄元道君 <da...@apache.org> wrote:
> > > >
> > > > > Hi Aasf,
> > > > > I've updated the PIP, PTAL
> > > > >
> > > > > Thanks,
> > > > > Tao Jiuming
> > > > >
> > > > > Asaf Mesika <as...@gmail.com> 于2023年3月5日周日 21:00写道:
> > > > >
> > > > > > On Thu, Mar 2, 2023 at 12:57 PM 太上玄元道君 <da...@apache.org>
> wrote:
> > > > > >
> > > > > > > > I  think you should fix this explanation:
> > > > > > >
> > > > > > > Thanks! I would like to copy the context you provide to the PIP
> > > > > > motivation,
> > > > > > > your description is more detailed, so developers don't have to
> go
> > > > > through
> > > > > > > the code.
> > > > > > >
> > > > > >
> > > > > > Sure
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > > Today the quota is checked periodically, right? So that's how
> > the
> > > > > > > operator
> > > > > > > > knows the cost in terms of I/O is limited.
> > > > > > > > Now you are adding one additional I/O per collection, every 1
> > min
> > > > by
> > > > > > > > default. That's a lot perhaps. How long is the check interval
> > > > today?
> > > > > > >
> > > > > > > Actually, I don't want to introduce additional costs, I thought
> > we
> > > > > > > could cache its result, so that it won't introduce additional
> > > costs.
> > > > > > > It may be that I did not make it clear in the PIP and caused
> this
> > > > > > > misunderstanding, sorry.
> > > > > > >
> > > > > >
> > > > > > Ok, just to verify: You plan to modify the code that runs
> > > periodically
> > > > > the
> > > > > > backlog quota check, so the result will be cached there? This way
> > > when
> > > > > you
> > > > > > pull that information from that code every 1min to expose it as a
> > > > metric
> > > > > it
> > > > > > will have 0 I/O cost?
> > > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > > The user today can calculate quota used for size based limit,
> > > since
> > > > > > there
> > > > > > > > are two metrics that are exposed today on a topic level: "
> > > > > > > > pulsar_storage_backlog_quota_limit" and
> > > > > "pulsar_storage_backlog_size".
> > > > > > > You
> > > > > > > > can just divide the two to get a percentage.
> > > > > > > > For the time-based limit, the only metric exposed today is
> > quota
> > > > > > itself ,
> > > > > > > "
> > > > > > > > pulsar_storage_backlog_quota_limit_time".
> > > > > > >
> > > > > > > I only noticed `pulsar_storage_backlog_size` but missed
> > > > > > > `pulsar_storage_backlog_quota_limit` and
> > > > > > > `pulsar_storage_backlog_quota_limit_time`. Many thanks for your
> > > > > reminder.
> > > > > > >
> > > > > > >
> > > > > > > So, in this condition, we already have the following
> topic-level
> > > > > metrics:
> > > > > > > `pulsar_storage_backlog_size`: The total backlog size of the
> > topics
> > > > of
> > > > > > this
> > > > > > > topic owned by this broker (in bytes).
> > > > > > > `pulsar_storage_backlog_quota_limit`: The total amount of the
> > data
> > > in
> > > > > > this
> > > > > > > topic that limits the backlog quota (bytes).
> > > > > > > `pulsar_storage_backlog_quota_limit_time`: The backlog quota
> > limit
> > > in
> > > > > > > time(seconds). (This metric does not exists in the doc, need to
> > > > > improve)
> > > > > > >
> > > > > > >
> > > > > > > We just need to add a new metric named
> > > > > > > `pulsar_storage_earliest_msg_publish_time_in_backlog` in the
> > > > > topic-level
> > > > > > > that indicates the publish time of the earliest message in the
> > > > backlog.
> > > > > > > So users could get `pulsar_backlog_size_quota_used_percentage`
> by
> > > > > divide
> > > > > > > `pulsar_storage_backlog_size ` and
> > > > > > >
> > `pulsar_storage_backlog_quota_limit`(`pulsar_storage_backlog_size`
> > > /
> > > > > > > `pulsar_storage_backlog_quota_limit`),
> > > > > > > and could get `pulsar_backlog_time_quota_used_percentage` by
> > divide
> > > > > `now
> > > > > > -
> > > > > > > pulsar_storage_earliest_msg_publish_time_in_backlog` and
> > > > > > > `pulsar_storage_backlog_quota_limit_time` (`now -
> > > > > > > pulsar_storage_earliest_msg_publish_time_in_backlog` /
> > > > > > > `pulsar_storage_backlog_quota_limit_time`).
> > > > > > >
> > > > > >
> > > > > > I think there is a problem with the name
> > > > > > `pulsar_storage_earliest_msg_publish_time_in_backlog` in the
> > > > topic-level:
> > > > > > * First, I prefer exposing the age rather than the publish time.
> > > > > > * Second, it's a bit hard to figure out the meaning of the
> earliest
> > > msg
> > > > > in
> > > > > > the backlog.
> > > > > >
> > > > > > Maybe `pulsar_storage_backlog_age_seconds`? In the explanation
> you
> > > can
> > > > > > write: "The age (time passed since it was published) of the
> > earliest
> > > > > > unacknowledged message based on the topic's
> > > > > > existing subscriptions" ?
> > > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > The backlog quota time checker runs periodically, so we can
> cache
> > > its
> > > > > > > result, so it won't lead to much costs.
> > > > > > >
> > > > > > > Pulsar also exposed subscription-level  `backlogSize` and
> > > > > > > `earliestMsgPublishTimeInBacklog` in Pulsar-Admin
> > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > > > > > >
> > > > > > > if
> > > > > > > `subscriptionBacklogSize` and `getEarliestTimeInBacklog` are
> > true.
> > > > > > > We can also expose `backlogQuotaLimiteSize` and
> > > > `backlogQuotaLimitTime`
> > > > > > of
> > > > > > > the topic to PulsarAdmin.
> > > > > > >
> > > > > >
> > > > > > What is the relationship you see between Pulsar exposing
> > > > > > subscriptionBacklogSize and earliestMsgPublishTimeInBacklog in
> > > > > > subscription, to exposing the backlog quota limits in pulsar
> admin?
> > > > > >
> > > > > > Limits can be exposed to Pulsar Admin, since it has 0 cost
> > associated
> > > > > with
> > > > > > it.
> > > > > > I think it's a good idea to do that.
> > > > > > The quota usage can also be exposed to pulsar admin, since we
> pull
> > > that
> > > > > > data from the backlog quota checker cache, so it has 0 cost as
> > well.
> > > > > >
> > > > > > As we said in previous email we can also expose
> > > > > > `backlogQuotaTimeOldestBacklogAgeSubscriptionName`
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > After users receive the backlog alert from metrics alerting
> > > systems,
> > > > > they
> > > > > > > can get the topic name, then, they can request Topics#getStats
> > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > > > > > >
> > > > > > > to
> > > > > > > get which subscriptions are in the huge backlog.
> > > > > > >
> > > > > > >
> > > > > > I agree users can use PulsarAdmin getStats for topic , with
> > > > > > getEarliestTimeInBacklog=true to find the oldest subscription
> > > > responsible
> > > > > > for exceeding quota, but we can give them that information with 0
> > > cost
> > > > > > since we already have that subscription name cached (we spent the
> > I/O
> > > > to
> > > > > > find out who that subscription is, let's just cache it and
> provide
> > > it).
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > Thanks,
> > > > > > > Tao Jiuming
> > > > > > >
> > > > > > > Asaf Mesika <as...@gmail.com> 于2023年3月1日周三 23:42写道:
> > > > > > >
> > > > > > > > >
> > > > > > > > > Pulsar has 2 configurations for the backlog eviction
> > > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://pulsar.apache.org/docs/2.11.x/cookbooks-retention-expiry/#backlog-quotas
> > > > > > > > >
> > > > > > > > > : backlogQuotaDefaultLimitBytes and
> > > > backlogQuotaDefaultLimitSecond.
> > > > > > > > > By default, backlog eviction is disabled, and also, there
> is
> > a
> > > > > field
> > > > > > > > named
> > > > > > > > > backlogQuotaMap in TopicPolicies
> > > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/HierarchyTopicPolicies.java#L45
> > > > > > > > >
> > > > > > > > > /NamespaceSpacePolicies
> > > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/common/policies/data/Policies.java#L41
> > > > > > > >
> > > > > > > > assists
> > > > > > > > > in controlling Topic/Namespace level backlog quota.
> > > > > > > > >
> > > > > > > > > If topic backlog reaches the threshold of any item, backlog
> > > > > eviction
> > > > > > > will
> > > > > > > > > be triggered, Pulsar will move subscription's cursor to
> skip
> > > > > > > > unacknowledged
> > > > > > > > > messages.
> > > > > > > > >
> > > > > > > > > Before backlog eviction happens, we don't have a metric to
> > > > monitor
> > > > > > how
> > > > > > > > > long that it can reaches the threshold.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I  think you should fix this explanation:
> > > > > > > >
> > > > > > > > In Pulsar, a subscription maintains a state of message
> > > > acknowledged.
> > > > > A
> > > > > > > > subscription backlog is the set of messages which are
> > > > unacknowledged.
> > > > > > > > A subscription backlog size is the sum of size of
> > unacknowledged
> > > > > > messages
> > > > > > > > (in bytes).
> > > > > > > > A topic can have many subscriptions.
> > > > > > > > A topic backlog is defined as the backlog size of the
> > > subscription
> > > > > > which
> > > > > > > > has the oldest unacknowledged message. Since acknowledged
> > > messages
> > > > > can
> > > > > > be
> > > > > > > > interleaved with unacknowledged messages, calculating the
> exact
> > > > size
> > > > > of
> > > > > > > > that subscription can be expensive as it requires I/O
> > operations
> > > to
> > > > > > read
> > > > > > > > from the messages from the ledgers.
> > > > > > > > For that reason, the topic backlog is actually defined to be
> > the
> > > > > > > estimated
> > > > > > > > backlog size of that subscription. It does so by summarizing
> > the
> > > > size
> > > > > > of
> > > > > > > > all the ledgers, starting from the current active one, up to
> > the
> > > > > ledger
> > > > > > > > which contains the oldest unacknowledged message (There is
> > > > actually a
> > > > > > > > faster way to calculate it, but this is the definition of the
> > > > > > > estimation).
> > > > > > > >
> > > > > > > > A topic backlog age is the age of the oldest unacknowledged
> > > message
> > > > > (in
> > > > > > > any
> > > > > > > > subscription). If that message was written 30 minutes ago,
> its
> > > age
> > > > is
> > > > > > 30
> > > > > > > > minutes.
> > > > > > > >
> > > > > > > > Pulsar has a feature called backlog quota (place link). It
> > allows
> > > > the
> > > > > > > user
> > > > > > > > to define a quota - in effect, a limit - which limits the
> topic
> > > > > > backlog.
> > > > > > > > There are two types of quotas:
> > > > > > > > * Size based: The limit is for the topic backlog size (as we
> > > > defined
> > > > > > > > above).
> > > > > > > > * Time based: The limit is for the topic's backlog age (as we
> > > > defined
> > > > > > > > above).
> > > > > > > >
> > > > > > > > Once a topic backlog exceeds either one of those limits, an
> > > action
> > > > is
> > > > > > > taken
> > > > > > > > upon messages written to the topic:
> > > > > > > > * The producer write is placed on hold for a certain amount
> of
> > > time
> > > > > > > before
> > > > > > > > failing.
> > > > > > > > * The producer write is failed
> > > > > > > > * The subscriptions oldest unacknowledged messages will be
> > > > > acknowledged
> > > > > > > in
> > > > > > > > order until both the topic backlog size or age will fall
> inside
> > > the
> > > > > > limit
> > > > > > > > (quota). The process is called backlog eviction (happens
> every
> > > > > > interval)
> > > > > > > >
> > > > > > > > The quotas can be defined as a default value for any topic,
> by
> > > > using
> > > > > > the
> > > > > > > > following broker configuration keys:
> > > backlogQuotaDefaultLimitBytes
> > > > ,
> > > > > > > > backlogQuotaDefaultLimitSecond. It can also be specified
> > directly
> > > > for
> > > > > > all
> > > > > > > > topics in a given namespace using the namespace policy, or a
> > > > specific
> > > > > > > topic
> > > > > > > > using a topic policy.
> > > > > > > >
> > > > > > > > The user today can calculate quota used for size based limit,
> > > since
> > > > > > there
> > > > > > > > are two metrics that are exposed today on a topic level: "
> > > > > > > > pulsar_storage_backlog_quota_limit" and
> > > > > "pulsar_storage_backlog_size".
> > > > > > > You
> > > > > > > > can just divide the two to get a percentage.
> > > > > > > > For the time-based limit, the only metric exposed today is
> > quota
> > > > > itself
> > > > > > > , "
> > > > > > > > pulsar_storage_backlog_quota_limit_time".
> > > > > > > >
> > > > > > > > ------------
> > > > > > > >
> > > > > > > > I would create two metrics:
> > > > > > > >
> > > > > > > > `pulsar_backlog_size_quota_used_percentage`
> > > > > > > > `pulsar_backlog_time_quota_used_percentage`
> > > > > > > >
> > > > > > > > You would like to know what triggered the alert, hence two.
> > > > > > > > It's not the quota percentage, it's the quota used
> percentage.
> > > > > > > >
> > > > > > > > ----------
> > > > > > > >
> > > > > > > > It checks if the backlog size exceeds the threshold(
> > > > > > > > > backlogQuotaDefaultLimitBytes), and it gets the current
> > backlog
> > > > > size
> > > > > > by
> > > > > > > > > calculating LedgerInfo
> > > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > > > > > > >,
> > > > > > > > > it will not lead to I/O.
> > > > > > > >
> > > > > > > > This is not correct.
> > > > > > > > It checks against the topic / namespace policy, and if it
> > doesn't
> > > > > > exist,
> > > > > > > it
> > > > > > > > falls back on the default configuration key mentioned above.
> > > > > > > >
> > > > > > > > It checks if the backlog time exceeds the threshold(
> > > > > > > > > backlogQuotaDefaultLimitSecond). If
> > > > > preciseTimeBasedBacklogQuotaCheck
> > > > > > > is
> > > > > > > > > set to be true, it will read an entry from Bookkeeper, but
> > the
> > > > > > default
> > > > > > > > > value is false, which means it gets the backlog time by
> > > > calculating
> > > > > > > > > LedgerInfo
> > > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > > > > > > >.
> > > > > > > > > So in general, we don't need to worry about it will lead to
> > > I/O.
> > > > > > > >
> > > > > > > >
> > > > > > > > I'm afraid of that.
> > > > > > > > Today the quota is checked periodically, right? So that's how
> > the
> > > > > > > operator
> > > > > > > > knows the cost in terms of I/O is limited.
> > > > > > > >  Now you are adding one additional I/O per collection, every
> 1
> > > min
> > > > by
> > > > > > > > default. That's a lot perhaps. How long is the check interval
> > > > today?
> > > > > > > >
> > > > > > > > Perhaps in the backlog quota check, you can persist the check
> > > > result,
> > > > > > and
> > > > > > > > use it? Persist the age that is.
> > > > > > > >
> > > > > > > >
> > > > > > > > ------
> > > > > > > >
> > > > > > > > Regarding "slowest_subscription"
> > > > > > > > I think the cost is too high, because the subscriptions will
> > keep
> > > > > > > > alternating, which can generate so many unique time series.
> > Since
> > > > > > > > Prometheus flush only every 2 hours, or any there TSDB, it
> will
> > > > cost
> > > > > > you
> > > > > > > > too much.
> > > > > > > >
> > > > > > > > I suggest exposing the name via the topic stats. This way
> they
> > > can
> > > > > > issue
> > > > > > > a
> > > > > > > > REST call to grab that subscription name only when the alert
> > > fires.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Asaf
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Feb 28, 2023 at 9:29 AM 太上玄元道君 <da...@apache.org>
> > > wrote:
> > > > > > > >
> > > > > > > > > Hi Asaf,
> > > > > > > > > I've updated the PIP, PTAL
> > > > > > > > >
> > > > > > > > > Thank,
> > > > > > > > > Tao Jiuming
> > > > > > > > >
> > > > > > > > > Asaf Mesika <as...@gmail.com> 于2023年2月26日周日 23:03写道:
> > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > > > > > > > backlogQuotaDefaultLimitBytes and
> > > > > backlogQuotaDefaultLimitSecond,
> > > > > > > if
> > > > > > > > > > > topic backlog reaches the threshold of any item,
> backlog
> > > > > eviction
> > > > > > > > will
> > > > > > > > > be
> > > > > > > > > > > triggered.
> > > > > > > > > >
> > > > > > > > > > This seems like default values, not the actual values.
> Can
> > > you
> > > > > > please
> > > > > > > > > > provide an explanation in the PIP and link to read more:
> > > > > > > > > > 1. Where do you define the backlog quota exactly? What is
> > the
> > > > > > > > granularity
> > > > > > > > > > (subscription?)
> > > > > > > > > > 2.  Is the backlog quota on by default? If so, what are
> the
> > > > > default
> > > > > > > > > values?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > *Notes*
> > > > > > > > > > 1. When the backlog quota limit is defined in Bytes, and
> > you
> > > > wish
> > > > > > to
> > > > > > > > know
> > > > > > > > > > how close a subscription is to its bytes limit, you need
> to
> > > > > > calculate
> > > > > > > > the
> > > > > > > > > > backlog size in bytes. From my understanding, there is an
> > > > > accurate
> > > > > > > > > > calculation (which is costly in terms of I/O) and there
> is
> > an
> > > > > > > estimate
> > > > > > > > of
> > > > > > > > > > it. I presume you would want to use the estimated one, is
> > > that
> > > > > > > correct?
> > > > > > > > > > The backlog quota itself, uses the accurate or the
> > estimated
> > > > when
> > > > > > it
> > > > > > > > > starts
> > > > > > > > > > evicting entries (i.e. marking them as acknowledged)?
> > > > > > > > > >
> > > > > > > > > > 2. For the backlog limit specifying in time units, there
> is
> > > no
> > > > > > > > estimate,
> > > > > > > > > as
> > > > > > > > > > it must be calculated all the time (earliest
> unacknowledged
> > > > > message
> > > > > > > > > > distance from now). How do you plan to calculate the
> > current
> > > > age
> > > > > of
> > > > > > > the
> > > > > > > > > > earliest message without bearing that I/O cost on each
> > metric
> > > > > > > > > calculation?
> > > > > > > > > >
> > > > > > > > > > 3. In the Goal section, you specify that your goal is to
> > add
> > > a
> > > > > > > > > "proximity"
> > > > > > > > > > metric.
> > > > > > > > > > a) You must define that - what is proximity metric
> exactly?
> > > > What
> > > > > > are
> > > > > > > > its
> > > > > > > > > > units? How are you planning to calculate it?
> > > > > > > > > > b) Proximity is not a good term IMO. I personally have
> > never
> > > > seen
> > > > > > > this
> > > > > > > > > term
> > > > > > > > > > used in software systems, unless it's in the
> aviation/space
> > > > > > industry.
> > > > > > > > > Once
> > > > > > > > > > you explain (a) I hope I can help provide alternative
> > names.
> > > > > > > > > >
> > > > > > > > > > 4. Maybe we should provide the used quota percentage for
> > both
> > > > > > limits,
> > > > > > > > > > instead of one per both, since it's easier to act upon
> the
> > > > alert
> > > > > > when
> > > > > > > > you
> > > > > > > > > > need which one triggered it.
> > > > > > > > > >
> > > > > > > > > > 5. I didn't understand the "slowest_subscription" label
> > used
> > > > when
> > > > > > > > > > describing the metric label. Can you please provide an
> > > > > explanation?
> > > > > > > > > >
> > > > > > > > > > 6. I suggest writing a "High Level Design" section, and
> add
> > > > > > > everything
> > > > > > > > > you
> > > > > > > > > > need to know for this proposal, so I don't need to read
> the
> > > > > > > > > > implementation details below (code).
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Asaf
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Feb 22, 2023 at 4:52 PM 太上玄元道君 <
> daojun@apache.org>
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi all,
> > > > > > > > > > >
> > > > > > > > > > > I've started a PIP to discuss: PIP-248 Add backlog
> > eviction
> > > > > > metric
> > > > > > > > > > >
> > > > > > > > > > > ### Motivation:
> > > > > > > > > > >
> > > > > > > > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > > > > > > > `backlogQuotaDefaultLimitBytes` and
> > > > > > > `backlogQuotaDefaultLimitSecond`,
> > > > > > > > > if
> > > > > > > > > > > topic backlog reaches the threshold of any item,
> backlog
> > > > > eviction
> > > > > > > > will
> > > > > > > > > be
> > > > > > > > > > > triggered.
> > > > > > > > > > >
> > > > > > > > > > > Before backlog eviction happens, we don't have a metric
> > to
> > > > > > monitor
> > > > > > > > how
> > > > > > > > > > long
> > > > > > > > > > > that it can reaches the threshold.
> > > > > > > > > > >
> > > > > > > > > > > We can provide a progress bar metric to tell users some
> > > > topics
> > > > > is
> > > > > > > > about
> > > > > > > > > > to
> > > > > > > > > > > trigger backlog eviction. And users can subscribe the
> > alert
> > > > to
> > > > > > > > schedule
> > > > > > > > > > > consumers.
> > > > > > > > > > >
> > > > > > > > > > > For more details, please read the PIP at
> > > > > > > > > > > https://github.com/apache/pulsar/issues/19601
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Tao Jiuming
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [Discuss] PIP-248: Add backlog eviction metric

Posted by 太上玄元道君 <da...@apache.org>.
I think yes, to avoid missing something, you can take a look if you have
time.

Thanks,
Tao Jiuming

Asaf Mesika <as...@gmail.com> 于2023年3月9日周四 17:40写道:

> Is the PIP updated with all comments?
>
> On Thu, Mar 9, 2023 at 8:59 AM 太上玄元道君 <da...@apache.org> wrote:
>
> > > backlogQuotaLimitSize
> > > should be `backlogQuotaSizeBytes`
> >
> > > backlogQuotaLimitTime
> > > should be `backlogQuotaTimeSeconds`
> >
> > > So you need to rename the metric.
> > > "pulsar_storage_backlog_quota_count" -->
> > > `pulsar_storage_backlog_eviction_count`
> >
> > > the topic's existing subscription.
> > > "subscription" --> "subscription*s*"
> >
> > > Number of backlog quota happends.
> > > Number of times backlog evictions happened due to exceeding backlog
> quota
> > > (either time or size).
> >
> > Accepted, if there is no more need to change, I'll start the vote next
> > week.
> >
> > Thanks,
> > Tao Jiuming
> >
> >
> > Asaf Mesika <as...@gmail.com> 于2023年3月7日周二 00:02写道:
> >
> > > >
> > > > Pulsar has a feature called backlog quota (place link).
> > >
> > > You need to place a link :)
> > >
> > > Expose pulsar_storage_backlog_quota_count in the topic leve
> > >
> > > You already have "pulsar_storage_backlog_size", so why do you need this
> > > metric for?
> > >
> > > backlogQuotaLimitSize
> > >
> > > should be `backlogQuotaSizeBytes`
> > >
> > > backlogQuotaLimitTime
> > >
> > > should be `backlogQuotaTimeSeconds`
> > >
> > > What about goal no.4? Expose oldest unacknowledged message subscription
> > > name?
> > >
> > > IMO, metrics are like API - perhaps indicate the change there as well
> > >
> > > Record the event when dropBacklogForSizeLimit
> > > > <
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L121
> > > >
> > > >  or dropBacklogForTimeLimit
> > > > <
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L194
> > >
> > > is
> > > > going to invoked.
> > >
> > >
> > > Oh, now I get it.
> > > So you need to rename the metric.
> > > "pulsar_storage_backlog_quota_count" -->
> > > `pulsar_storage_backlog_eviction_count`
> > >
> > >
> > > > the topic's existing subscription.
> > >
> > > "subscription" --> "subscription*s*"
> > >
> > > Number of backlog quota happends.
> > >
> > > Number of times backlog evictions happened due to exceeding backlog
> quota
> > > (either time or size).
> > >
> > >
> > > >    1. Find the backlog subscriptions
> > > >    After received the alarm, users could request
> > > Topics#getStats(topicName,
> > > >    true/false, true, true)
> > > >    <
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > >
> > > to
> > > >    get the topic stats, and find which subscriptions are in backlog.
> > > >    Pulsar exposed backlogSize and earliestMsgPublishTimeInBacklog in
> > the
> > > >    subscription level, and we will expose backlogQuotaLimitSize and
> > > >    backlogQuotaLimitTime in the topic level, so users could find
> which
> > > >    subscriptions in backlog easily.
> > > >
> > > > I wrote how it should be done IMO in a previous email.
> > >
> > >
> > > On Mon, Mar 6, 2023 at 1:20 PM 太上玄元道君 <da...@apache.org> wrote:
> > >
> > > > Hi Aasf,
> > > > I've updated the PIP, PTAL
> > > >
> > > > Thanks,
> > > > Tao Jiuming
> > > >
> > > > Asaf Mesika <as...@gmail.com> 于2023年3月5日周日 21:00写道:
> > > >
> > > > > On Thu, Mar 2, 2023 at 12:57 PM 太上玄元道君 <da...@apache.org> wrote:
> > > > >
> > > > > > > I  think you should fix this explanation:
> > > > > >
> > > > > > Thanks! I would like to copy the context you provide to the PIP
> > > > > motivation,
> > > > > > your description is more detailed, so developers don't have to go
> > > > through
> > > > > > the code.
> > > > > >
> > > > >
> > > > > Sure
> > > > >
> > > > >
> > > > > >
> > > > > > > Today the quota is checked periodically, right? So that's how
> the
> > > > > > operator
> > > > > > > knows the cost in terms of I/O is limited.
> > > > > > > Now you are adding one additional I/O per collection, every 1
> min
> > > by
> > > > > > > default. That's a lot perhaps. How long is the check interval
> > > today?
> > > > > >
> > > > > > Actually, I don't want to introduce additional costs, I thought
> we
> > > > > > could cache its result, so that it won't introduce additional
> > costs.
> > > > > > It may be that I did not make it clear in the PIP and caused this
> > > > > > misunderstanding, sorry.
> > > > > >
> > > > >
> > > > > Ok, just to verify: You plan to modify the code that runs
> > periodically
> > > > the
> > > > > backlog quota check, so the result will be cached there? This way
> > when
> > > > you
> > > > > pull that information from that code every 1min to expose it as a
> > > metric
> > > > it
> > > > > will have 0 I/O cost?
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > > The user today can calculate quota used for size based limit,
> > since
> > > > > there
> > > > > > > are two metrics that are exposed today on a topic level: "
> > > > > > > pulsar_storage_backlog_quota_limit" and
> > > > "pulsar_storage_backlog_size".
> > > > > > You
> > > > > > > can just divide the two to get a percentage.
> > > > > > > For the time-based limit, the only metric exposed today is
> quota
> > > > > itself ,
> > > > > > "
> > > > > > > pulsar_storage_backlog_quota_limit_time".
> > > > > >
> > > > > > I only noticed `pulsar_storage_backlog_size` but missed
> > > > > > `pulsar_storage_backlog_quota_limit` and
> > > > > > `pulsar_storage_backlog_quota_limit_time`. Many thanks for your
> > > > reminder.
> > > > > >
> > > > > >
> > > > > > So, in this condition, we already have the following topic-level
> > > > metrics:
> > > > > > `pulsar_storage_backlog_size`: The total backlog size of the
> topics
> > > of
> > > > > this
> > > > > > topic owned by this broker (in bytes).
> > > > > > `pulsar_storage_backlog_quota_limit`: The total amount of the
> data
> > in
> > > > > this
> > > > > > topic that limits the backlog quota (bytes).
> > > > > > `pulsar_storage_backlog_quota_limit_time`: The backlog quota
> limit
> > in
> > > > > > time(seconds). (This metric does not exists in the doc, need to
> > > > improve)
> > > > > >
> > > > > >
> > > > > > We just need to add a new metric named
> > > > > > `pulsar_storage_earliest_msg_publish_time_in_backlog` in the
> > > > topic-level
> > > > > > that indicates the publish time of the earliest message in the
> > > backlog.
> > > > > > So users could get `pulsar_backlog_size_quota_used_percentage` by
> > > > divide
> > > > > > `pulsar_storage_backlog_size ` and
> > > > > >
> `pulsar_storage_backlog_quota_limit`(`pulsar_storage_backlog_size`
> > /
> > > > > > `pulsar_storage_backlog_quota_limit`),
> > > > > > and could get `pulsar_backlog_time_quota_used_percentage` by
> divide
> > > > `now
> > > > > -
> > > > > > pulsar_storage_earliest_msg_publish_time_in_backlog` and
> > > > > > `pulsar_storage_backlog_quota_limit_time` (`now -
> > > > > > pulsar_storage_earliest_msg_publish_time_in_backlog` /
> > > > > > `pulsar_storage_backlog_quota_limit_time`).
> > > > > >
> > > > >
> > > > > I think there is a problem with the name
> > > > > `pulsar_storage_earliest_msg_publish_time_in_backlog` in the
> > > topic-level:
> > > > > * First, I prefer exposing the age rather than the publish time.
> > > > > * Second, it's a bit hard to figure out the meaning of the earliest
> > msg
> > > > in
> > > > > the backlog.
> > > > >
> > > > > Maybe `pulsar_storage_backlog_age_seconds`? In the explanation you
> > can
> > > > > write: "The age (time passed since it was published) of the
> earliest
> > > > > unacknowledged message based on the topic's
> > > > > existing subscriptions" ?
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > The backlog quota time checker runs periodically, so we can cache
> > its
> > > > > > result, so it won't lead to much costs.
> > > > > >
> > > > > > Pulsar also exposed subscription-level  `backlogSize` and
> > > > > > `earliestMsgPublishTimeInBacklog` in Pulsar-Admin
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > > > > >
> > > > > > if
> > > > > > `subscriptionBacklogSize` and `getEarliestTimeInBacklog` are
> true.
> > > > > > We can also expose `backlogQuotaLimiteSize` and
> > > `backlogQuotaLimitTime`
> > > > > of
> > > > > > the topic to PulsarAdmin.
> > > > > >
> > > > >
> > > > > What is the relationship you see between Pulsar exposing
> > > > > subscriptionBacklogSize and earliestMsgPublishTimeInBacklog in
> > > > > subscription, to exposing the backlog quota limits in pulsar admin?
> > > > >
> > > > > Limits can be exposed to Pulsar Admin, since it has 0 cost
> associated
> > > > with
> > > > > it.
> > > > > I think it's a good idea to do that.
> > > > > The quota usage can also be exposed to pulsar admin, since we pull
> > that
> > > > > data from the backlog quota checker cache, so it has 0 cost as
> well.
> > > > >
> > > > > As we said in previous email we can also expose
> > > > > `backlogQuotaTimeOldestBacklogAgeSubscriptionName`
> > > > >
> > > > >
> > > > > >
> > > > > > After users receive the backlog alert from metrics alerting
> > systems,
> > > > they
> > > > > > can get the topic name, then, they can request Topics#getStats
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > > > > >
> > > > > > to
> > > > > > get which subscriptions are in the huge backlog.
> > > > > >
> > > > > >
> > > > > I agree users can use PulsarAdmin getStats for topic , with
> > > > > getEarliestTimeInBacklog=true to find the oldest subscription
> > > responsible
> > > > > for exceeding quota, but we can give them that information with 0
> > cost
> > > > > since we already have that subscription name cached (we spent the
> I/O
> > > to
> > > > > find out who that subscription is, let's just cache it and provide
> > it).
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > > Thanks,
> > > > > > Tao Jiuming
> > > > > >
> > > > > > Asaf Mesika <as...@gmail.com> 于2023年3月1日周三 23:42写道:
> > > > > >
> > > > > > > >
> > > > > > > > Pulsar has 2 configurations for the backlog eviction
> > > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://pulsar.apache.org/docs/2.11.x/cookbooks-retention-expiry/#backlog-quotas
> > > > > > > >
> > > > > > > > : backlogQuotaDefaultLimitBytes and
> > > backlogQuotaDefaultLimitSecond.
> > > > > > > > By default, backlog eviction is disabled, and also, there is
> a
> > > > field
> > > > > > > named
> > > > > > > > backlogQuotaMap in TopicPolicies
> > > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/HierarchyTopicPolicies.java#L45
> > > > > > > >
> > > > > > > > /NamespaceSpacePolicies
> > > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/common/policies/data/Policies.java#L41
> > > > > > >
> > > > > > > assists
> > > > > > > > in controlling Topic/Namespace level backlog quota.
> > > > > > > >
> > > > > > > > If topic backlog reaches the threshold of any item, backlog
> > > > eviction
> > > > > > will
> > > > > > > > be triggered, Pulsar will move subscription's cursor to skip
> > > > > > > unacknowledged
> > > > > > > > messages.
> > > > > > > >
> > > > > > > > Before backlog eviction happens, we don't have a metric to
> > > monitor
> > > > > how
> > > > > > > > long that it can reaches the threshold.
> > > > > > > >
> > > > > > >
> > > > > > > I  think you should fix this explanation:
> > > > > > >
> > > > > > > In Pulsar, a subscription maintains a state of message
> > > acknowledged.
> > > > A
> > > > > > > subscription backlog is the set of messages which are
> > > unacknowledged.
> > > > > > > A subscription backlog size is the sum of size of
> unacknowledged
> > > > > messages
> > > > > > > (in bytes).
> > > > > > > A topic can have many subscriptions.
> > > > > > > A topic backlog is defined as the backlog size of the
> > subscription
> > > > > which
> > > > > > > has the oldest unacknowledged message. Since acknowledged
> > messages
> > > > can
> > > > > be
> > > > > > > interleaved with unacknowledged messages, calculating the exact
> > > size
> > > > of
> > > > > > > that subscription can be expensive as it requires I/O
> operations
> > to
> > > > > read
> > > > > > > from the messages from the ledgers.
> > > > > > > For that reason, the topic backlog is actually defined to be
> the
> > > > > > estimated
> > > > > > > backlog size of that subscription. It does so by summarizing
> the
> > > size
> > > > > of
> > > > > > > all the ledgers, starting from the current active one, up to
> the
> > > > ledger
> > > > > > > which contains the oldest unacknowledged message (There is
> > > actually a
> > > > > > > faster way to calculate it, but this is the definition of the
> > > > > > estimation).
> > > > > > >
> > > > > > > A topic backlog age is the age of the oldest unacknowledged
> > message
> > > > (in
> > > > > > any
> > > > > > > subscription). If that message was written 30 minutes ago, its
> > age
> > > is
> > > > > 30
> > > > > > > minutes.
> > > > > > >
> > > > > > > Pulsar has a feature called backlog quota (place link). It
> allows
> > > the
> > > > > > user
> > > > > > > to define a quota - in effect, a limit - which limits the topic
> > > > > backlog.
> > > > > > > There are two types of quotas:
> > > > > > > * Size based: The limit is for the topic backlog size (as we
> > > defined
> > > > > > > above).
> > > > > > > * Time based: The limit is for the topic's backlog age (as we
> > > defined
> > > > > > > above).
> > > > > > >
> > > > > > > Once a topic backlog exceeds either one of those limits, an
> > action
> > > is
> > > > > > taken
> > > > > > > upon messages written to the topic:
> > > > > > > * The producer write is placed on hold for a certain amount of
> > time
> > > > > > before
> > > > > > > failing.
> > > > > > > * The producer write is failed
> > > > > > > * The subscriptions oldest unacknowledged messages will be
> > > > acknowledged
> > > > > > in
> > > > > > > order until both the topic backlog size or age will fall inside
> > the
> > > > > limit
> > > > > > > (quota). The process is called backlog eviction (happens every
> > > > > interval)
> > > > > > >
> > > > > > > The quotas can be defined as a default value for any topic, by
> > > using
> > > > > the
> > > > > > > following broker configuration keys:
> > backlogQuotaDefaultLimitBytes
> > > ,
> > > > > > > backlogQuotaDefaultLimitSecond. It can also be specified
> directly
> > > for
> > > > > all
> > > > > > > topics in a given namespace using the namespace policy, or a
> > > specific
> > > > > > topic
> > > > > > > using a topic policy.
> > > > > > >
> > > > > > > The user today can calculate quota used for size based limit,
> > since
> > > > > there
> > > > > > > are two metrics that are exposed today on a topic level: "
> > > > > > > pulsar_storage_backlog_quota_limit" and
> > > > "pulsar_storage_backlog_size".
> > > > > > You
> > > > > > > can just divide the two to get a percentage.
> > > > > > > For the time-based limit, the only metric exposed today is
> quota
> > > > itself
> > > > > > , "
> > > > > > > pulsar_storage_backlog_quota_limit_time".
> > > > > > >
> > > > > > > ------------
> > > > > > >
> > > > > > > I would create two metrics:
> > > > > > >
> > > > > > > `pulsar_backlog_size_quota_used_percentage`
> > > > > > > `pulsar_backlog_time_quota_used_percentage`
> > > > > > >
> > > > > > > You would like to know what triggered the alert, hence two.
> > > > > > > It's not the quota percentage, it's the quota used percentage.
> > > > > > >
> > > > > > > ----------
> > > > > > >
> > > > > > > It checks if the backlog size exceeds the threshold(
> > > > > > > > backlogQuotaDefaultLimitBytes), and it gets the current
> backlog
> > > > size
> > > > > by
> > > > > > > > calculating LedgerInfo
> > > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > > > > > >,
> > > > > > > > it will not lead to I/O.
> > > > > > >
> > > > > > > This is not correct.
> > > > > > > It checks against the topic / namespace policy, and if it
> doesn't
> > > > > exist,
> > > > > > it
> > > > > > > falls back on the default configuration key mentioned above.
> > > > > > >
> > > > > > > It checks if the backlog time exceeds the threshold(
> > > > > > > > backlogQuotaDefaultLimitSecond). If
> > > > preciseTimeBasedBacklogQuotaCheck
> > > > > > is
> > > > > > > > set to be true, it will read an entry from Bookkeeper, but
> the
> > > > > default
> > > > > > > > value is false, which means it gets the backlog time by
> > > calculating
> > > > > > > > LedgerInfo
> > > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > > > > > >.
> > > > > > > > So in general, we don't need to worry about it will lead to
> > I/O.
> > > > > > >
> > > > > > >
> > > > > > > I'm afraid of that.
> > > > > > > Today the quota is checked periodically, right? So that's how
> the
> > > > > > operator
> > > > > > > knows the cost in terms of I/O is limited.
> > > > > > >  Now you are adding one additional I/O per collection, every 1
> > min
> > > by
> > > > > > > default. That's a lot perhaps. How long is the check interval
> > > today?
> > > > > > >
> > > > > > > Perhaps in the backlog quota check, you can persist the check
> > > result,
> > > > > and
> > > > > > > use it? Persist the age that is.
> > > > > > >
> > > > > > >
> > > > > > > ------
> > > > > > >
> > > > > > > Regarding "slowest_subscription"
> > > > > > > I think the cost is too high, because the subscriptions will
> keep
> > > > > > > alternating, which can generate so many unique time series.
> Since
> > > > > > > Prometheus flush only every 2 hours, or any there TSDB, it will
> > > cost
> > > > > you
> > > > > > > too much.
> > > > > > >
> > > > > > > I suggest exposing the name via the topic stats. This way they
> > can
> > > > > issue
> > > > > > a
> > > > > > > REST call to grab that subscription name only when the alert
> > fires.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Asaf
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Feb 28, 2023 at 9:29 AM 太上玄元道君 <da...@apache.org>
> > wrote:
> > > > > > >
> > > > > > > > Hi Asaf,
> > > > > > > > I've updated the PIP, PTAL
> > > > > > > >
> > > > > > > > Thank,
> > > > > > > > Tao Jiuming
> > > > > > > >
> > > > > > > > Asaf Mesika <as...@gmail.com> 于2023年2月26日周日 23:03写道:
> > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > > > > > > backlogQuotaDefaultLimitBytes and
> > > > backlogQuotaDefaultLimitSecond,
> > > > > > if
> > > > > > > > > > topic backlog reaches the threshold of any item, backlog
> > > > eviction
> > > > > > > will
> > > > > > > > be
> > > > > > > > > > triggered.
> > > > > > > > >
> > > > > > > > > This seems like default values, not the actual values. Can
> > you
> > > > > please
> > > > > > > > > provide an explanation in the PIP and link to read more:
> > > > > > > > > 1. Where do you define the backlog quota exactly? What is
> the
> > > > > > > granularity
> > > > > > > > > (subscription?)
> > > > > > > > > 2.  Is the backlog quota on by default? If so, what are the
> > > > default
> > > > > > > > values?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > *Notes*
> > > > > > > > > 1. When the backlog quota limit is defined in Bytes, and
> you
> > > wish
> > > > > to
> > > > > > > know
> > > > > > > > > how close a subscription is to its bytes limit, you need to
> > > > > calculate
> > > > > > > the
> > > > > > > > > backlog size in bytes. From my understanding, there is an
> > > > accurate
> > > > > > > > > calculation (which is costly in terms of I/O) and there is
> an
> > > > > > estimate
> > > > > > > of
> > > > > > > > > it. I presume you would want to use the estimated one, is
> > that
> > > > > > correct?
> > > > > > > > > The backlog quota itself, uses the accurate or the
> estimated
> > > when
> > > > > it
> > > > > > > > starts
> > > > > > > > > evicting entries (i.e. marking them as acknowledged)?
> > > > > > > > >
> > > > > > > > > 2. For the backlog limit specifying in time units, there is
> > no
> > > > > > > estimate,
> > > > > > > > as
> > > > > > > > > it must be calculated all the time (earliest unacknowledged
> > > > message
> > > > > > > > > distance from now). How do you plan to calculate the
> current
> > > age
> > > > of
> > > > > > the
> > > > > > > > > earliest message without bearing that I/O cost on each
> metric
> > > > > > > > calculation?
> > > > > > > > >
> > > > > > > > > 3. In the Goal section, you specify that your goal is to
> add
> > a
> > > > > > > > "proximity"
> > > > > > > > > metric.
> > > > > > > > > a) You must define that - what is proximity metric exactly?
> > > What
> > > > > are
> > > > > > > its
> > > > > > > > > units? How are you planning to calculate it?
> > > > > > > > > b) Proximity is not a good term IMO. I personally have
> never
> > > seen
> > > > > > this
> > > > > > > > term
> > > > > > > > > used in software systems, unless it's in the aviation/space
> > > > > industry.
> > > > > > > > Once
> > > > > > > > > you explain (a) I hope I can help provide alternative
> names.
> > > > > > > > >
> > > > > > > > > 4. Maybe we should provide the used quota percentage for
> both
> > > > > limits,
> > > > > > > > > instead of one per both, since it's easier to act upon the
> > > alert
> > > > > when
> > > > > > > you
> > > > > > > > > need which one triggered it.
> > > > > > > > >
> > > > > > > > > 5. I didn't understand the "slowest_subscription" label
> used
> > > when
> > > > > > > > > describing the metric label. Can you please provide an
> > > > explanation?
> > > > > > > > >
> > > > > > > > > 6. I suggest writing a "High Level Design" section, and add
> > > > > > everything
> > > > > > > > you
> > > > > > > > > need to know for this proposal, so I don't need to read the
> > > > > > > > > implementation details below (code).
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Asaf
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Feb 22, 2023 at 4:52 PM 太上玄元道君 <da...@apache.org>
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi all,
> > > > > > > > > >
> > > > > > > > > > I've started a PIP to discuss: PIP-248 Add backlog
> eviction
> > > > > metric
> > > > > > > > > >
> > > > > > > > > > ### Motivation:
> > > > > > > > > >
> > > > > > > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > > > > > > `backlogQuotaDefaultLimitBytes` and
> > > > > > `backlogQuotaDefaultLimitSecond`,
> > > > > > > > if
> > > > > > > > > > topic backlog reaches the threshold of any item, backlog
> > > > eviction
> > > > > > > will
> > > > > > > > be
> > > > > > > > > > triggered.
> > > > > > > > > >
> > > > > > > > > > Before backlog eviction happens, we don't have a metric
> to
> > > > > monitor
> > > > > > > how
> > > > > > > > > long
> > > > > > > > > > that it can reaches the threshold.
> > > > > > > > > >
> > > > > > > > > > We can provide a progress bar metric to tell users some
> > > topics
> > > > is
> > > > > > > about
> > > > > > > > > to
> > > > > > > > > > trigger backlog eviction. And users can subscribe the
> alert
> > > to
> > > > > > > schedule
> > > > > > > > > > consumers.
> > > > > > > > > >
> > > > > > > > > > For more details, please read the PIP at
> > > > > > > > > > https://github.com/apache/pulsar/issues/19601
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Tao Jiuming
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [Discuss] PIP-248: Add backlog eviction metric

Posted by Asaf Mesika <as...@gmail.com>.
Is the PIP updated with all comments?

On Thu, Mar 9, 2023 at 8:59 AM 太上玄元道君 <da...@apache.org> wrote:

> > backlogQuotaLimitSize
> > should be `backlogQuotaSizeBytes`
>
> > backlogQuotaLimitTime
> > should be `backlogQuotaTimeSeconds`
>
> > So you need to rename the metric.
> > "pulsar_storage_backlog_quota_count" -->
> > `pulsar_storage_backlog_eviction_count`
>
> > the topic's existing subscription.
> > "subscription" --> "subscription*s*"
>
> > Number of backlog quota happends.
> > Number of times backlog evictions happened due to exceeding backlog quota
> > (either time or size).
>
> Accepted, if there is no more need to change, I'll start the vote next
> week.
>
> Thanks,
> Tao Jiuming
>
>
> Asaf Mesika <as...@gmail.com> 于2023年3月7日周二 00:02写道:
>
> > >
> > > Pulsar has a feature called backlog quota (place link).
> >
> > You need to place a link :)
> >
> > Expose pulsar_storage_backlog_quota_count in the topic leve
> >
> > You already have "pulsar_storage_backlog_size", so why do you need this
> > metric for?
> >
> > backlogQuotaLimitSize
> >
> > should be `backlogQuotaSizeBytes`
> >
> > backlogQuotaLimitTime
> >
> > should be `backlogQuotaTimeSeconds`
> >
> > What about goal no.4? Expose oldest unacknowledged message subscription
> > name?
> >
> > IMO, metrics are like API - perhaps indicate the change there as well
> >
> > Record the event when dropBacklogForSizeLimit
> > > <
> >
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L121
> > >
> > >  or dropBacklogForTimeLimit
> > > <
> >
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L194
> >
> > is
> > > going to invoked.
> >
> >
> > Oh, now I get it.
> > So you need to rename the metric.
> > "pulsar_storage_backlog_quota_count" -->
> > `pulsar_storage_backlog_eviction_count`
> >
> >
> > > the topic's existing subscription.
> >
> > "subscription" --> "subscription*s*"
> >
> > Number of backlog quota happends.
> >
> > Number of times backlog evictions happened due to exceeding backlog quota
> > (either time or size).
> >
> >
> > >    1. Find the backlog subscriptions
> > >    After received the alarm, users could request
> > Topics#getStats(topicName,
> > >    true/false, true, true)
> > >    <
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> >
> > to
> > >    get the topic stats, and find which subscriptions are in backlog.
> > >    Pulsar exposed backlogSize and earliestMsgPublishTimeInBacklog in
> the
> > >    subscription level, and we will expose backlogQuotaLimitSize and
> > >    backlogQuotaLimitTime in the topic level, so users could find which
> > >    subscriptions in backlog easily.
> > >
> > > I wrote how it should be done IMO in a previous email.
> >
> >
> > On Mon, Mar 6, 2023 at 1:20 PM 太上玄元道君 <da...@apache.org> wrote:
> >
> > > Hi Aasf,
> > > I've updated the PIP, PTAL
> > >
> > > Thanks,
> > > Tao Jiuming
> > >
> > > Asaf Mesika <as...@gmail.com> 于2023年3月5日周日 21:00写道:
> > >
> > > > On Thu, Mar 2, 2023 at 12:57 PM 太上玄元道君 <da...@apache.org> wrote:
> > > >
> > > > > > I  think you should fix this explanation:
> > > > >
> > > > > Thanks! I would like to copy the context you provide to the PIP
> > > > motivation,
> > > > > your description is more detailed, so developers don't have to go
> > > through
> > > > > the code.
> > > > >
> > > >
> > > > Sure
> > > >
> > > >
> > > > >
> > > > > > Today the quota is checked periodically, right? So that's how the
> > > > > operator
> > > > > > knows the cost in terms of I/O is limited.
> > > > > > Now you are adding one additional I/O per collection, every 1 min
> > by
> > > > > > default. That's a lot perhaps. How long is the check interval
> > today?
> > > > >
> > > > > Actually, I don't want to introduce additional costs, I thought we
> > > > > could cache its result, so that it won't introduce additional
> costs.
> > > > > It may be that I did not make it clear in the PIP and caused this
> > > > > misunderstanding, sorry.
> > > > >
> > > >
> > > > Ok, just to verify: You plan to modify the code that runs
> periodically
> > > the
> > > > backlog quota check, so the result will be cached there? This way
> when
> > > you
> > > > pull that information from that code every 1min to expose it as a
> > metric
> > > it
> > > > will have 0 I/O cost?
> > > >
> > > >
> > > >
> > > > >
> > > > > > The user today can calculate quota used for size based limit,
> since
> > > > there
> > > > > > are two metrics that are exposed today on a topic level: "
> > > > > > pulsar_storage_backlog_quota_limit" and
> > > "pulsar_storage_backlog_size".
> > > > > You
> > > > > > can just divide the two to get a percentage.
> > > > > > For the time-based limit, the only metric exposed today is quota
> > > > itself ,
> > > > > "
> > > > > > pulsar_storage_backlog_quota_limit_time".
> > > > >
> > > > > I only noticed `pulsar_storage_backlog_size` but missed
> > > > > `pulsar_storage_backlog_quota_limit` and
> > > > > `pulsar_storage_backlog_quota_limit_time`. Many thanks for your
> > > reminder.
> > > > >
> > > > >
> > > > > So, in this condition, we already have the following topic-level
> > > metrics:
> > > > > `pulsar_storage_backlog_size`: The total backlog size of the topics
> > of
> > > > this
> > > > > topic owned by this broker (in bytes).
> > > > > `pulsar_storage_backlog_quota_limit`: The total amount of the data
> in
> > > > this
> > > > > topic that limits the backlog quota (bytes).
> > > > > `pulsar_storage_backlog_quota_limit_time`: The backlog quota limit
> in
> > > > > time(seconds). (This metric does not exists in the doc, need to
> > > improve)
> > > > >
> > > > >
> > > > > We just need to add a new metric named
> > > > > `pulsar_storage_earliest_msg_publish_time_in_backlog` in the
> > > topic-level
> > > > > that indicates the publish time of the earliest message in the
> > backlog.
> > > > > So users could get `pulsar_backlog_size_quota_used_percentage` by
> > > divide
> > > > > `pulsar_storage_backlog_size ` and
> > > > > `pulsar_storage_backlog_quota_limit`(`pulsar_storage_backlog_size`
> /
> > > > > `pulsar_storage_backlog_quota_limit`),
> > > > > and could get `pulsar_backlog_time_quota_used_percentage` by divide
> > > `now
> > > > -
> > > > > pulsar_storage_earliest_msg_publish_time_in_backlog` and
> > > > > `pulsar_storage_backlog_quota_limit_time` (`now -
> > > > > pulsar_storage_earliest_msg_publish_time_in_backlog` /
> > > > > `pulsar_storage_backlog_quota_limit_time`).
> > > > >
> > > >
> > > > I think there is a problem with the name
> > > > `pulsar_storage_earliest_msg_publish_time_in_backlog` in the
> > topic-level:
> > > > * First, I prefer exposing the age rather than the publish time.
> > > > * Second, it's a bit hard to figure out the meaning of the earliest
> msg
> > > in
> > > > the backlog.
> > > >
> > > > Maybe `pulsar_storage_backlog_age_seconds`? In the explanation you
> can
> > > > write: "The age (time passed since it was published) of the earliest
> > > > unacknowledged message based on the topic's
> > > > existing subscriptions" ?
> > > >
> > > >
> > > >
> > > > >
> > > > > The backlog quota time checker runs periodically, so we can cache
> its
> > > > > result, so it won't lead to much costs.
> > > > >
> > > > > Pulsar also exposed subscription-level  `backlogSize` and
> > > > > `earliestMsgPublishTimeInBacklog` in Pulsar-Admin
> > > > > <
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > > > >
> > > > > if
> > > > > `subscriptionBacklogSize` and `getEarliestTimeInBacklog` are true.
> > > > > We can also expose `backlogQuotaLimiteSize` and
> > `backlogQuotaLimitTime`
> > > > of
> > > > > the topic to PulsarAdmin.
> > > > >
> > > >
> > > > What is the relationship you see between Pulsar exposing
> > > > subscriptionBacklogSize and earliestMsgPublishTimeInBacklog in
> > > > subscription, to exposing the backlog quota limits in pulsar admin?
> > > >
> > > > Limits can be exposed to Pulsar Admin, since it has 0 cost associated
> > > with
> > > > it.
> > > > I think it's a good idea to do that.
> > > > The quota usage can also be exposed to pulsar admin, since we pull
> that
> > > > data from the backlog quota checker cache, so it has 0 cost as well.
> > > >
> > > > As we said in previous email we can also expose
> > > > `backlogQuotaTimeOldestBacklogAgeSubscriptionName`
> > > >
> > > >
> > > > >
> > > > > After users receive the backlog alert from metrics alerting
> systems,
> > > they
> > > > > can get the topic name, then, they can request Topics#getStats
> > > > > <
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > > > >
> > > > > to
> > > > > get which subscriptions are in the huge backlog.
> > > > >
> > > > >
> > > > I agree users can use PulsarAdmin getStats for topic , with
> > > > getEarliestTimeInBacklog=true to find the oldest subscription
> > responsible
> > > > for exceeding quota, but we can give them that information with 0
> cost
> > > > since we already have that subscription name cached (we spent the I/O
> > to
> > > > find out who that subscription is, let's just cache it and provide
> it).
> > > >
> > > >
> > > >
> > > >
> > > > > Thanks,
> > > > > Tao Jiuming
> > > > >
> > > > > Asaf Mesika <as...@gmail.com> 于2023年3月1日周三 23:42写道:
> > > > >
> > > > > > >
> > > > > > > Pulsar has 2 configurations for the backlog eviction
> > > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://pulsar.apache.org/docs/2.11.x/cookbooks-retention-expiry/#backlog-quotas
> > > > > > >
> > > > > > > : backlogQuotaDefaultLimitBytes and
> > backlogQuotaDefaultLimitSecond.
> > > > > > > By default, backlog eviction is disabled, and also, there is a
> > > field
> > > > > > named
> > > > > > > backlogQuotaMap in TopicPolicies
> > > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/HierarchyTopicPolicies.java#L45
> > > > > > >
> > > > > > > /NamespaceSpacePolicies
> > > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/common/policies/data/Policies.java#L41
> > > > > >
> > > > > > assists
> > > > > > > in controlling Topic/Namespace level backlog quota.
> > > > > > >
> > > > > > > If topic backlog reaches the threshold of any item, backlog
> > > eviction
> > > > > will
> > > > > > > be triggered, Pulsar will move subscription's cursor to skip
> > > > > > unacknowledged
> > > > > > > messages.
> > > > > > >
> > > > > > > Before backlog eviction happens, we don't have a metric to
> > monitor
> > > > how
> > > > > > > long that it can reaches the threshold.
> > > > > > >
> > > > > >
> > > > > > I  think you should fix this explanation:
> > > > > >
> > > > > > In Pulsar, a subscription maintains a state of message
> > acknowledged.
> > > A
> > > > > > subscription backlog is the set of messages which are
> > unacknowledged.
> > > > > > A subscription backlog size is the sum of size of unacknowledged
> > > > messages
> > > > > > (in bytes).
> > > > > > A topic can have many subscriptions.
> > > > > > A topic backlog is defined as the backlog size of the
> subscription
> > > > which
> > > > > > has the oldest unacknowledged message. Since acknowledged
> messages
> > > can
> > > > be
> > > > > > interleaved with unacknowledged messages, calculating the exact
> > size
> > > of
> > > > > > that subscription can be expensive as it requires I/O operations
> to
> > > > read
> > > > > > from the messages from the ledgers.
> > > > > > For that reason, the topic backlog is actually defined to be the
> > > > > estimated
> > > > > > backlog size of that subscription. It does so by summarizing the
> > size
> > > > of
> > > > > > all the ledgers, starting from the current active one, up to the
> > > ledger
> > > > > > which contains the oldest unacknowledged message (There is
> > actually a
> > > > > > faster way to calculate it, but this is the definition of the
> > > > > estimation).
> > > > > >
> > > > > > A topic backlog age is the age of the oldest unacknowledged
> message
> > > (in
> > > > > any
> > > > > > subscription). If that message was written 30 minutes ago, its
> age
> > is
> > > > 30
> > > > > > minutes.
> > > > > >
> > > > > > Pulsar has a feature called backlog quota (place link). It allows
> > the
> > > > > user
> > > > > > to define a quota - in effect, a limit - which limits the topic
> > > > backlog.
> > > > > > There are two types of quotas:
> > > > > > * Size based: The limit is for the topic backlog size (as we
> > defined
> > > > > > above).
> > > > > > * Time based: The limit is for the topic's backlog age (as we
> > defined
> > > > > > above).
> > > > > >
> > > > > > Once a topic backlog exceeds either one of those limits, an
> action
> > is
> > > > > taken
> > > > > > upon messages written to the topic:
> > > > > > * The producer write is placed on hold for a certain amount of
> time
> > > > > before
> > > > > > failing.
> > > > > > * The producer write is failed
> > > > > > * The subscriptions oldest unacknowledged messages will be
> > > acknowledged
> > > > > in
> > > > > > order until both the topic backlog size or age will fall inside
> the
> > > > limit
> > > > > > (quota). The process is called backlog eviction (happens every
> > > > interval)
> > > > > >
> > > > > > The quotas can be defined as a default value for any topic, by
> > using
> > > > the
> > > > > > following broker configuration keys:
> backlogQuotaDefaultLimitBytes
> > ,
> > > > > > backlogQuotaDefaultLimitSecond. It can also be specified directly
> > for
> > > > all
> > > > > > topics in a given namespace using the namespace policy, or a
> > specific
> > > > > topic
> > > > > > using a topic policy.
> > > > > >
> > > > > > The user today can calculate quota used for size based limit,
> since
> > > > there
> > > > > > are two metrics that are exposed today on a topic level: "
> > > > > > pulsar_storage_backlog_quota_limit" and
> > > "pulsar_storage_backlog_size".
> > > > > You
> > > > > > can just divide the two to get a percentage.
> > > > > > For the time-based limit, the only metric exposed today is quota
> > > itself
> > > > > , "
> > > > > > pulsar_storage_backlog_quota_limit_time".
> > > > > >
> > > > > > ------------
> > > > > >
> > > > > > I would create two metrics:
> > > > > >
> > > > > > `pulsar_backlog_size_quota_used_percentage`
> > > > > > `pulsar_backlog_time_quota_used_percentage`
> > > > > >
> > > > > > You would like to know what triggered the alert, hence two.
> > > > > > It's not the quota percentage, it's the quota used percentage.
> > > > > >
> > > > > > ----------
> > > > > >
> > > > > > It checks if the backlog size exceeds the threshold(
> > > > > > > backlogQuotaDefaultLimitBytes), and it gets the current backlog
> > > size
> > > > by
> > > > > > > calculating LedgerInfo
> > > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > > > > >,
> > > > > > > it will not lead to I/O.
> > > > > >
> > > > > > This is not correct.
> > > > > > It checks against the topic / namespace policy, and if it doesn't
> > > > exist,
> > > > > it
> > > > > > falls back on the default configuration key mentioned above.
> > > > > >
> > > > > > It checks if the backlog time exceeds the threshold(
> > > > > > > backlogQuotaDefaultLimitSecond). If
> > > preciseTimeBasedBacklogQuotaCheck
> > > > > is
> > > > > > > set to be true, it will read an entry from Bookkeeper, but the
> > > > default
> > > > > > > value is false, which means it gets the backlog time by
> > calculating
> > > > > > > LedgerInfo
> > > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > > > > >.
> > > > > > > So in general, we don't need to worry about it will lead to
> I/O.
> > > > > >
> > > > > >
> > > > > > I'm afraid of that.
> > > > > > Today the quota is checked periodically, right? So that's how the
> > > > > operator
> > > > > > knows the cost in terms of I/O is limited.
> > > > > >  Now you are adding one additional I/O per collection, every 1
> min
> > by
> > > > > > default. That's a lot perhaps. How long is the check interval
> > today?
> > > > > >
> > > > > > Perhaps in the backlog quota check, you can persist the check
> > result,
> > > > and
> > > > > > use it? Persist the age that is.
> > > > > >
> > > > > >
> > > > > > ------
> > > > > >
> > > > > > Regarding "slowest_subscription"
> > > > > > I think the cost is too high, because the subscriptions will keep
> > > > > > alternating, which can generate so many unique time series. Since
> > > > > > Prometheus flush only every 2 hours, or any there TSDB, it will
> > cost
> > > > you
> > > > > > too much.
> > > > > >
> > > > > > I suggest exposing the name via the topic stats. This way they
> can
> > > > issue
> > > > > a
> > > > > > REST call to grab that subscription name only when the alert
> fires.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Asaf
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, Feb 28, 2023 at 9:29 AM 太上玄元道君 <da...@apache.org>
> wrote:
> > > > > >
> > > > > > > Hi Asaf,
> > > > > > > I've updated the PIP, PTAL
> > > > > > >
> > > > > > > Thank,
> > > > > > > Tao Jiuming
> > > > > > >
> > > > > > > Asaf Mesika <as...@gmail.com> 于2023年2月26日周日 23:03写道:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > > > > > backlogQuotaDefaultLimitBytes and
> > > backlogQuotaDefaultLimitSecond,
> > > > > if
> > > > > > > > > topic backlog reaches the threshold of any item, backlog
> > > eviction
> > > > > > will
> > > > > > > be
> > > > > > > > > triggered.
> > > > > > > >
> > > > > > > > This seems like default values, not the actual values. Can
> you
> > > > please
> > > > > > > > provide an explanation in the PIP and link to read more:
> > > > > > > > 1. Where do you define the backlog quota exactly? What is the
> > > > > > granularity
> > > > > > > > (subscription?)
> > > > > > > > 2.  Is the backlog quota on by default? If so, what are the
> > > default
> > > > > > > values?
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > *Notes*
> > > > > > > > 1. When the backlog quota limit is defined in Bytes, and you
> > wish
> > > > to
> > > > > > know
> > > > > > > > how close a subscription is to its bytes limit, you need to
> > > > calculate
> > > > > > the
> > > > > > > > backlog size in bytes. From my understanding, there is an
> > > accurate
> > > > > > > > calculation (which is costly in terms of I/O) and there is an
> > > > > estimate
> > > > > > of
> > > > > > > > it. I presume you would want to use the estimated one, is
> that
> > > > > correct?
> > > > > > > > The backlog quota itself, uses the accurate or the estimated
> > when
> > > > it
> > > > > > > starts
> > > > > > > > evicting entries (i.e. marking them as acknowledged)?
> > > > > > > >
> > > > > > > > 2. For the backlog limit specifying in time units, there is
> no
> > > > > > estimate,
> > > > > > > as
> > > > > > > > it must be calculated all the time (earliest unacknowledged
> > > message
> > > > > > > > distance from now). How do you plan to calculate the current
> > age
> > > of
> > > > > the
> > > > > > > > earliest message without bearing that I/O cost on each metric
> > > > > > > calculation?
> > > > > > > >
> > > > > > > > 3. In the Goal section, you specify that your goal is to add
> a
> > > > > > > "proximity"
> > > > > > > > metric.
> > > > > > > > a) You must define that - what is proximity metric exactly?
> > What
> > > > are
> > > > > > its
> > > > > > > > units? How are you planning to calculate it?
> > > > > > > > b) Proximity is not a good term IMO. I personally have never
> > seen
> > > > > this
> > > > > > > term
> > > > > > > > used in software systems, unless it's in the aviation/space
> > > > industry.
> > > > > > > Once
> > > > > > > > you explain (a) I hope I can help provide alternative names.
> > > > > > > >
> > > > > > > > 4. Maybe we should provide the used quota percentage for both
> > > > limits,
> > > > > > > > instead of one per both, since it's easier to act upon the
> > alert
> > > > when
> > > > > > you
> > > > > > > > need which one triggered it.
> > > > > > > >
> > > > > > > > 5. I didn't understand the "slowest_subscription" label used
> > when
> > > > > > > > describing the metric label. Can you please provide an
> > > explanation?
> > > > > > > >
> > > > > > > > 6. I suggest writing a "High Level Design" section, and add
> > > > > everything
> > > > > > > you
> > > > > > > > need to know for this proposal, so I don't need to read the
> > > > > > > > implementation details below (code).
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Asaf
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Feb 22, 2023 at 4:52 PM 太上玄元道君 <da...@apache.org>
> > > wrote:
> > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > I've started a PIP to discuss: PIP-248 Add backlog eviction
> > > > metric
> > > > > > > > >
> > > > > > > > > ### Motivation:
> > > > > > > > >
> > > > > > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > > > > > `backlogQuotaDefaultLimitBytes` and
> > > > > `backlogQuotaDefaultLimitSecond`,
> > > > > > > if
> > > > > > > > > topic backlog reaches the threshold of any item, backlog
> > > eviction
> > > > > > will
> > > > > > > be
> > > > > > > > > triggered.
> > > > > > > > >
> > > > > > > > > Before backlog eviction happens, we don't have a metric to
> > > > monitor
> > > > > > how
> > > > > > > > long
> > > > > > > > > that it can reaches the threshold.
> > > > > > > > >
> > > > > > > > > We can provide a progress bar metric to tell users some
> > topics
> > > is
> > > > > > about
> > > > > > > > to
> > > > > > > > > trigger backlog eviction. And users can subscribe the alert
> > to
> > > > > > schedule
> > > > > > > > > consumers.
> > > > > > > > >
> > > > > > > > > For more details, please read the PIP at
> > > > > > > > > https://github.com/apache/pulsar/issues/19601
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Tao Jiuming
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [Discuss] PIP-248: Add backlog eviction metric

Posted by 太上玄元道君 <da...@apache.org>.
> backlogQuotaLimitSize
> should be `backlogQuotaSizeBytes`

> backlogQuotaLimitTime
> should be `backlogQuotaTimeSeconds`

> So you need to rename the metric.
> "pulsar_storage_backlog_quota_count" -->
> `pulsar_storage_backlog_eviction_count`

> the topic's existing subscription.
> "subscription" --> "subscription*s*"

> Number of backlog quota happends.
> Number of times backlog evictions happened due to exceeding backlog quota
> (either time or size).

Accepted, if there is no more need to change, I'll start the vote next week.

Thanks,
Tao Jiuming


Asaf Mesika <as...@gmail.com> 于2023年3月7日周二 00:02写道:

> >
> > Pulsar has a feature called backlog quota (place link).
>
> You need to place a link :)
>
> Expose pulsar_storage_backlog_quota_count in the topic leve
>
> You already have "pulsar_storage_backlog_size", so why do you need this
> metric for?
>
> backlogQuotaLimitSize
>
> should be `backlogQuotaSizeBytes`
>
> backlogQuotaLimitTime
>
> should be `backlogQuotaTimeSeconds`
>
> What about goal no.4? Expose oldest unacknowledged message subscription
> name?
>
> IMO, metrics are like API - perhaps indicate the change there as well
>
> Record the event when dropBacklogForSizeLimit
> > <
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L121
> >
> >  or dropBacklogForTimeLimit
> > <
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L194>
> is
> > going to invoked.
>
>
> Oh, now I get it.
> So you need to rename the metric.
> "pulsar_storage_backlog_quota_count" -->
> `pulsar_storage_backlog_eviction_count`
>
>
> > the topic's existing subscription.
>
> "subscription" --> "subscription*s*"
>
> Number of backlog quota happends.
>
> Number of times backlog evictions happened due to exceeding backlog quota
> (either time or size).
>
>
> >    1. Find the backlog subscriptions
> >    After received the alarm, users could request
> Topics#getStats(topicName,
> >    true/false, true, true)
> >    <
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139>
> to
> >    get the topic stats, and find which subscriptions are in backlog.
> >    Pulsar exposed backlogSize and earliestMsgPublishTimeInBacklog in the
> >    subscription level, and we will expose backlogQuotaLimitSize and
> >    backlogQuotaLimitTime in the topic level, so users could find which
> >    subscriptions in backlog easily.
> >
> > I wrote how it should be done IMO in a previous email.
>
>
> On Mon, Mar 6, 2023 at 1:20 PM 太上玄元道君 <da...@apache.org> wrote:
>
> > Hi Aasf,
> > I've updated the PIP, PTAL
> >
> > Thanks,
> > Tao Jiuming
> >
> > Asaf Mesika <as...@gmail.com> 于2023年3月5日周日 21:00写道:
> >
> > > On Thu, Mar 2, 2023 at 12:57 PM 太上玄元道君 <da...@apache.org> wrote:
> > >
> > > > > I  think you should fix this explanation:
> > > >
> > > > Thanks! I would like to copy the context you provide to the PIP
> > > motivation,
> > > > your description is more detailed, so developers don't have to go
> > through
> > > > the code.
> > > >
> > >
> > > Sure
> > >
> > >
> > > >
> > > > > Today the quota is checked periodically, right? So that's how the
> > > > operator
> > > > > knows the cost in terms of I/O is limited.
> > > > > Now you are adding one additional I/O per collection, every 1 min
> by
> > > > > default. That's a lot perhaps. How long is the check interval
> today?
> > > >
> > > > Actually, I don't want to introduce additional costs, I thought we
> > > > could cache its result, so that it won't introduce additional costs.
> > > > It may be that I did not make it clear in the PIP and caused this
> > > > misunderstanding, sorry.
> > > >
> > >
> > > Ok, just to verify: You plan to modify the code that runs periodically
> > the
> > > backlog quota check, so the result will be cached there? This way when
> > you
> > > pull that information from that code every 1min to expose it as a
> metric
> > it
> > > will have 0 I/O cost?
> > >
> > >
> > >
> > > >
> > > > > The user today can calculate quota used for size based limit, since
> > > there
> > > > > are two metrics that are exposed today on a topic level: "
> > > > > pulsar_storage_backlog_quota_limit" and
> > "pulsar_storage_backlog_size".
> > > > You
> > > > > can just divide the two to get a percentage.
> > > > > For the time-based limit, the only metric exposed today is quota
> > > itself ,
> > > > "
> > > > > pulsar_storage_backlog_quota_limit_time".
> > > >
> > > > I only noticed `pulsar_storage_backlog_size` but missed
> > > > `pulsar_storage_backlog_quota_limit` and
> > > > `pulsar_storage_backlog_quota_limit_time`. Many thanks for your
> > reminder.
> > > >
> > > >
> > > > So, in this condition, we already have the following topic-level
> > metrics:
> > > > `pulsar_storage_backlog_size`: The total backlog size of the topics
> of
> > > this
> > > > topic owned by this broker (in bytes).
> > > > `pulsar_storage_backlog_quota_limit`: The total amount of the data in
> > > this
> > > > topic that limits the backlog quota (bytes).
> > > > `pulsar_storage_backlog_quota_limit_time`: The backlog quota limit in
> > > > time(seconds). (This metric does not exists in the doc, need to
> > improve)
> > > >
> > > >
> > > > We just need to add a new metric named
> > > > `pulsar_storage_earliest_msg_publish_time_in_backlog` in the
> > topic-level
> > > > that indicates the publish time of the earliest message in the
> backlog.
> > > > So users could get `pulsar_backlog_size_quota_used_percentage` by
> > divide
> > > > `pulsar_storage_backlog_size ` and
> > > > `pulsar_storage_backlog_quota_limit`(`pulsar_storage_backlog_size` /
> > > > `pulsar_storage_backlog_quota_limit`),
> > > > and could get `pulsar_backlog_time_quota_used_percentage` by divide
> > `now
> > > -
> > > > pulsar_storage_earliest_msg_publish_time_in_backlog` and
> > > > `pulsar_storage_backlog_quota_limit_time` (`now -
> > > > pulsar_storage_earliest_msg_publish_time_in_backlog` /
> > > > `pulsar_storage_backlog_quota_limit_time`).
> > > >
> > >
> > > I think there is a problem with the name
> > > `pulsar_storage_earliest_msg_publish_time_in_backlog` in the
> topic-level:
> > > * First, I prefer exposing the age rather than the publish time.
> > > * Second, it's a bit hard to figure out the meaning of the earliest msg
> > in
> > > the backlog.
> > >
> > > Maybe `pulsar_storage_backlog_age_seconds`? In the explanation you can
> > > write: "The age (time passed since it was published) of the earliest
> > > unacknowledged message based on the topic's
> > > existing subscriptions" ?
> > >
> > >
> > >
> > > >
> > > > The backlog quota time checker runs periodically, so we can cache its
> > > > result, so it won't lead to much costs.
> > > >
> > > > Pulsar also exposed subscription-level  `backlogSize` and
> > > > `earliestMsgPublishTimeInBacklog` in Pulsar-Admin
> > > > <
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > > >
> > > > if
> > > > `subscriptionBacklogSize` and `getEarliestTimeInBacklog` are true.
> > > > We can also expose `backlogQuotaLimiteSize` and
> `backlogQuotaLimitTime`
> > > of
> > > > the topic to PulsarAdmin.
> > > >
> > >
> > > What is the relationship you see between Pulsar exposing
> > > subscriptionBacklogSize and earliestMsgPublishTimeInBacklog in
> > > subscription, to exposing the backlog quota limits in pulsar admin?
> > >
> > > Limits can be exposed to Pulsar Admin, since it has 0 cost associated
> > with
> > > it.
> > > I think it's a good idea to do that.
> > > The quota usage can also be exposed to pulsar admin, since we pull that
> > > data from the backlog quota checker cache, so it has 0 cost as well.
> > >
> > > As we said in previous email we can also expose
> > > `backlogQuotaTimeOldestBacklogAgeSubscriptionName`
> > >
> > >
> > > >
> > > > After users receive the backlog alert from metrics alerting systems,
> > they
> > > > can get the topic name, then, they can request Topics#getStats
> > > > <
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > > >
> > > > to
> > > > get which subscriptions are in the huge backlog.
> > > >
> > > >
> > > I agree users can use PulsarAdmin getStats for topic , with
> > > getEarliestTimeInBacklog=true to find the oldest subscription
> responsible
> > > for exceeding quota, but we can give them that information with 0 cost
> > > since we already have that subscription name cached (we spent the I/O
> to
> > > find out who that subscription is, let's just cache it and provide it).
> > >
> > >
> > >
> > >
> > > > Thanks,
> > > > Tao Jiuming
> > > >
> > > > Asaf Mesika <as...@gmail.com> 于2023年3月1日周三 23:42写道:
> > > >
> > > > > >
> > > > > > Pulsar has 2 configurations for the backlog eviction
> > > > > > <
> > > > >
> > > >
> > >
> >
> https://pulsar.apache.org/docs/2.11.x/cookbooks-retention-expiry/#backlog-quotas
> > > > > >
> > > > > > : backlogQuotaDefaultLimitBytes and
> backlogQuotaDefaultLimitSecond.
> > > > > > By default, backlog eviction is disabled, and also, there is a
> > field
> > > > > named
> > > > > > backlogQuotaMap in TopicPolicies
> > > > > > <
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/HierarchyTopicPolicies.java#L45
> > > > > >
> > > > > > /NamespaceSpacePolicies
> > > > > > <
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/common/policies/data/Policies.java#L41
> > > > >
> > > > > assists
> > > > > > in controlling Topic/Namespace level backlog quota.
> > > > > >
> > > > > > If topic backlog reaches the threshold of any item, backlog
> > eviction
> > > > will
> > > > > > be triggered, Pulsar will move subscription's cursor to skip
> > > > > unacknowledged
> > > > > > messages.
> > > > > >
> > > > > > Before backlog eviction happens, we don't have a metric to
> monitor
> > > how
> > > > > > long that it can reaches the threshold.
> > > > > >
> > > > >
> > > > > I  think you should fix this explanation:
> > > > >
> > > > > In Pulsar, a subscription maintains a state of message
> acknowledged.
> > A
> > > > > subscription backlog is the set of messages which are
> unacknowledged.
> > > > > A subscription backlog size is the sum of size of unacknowledged
> > > messages
> > > > > (in bytes).
> > > > > A topic can have many subscriptions.
> > > > > A topic backlog is defined as the backlog size of the subscription
> > > which
> > > > > has the oldest unacknowledged message. Since acknowledged messages
> > can
> > > be
> > > > > interleaved with unacknowledged messages, calculating the exact
> size
> > of
> > > > > that subscription can be expensive as it requires I/O operations to
> > > read
> > > > > from the messages from the ledgers.
> > > > > For that reason, the topic backlog is actually defined to be the
> > > > estimated
> > > > > backlog size of that subscription. It does so by summarizing the
> size
> > > of
> > > > > all the ledgers, starting from the current active one, up to the
> > ledger
> > > > > which contains the oldest unacknowledged message (There is
> actually a
> > > > > faster way to calculate it, but this is the definition of the
> > > > estimation).
> > > > >
> > > > > A topic backlog age is the age of the oldest unacknowledged message
> > (in
> > > > any
> > > > > subscription). If that message was written 30 minutes ago, its age
> is
> > > 30
> > > > > minutes.
> > > > >
> > > > > Pulsar has a feature called backlog quota (place link). It allows
> the
> > > > user
> > > > > to define a quota - in effect, a limit - which limits the topic
> > > backlog.
> > > > > There are two types of quotas:
> > > > > * Size based: The limit is for the topic backlog size (as we
> defined
> > > > > above).
> > > > > * Time based: The limit is for the topic's backlog age (as we
> defined
> > > > > above).
> > > > >
> > > > > Once a topic backlog exceeds either one of those limits, an action
> is
> > > > taken
> > > > > upon messages written to the topic:
> > > > > * The producer write is placed on hold for a certain amount of time
> > > > before
> > > > > failing.
> > > > > * The producer write is failed
> > > > > * The subscriptions oldest unacknowledged messages will be
> > acknowledged
> > > > in
> > > > > order until both the topic backlog size or age will fall inside the
> > > limit
> > > > > (quota). The process is called backlog eviction (happens every
> > > interval)
> > > > >
> > > > > The quotas can be defined as a default value for any topic, by
> using
> > > the
> > > > > following broker configuration keys: backlogQuotaDefaultLimitBytes
> ,
> > > > > backlogQuotaDefaultLimitSecond. It can also be specified directly
> for
> > > all
> > > > > topics in a given namespace using the namespace policy, or a
> specific
> > > > topic
> > > > > using a topic policy.
> > > > >
> > > > > The user today can calculate quota used for size based limit, since
> > > there
> > > > > are two metrics that are exposed today on a topic level: "
> > > > > pulsar_storage_backlog_quota_limit" and
> > "pulsar_storage_backlog_size".
> > > > You
> > > > > can just divide the two to get a percentage.
> > > > > For the time-based limit, the only metric exposed today is quota
> > itself
> > > > , "
> > > > > pulsar_storage_backlog_quota_limit_time".
> > > > >
> > > > > ------------
> > > > >
> > > > > I would create two metrics:
> > > > >
> > > > > `pulsar_backlog_size_quota_used_percentage`
> > > > > `pulsar_backlog_time_quota_used_percentage`
> > > > >
> > > > > You would like to know what triggered the alert, hence two.
> > > > > It's not the quota percentage, it's the quota used percentage.
> > > > >
> > > > > ----------
> > > > >
> > > > > It checks if the backlog size exceeds the threshold(
> > > > > > backlogQuotaDefaultLimitBytes), and it gets the current backlog
> > size
> > > by
> > > > > > calculating LedgerInfo
> > > > > > <
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > > > >,
> > > > > > it will not lead to I/O.
> > > > >
> > > > > This is not correct.
> > > > > It checks against the topic / namespace policy, and if it doesn't
> > > exist,
> > > > it
> > > > > falls back on the default configuration key mentioned above.
> > > > >
> > > > > It checks if the backlog time exceeds the threshold(
> > > > > > backlogQuotaDefaultLimitSecond). If
> > preciseTimeBasedBacklogQuotaCheck
> > > > is
> > > > > > set to be true, it will read an entry from Bookkeeper, but the
> > > default
> > > > > > value is false, which means it gets the backlog time by
> calculating
> > > > > > LedgerInfo
> > > > > > <
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > > > >.
> > > > > > So in general, we don't need to worry about it will lead to I/O.
> > > > >
> > > > >
> > > > > I'm afraid of that.
> > > > > Today the quota is checked periodically, right? So that's how the
> > > > operator
> > > > > knows the cost in terms of I/O is limited.
> > > > >  Now you are adding one additional I/O per collection, every 1 min
> by
> > > > > default. That's a lot perhaps. How long is the check interval
> today?
> > > > >
> > > > > Perhaps in the backlog quota check, you can persist the check
> result,
> > > and
> > > > > use it? Persist the age that is.
> > > > >
> > > > >
> > > > > ------
> > > > >
> > > > > Regarding "slowest_subscription"
> > > > > I think the cost is too high, because the subscriptions will keep
> > > > > alternating, which can generate so many unique time series. Since
> > > > > Prometheus flush only every 2 hours, or any there TSDB, it will
> cost
> > > you
> > > > > too much.
> > > > >
> > > > > I suggest exposing the name via the topic stats. This way they can
> > > issue
> > > > a
> > > > > REST call to grab that subscription name only when the alert fires.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Asaf
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Feb 28, 2023 at 9:29 AM 太上玄元道君 <da...@apache.org> wrote:
> > > > >
> > > > > > Hi Asaf,
> > > > > > I've updated the PIP, PTAL
> > > > > >
> > > > > > Thank,
> > > > > > Tao Jiuming
> > > > > >
> > > > > > Asaf Mesika <as...@gmail.com> 于2023年2月26日周日 23:03写道:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > > > > backlogQuotaDefaultLimitBytes and
> > backlogQuotaDefaultLimitSecond,
> > > > if
> > > > > > > > topic backlog reaches the threshold of any item, backlog
> > eviction
> > > > > will
> > > > > > be
> > > > > > > > triggered.
> > > > > > >
> > > > > > > This seems like default values, not the actual values. Can you
> > > please
> > > > > > > provide an explanation in the PIP and link to read more:
> > > > > > > 1. Where do you define the backlog quota exactly? What is the
> > > > > granularity
> > > > > > > (subscription?)
> > > > > > > 2.  Is the backlog quota on by default? If so, what are the
> > default
> > > > > > values?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > *Notes*
> > > > > > > 1. When the backlog quota limit is defined in Bytes, and you
> wish
> > > to
> > > > > know
> > > > > > > how close a subscription is to its bytes limit, you need to
> > > calculate
> > > > > the
> > > > > > > backlog size in bytes. From my understanding, there is an
> > accurate
> > > > > > > calculation (which is costly in terms of I/O) and there is an
> > > > estimate
> > > > > of
> > > > > > > it. I presume you would want to use the estimated one, is that
> > > > correct?
> > > > > > > The backlog quota itself, uses the accurate or the estimated
> when
> > > it
> > > > > > starts
> > > > > > > evicting entries (i.e. marking them as acknowledged)?
> > > > > > >
> > > > > > > 2. For the backlog limit specifying in time units, there is no
> > > > > estimate,
> > > > > > as
> > > > > > > it must be calculated all the time (earliest unacknowledged
> > message
> > > > > > > distance from now). How do you plan to calculate the current
> age
> > of
> > > > the
> > > > > > > earliest message without bearing that I/O cost on each metric
> > > > > > calculation?
> > > > > > >
> > > > > > > 3. In the Goal section, you specify that your goal is to add a
> > > > > > "proximity"
> > > > > > > metric.
> > > > > > > a) You must define that - what is proximity metric exactly?
> What
> > > are
> > > > > its
> > > > > > > units? How are you planning to calculate it?
> > > > > > > b) Proximity is not a good term IMO. I personally have never
> seen
> > > > this
> > > > > > term
> > > > > > > used in software systems, unless it's in the aviation/space
> > > industry.
> > > > > > Once
> > > > > > > you explain (a) I hope I can help provide alternative names.
> > > > > > >
> > > > > > > 4. Maybe we should provide the used quota percentage for both
> > > limits,
> > > > > > > instead of one per both, since it's easier to act upon the
> alert
> > > when
> > > > > you
> > > > > > > need which one triggered it.
> > > > > > >
> > > > > > > 5. I didn't understand the "slowest_subscription" label used
> when
> > > > > > > describing the metric label. Can you please provide an
> > explanation?
> > > > > > >
> > > > > > > 6. I suggest writing a "High Level Design" section, and add
> > > > everything
> > > > > > you
> > > > > > > need to know for this proposal, so I don't need to read the
> > > > > > > implementation details below (code).
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Asaf
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Feb 22, 2023 at 4:52 PM 太上玄元道君 <da...@apache.org>
> > wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > I've started a PIP to discuss: PIP-248 Add backlog eviction
> > > metric
> > > > > > > >
> > > > > > > > ### Motivation:
> > > > > > > >
> > > > > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > > > > `backlogQuotaDefaultLimitBytes` and
> > > > `backlogQuotaDefaultLimitSecond`,
> > > > > > if
> > > > > > > > topic backlog reaches the threshold of any item, backlog
> > eviction
> > > > > will
> > > > > > be
> > > > > > > > triggered.
> > > > > > > >
> > > > > > > > Before backlog eviction happens, we don't have a metric to
> > > monitor
> > > > > how
> > > > > > > long
> > > > > > > > that it can reaches the threshold.
> > > > > > > >
> > > > > > > > We can provide a progress bar metric to tell users some
> topics
> > is
> > > > > about
> > > > > > > to
> > > > > > > > trigger backlog eviction. And users can subscribe the alert
> to
> > > > > schedule
> > > > > > > > consumers.
> > > > > > > >
> > > > > > > > For more details, please read the PIP at
> > > > > > > > https://github.com/apache/pulsar/issues/19601
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Tao Jiuming
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [Discuss] PIP-248: Add backlog eviction metric

Posted by Asaf Mesika <as...@gmail.com>.
>
> Pulsar has a feature called backlog quota (place link).

You need to place a link :)

Expose pulsar_storage_backlog_quota_count in the topic leve

You already have "pulsar_storage_backlog_size", so why do you need this
metric for?

backlogQuotaLimitSize

should be `backlogQuotaSizeBytes`

backlogQuotaLimitTime

should be `backlogQuotaTimeSeconds`

What about goal no.4? Expose oldest unacknowledged message subscription
name?

IMO, metrics are like API - perhaps indicate the change there as well

Record the event when dropBacklogForSizeLimit
> <https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L121>
>  or dropBacklogForTimeLimit
> <https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L194> is
> going to invoked.


Oh, now I get it.
So you need to rename the metric.
"pulsar_storage_backlog_quota_count" -->
`pulsar_storage_backlog_eviction_count`


> the topic's existing subscription.

"subscription" --> "subscription*s*"

Number of backlog quota happends.

Number of times backlog evictions happened due to exceeding backlog quota
(either time or size).


>    1. Find the backlog subscriptions
>    After received the alarm, users could request Topics#getStats(topicName,
>    true/false, true, true)
>    <https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139> to
>    get the topic stats, and find which subscriptions are in backlog.
>    Pulsar exposed backlogSize and earliestMsgPublishTimeInBacklog in the
>    subscription level, and we will expose backlogQuotaLimitSize and
>    backlogQuotaLimitTime in the topic level, so users could find which
>    subscriptions in backlog easily.
>
> I wrote how it should be done IMO in a previous email.


On Mon, Mar 6, 2023 at 1:20 PM 太上玄元道君 <da...@apache.org> wrote:

> Hi Aasf,
> I've updated the PIP, PTAL
>
> Thanks,
> Tao Jiuming
>
> Asaf Mesika <as...@gmail.com> 于2023年3月5日周日 21:00写道:
>
> > On Thu, Mar 2, 2023 at 12:57 PM 太上玄元道君 <da...@apache.org> wrote:
> >
> > > > I  think you should fix this explanation:
> > >
> > > Thanks! I would like to copy the context you provide to the PIP
> > motivation,
> > > your description is more detailed, so developers don't have to go
> through
> > > the code.
> > >
> >
> > Sure
> >
> >
> > >
> > > > Today the quota is checked periodically, right? So that's how the
> > > operator
> > > > knows the cost in terms of I/O is limited.
> > > > Now you are adding one additional I/O per collection, every 1 min by
> > > > default. That's a lot perhaps. How long is the check interval today?
> > >
> > > Actually, I don't want to introduce additional costs, I thought we
> > > could cache its result, so that it won't introduce additional costs.
> > > It may be that I did not make it clear in the PIP and caused this
> > > misunderstanding, sorry.
> > >
> >
> > Ok, just to verify: You plan to modify the code that runs periodically
> the
> > backlog quota check, so the result will be cached there? This way when
> you
> > pull that information from that code every 1min to expose it as a metric
> it
> > will have 0 I/O cost?
> >
> >
> >
> > >
> > > > The user today can calculate quota used for size based limit, since
> > there
> > > > are two metrics that are exposed today on a topic level: "
> > > > pulsar_storage_backlog_quota_limit" and
> "pulsar_storage_backlog_size".
> > > You
> > > > can just divide the two to get a percentage.
> > > > For the time-based limit, the only metric exposed today is quota
> > itself ,
> > > "
> > > > pulsar_storage_backlog_quota_limit_time".
> > >
> > > I only noticed `pulsar_storage_backlog_size` but missed
> > > `pulsar_storage_backlog_quota_limit` and
> > > `pulsar_storage_backlog_quota_limit_time`. Many thanks for your
> reminder.
> > >
> > >
> > > So, in this condition, we already have the following topic-level
> metrics:
> > > `pulsar_storage_backlog_size`: The total backlog size of the topics of
> > this
> > > topic owned by this broker (in bytes).
> > > `pulsar_storage_backlog_quota_limit`: The total amount of the data in
> > this
> > > topic that limits the backlog quota (bytes).
> > > `pulsar_storage_backlog_quota_limit_time`: The backlog quota limit in
> > > time(seconds). (This metric does not exists in the doc, need to
> improve)
> > >
> > >
> > > We just need to add a new metric named
> > > `pulsar_storage_earliest_msg_publish_time_in_backlog` in the
> topic-level
> > > that indicates the publish time of the earliest message in the backlog.
> > > So users could get `pulsar_backlog_size_quota_used_percentage` by
> divide
> > > `pulsar_storage_backlog_size ` and
> > > `pulsar_storage_backlog_quota_limit`(`pulsar_storage_backlog_size` /
> > > `pulsar_storage_backlog_quota_limit`),
> > > and could get `pulsar_backlog_time_quota_used_percentage` by divide
> `now
> > -
> > > pulsar_storage_earliest_msg_publish_time_in_backlog` and
> > > `pulsar_storage_backlog_quota_limit_time` (`now -
> > > pulsar_storage_earliest_msg_publish_time_in_backlog` /
> > > `pulsar_storage_backlog_quota_limit_time`).
> > >
> >
> > I think there is a problem with the name
> > `pulsar_storage_earliest_msg_publish_time_in_backlog` in the topic-level:
> > * First, I prefer exposing the age rather than the publish time.
> > * Second, it's a bit hard to figure out the meaning of the earliest msg
> in
> > the backlog.
> >
> > Maybe `pulsar_storage_backlog_age_seconds`? In the explanation you can
> > write: "The age (time passed since it was published) of the earliest
> > unacknowledged message based on the topic's
> > existing subscriptions" ?
> >
> >
> >
> > >
> > > The backlog quota time checker runs periodically, so we can cache its
> > > result, so it won't lead to much costs.
> > >
> > > Pulsar also exposed subscription-level  `backlogSize` and
> > > `earliestMsgPublishTimeInBacklog` in Pulsar-Admin
> > > <
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > >
> > > if
> > > `subscriptionBacklogSize` and `getEarliestTimeInBacklog` are true.
> > > We can also expose `backlogQuotaLimiteSize` and `backlogQuotaLimitTime`
> > of
> > > the topic to PulsarAdmin.
> > >
> >
> > What is the relationship you see between Pulsar exposing
> > subscriptionBacklogSize and earliestMsgPublishTimeInBacklog in
> > subscription, to exposing the backlog quota limits in pulsar admin?
> >
> > Limits can be exposed to Pulsar Admin, since it has 0 cost associated
> with
> > it.
> > I think it's a good idea to do that.
> > The quota usage can also be exposed to pulsar admin, since we pull that
> > data from the backlog quota checker cache, so it has 0 cost as well.
> >
> > As we said in previous email we can also expose
> > `backlogQuotaTimeOldestBacklogAgeSubscriptionName`
> >
> >
> > >
> > > After users receive the backlog alert from metrics alerting systems,
> they
> > > can get the topic name, then, they can request Topics#getStats
> > > <
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > > >
> > > to
> > > get which subscriptions are in the huge backlog.
> > >
> > >
> > I agree users can use PulsarAdmin getStats for topic , with
> > getEarliestTimeInBacklog=true to find the oldest subscription responsible
> > for exceeding quota, but we can give them that information with 0 cost
> > since we already have that subscription name cached (we spent the I/O to
> > find out who that subscription is, let's just cache it and provide it).
> >
> >
> >
> >
> > > Thanks,
> > > Tao Jiuming
> > >
> > > Asaf Mesika <as...@gmail.com> 于2023年3月1日周三 23:42写道:
> > >
> > > > >
> > > > > Pulsar has 2 configurations for the backlog eviction
> > > > > <
> > > >
> > >
> >
> https://pulsar.apache.org/docs/2.11.x/cookbooks-retention-expiry/#backlog-quotas
> > > > >
> > > > > : backlogQuotaDefaultLimitBytes and backlogQuotaDefaultLimitSecond.
> > > > > By default, backlog eviction is disabled, and also, there is a
> field
> > > > named
> > > > > backlogQuotaMap in TopicPolicies
> > > > > <
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/HierarchyTopicPolicies.java#L45
> > > > >
> > > > > /NamespaceSpacePolicies
> > > > > <
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/common/policies/data/Policies.java#L41
> > > >
> > > > assists
> > > > > in controlling Topic/Namespace level backlog quota.
> > > > >
> > > > > If topic backlog reaches the threshold of any item, backlog
> eviction
> > > will
> > > > > be triggered, Pulsar will move subscription's cursor to skip
> > > > unacknowledged
> > > > > messages.
> > > > >
> > > > > Before backlog eviction happens, we don't have a metric to monitor
> > how
> > > > > long that it can reaches the threshold.
> > > > >
> > > >
> > > > I  think you should fix this explanation:
> > > >
> > > > In Pulsar, a subscription maintains a state of message acknowledged.
> A
> > > > subscription backlog is the set of messages which are unacknowledged.
> > > > A subscription backlog size is the sum of size of unacknowledged
> > messages
> > > > (in bytes).
> > > > A topic can have many subscriptions.
> > > > A topic backlog is defined as the backlog size of the subscription
> > which
> > > > has the oldest unacknowledged message. Since acknowledged messages
> can
> > be
> > > > interleaved with unacknowledged messages, calculating the exact size
> of
> > > > that subscription can be expensive as it requires I/O operations to
> > read
> > > > from the messages from the ledgers.
> > > > For that reason, the topic backlog is actually defined to be the
> > > estimated
> > > > backlog size of that subscription. It does so by summarizing the size
> > of
> > > > all the ledgers, starting from the current active one, up to the
> ledger
> > > > which contains the oldest unacknowledged message (There is actually a
> > > > faster way to calculate it, but this is the definition of the
> > > estimation).
> > > >
> > > > A topic backlog age is the age of the oldest unacknowledged message
> (in
> > > any
> > > > subscription). If that message was written 30 minutes ago, its age is
> > 30
> > > > minutes.
> > > >
> > > > Pulsar has a feature called backlog quota (place link). It allows the
> > > user
> > > > to define a quota - in effect, a limit - which limits the topic
> > backlog.
> > > > There are two types of quotas:
> > > > * Size based: The limit is for the topic backlog size (as we defined
> > > > above).
> > > > * Time based: The limit is for the topic's backlog age (as we defined
> > > > above).
> > > >
> > > > Once a topic backlog exceeds either one of those limits, an action is
> > > taken
> > > > upon messages written to the topic:
> > > > * The producer write is placed on hold for a certain amount of time
> > > before
> > > > failing.
> > > > * The producer write is failed
> > > > * The subscriptions oldest unacknowledged messages will be
> acknowledged
> > > in
> > > > order until both the topic backlog size or age will fall inside the
> > limit
> > > > (quota). The process is called backlog eviction (happens every
> > interval)
> > > >
> > > > The quotas can be defined as a default value for any topic, by using
> > the
> > > > following broker configuration keys: backlogQuotaDefaultLimitBytes ,
> > > > backlogQuotaDefaultLimitSecond. It can also be specified directly for
> > all
> > > > topics in a given namespace using the namespace policy, or a specific
> > > topic
> > > > using a topic policy.
> > > >
> > > > The user today can calculate quota used for size based limit, since
> > there
> > > > are two metrics that are exposed today on a topic level: "
> > > > pulsar_storage_backlog_quota_limit" and
> "pulsar_storage_backlog_size".
> > > You
> > > > can just divide the two to get a percentage.
> > > > For the time-based limit, the only metric exposed today is quota
> itself
> > > , "
> > > > pulsar_storage_backlog_quota_limit_time".
> > > >
> > > > ------------
> > > >
> > > > I would create two metrics:
> > > >
> > > > `pulsar_backlog_size_quota_used_percentage`
> > > > `pulsar_backlog_time_quota_used_percentage`
> > > >
> > > > You would like to know what triggered the alert, hence two.
> > > > It's not the quota percentage, it's the quota used percentage.
> > > >
> > > > ----------
> > > >
> > > > It checks if the backlog size exceeds the threshold(
> > > > > backlogQuotaDefaultLimitBytes), and it gets the current backlog
> size
> > by
> > > > > calculating LedgerInfo
> > > > > <
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > > >,
> > > > > it will not lead to I/O.
> > > >
> > > > This is not correct.
> > > > It checks against the topic / namespace policy, and if it doesn't
> > exist,
> > > it
> > > > falls back on the default configuration key mentioned above.
> > > >
> > > > It checks if the backlog time exceeds the threshold(
> > > > > backlogQuotaDefaultLimitSecond). If
> preciseTimeBasedBacklogQuotaCheck
> > > is
> > > > > set to be true, it will read an entry from Bookkeeper, but the
> > default
> > > > > value is false, which means it gets the backlog time by calculating
> > > > > LedgerInfo
> > > > > <
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > > >.
> > > > > So in general, we don't need to worry about it will lead to I/O.
> > > >
> > > >
> > > > I'm afraid of that.
> > > > Today the quota is checked periodically, right? So that's how the
> > > operator
> > > > knows the cost in terms of I/O is limited.
> > > >  Now you are adding one additional I/O per collection, every 1 min by
> > > > default. That's a lot perhaps. How long is the check interval today?
> > > >
> > > > Perhaps in the backlog quota check, you can persist the check result,
> > and
> > > > use it? Persist the age that is.
> > > >
> > > >
> > > > ------
> > > >
> > > > Regarding "slowest_subscription"
> > > > I think the cost is too high, because the subscriptions will keep
> > > > alternating, which can generate so many unique time series. Since
> > > > Prometheus flush only every 2 hours, or any there TSDB, it will cost
> > you
> > > > too much.
> > > >
> > > > I suggest exposing the name via the topic stats. This way they can
> > issue
> > > a
> > > > REST call to grab that subscription name only when the alert fires.
> > > >
> > > > Thanks,
> > > >
> > > > Asaf
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Feb 28, 2023 at 9:29 AM 太上玄元道君 <da...@apache.org> wrote:
> > > >
> > > > > Hi Asaf,
> > > > > I've updated the PIP, PTAL
> > > > >
> > > > > Thank,
> > > > > Tao Jiuming
> > > > >
> > > > > Asaf Mesika <as...@gmail.com> 于2023年2月26日周日 23:03写道:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > > > backlogQuotaDefaultLimitBytes and
> backlogQuotaDefaultLimitSecond,
> > > if
> > > > > > > topic backlog reaches the threshold of any item, backlog
> eviction
> > > > will
> > > > > be
> > > > > > > triggered.
> > > > > >
> > > > > > This seems like default values, not the actual values. Can you
> > please
> > > > > > provide an explanation in the PIP and link to read more:
> > > > > > 1. Where do you define the backlog quota exactly? What is the
> > > > granularity
> > > > > > (subscription?)
> > > > > > 2.  Is the backlog quota on by default? If so, what are the
> default
> > > > > values?
> > > > > >
> > > > > >
> > > > > >
> > > > > > *Notes*
> > > > > > 1. When the backlog quota limit is defined in Bytes, and you wish
> > to
> > > > know
> > > > > > how close a subscription is to its bytes limit, you need to
> > calculate
> > > > the
> > > > > > backlog size in bytes. From my understanding, there is an
> accurate
> > > > > > calculation (which is costly in terms of I/O) and there is an
> > > estimate
> > > > of
> > > > > > it. I presume you would want to use the estimated one, is that
> > > correct?
> > > > > > The backlog quota itself, uses the accurate or the estimated when
> > it
> > > > > starts
> > > > > > evicting entries (i.e. marking them as acknowledged)?
> > > > > >
> > > > > > 2. For the backlog limit specifying in time units, there is no
> > > > estimate,
> > > > > as
> > > > > > it must be calculated all the time (earliest unacknowledged
> message
> > > > > > distance from now). How do you plan to calculate the current age
> of
> > > the
> > > > > > earliest message without bearing that I/O cost on each metric
> > > > > calculation?
> > > > > >
> > > > > > 3. In the Goal section, you specify that your goal is to add a
> > > > > "proximity"
> > > > > > metric.
> > > > > > a) You must define that - what is proximity metric exactly? What
> > are
> > > > its
> > > > > > units? How are you planning to calculate it?
> > > > > > b) Proximity is not a good term IMO. I personally have never seen
> > > this
> > > > > term
> > > > > > used in software systems, unless it's in the aviation/space
> > industry.
> > > > > Once
> > > > > > you explain (a) I hope I can help provide alternative names.
> > > > > >
> > > > > > 4. Maybe we should provide the used quota percentage for both
> > limits,
> > > > > > instead of one per both, since it's easier to act upon the alert
> > when
> > > > you
> > > > > > need which one triggered it.
> > > > > >
> > > > > > 5. I didn't understand the "slowest_subscription" label used when
> > > > > > describing the metric label. Can you please provide an
> explanation?
> > > > > >
> > > > > > 6. I suggest writing a "High Level Design" section, and add
> > > everything
> > > > > you
> > > > > > need to know for this proposal, so I don't need to read the
> > > > > > implementation details below (code).
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Asaf
> > > > > >
> > > > > >
> > > > > > On Wed, Feb 22, 2023 at 4:52 PM 太上玄元道君 <da...@apache.org>
> wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I've started a PIP to discuss: PIP-248 Add backlog eviction
> > metric
> > > > > > >
> > > > > > > ### Motivation:
> > > > > > >
> > > > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > > > `backlogQuotaDefaultLimitBytes` and
> > > `backlogQuotaDefaultLimitSecond`,
> > > > > if
> > > > > > > topic backlog reaches the threshold of any item, backlog
> eviction
> > > > will
> > > > > be
> > > > > > > triggered.
> > > > > > >
> > > > > > > Before backlog eviction happens, we don't have a metric to
> > monitor
> > > > how
> > > > > > long
> > > > > > > that it can reaches the threshold.
> > > > > > >
> > > > > > > We can provide a progress bar metric to tell users some topics
> is
> > > > about
> > > > > > to
> > > > > > > trigger backlog eviction. And users can subscribe the alert to
> > > > schedule
> > > > > > > consumers.
> > > > > > >
> > > > > > > For more details, please read the PIP at
> > > > > > > https://github.com/apache/pulsar/issues/19601
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Tao Jiuming
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [Discuss] PIP-248: Add backlog eviction metric

Posted by 太上玄元道君 <da...@apache.org>.
Hi Aasf,
I've updated the PIP, PTAL

Thanks,
Tao Jiuming

Asaf Mesika <as...@gmail.com> 于2023年3月5日周日 21:00写道:

> On Thu, Mar 2, 2023 at 12:57 PM 太上玄元道君 <da...@apache.org> wrote:
>
> > > I  think you should fix this explanation:
> >
> > Thanks! I would like to copy the context you provide to the PIP
> motivation,
> > your description is more detailed, so developers don't have to go through
> > the code.
> >
>
> Sure
>
>
> >
> > > Today the quota is checked periodically, right? So that's how the
> > operator
> > > knows the cost in terms of I/O is limited.
> > > Now you are adding one additional I/O per collection, every 1 min by
> > > default. That's a lot perhaps. How long is the check interval today?
> >
> > Actually, I don't want to introduce additional costs, I thought we
> > could cache its result, so that it won't introduce additional costs.
> > It may be that I did not make it clear in the PIP and caused this
> > misunderstanding, sorry.
> >
>
> Ok, just to verify: You plan to modify the code that runs periodically the
> backlog quota check, so the result will be cached there? This way when you
> pull that information from that code every 1min to expose it as a metric it
> will have 0 I/O cost?
>
>
>
> >
> > > The user today can calculate quota used for size based limit, since
> there
> > > are two metrics that are exposed today on a topic level: "
> > > pulsar_storage_backlog_quota_limit" and "pulsar_storage_backlog_size".
> > You
> > > can just divide the two to get a percentage.
> > > For the time-based limit, the only metric exposed today is quota
> itself ,
> > "
> > > pulsar_storage_backlog_quota_limit_time".
> >
> > I only noticed `pulsar_storage_backlog_size` but missed
> > `pulsar_storage_backlog_quota_limit` and
> > `pulsar_storage_backlog_quota_limit_time`. Many thanks for your reminder.
> >
> >
> > So, in this condition, we already have the following topic-level metrics:
> > `pulsar_storage_backlog_size`: The total backlog size of the topics of
> this
> > topic owned by this broker (in bytes).
> > `pulsar_storage_backlog_quota_limit`: The total amount of the data in
> this
> > topic that limits the backlog quota (bytes).
> > `pulsar_storage_backlog_quota_limit_time`: The backlog quota limit in
> > time(seconds). (This metric does not exists in the doc, need to improve)
> >
> >
> > We just need to add a new metric named
> > `pulsar_storage_earliest_msg_publish_time_in_backlog` in the topic-level
> > that indicates the publish time of the earliest message in the backlog.
> > So users could get `pulsar_backlog_size_quota_used_percentage` by divide
> > `pulsar_storage_backlog_size ` and
> > `pulsar_storage_backlog_quota_limit`(`pulsar_storage_backlog_size` /
> > `pulsar_storage_backlog_quota_limit`),
> > and could get `pulsar_backlog_time_quota_used_percentage` by divide `now
> -
> > pulsar_storage_earliest_msg_publish_time_in_backlog` and
> > `pulsar_storage_backlog_quota_limit_time` (`now -
> > pulsar_storage_earliest_msg_publish_time_in_backlog` /
> > `pulsar_storage_backlog_quota_limit_time`).
> >
>
> I think there is a problem with the name
> `pulsar_storage_earliest_msg_publish_time_in_backlog` in the topic-level:
> * First, I prefer exposing the age rather than the publish time.
> * Second, it's a bit hard to figure out the meaning of the earliest msg in
> the backlog.
>
> Maybe `pulsar_storage_backlog_age_seconds`? In the explanation you can
> write: "The age (time passed since it was published) of the earliest
> unacknowledged message based on the topic's
> existing subscriptions" ?
>
>
>
> >
> > The backlog quota time checker runs periodically, so we can cache its
> > result, so it won't lead to much costs.
> >
> > Pulsar also exposed subscription-level  `backlogSize` and
> > `earliestMsgPublishTimeInBacklog` in Pulsar-Admin
> > <
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > >
> > if
> > `subscriptionBacklogSize` and `getEarliestTimeInBacklog` are true.
> > We can also expose `backlogQuotaLimiteSize` and `backlogQuotaLimitTime`
> of
> > the topic to PulsarAdmin.
> >
>
> What is the relationship you see between Pulsar exposing
> subscriptionBacklogSize and earliestMsgPublishTimeInBacklog in
> subscription, to exposing the backlog quota limits in pulsar admin?
>
> Limits can be exposed to Pulsar Admin, since it has 0 cost associated with
> it.
> I think it's a good idea to do that.
> The quota usage can also be exposed to pulsar admin, since we pull that
> data from the backlog quota checker cache, so it has 0 cost as well.
>
> As we said in previous email we can also expose
> `backlogQuotaTimeOldestBacklogAgeSubscriptionName`
>
>
> >
> > After users receive the backlog alert from metrics alerting systems, they
> > can get the topic name, then, they can request Topics#getStats
> > <
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> > >
> > to
> > get which subscriptions are in the huge backlog.
> >
> >
> I agree users can use PulsarAdmin getStats for topic , with
> getEarliestTimeInBacklog=true to find the oldest subscription responsible
> for exceeding quota, but we can give them that information with 0 cost
> since we already have that subscription name cached (we spent the I/O to
> find out who that subscription is, let's just cache it and provide it).
>
>
>
>
> > Thanks,
> > Tao Jiuming
> >
> > Asaf Mesika <as...@gmail.com> 于2023年3月1日周三 23:42写道:
> >
> > > >
> > > > Pulsar has 2 configurations for the backlog eviction
> > > > <
> > >
> >
> https://pulsar.apache.org/docs/2.11.x/cookbooks-retention-expiry/#backlog-quotas
> > > >
> > > > : backlogQuotaDefaultLimitBytes and backlogQuotaDefaultLimitSecond.
> > > > By default, backlog eviction is disabled, and also, there is a field
> > > named
> > > > backlogQuotaMap in TopicPolicies
> > > > <
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/HierarchyTopicPolicies.java#L45
> > > >
> > > > /NamespaceSpacePolicies
> > > > <
> > >
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/common/policies/data/Policies.java#L41
> > >
> > > assists
> > > > in controlling Topic/Namespace level backlog quota.
> > > >
> > > > If topic backlog reaches the threshold of any item, backlog eviction
> > will
> > > > be triggered, Pulsar will move subscription's cursor to skip
> > > unacknowledged
> > > > messages.
> > > >
> > > > Before backlog eviction happens, we don't have a metric to monitor
> how
> > > > long that it can reaches the threshold.
> > > >
> > >
> > > I  think you should fix this explanation:
> > >
> > > In Pulsar, a subscription maintains a state of message acknowledged. A
> > > subscription backlog is the set of messages which are unacknowledged.
> > > A subscription backlog size is the sum of size of unacknowledged
> messages
> > > (in bytes).
> > > A topic can have many subscriptions.
> > > A topic backlog is defined as the backlog size of the subscription
> which
> > > has the oldest unacknowledged message. Since acknowledged messages can
> be
> > > interleaved with unacknowledged messages, calculating the exact size of
> > > that subscription can be expensive as it requires I/O operations to
> read
> > > from the messages from the ledgers.
> > > For that reason, the topic backlog is actually defined to be the
> > estimated
> > > backlog size of that subscription. It does so by summarizing the size
> of
> > > all the ledgers, starting from the current active one, up to the ledger
> > > which contains the oldest unacknowledged message (There is actually a
> > > faster way to calculate it, but this is the definition of the
> > estimation).
> > >
> > > A topic backlog age is the age of the oldest unacknowledged message (in
> > any
> > > subscription). If that message was written 30 minutes ago, its age is
> 30
> > > minutes.
> > >
> > > Pulsar has a feature called backlog quota (place link). It allows the
> > user
> > > to define a quota - in effect, a limit - which limits the topic
> backlog.
> > > There are two types of quotas:
> > > * Size based: The limit is for the topic backlog size (as we defined
> > > above).
> > > * Time based: The limit is for the topic's backlog age (as we defined
> > > above).
> > >
> > > Once a topic backlog exceeds either one of those limits, an action is
> > taken
> > > upon messages written to the topic:
> > > * The producer write is placed on hold for a certain amount of time
> > before
> > > failing.
> > > * The producer write is failed
> > > * The subscriptions oldest unacknowledged messages will be acknowledged
> > in
> > > order until both the topic backlog size or age will fall inside the
> limit
> > > (quota). The process is called backlog eviction (happens every
> interval)
> > >
> > > The quotas can be defined as a default value for any topic, by using
> the
> > > following broker configuration keys: backlogQuotaDefaultLimitBytes ,
> > > backlogQuotaDefaultLimitSecond. It can also be specified directly for
> all
> > > topics in a given namespace using the namespace policy, or a specific
> > topic
> > > using a topic policy.
> > >
> > > The user today can calculate quota used for size based limit, since
> there
> > > are two metrics that are exposed today on a topic level: "
> > > pulsar_storage_backlog_quota_limit" and "pulsar_storage_backlog_size".
> > You
> > > can just divide the two to get a percentage.
> > > For the time-based limit, the only metric exposed today is quota itself
> > , "
> > > pulsar_storage_backlog_quota_limit_time".
> > >
> > > ------------
> > >
> > > I would create two metrics:
> > >
> > > `pulsar_backlog_size_quota_used_percentage`
> > > `pulsar_backlog_time_quota_used_percentage`
> > >
> > > You would like to know what triggered the alert, hence two.
> > > It's not the quota percentage, it's the quota used percentage.
> > >
> > > ----------
> > >
> > > It checks if the backlog size exceeds the threshold(
> > > > backlogQuotaDefaultLimitBytes), and it gets the current backlog size
> by
> > > > calculating LedgerInfo
> > > > <
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > >,
> > > > it will not lead to I/O.
> > >
> > > This is not correct.
> > > It checks against the topic / namespace policy, and if it doesn't
> exist,
> > it
> > > falls back on the default configuration key mentioned above.
> > >
> > > It checks if the backlog time exceeds the threshold(
> > > > backlogQuotaDefaultLimitSecond). If preciseTimeBasedBacklogQuotaCheck
> > is
> > > > set to be true, it will read an entry from Bookkeeper, but the
> default
> > > > value is false, which means it gets the backlog time by calculating
> > > > LedgerInfo
> > > > <
> > >
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > > >.
> > > > So in general, we don't need to worry about it will lead to I/O.
> > >
> > >
> > > I'm afraid of that.
> > > Today the quota is checked periodically, right? So that's how the
> > operator
> > > knows the cost in terms of I/O is limited.
> > >  Now you are adding one additional I/O per collection, every 1 min by
> > > default. That's a lot perhaps. How long is the check interval today?
> > >
> > > Perhaps in the backlog quota check, you can persist the check result,
> and
> > > use it? Persist the age that is.
> > >
> > >
> > > ------
> > >
> > > Regarding "slowest_subscription"
> > > I think the cost is too high, because the subscriptions will keep
> > > alternating, which can generate so many unique time series. Since
> > > Prometheus flush only every 2 hours, or any there TSDB, it will cost
> you
> > > too much.
> > >
> > > I suggest exposing the name via the topic stats. This way they can
> issue
> > a
> > > REST call to grab that subscription name only when the alert fires.
> > >
> > > Thanks,
> > >
> > > Asaf
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Feb 28, 2023 at 9:29 AM 太上玄元道君 <da...@apache.org> wrote:
> > >
> > > > Hi Asaf,
> > > > I've updated the PIP, PTAL
> > > >
> > > > Thank,
> > > > Tao Jiuming
> > > >
> > > > Asaf Mesika <as...@gmail.com> 于2023年2月26日周日 23:03写道:
> > > >
> > > > > Hi,
> > > > >
> > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > > backlogQuotaDefaultLimitBytes and backlogQuotaDefaultLimitSecond,
> > if
> > > > > > topic backlog reaches the threshold of any item, backlog eviction
> > > will
> > > > be
> > > > > > triggered.
> > > > >
> > > > > This seems like default values, not the actual values. Can you
> please
> > > > > provide an explanation in the PIP and link to read more:
> > > > > 1. Where do you define the backlog quota exactly? What is the
> > > granularity
> > > > > (subscription?)
> > > > > 2.  Is the backlog quota on by default? If so, what are the default
> > > > values?
> > > > >
> > > > >
> > > > >
> > > > > *Notes*
> > > > > 1. When the backlog quota limit is defined in Bytes, and you wish
> to
> > > know
> > > > > how close a subscription is to its bytes limit, you need to
> calculate
> > > the
> > > > > backlog size in bytes. From my understanding, there is an accurate
> > > > > calculation (which is costly in terms of I/O) and there is an
> > estimate
> > > of
> > > > > it. I presume you would want to use the estimated one, is that
> > correct?
> > > > > The backlog quota itself, uses the accurate or the estimated when
> it
> > > > starts
> > > > > evicting entries (i.e. marking them as acknowledged)?
> > > > >
> > > > > 2. For the backlog limit specifying in time units, there is no
> > > estimate,
> > > > as
> > > > > it must be calculated all the time (earliest unacknowledged message
> > > > > distance from now). How do you plan to calculate the current age of
> > the
> > > > > earliest message without bearing that I/O cost on each metric
> > > > calculation?
> > > > >
> > > > > 3. In the Goal section, you specify that your goal is to add a
> > > > "proximity"
> > > > > metric.
> > > > > a) You must define that - what is proximity metric exactly? What
> are
> > > its
> > > > > units? How are you planning to calculate it?
> > > > > b) Proximity is not a good term IMO. I personally have never seen
> > this
> > > > term
> > > > > used in software systems, unless it's in the aviation/space
> industry.
> > > > Once
> > > > > you explain (a) I hope I can help provide alternative names.
> > > > >
> > > > > 4. Maybe we should provide the used quota percentage for both
> limits,
> > > > > instead of one per both, since it's easier to act upon the alert
> when
> > > you
> > > > > need which one triggered it.
> > > > >
> > > > > 5. I didn't understand the "slowest_subscription" label used when
> > > > > describing the metric label. Can you please provide an explanation?
> > > > >
> > > > > 6. I suggest writing a "High Level Design" section, and add
> > everything
> > > > you
> > > > > need to know for this proposal, so I don't need to read the
> > > > > implementation details below (code).
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Asaf
> > > > >
> > > > >
> > > > > On Wed, Feb 22, 2023 at 4:52 PM 太上玄元道君 <da...@apache.org> wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I've started a PIP to discuss: PIP-248 Add backlog eviction
> metric
> > > > > >
> > > > > > ### Motivation:
> > > > > >
> > > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > > `backlogQuotaDefaultLimitBytes` and
> > `backlogQuotaDefaultLimitSecond`,
> > > > if
> > > > > > topic backlog reaches the threshold of any item, backlog eviction
> > > will
> > > > be
> > > > > > triggered.
> > > > > >
> > > > > > Before backlog eviction happens, we don't have a metric to
> monitor
> > > how
> > > > > long
> > > > > > that it can reaches the threshold.
> > > > > >
> > > > > > We can provide a progress bar metric to tell users some topics is
> > > about
> > > > > to
> > > > > > trigger backlog eviction. And users can subscribe the alert to
> > > schedule
> > > > > > consumers.
> > > > > >
> > > > > > For more details, please read the PIP at
> > > > > > https://github.com/apache/pulsar/issues/19601
> > > > > >
> > > > > > Thanks,
> > > > > > Tao Jiuming
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [Discuss] PIP-248: Add backlog eviction metric

Posted by Asaf Mesika <as...@gmail.com>.
On Thu, Mar 2, 2023 at 12:57 PM 太上玄元道君 <da...@apache.org> wrote:

> > I  think you should fix this explanation:
>
> Thanks! I would like to copy the context you provide to the PIP motivation,
> your description is more detailed, so developers don't have to go through
> the code.
>

Sure


>
> > Today the quota is checked periodically, right? So that's how the
> operator
> > knows the cost in terms of I/O is limited.
> > Now you are adding one additional I/O per collection, every 1 min by
> > default. That's a lot perhaps. How long is the check interval today?
>
> Actually, I don't want to introduce additional costs, I thought we
> could cache its result, so that it won't introduce additional costs.
> It may be that I did not make it clear in the PIP and caused this
> misunderstanding, sorry.
>

Ok, just to verify: You plan to modify the code that runs periodically the
backlog quota check, so the result will be cached there? This way when you
pull that information from that code every 1min to expose it as a metric it
will have 0 I/O cost?



>
> > The user today can calculate quota used for size based limit, since there
> > are two metrics that are exposed today on a topic level: "
> > pulsar_storage_backlog_quota_limit" and "pulsar_storage_backlog_size".
> You
> > can just divide the two to get a percentage.
> > For the time-based limit, the only metric exposed today is quota itself ,
> "
> > pulsar_storage_backlog_quota_limit_time".
>
> I only noticed `pulsar_storage_backlog_size` but missed
> `pulsar_storage_backlog_quota_limit` and
> `pulsar_storage_backlog_quota_limit_time`. Many thanks for your reminder.
>
>
> So, in this condition, we already have the following topic-level metrics:
> `pulsar_storage_backlog_size`: The total backlog size of the topics of this
> topic owned by this broker (in bytes).
> `pulsar_storage_backlog_quota_limit`: The total amount of the data in this
> topic that limits the backlog quota (bytes).
> `pulsar_storage_backlog_quota_limit_time`: The backlog quota limit in
> time(seconds). (This metric does not exists in the doc, need to improve)
>
>
> We just need to add a new metric named
> `pulsar_storage_earliest_msg_publish_time_in_backlog` in the topic-level
> that indicates the publish time of the earliest message in the backlog.
> So users could get `pulsar_backlog_size_quota_used_percentage` by divide
> `pulsar_storage_backlog_size ` and
> `pulsar_storage_backlog_quota_limit`(`pulsar_storage_backlog_size` /
> `pulsar_storage_backlog_quota_limit`),
> and could get `pulsar_backlog_time_quota_used_percentage` by divide `now -
> pulsar_storage_earliest_msg_publish_time_in_backlog` and
> `pulsar_storage_backlog_quota_limit_time` (`now -
> pulsar_storage_earliest_msg_publish_time_in_backlog` /
> `pulsar_storage_backlog_quota_limit_time`).
>

I think there is a problem with the name
`pulsar_storage_earliest_msg_publish_time_in_backlog` in the topic-level:
* First, I prefer exposing the age rather than the publish time.
* Second, it's a bit hard to figure out the meaning of the earliest msg in
the backlog.

Maybe `pulsar_storage_backlog_age_seconds`? In the explanation you can
write: "The age (time passed since it was published) of the earliest
unacknowledged message based on the topic's
existing subscriptions" ?



>
> The backlog quota time checker runs periodically, so we can cache its
> result, so it won't lead to much costs.
>
> Pulsar also exposed subscription-level  `backlogSize` and
> `earliestMsgPublishTimeInBacklog` in Pulsar-Admin
> <
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> >
> if
> `subscriptionBacklogSize` and `getEarliestTimeInBacklog` are true.
> We can also expose `backlogQuotaLimiteSize` and `backlogQuotaLimitTime` of
> the topic to PulsarAdmin.
>

What is the relationship you see between Pulsar exposing
subscriptionBacklogSize and earliestMsgPublishTimeInBacklog in
subscription, to exposing the backlog quota limits in pulsar admin?

Limits can be exposed to Pulsar Admin, since it has 0 cost associated with
it.
I think it's a good idea to do that.
The quota usage can also be exposed to pulsar admin, since we pull that
data from the backlog quota checker cache, so it has 0 cost as well.

As we said in previous email we can also expose
`backlogQuotaTimeOldestBacklogAgeSubscriptionName`


>
> After users receive the backlog alert from metrics alerting systems, they
> can get the topic name, then, they can request Topics#getStats
> <
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139
> >
> to
> get which subscriptions are in the huge backlog.
>
>
I agree users can use PulsarAdmin getStats for topic , with
getEarliestTimeInBacklog=true to find the oldest subscription responsible
for exceeding quota, but we can give them that information with 0 cost
since we already have that subscription name cached (we spent the I/O to
find out who that subscription is, let's just cache it and provide it).




> Thanks,
> Tao Jiuming
>
> Asaf Mesika <as...@gmail.com> 于2023年3月1日周三 23:42写道:
>
> > >
> > > Pulsar has 2 configurations for the backlog eviction
> > > <
> >
> https://pulsar.apache.org/docs/2.11.x/cookbooks-retention-expiry/#backlog-quotas
> > >
> > > : backlogQuotaDefaultLimitBytes and backlogQuotaDefaultLimitSecond.
> > > By default, backlog eviction is disabled, and also, there is a field
> > named
> > > backlogQuotaMap in TopicPolicies
> > > <
> >
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/HierarchyTopicPolicies.java#L45
> > >
> > > /NamespaceSpacePolicies
> > > <
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/common/policies/data/Policies.java#L41
> >
> > assists
> > > in controlling Topic/Namespace level backlog quota.
> > >
> > > If topic backlog reaches the threshold of any item, backlog eviction
> will
> > > be triggered, Pulsar will move subscription's cursor to skip
> > unacknowledged
> > > messages.
> > >
> > > Before backlog eviction happens, we don't have a metric to monitor how
> > > long that it can reaches the threshold.
> > >
> >
> > I  think you should fix this explanation:
> >
> > In Pulsar, a subscription maintains a state of message acknowledged. A
> > subscription backlog is the set of messages which are unacknowledged.
> > A subscription backlog size is the sum of size of unacknowledged messages
> > (in bytes).
> > A topic can have many subscriptions.
> > A topic backlog is defined as the backlog size of the subscription which
> > has the oldest unacknowledged message. Since acknowledged messages can be
> > interleaved with unacknowledged messages, calculating the exact size of
> > that subscription can be expensive as it requires I/O operations to read
> > from the messages from the ledgers.
> > For that reason, the topic backlog is actually defined to be the
> estimated
> > backlog size of that subscription. It does so by summarizing the size of
> > all the ledgers, starting from the current active one, up to the ledger
> > which contains the oldest unacknowledged message (There is actually a
> > faster way to calculate it, but this is the definition of the
> estimation).
> >
> > A topic backlog age is the age of the oldest unacknowledged message (in
> any
> > subscription). If that message was written 30 minutes ago, its age is 30
> > minutes.
> >
> > Pulsar has a feature called backlog quota (place link). It allows the
> user
> > to define a quota - in effect, a limit - which limits the topic backlog.
> > There are two types of quotas:
> > * Size based: The limit is for the topic backlog size (as we defined
> > above).
> > * Time based: The limit is for the topic's backlog age (as we defined
> > above).
> >
> > Once a topic backlog exceeds either one of those limits, an action is
> taken
> > upon messages written to the topic:
> > * The producer write is placed on hold for a certain amount of time
> before
> > failing.
> > * The producer write is failed
> > * The subscriptions oldest unacknowledged messages will be acknowledged
> in
> > order until both the topic backlog size or age will fall inside the limit
> > (quota). The process is called backlog eviction (happens every interval)
> >
> > The quotas can be defined as a default value for any topic, by using the
> > following broker configuration keys: backlogQuotaDefaultLimitBytes ,
> > backlogQuotaDefaultLimitSecond. It can also be specified directly for all
> > topics in a given namespace using the namespace policy, or a specific
> topic
> > using a topic policy.
> >
> > The user today can calculate quota used for size based limit, since there
> > are two metrics that are exposed today on a topic level: "
> > pulsar_storage_backlog_quota_limit" and "pulsar_storage_backlog_size".
> You
> > can just divide the two to get a percentage.
> > For the time-based limit, the only metric exposed today is quota itself
> , "
> > pulsar_storage_backlog_quota_limit_time".
> >
> > ------------
> >
> > I would create two metrics:
> >
> > `pulsar_backlog_size_quota_used_percentage`
> > `pulsar_backlog_time_quota_used_percentage`
> >
> > You would like to know what triggered the alert, hence two.
> > It's not the quota percentage, it's the quota used percentage.
> >
> > ----------
> >
> > It checks if the backlog size exceeds the threshold(
> > > backlogQuotaDefaultLimitBytes), and it gets the current backlog size by
> > > calculating LedgerInfo
> > > <
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > >,
> > > it will not lead to I/O.
> >
> > This is not correct.
> > It checks against the topic / namespace policy, and if it doesn't exist,
> it
> > falls back on the default configuration key mentioned above.
> >
> > It checks if the backlog time exceeds the threshold(
> > > backlogQuotaDefaultLimitSecond). If preciseTimeBasedBacklogQuotaCheck
> is
> > > set to be true, it will read an entry from Bookkeeper, but the default
> > > value is false, which means it gets the backlog time by calculating
> > > LedgerInfo
> > > <
> >
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> > >.
> > > So in general, we don't need to worry about it will lead to I/O.
> >
> >
> > I'm afraid of that.
> > Today the quota is checked periodically, right? So that's how the
> operator
> > knows the cost in terms of I/O is limited.
> >  Now you are adding one additional I/O per collection, every 1 min by
> > default. That's a lot perhaps. How long is the check interval today?
> >
> > Perhaps in the backlog quota check, you can persist the check result, and
> > use it? Persist the age that is.
> >
> >
> > ------
> >
> > Regarding "slowest_subscription"
> > I think the cost is too high, because the subscriptions will keep
> > alternating, which can generate so many unique time series. Since
> > Prometheus flush only every 2 hours, or any there TSDB, it will cost you
> > too much.
> >
> > I suggest exposing the name via the topic stats. This way they can issue
> a
> > REST call to grab that subscription name only when the alert fires.
> >
> > Thanks,
> >
> > Asaf
> >
> >
> >
> >
> >
> > On Tue, Feb 28, 2023 at 9:29 AM 太上玄元道君 <da...@apache.org> wrote:
> >
> > > Hi Asaf,
> > > I've updated the PIP, PTAL
> > >
> > > Thank,
> > > Tao Jiuming
> > >
> > > Asaf Mesika <as...@gmail.com> 于2023年2月26日周日 23:03写道:
> > >
> > > > Hi,
> > > >
> > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > backlogQuotaDefaultLimitBytes and backlogQuotaDefaultLimitSecond,
> if
> > > > > topic backlog reaches the threshold of any item, backlog eviction
> > will
> > > be
> > > > > triggered.
> > > >
> > > > This seems like default values, not the actual values. Can you please
> > > > provide an explanation in the PIP and link to read more:
> > > > 1. Where do you define the backlog quota exactly? What is the
> > granularity
> > > > (subscription?)
> > > > 2.  Is the backlog quota on by default? If so, what are the default
> > > values?
> > > >
> > > >
> > > >
> > > > *Notes*
> > > > 1. When the backlog quota limit is defined in Bytes, and you wish to
> > know
> > > > how close a subscription is to its bytes limit, you need to calculate
> > the
> > > > backlog size in bytes. From my understanding, there is an accurate
> > > > calculation (which is costly in terms of I/O) and there is an
> estimate
> > of
> > > > it. I presume you would want to use the estimated one, is that
> correct?
> > > > The backlog quota itself, uses the accurate or the estimated when it
> > > starts
> > > > evicting entries (i.e. marking them as acknowledged)?
> > > >
> > > > 2. For the backlog limit specifying in time units, there is no
> > estimate,
> > > as
> > > > it must be calculated all the time (earliest unacknowledged message
> > > > distance from now). How do you plan to calculate the current age of
> the
> > > > earliest message without bearing that I/O cost on each metric
> > > calculation?
> > > >
> > > > 3. In the Goal section, you specify that your goal is to add a
> > > "proximity"
> > > > metric.
> > > > a) You must define that - what is proximity metric exactly? What are
> > its
> > > > units? How are you planning to calculate it?
> > > > b) Proximity is not a good term IMO. I personally have never seen
> this
> > > term
> > > > used in software systems, unless it's in the aviation/space industry.
> > > Once
> > > > you explain (a) I hope I can help provide alternative names.
> > > >
> > > > 4. Maybe we should provide the used quota percentage for both limits,
> > > > instead of one per both, since it's easier to act upon the alert when
> > you
> > > > need which one triggered it.
> > > >
> > > > 5. I didn't understand the "slowest_subscription" label used when
> > > > describing the metric label. Can you please provide an explanation?
> > > >
> > > > 6. I suggest writing a "High Level Design" section, and add
> everything
> > > you
> > > > need to know for this proposal, so I don't need to read the
> > > > implementation details below (code).
> > > >
> > > > Thanks,
> > > >
> > > > Asaf
> > > >
> > > >
> > > > On Wed, Feb 22, 2023 at 4:52 PM 太上玄元道君 <da...@apache.org> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I've started a PIP to discuss: PIP-248 Add backlog eviction metric
> > > > >
> > > > > ### Motivation:
> > > > >
> > > > > Pulsar has 2 configurations for the backlog eviction:
> > > > > `backlogQuotaDefaultLimitBytes` and
> `backlogQuotaDefaultLimitSecond`,
> > > if
> > > > > topic backlog reaches the threshold of any item, backlog eviction
> > will
> > > be
> > > > > triggered.
> > > > >
> > > > > Before backlog eviction happens, we don't have a metric to monitor
> > how
> > > > long
> > > > > that it can reaches the threshold.
> > > > >
> > > > > We can provide a progress bar metric to tell users some topics is
> > about
> > > > to
> > > > > trigger backlog eviction. And users can subscribe the alert to
> > schedule
> > > > > consumers.
> > > > >
> > > > > For more details, please read the PIP at
> > > > > https://github.com/apache/pulsar/issues/19601
> > > > >
> > > > > Thanks,
> > > > > Tao Jiuming
> > > > >
> > > >
> > >
> >
>

Re: [Discuss] PIP-248: Add backlog eviction metric

Posted by 太上玄元道君 <da...@apache.org>.
> I  think you should fix this explanation:

Thanks! I would like to copy the context you provide to the PIP motivation,
your description is more detailed, so developers don't have to go through
the code.

> Today the quota is checked periodically, right? So that's how the operator
> knows the cost in terms of I/O is limited.
> Now you are adding one additional I/O per collection, every 1 min by
> default. That's a lot perhaps. How long is the check interval today?

Actually, I don't want to introduce additional costs, I thought we
could cache its result, so that it won't introduce additional costs.
It may be that I did not make it clear in the PIP and caused this
misunderstanding, sorry.

> The user today can calculate quota used for size based limit, since there
> are two metrics that are exposed today on a topic level: "
> pulsar_storage_backlog_quota_limit" and "pulsar_storage_backlog_size". You
> can just divide the two to get a percentage.
> For the time-based limit, the only metric exposed today is quota itself ,
"
> pulsar_storage_backlog_quota_limit_time".

I only noticed `pulsar_storage_backlog_size` but missed
`pulsar_storage_backlog_quota_limit` and
`pulsar_storage_backlog_quota_limit_time`. Many thanks for your reminder.


So, in this condition, we already have the following topic-level metrics:
`pulsar_storage_backlog_size`: The total backlog size of the topics of this
topic owned by this broker (in bytes).
`pulsar_storage_backlog_quota_limit`: The total amount of the data in this
topic that limits the backlog quota (bytes).
`pulsar_storage_backlog_quota_limit_time`: The backlog quota limit in
time(seconds). (This metric does not exists in the doc, need to improve)


We just need to add a new metric named
`pulsar_storage_earliest_msg_publish_time_in_backlog` in the topic-level
that indicates the publish time of the earliest message in the backlog.
So users could get `pulsar_backlog_size_quota_used_percentage` by divide
`pulsar_storage_backlog_size ` and
`pulsar_storage_backlog_quota_limit`(`pulsar_storage_backlog_size` /
`pulsar_storage_backlog_quota_limit`),
and could get `pulsar_backlog_time_quota_used_percentage` by divide `now -
pulsar_storage_earliest_msg_publish_time_in_backlog` and
`pulsar_storage_backlog_quota_limit_time` (`now -
pulsar_storage_earliest_msg_publish_time_in_backlog` /
`pulsar_storage_backlog_quota_limit_time`).

The backlog quota time checker runs periodically, so we can cache its
result, so it won't lead to much costs.

Pulsar also exposed subscription-level  `backlogSize` and
`earliestMsgPublishTimeInBacklog` in Pulsar-Admin
<https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139>
if
`subscriptionBacklogSize` and `getEarliestTimeInBacklog` are true.
We can also expose `backlogQuotaLimiteSize` and `backlogQuotaLimitTime` of
the topic to PulsarAdmin.

After users receive the backlog alert from metrics alerting systems, they
can get the topic name, then, they can request Topics#getStats
<https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139>
to
get which subscriptions are in the huge backlog.

Thanks,
Tao Jiuming

Asaf Mesika <as...@gmail.com> 于2023年3月1日周三 23:42写道:

> >
> > Pulsar has 2 configurations for the backlog eviction
> > <
> https://pulsar.apache.org/docs/2.11.x/cookbooks-retention-expiry/#backlog-quotas
> >
> > : backlogQuotaDefaultLimitBytes and backlogQuotaDefaultLimitSecond.
> > By default, backlog eviction is disabled, and also, there is a field
> named
> > backlogQuotaMap in TopicPolicies
> > <
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/HierarchyTopicPolicies.java#L45
> >
> > /NamespaceSpacePolicies
> > <
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/common/policies/data/Policies.java#L41>
> assists
> > in controlling Topic/Namespace level backlog quota.
> >
> > If topic backlog reaches the threshold of any item, backlog eviction will
> > be triggered, Pulsar will move subscription's cursor to skip
> unacknowledged
> > messages.
> >
> > Before backlog eviction happens, we don't have a metric to monitor how
> > long that it can reaches the threshold.
> >
>
> I  think you should fix this explanation:
>
> In Pulsar, a subscription maintains a state of message acknowledged. A
> subscription backlog is the set of messages which are unacknowledged.
> A subscription backlog size is the sum of size of unacknowledged messages
> (in bytes).
> A topic can have many subscriptions.
> A topic backlog is defined as the backlog size of the subscription which
> has the oldest unacknowledged message. Since acknowledged messages can be
> interleaved with unacknowledged messages, calculating the exact size of
> that subscription can be expensive as it requires I/O operations to read
> from the messages from the ledgers.
> For that reason, the topic backlog is actually defined to be the estimated
> backlog size of that subscription. It does so by summarizing the size of
> all the ledgers, starting from the current active one, up to the ledger
> which contains the oldest unacknowledged message (There is actually a
> faster way to calculate it, but this is the definition of the estimation).
>
> A topic backlog age is the age of the oldest unacknowledged message (in any
> subscription). If that message was written 30 minutes ago, its age is 30
> minutes.
>
> Pulsar has a feature called backlog quota (place link). It allows the user
> to define a quota - in effect, a limit - which limits the topic backlog.
> There are two types of quotas:
> * Size based: The limit is for the topic backlog size (as we defined
> above).
> * Time based: The limit is for the topic's backlog age (as we defined
> above).
>
> Once a topic backlog exceeds either one of those limits, an action is taken
> upon messages written to the topic:
> * The producer write is placed on hold for a certain amount of time before
> failing.
> * The producer write is failed
> * The subscriptions oldest unacknowledged messages will be acknowledged in
> order until both the topic backlog size or age will fall inside the limit
> (quota). The process is called backlog eviction (happens every interval)
>
> The quotas can be defined as a default value for any topic, by using the
> following broker configuration keys: backlogQuotaDefaultLimitBytes ,
> backlogQuotaDefaultLimitSecond. It can also be specified directly for all
> topics in a given namespace using the namespace policy, or a specific topic
> using a topic policy.
>
> The user today can calculate quota used for size based limit, since there
> are two metrics that are exposed today on a topic level: "
> pulsar_storage_backlog_quota_limit" and "pulsar_storage_backlog_size". You
> can just divide the two to get a percentage.
> For the time-based limit, the only metric exposed today is quota itself , "
> pulsar_storage_backlog_quota_limit_time".
>
> ------------
>
> I would create two metrics:
>
> `pulsar_backlog_size_quota_used_percentage`
> `pulsar_backlog_time_quota_used_percentage`
>
> You would like to know what triggered the alert, hence two.
> It's not the quota percentage, it's the quota used percentage.
>
> ----------
>
> It checks if the backlog size exceeds the threshold(
> > backlogQuotaDefaultLimitBytes), and it gets the current backlog size by
> > calculating LedgerInfo
> > <
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> >,
> > it will not lead to I/O.
>
> This is not correct.
> It checks against the topic / namespace policy, and if it doesn't exist, it
> falls back on the default configuration key mentioned above.
>
> It checks if the backlog time exceeds the threshold(
> > backlogQuotaDefaultLimitSecond). If preciseTimeBasedBacklogQuotaCheck is
> > set to be true, it will read an entry from Bookkeeper, but the default
> > value is false, which means it gets the backlog time by calculating
> > LedgerInfo
> > <
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> >.
> > So in general, we don't need to worry about it will lead to I/O.
>
>
> I'm afraid of that.
> Today the quota is checked periodically, right? So that's how the operator
> knows the cost in terms of I/O is limited.
>  Now you are adding one additional I/O per collection, every 1 min by
> default. That's a lot perhaps. How long is the check interval today?
>
> Perhaps in the backlog quota check, you can persist the check result, and
> use it? Persist the age that is.
>
>
> ------
>
> Regarding "slowest_subscription"
> I think the cost is too high, because the subscriptions will keep
> alternating, which can generate so many unique time series. Since
> Prometheus flush only every 2 hours, or any there TSDB, it will cost you
> too much.
>
> I suggest exposing the name via the topic stats. This way they can issue a
> REST call to grab that subscription name only when the alert fires.
>
> Thanks,
>
> Asaf
>
>
>
>
>
> On Tue, Feb 28, 2023 at 9:29 AM 太上玄元道君 <da...@apache.org> wrote:
>
> > Hi Asaf,
> > I've updated the PIP, PTAL
> >
> > Thank,
> > Tao Jiuming
> >
> > Asaf Mesika <as...@gmail.com> 于2023年2月26日周日 23:03写道:
> >
> > > Hi,
> > >
> > > Pulsar has 2 configurations for the backlog eviction:
> > > > backlogQuotaDefaultLimitBytes and backlogQuotaDefaultLimitSecond, if
> > > > topic backlog reaches the threshold of any item, backlog eviction
> will
> > be
> > > > triggered.
> > >
> > > This seems like default values, not the actual values. Can you please
> > > provide an explanation in the PIP and link to read more:
> > > 1. Where do you define the backlog quota exactly? What is the
> granularity
> > > (subscription?)
> > > 2.  Is the backlog quota on by default? If so, what are the default
> > values?
> > >
> > >
> > >
> > > *Notes*
> > > 1. When the backlog quota limit is defined in Bytes, and you wish to
> know
> > > how close a subscription is to its bytes limit, you need to calculate
> the
> > > backlog size in bytes. From my understanding, there is an accurate
> > > calculation (which is costly in terms of I/O) and there is an estimate
> of
> > > it. I presume you would want to use the estimated one, is that correct?
> > > The backlog quota itself, uses the accurate or the estimated when it
> > starts
> > > evicting entries (i.e. marking them as acknowledged)?
> > >
> > > 2. For the backlog limit specifying in time units, there is no
> estimate,
> > as
> > > it must be calculated all the time (earliest unacknowledged message
> > > distance from now). How do you plan to calculate the current age of the
> > > earliest message without bearing that I/O cost on each metric
> > calculation?
> > >
> > > 3. In the Goal section, you specify that your goal is to add a
> > "proximity"
> > > metric.
> > > a) You must define that - what is proximity metric exactly? What are
> its
> > > units? How are you planning to calculate it?
> > > b) Proximity is not a good term IMO. I personally have never seen this
> > term
> > > used in software systems, unless it's in the aviation/space industry.
> > Once
> > > you explain (a) I hope I can help provide alternative names.
> > >
> > > 4. Maybe we should provide the used quota percentage for both limits,
> > > instead of one per both, since it's easier to act upon the alert when
> you
> > > need which one triggered it.
> > >
> > > 5. I didn't understand the "slowest_subscription" label used when
> > > describing the metric label. Can you please provide an explanation?
> > >
> > > 6. I suggest writing a "High Level Design" section, and add everything
> > you
> > > need to know for this proposal, so I don't need to read the
> > > implementation details below (code).
> > >
> > > Thanks,
> > >
> > > Asaf
> > >
> > >
> > > On Wed, Feb 22, 2023 at 4:52 PM 太上玄元道君 <da...@apache.org> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I've started a PIP to discuss: PIP-248 Add backlog eviction metric
> > > >
> > > > ### Motivation:
> > > >
> > > > Pulsar has 2 configurations for the backlog eviction:
> > > > `backlogQuotaDefaultLimitBytes` and `backlogQuotaDefaultLimitSecond`,
> > if
> > > > topic backlog reaches the threshold of any item, backlog eviction
> will
> > be
> > > > triggered.
> > > >
> > > > Before backlog eviction happens, we don't have a metric to monitor
> how
> > > long
> > > > that it can reaches the threshold.
> > > >
> > > > We can provide a progress bar metric to tell users some topics is
> about
> > > to
> > > > trigger backlog eviction. And users can subscribe the alert to
> schedule
> > > > consumers.
> > > >
> > > > For more details, please read the PIP at
> > > > https://github.com/apache/pulsar/issues/19601
> > > >
> > > > Thanks,
> > > > Tao Jiuming
> > > >
> > >
> >
>

Re: [Discuss] PIP-248: Add backlog eviction metric

Posted by PengHui Li <pe...@apache.org>.
Ah, I forgot this one "pulsar_storage_backlog_quota_limit"
As Asaf said, users can just divide the two to get a percentage.
I think we don't need to expose more metrics for the size-based backlog
quota. And only exposing the topic-level metrics looks good to me.
Users can get the alert and then check which subscription with large
backlogs
by the Pulsar Admin.

For the estimated backlog size. It should be ok? The backlog quota policy
also performs based on the estimated backlog size.

> I'm afraid of that.
> Today the quota is checked periodically, right? So that's how the operator
> knows the cost in terms of I/O is limited.
> Now you are adding one additional I/O per collection, every 1 min by
> default. That's a lot perhaps. How long is the check interval today?
>
> Perhaps in the backlog quota check, you can persist the check result, and
> use it? Persist the age that is.

I think yes, we don't need to add additional costs here. The broker did the
backlog
check if the backlog quota was enabled. So we can just record the last
checked value
to the topic.

Follow the same way, we can just expose the time-based lag metrics. So that
users can divide the two to get a percentage.

> Regarding "slowest_subscription"
> I think the cost is too high, because the subscriptions will keep
> alternating, which can generate so many unique time series. Since
> Prometheus flush only every 2 hours, or any there TSDB, it will cost you
> too much.
>
> I suggest exposing the name via the topic stats. This way they can issue a
> REST call to grab that subscription name only when the alert fires.

Yes, I totally agree. And now we already have the information.
Just get the subscription with max backlog size.

@jiuming I think you'd better copy the context that Asaf provided to the
proposal.
It will help the reviewer to understand what problems we want to resolve.
And It will provide the opportunity for more people to join the discussion.

Regards
Penghui

On Wed, Mar 1, 2023 at 11:42 PM Asaf Mesika <as...@gmail.com> wrote:

> >
> > Pulsar has 2 configurations for the backlog eviction
> > <
> https://pulsar.apache.org/docs/2.11.x/cookbooks-retention-expiry/#backlog-quotas
> >
> > : backlogQuotaDefaultLimitBytes and backlogQuotaDefaultLimitSecond.
> > By default, backlog eviction is disabled, and also, there is a field
> named
> > backlogQuotaMap in TopicPolicies
> > <
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/HierarchyTopicPolicies.java#L45
> >
> > /NamespaceSpacePolicies
> > <
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/common/policies/data/Policies.java#L41>
> assists
> > in controlling Topic/Namespace level backlog quota.
> >
> > If topic backlog reaches the threshold of any item, backlog eviction will
> > be triggered, Pulsar will move subscription's cursor to skip
> unacknowledged
> > messages.
> >
> > Before backlog eviction happens, we don't have a metric to monitor how
> > long that it can reaches the threshold.
> >
>
> I  think you should fix this explanation:
>
> In Pulsar, a subscription maintains a state of message acknowledged. A
> subscription backlog is the set of messages which are unacknowledged.
> A subscription backlog size is the sum of size of unacknowledged messages
> (in bytes).
> A topic can have many subscriptions.
> A topic backlog is defined as the backlog size of the subscription which
> has the oldest unacknowledged message. Since acknowledged messages can be
> interleaved with unacknowledged messages, calculating the exact size of
> that subscription can be expensive as it requires I/O operations to read
> from the messages from the ledgers.
> For that reason, the topic backlog is actually defined to be the estimated
> backlog size of that subscription. It does so by summarizing the size of
> all the ledgers, starting from the current active one, up to the ledger
> which contains the oldest unacknowledged message (There is actually a
> faster way to calculate it, but this is the definition of the estimation).
>
> A topic backlog age is the age of the oldest unacknowledged message (in any
> subscription). If that message was written 30 minutes ago, its age is 30
> minutes.
>
> Pulsar has a feature called backlog quota (place link). It allows the user
> to define a quota - in effect, a limit - which limits the topic backlog.
> There are two types of quotas:
> * Size based: The limit is for the topic backlog size (as we defined
> above).
> * Time based: The limit is for the topic's backlog age (as we defined
> above).
>
> Once a topic backlog exceeds either one of those limits, an action is taken
> upon messages written to the topic:
> * The producer write is placed on hold for a certain amount of time before
> failing.
> * The producer write is failed
> * The subscriptions oldest unacknowledged messages will be acknowledged in
> order until both the topic backlog size or age will fall inside the limit
> (quota). The process is called backlog eviction (happens every interval)
>
> The quotas can be defined as a default value for any topic, by using the
> following broker configuration keys: backlogQuotaDefaultLimitBytes ,
> backlogQuotaDefaultLimitSecond. It can also be specified directly for all
> topics in a given namespace using the namespace policy, or a specific topic
> using a topic policy.
>
> The user today can calculate quota used for size based limit, since there
> are two metrics that are exposed today on a topic level: "
> pulsar_storage_backlog_quota_limit" and "pulsar_storage_backlog_size". You
> can just divide the two to get a percentage.
> For the time-based limit, the only metric exposed today is quota itself , "
> pulsar_storage_backlog_quota_limit_time".
>
> ------------
>
> I would create two metrics:
>
> `pulsar_backlog_size_quota_used_percentage`
> `pulsar_backlog_time_quota_used_percentage`
>
> You would like to know what triggered the alert, hence two.
> It's not the quota percentage, it's the quota used percentage.
>
> ----------
>
> It checks if the backlog size exceeds the threshold(
> > backlogQuotaDefaultLimitBytes), and it gets the current backlog size by
> > calculating LedgerInfo
> > <
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> >,
> > it will not lead to I/O.
>
> This is not correct.
> It checks against the topic / namespace policy, and if it doesn't exist, it
> falls back on the default configuration key mentioned above.
>
> It checks if the backlog time exceeds the threshold(
> > backlogQuotaDefaultLimitSecond). If preciseTimeBasedBacklogQuotaCheck is
> > set to be true, it will read an entry from Bookkeeper, but the default
> > value is false, which means it gets the backlog time by calculating
> > LedgerInfo
> > <
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> >.
> > So in general, we don't need to worry about it will lead to I/O.
>
>
> I'm afraid of that.
> Today the quota is checked periodically, right? So that's how the operator
> knows the cost in terms of I/O is limited.
>  Now you are adding one additional I/O per collection, every 1 min by
> default. That's a lot perhaps. How long is the check interval today?
>
> Perhaps in the backlog quota check, you can persist the check result, and
> use it? Persist the age that is.
>
>
> ------
>
> Regarding "slowest_subscription"
> I think the cost is too high, because the subscriptions will keep
> alternating, which can generate so many unique time series. Since
> Prometheus flush only every 2 hours, or any there TSDB, it will cost you
> too much.
>
> I suggest exposing the name via the topic stats. This way they can issue a
> REST call to grab that subscription name only when the alert fires.
>
> Thanks,
>
> Asaf
>
>
>
>
>
> On Tue, Feb 28, 2023 at 9:29 AM 太上玄元道君 <da...@apache.org> wrote:
>
> > Hi Asaf,
> > I've updated the PIP, PTAL
> >
> > Thank,
> > Tao Jiuming
> >
> > Asaf Mesika <as...@gmail.com> 于2023年2月26日周日 23:03写道:
> >
> > > Hi,
> > >
> > > Pulsar has 2 configurations for the backlog eviction:
> > > > backlogQuotaDefaultLimitBytes and backlogQuotaDefaultLimitSecond, if
> > > > topic backlog reaches the threshold of any item, backlog eviction
> will
> > be
> > > > triggered.
> > >
> > > This seems like default values, not the actual values. Can you please
> > > provide an explanation in the PIP and link to read more:
> > > 1. Where do you define the backlog quota exactly? What is the
> granularity
> > > (subscription?)
> > > 2.  Is the backlog quota on by default? If so, what are the default
> > values?
> > >
> > >
> > >
> > > *Notes*
> > > 1. When the backlog quota limit is defined in Bytes, and you wish to
> know
> > > how close a subscription is to its bytes limit, you need to calculate
> the
> > > backlog size in bytes. From my understanding, there is an accurate
> > > calculation (which is costly in terms of I/O) and there is an estimate
> of
> > > it. I presume you would want to use the estimated one, is that correct?
> > > The backlog quota itself, uses the accurate or the estimated when it
> > starts
> > > evicting entries (i.e. marking them as acknowledged)?
> > >
> > > 2. For the backlog limit specifying in time units, there is no
> estimate,
> > as
> > > it must be calculated all the time (earliest unacknowledged message
> > > distance from now). How do you plan to calculate the current age of the
> > > earliest message without bearing that I/O cost on each metric
> > calculation?
> > >
> > > 3. In the Goal section, you specify that your goal is to add a
> > "proximity"
> > > metric.
> > > a) You must define that - what is proximity metric exactly? What are
> its
> > > units? How are you planning to calculate it?
> > > b) Proximity is not a good term IMO. I personally have never seen this
> > term
> > > used in software systems, unless it's in the aviation/space industry.
> > Once
> > > you explain (a) I hope I can help provide alternative names.
> > >
> > > 4. Maybe we should provide the used quota percentage for both limits,
> > > instead of one per both, since it's easier to act upon the alert when
> you
> > > need which one triggered it.
> > >
> > > 5. I didn't understand the "slowest_subscription" label used when
> > > describing the metric label. Can you please provide an explanation?
> > >
> > > 6. I suggest writing a "High Level Design" section, and add everything
> > you
> > > need to know for this proposal, so I don't need to read the
> > > implementation details below (code).
> > >
> > > Thanks,
> > >
> > > Asaf
> > >
> > >
> > > On Wed, Feb 22, 2023 at 4:52 PM 太上玄元道君 <da...@apache.org> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I've started a PIP to discuss: PIP-248 Add backlog eviction metric
> > > >
> > > > ### Motivation:
> > > >
> > > > Pulsar has 2 configurations for the backlog eviction:
> > > > `backlogQuotaDefaultLimitBytes` and `backlogQuotaDefaultLimitSecond`,
> > if
> > > > topic backlog reaches the threshold of any item, backlog eviction
> will
> > be
> > > > triggered.
> > > >
> > > > Before backlog eviction happens, we don't have a metric to monitor
> how
> > > long
> > > > that it can reaches the threshold.
> > > >
> > > > We can provide a progress bar metric to tell users some topics is
> about
> > > to
> > > > trigger backlog eviction. And users can subscribe the alert to
> schedule
> > > > consumers.
> > > >
> > > > For more details, please read the PIP at
> > > > https://github.com/apache/pulsar/issues/19601
> > > >
> > > > Thanks,
> > > > Tao Jiuming
> > > >
> > >
> >
>