You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Enrico Olivelli <eo...@gmail.com> on 2022/07/12 12:57:41 UTC

Tool to calculate accurate backlog of a subscription (considering batching and entry filters)

Hello folks,
there is a long standing issue about calculating an accurate backlog
in presence of batch messages [1].
Now that we have PIP-105, broker side filters, the problem becomes bigger.

I have sent a patch [2] to provide a feature to "analise" a
Subscription and scan the non dispatched messages to perform an
accurate calculation:
- count the number of "logical" messages, considering batch messages
- execute the EntryFilter

The suggested usage of this feature will be:
- use stats/metrics to monitor the backlog
- if the backlog is bigger than expected use the new command to
understand what's happening


Thoughts ?

Best regards
Enrico


[1] https://github.com/apache/pulsar/issues/10705
[2] https://github.com/apache/pulsar/pull/16545

Re: Tool to calculate accurate backlog of a subscription (considering batching and entry filters)

Posted by Enrico Olivelli <eo...@gmail.com>.
Il giorno mer 13 lug 2022 alle ore 04:05 PengHui Li
<pe...@apache.org> ha scritto:
>
> Oh, I see.
>
> I think we can start with a proposal? I saw the new REST API introduced in
> the PR.
> It's better to follow the proposal process to get explicit approvals.


Sure.
I started this conversation to understand if there is interest.

Let me prepare an official PIP

Thanks for your feedback
Enrico

>
> Thanks,
> Penghui
>
> On Tue, Jul 12, 2022 at 10:42 PM Enrico Olivelli <eo...@gmail.com>
> wrote:
>
> > Il giorno mar 12 lug 2022 alle ore 15:04 PengHui Li
> > <pe...@apache.org> ha scritto:
> > >
> > > Jiuming has pushed a fix before
> > https://github.com/apache/pulsar/pull/14958
> > > which uses the message index to calculate the precise backlog without
> > going
> > > through
> > > all the messages.
> >
> > I am sorry but it does not cover my needs.
> > Actually I need to run the "EntryFilter" in order to calculate exactly
> > which messages are to be dispatched or not.
> >
> > Think about a topic with tens of different subscription with a
> > different "filter",
> > the user wants to know how many messages will be actually delivered to
> > a subscription
> >
> > Enrico
> >
> > >
> > > Penghui
> > >
> > > On Tue, Jul 12, 2022 at 8:58 PM Enrico Olivelli <eo...@gmail.com>
> > wrote:
> > >
> > > > Hello folks,
> > > > there is a long standing issue about calculating an accurate backlog
> > > > in presence of batch messages [1].
> > > > Now that we have PIP-105, broker side filters, the problem becomes
> > bigger.
> > > >
> > > > I have sent a patch [2] to provide a feature to "analise" a
> > > > Subscription and scan the non dispatched messages to perform an
> > > > accurate calculation:
> > > > - count the number of "logical" messages, considering batch messages
> > > > - execute the EntryFilter
> > > >
> > > > The suggested usage of this feature will be:
> > > > - use stats/metrics to monitor the backlog
> > > > - if the backlog is bigger than expected use the new command to
> > > > understand what's happening
> > > >
> > > >
> > > > Thoughts ?
> > > >
> > > > Best regards
> > > > Enrico
> > > >
> > > >
> > > > [1] https://github.com/apache/pulsar/issues/10705
> > > > [2] https://github.com/apache/pulsar/pull/16545
> > > >
> >

Re: Tool to calculate accurate backlog of a subscription (considering batching and entry filters)

Posted by PengHui Li <pe...@apache.org>.
Oh, I see.

I think we can start with a proposal? I saw the new REST API introduced in
the PR.
It's better to follow the proposal process to get explicit approvals.

Thanks,
Penghui

On Tue, Jul 12, 2022 at 10:42 PM Enrico Olivelli <eo...@gmail.com>
wrote:

> Il giorno mar 12 lug 2022 alle ore 15:04 PengHui Li
> <pe...@apache.org> ha scritto:
> >
> > Jiuming has pushed a fix before
> https://github.com/apache/pulsar/pull/14958
> > which uses the message index to calculate the precise backlog without
> going
> > through
> > all the messages.
>
> I am sorry but it does not cover my needs.
> Actually I need to run the "EntryFilter" in order to calculate exactly
> which messages are to be dispatched or not.
>
> Think about a topic with tens of different subscription with a
> different "filter",
> the user wants to know how many messages will be actually delivered to
> a subscription
>
> Enrico
>
> >
> > Penghui
> >
> > On Tue, Jul 12, 2022 at 8:58 PM Enrico Olivelli <eo...@gmail.com>
> wrote:
> >
> > > Hello folks,
> > > there is a long standing issue about calculating an accurate backlog
> > > in presence of batch messages [1].
> > > Now that we have PIP-105, broker side filters, the problem becomes
> bigger.
> > >
> > > I have sent a patch [2] to provide a feature to "analise" a
> > > Subscription and scan the non dispatched messages to perform an
> > > accurate calculation:
> > > - count the number of "logical" messages, considering batch messages
> > > - execute the EntryFilter
> > >
> > > The suggested usage of this feature will be:
> > > - use stats/metrics to monitor the backlog
> > > - if the backlog is bigger than expected use the new command to
> > > understand what's happening
> > >
> > >
> > > Thoughts ?
> > >
> > > Best regards
> > > Enrico
> > >
> > >
> > > [1] https://github.com/apache/pulsar/issues/10705
> > > [2] https://github.com/apache/pulsar/pull/16545
> > >
>

Re: Tool to calculate accurate backlog of a subscription (considering batching and entry filters)

Posted by Enrico Olivelli <eo...@gmail.com>.
Il giorno mar 12 lug 2022 alle ore 15:04 PengHui Li
<pe...@apache.org> ha scritto:
>
> Jiuming has pushed a fix before https://github.com/apache/pulsar/pull/14958
> which uses the message index to calculate the precise backlog without going
> through
> all the messages.

I am sorry but it does not cover my needs.
Actually I need to run the "EntryFilter" in order to calculate exactly
which messages are to be dispatched or not.

Think about a topic with tens of different subscription with a
different "filter",
the user wants to know how many messages will be actually delivered to
a subscription

Enrico

>
> Penghui
>
> On Tue, Jul 12, 2022 at 8:58 PM Enrico Olivelli <eo...@gmail.com> wrote:
>
> > Hello folks,
> > there is a long standing issue about calculating an accurate backlog
> > in presence of batch messages [1].
> > Now that we have PIP-105, broker side filters, the problem becomes bigger.
> >
> > I have sent a patch [2] to provide a feature to "analise" a
> > Subscription and scan the non dispatched messages to perform an
> > accurate calculation:
> > - count the number of "logical" messages, considering batch messages
> > - execute the EntryFilter
> >
> > The suggested usage of this feature will be:
> > - use stats/metrics to monitor the backlog
> > - if the backlog is bigger than expected use the new command to
> > understand what's happening
> >
> >
> > Thoughts ?
> >
> > Best regards
> > Enrico
> >
> >
> > [1] https://github.com/apache/pulsar/issues/10705
> > [2] https://github.com/apache/pulsar/pull/16545
> >

Re: Tool to calculate accurate backlog of a subscription (considering batching and entry filters)

Posted by PengHui Li <pe...@apache.org>.
Jiuming has pushed a fix before https://github.com/apache/pulsar/pull/14958
which uses the message index to calculate the precise backlog without going
through
all the messages.

Penghui

On Tue, Jul 12, 2022 at 8:58 PM Enrico Olivelli <eo...@gmail.com> wrote:

> Hello folks,
> there is a long standing issue about calculating an accurate backlog
> in presence of batch messages [1].
> Now that we have PIP-105, broker side filters, the problem becomes bigger.
>
> I have sent a patch [2] to provide a feature to "analise" a
> Subscription and scan the non dispatched messages to perform an
> accurate calculation:
> - count the number of "logical" messages, considering batch messages
> - execute the EntryFilter
>
> The suggested usage of this feature will be:
> - use stats/metrics to monitor the backlog
> - if the backlog is bigger than expected use the new command to
> understand what's happening
>
>
> Thoughts ?
>
> Best regards
> Enrico
>
>
> [1] https://github.com/apache/pulsar/issues/10705
> [2] https://github.com/apache/pulsar/pull/16545
>