You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Asaf Mesika <as...@gmail.com> on 2022/11/07 09:26:06 UTC

Re: [PIP-214][broker]Add broker level metrics statistics and expose to prometheus discussion

I'm reposting my comment from the PR as this requires further discussion to
reach an agreement.

-------

You have decided to re-use the same metric name, but with removing the
namespace dimension. So we would do something like this in case we turn on
topic-level metrics:

pulsar_rate_in{host="brokerA"} - broker-level rate in
pulsar_rate_in{host="brokerA" namespace="ns1" topic="topic10"} - topic
level rate in

This creates duplicates when users will use it.
Say you wanted to see your cluster rate in, you would do something like
sum(pulsar_rate_in), but this sum two things:

   1. Topic level rate in, across all brokers.
   2. Broker level rate in, across all brokers

So, in effect, the rate would be doubled.

The idea is that the metric is divided into one granularity level, be it
(namespace, topic) or (namespace) so that sum by any dimension they wish.
You can't report two granularity levels of any metric under the same name.

Your previous version of adding another metric name and explicitly stating
its broker-level metric pulsar_broker_rate_in makes more sense.

IMO, it must be changed back, unless I'm missing something.

---------


Yang, please note that doing what you suggested


If the user uses the following query, there will be no double rate:
> sum(pulsar_rate_in{namespace=""})



IMO is not a valid user experience for a Pulsar user. People don't do this
stuff (attribute=""), and we don't expect them to know this small detail
when authoring queries.


In theory, this should not be the concern of Pulsar at all - making the
aggregation since it can be done using TSDB like Prometheus recording rules
- but I do understand that emitting so many time series is inherently a
Pulsar issue. I accept this as a workaround for now.

I hope the work researched here
<https://lists.apache.org/thread/comd0o5760fcgc8qn5d0s7bs7c3zs1j3> will
yield a better long-term solution.


Thanks,


Asaf

On Mon, Oct 24, 2022 at 10:06 PM Michael Marshall <mm...@apache.org>
wrote:

> I think this feature is already available out of the box, as Penghui
> suggests. You can get metrics per broker summing them up over the
> `instance` label or the `kubernetes_pod_name` label. These labels are
> added not by the broker, but instead by the prometheus scrape
> definition.
>
> Also, you are correct that there are some metrics that are always 0, e.g.:
>
> pulsar_topics_count{cluster="my-cluster"} 0
>
> The motivation for those metrics is provided here [0], though I'm not
> sure it is good motivation.
>
> Thanks,
> Michael
>
> [0]
> https://github.com/apache/pulsar/blob/6d6665e296e6714801048dee1292e3a07abb5cc1/pulsar-broker/src/main/java/org/apache/pulsar/broker/stats/prometheus/NamespaceStatsAggregator.java#L306-L307
>
> On Sun, Oct 23, 2022 at 8:52 PM PengHui Li <pe...@apache.org> wrote:
> >
> > I support the motivation.
> >
> > When I read the document of Prometheus
> > I see "A label with an empty label value is considered equivalent to a
> > label that does not exist." [0]
> > Can we just keep the value of the topic and namespace label empty?
> > And here is a blog that introduced how to deal with the empty label [1]
> >
> > If this can work. We don't need to add new indicator names.
> > Otherwise, we might need "pulsar_tenant_*" ,"pulsar_namespace_*",
> > "pulsar_cluster_*" in the future.
> >
> > [0] https://prometheus.io/docs/concepts/data_model/
> > [1]
> >
> https://medium.com/pareture/query-results-where-label-is-not-present-in-prometheus-e1176320575d
> >
> > Thanks,
> > Penghui
> >
> > On Sun, Oct 23, 2022 at 6:14 PM Asaf Mesika <as...@gmail.com>
> wrote:
> >
> > > One thing to note: You mentioned a PIP number but I'm not sure this is
> the
> > > right process you follow here for that.
> > >
> > >
> > > On Sun, Oct 23, 2022 at 1:12 PM Asaf Mesika <as...@gmail.com>
> wrote:
> > >
> > > > The suggestion makes sense to me as well. I've also reviewed your PR.
> > > >
> > > > On Sun, Oct 23, 2022 at 8:43 AM Haiting Jiang <
> jianghaiting@gmail.com>
> > > > wrote:
> > > >
> > > >> +1
> > > >> Makes sense to me.
> > > >>
> > > >> Thanks,
> > > >> Haiting
> > > >>
> > > >> On Sat, Oct 22, 2022 at 3:59 AM yang yijun <yy...@gmail.com>
> wrote:
> > > >> >
> > > >> > Hi,I have a suggestion to improve the broker.
> > > >> >
> > > >> > For detailed improvement instructions, please refer to issue:
> > > >> > https://github.com/apache/pulsar/issues/18056
> > > >> >
> > > >> > For detailed source code change, please refer to PR:
> > > >> > https://github.com/apache/pulsar/pull/18116
> > > >>
> > > >
> > >
>

Re: [PIP-214][broker]Add broker level metrics statistics and expose to prometheus discussion

Posted by Michael Marshall <mm...@apache.org>.
I agree with your point, too, Asaf. Thanks for posting it on the mailing list.

- Michael

On Mon, Nov 7, 2022 at 6:38 PM PengHui Li <co...@gmail.com> wrote:
>
> Hi Asaf,
>
> Agree with your point.
> +1 for the new indicator name solution.
>
> Penghui
>
> > On Nov 7, 2022, at 17:26, Asaf Mesika <as...@gmail.com> wrote:
> >
> > I'm reposting my comment from the PR as this requires further discussion to
> > reach an agreement.
> >
> > -------
> >
> > You have decided to re-use the same metric name, but with removing the
> > namespace dimension. So we would do something like this in case we turn on
> > topic-level metrics:
> >
> > pulsar_rate_in{host="brokerA"} - broker-level rate in
> > pulsar_rate_in{host="brokerA" namespace="ns1" topic="topic10"} - topic
> > level rate in
> >
> > This creates duplicates when users will use it.
> > Say you wanted to see your cluster rate in, you would do something like
> > sum(pulsar_rate_in), but this sum two things:
> >
> >   1. Topic level rate in, across all brokers.
> >   2. Broker level rate in, across all brokers
> >
> > So, in effect, the rate would be doubled.
> >
> > The idea is that the metric is divided into one granularity level, be it
> > (namespace, topic) or (namespace) so that sum by any dimension they wish.
> > You can't report two granularity levels of any metric under the same name.
> >
> > Your previous version of adding another metric name and explicitly stating
> > its broker-level metric pulsar_broker_rate_in makes more sense.
> >
> > IMO, it must be changed back, unless I'm missing something.
> >
> > ---------
> >
> >
> > Yang, please note that doing what you suggested
> >
> >
> > If the user uses the following query, there will be no double rate:
> >> sum(pulsar_rate_in{namespace=""})
> >
> >
> >
> > IMO is not a valid user experience for a Pulsar user. People don't do this
> > stuff (attribute=""), and we don't expect them to know this small detail
> > when authoring queries.
> >
> >
> > In theory, this should not be the concern of Pulsar at all - making the
> > aggregation since it can be done using TSDB like Prometheus recording rules
> > - but I do understand that emitting so many time series is inherently a
> > Pulsar issue. I accept this as a workaround for now.
> >
> > I hope the work researched here
> > <https://lists.apache.org/thread/comd0o5760fcgc8qn5d0s7bs7c3zs1j3> will
> > yield a better long-term solution.
> >
> >
> > Thanks,
> >
> >
> > Asaf
> >
> > On Mon, Oct 24, 2022 at 10:06 PM Michael Marshall <mm...@apache.org>
> > wrote:
> >
> >> I think this feature is already available out of the box, as Penghui
> >> suggests. You can get metrics per broker summing them up over the
> >> `instance` label or the `kubernetes_pod_name` label. These labels are
> >> added not by the broker, but instead by the prometheus scrape
> >> definition.
> >>
> >> Also, you are correct that there are some metrics that are always 0, e.g.:
> >>
> >> pulsar_topics_count{cluster="my-cluster"} 0
> >>
> >> The motivation for those metrics is provided here [0], though I'm not
> >> sure it is good motivation.
> >>
> >> Thanks,
> >> Michael
> >>
> >> [0]
> >> https://github.com/apache/pulsar/blob/6d6665e296e6714801048dee1292e3a07abb5cc1/pulsar-broker/src/main/java/org/apache/pulsar/broker/stats/prometheus/NamespaceStatsAggregator.java#L306-L307
> >>
> >> On Sun, Oct 23, 2022 at 8:52 PM PengHui Li <pe...@apache.org> wrote:
> >>>
> >>> I support the motivation.
> >>>
> >>> When I read the document of Prometheus
> >>> I see "A label with an empty label value is considered equivalent to a
> >>> label that does not exist." [0]
> >>> Can we just keep the value of the topic and namespace label empty?
> >>> And here is a blog that introduced how to deal with the empty label [1]
> >>>
> >>> If this can work. We don't need to add new indicator names.
> >>> Otherwise, we might need "pulsar_tenant_*" ,"pulsar_namespace_*",
> >>> "pulsar_cluster_*" in the future.
> >>>
> >>> [0] https://prometheus.io/docs/concepts/data_model/
> >>> [1]
> >>>
> >> https://medium.com/pareture/query-results-where-label-is-not-present-in-prometheus-e1176320575d
> >>>
> >>> Thanks,
> >>> Penghui
> >>>
> >>> On Sun, Oct 23, 2022 at 6:14 PM Asaf Mesika <as...@gmail.com>
> >> wrote:
> >>>
> >>>> One thing to note: You mentioned a PIP number but I'm not sure this is
> >> the
> >>>> right process you follow here for that.
> >>>>
> >>>>
> >>>> On Sun, Oct 23, 2022 at 1:12 PM Asaf Mesika <as...@gmail.com>
> >> wrote:
> >>>>
> >>>>> The suggestion makes sense to me as well. I've also reviewed your PR.
> >>>>>
> >>>>> On Sun, Oct 23, 2022 at 8:43 AM Haiting Jiang <
> >> jianghaiting@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> +1
> >>>>>> Makes sense to me.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Haiting
> >>>>>>
> >>>>>> On Sat, Oct 22, 2022 at 3:59 AM yang yijun <yy...@gmail.com>
> >> wrote:
> >>>>>>>
> >>>>>>> Hi,I have a suggestion to improve the broker.
> >>>>>>>
> >>>>>>> For detailed improvement instructions, please refer to issue:
> >>>>>>> https://github.com/apache/pulsar/issues/18056
> >>>>>>>
> >>>>>>> For detailed source code change, please refer to PR:
> >>>>>>> https://github.com/apache/pulsar/pull/18116
> >>>>>>
> >>>>>
> >>>>
> >>
>

Re: [PIP-214][broker]Add broker level metrics statistics and expose to prometheus discussion

Posted by PengHui Li <co...@gmail.com>.
Hi Asaf,

Agree with your point.
+1 for the new indicator name solution.

Penghui

> On Nov 7, 2022, at 17:26, Asaf Mesika <as...@gmail.com> wrote:
> 
> I'm reposting my comment from the PR as this requires further discussion to
> reach an agreement.
> 
> -------
> 
> You have decided to re-use the same metric name, but with removing the
> namespace dimension. So we would do something like this in case we turn on
> topic-level metrics:
> 
> pulsar_rate_in{host="brokerA"} - broker-level rate in
> pulsar_rate_in{host="brokerA" namespace="ns1" topic="topic10"} - topic
> level rate in
> 
> This creates duplicates when users will use it.
> Say you wanted to see your cluster rate in, you would do something like
> sum(pulsar_rate_in), but this sum two things:
> 
>   1. Topic level rate in, across all brokers.
>   2. Broker level rate in, across all brokers
> 
> So, in effect, the rate would be doubled.
> 
> The idea is that the metric is divided into one granularity level, be it
> (namespace, topic) or (namespace) so that sum by any dimension they wish.
> You can't report two granularity levels of any metric under the same name.
> 
> Your previous version of adding another metric name and explicitly stating
> its broker-level metric pulsar_broker_rate_in makes more sense.
> 
> IMO, it must be changed back, unless I'm missing something.
> 
> ---------
> 
> 
> Yang, please note that doing what you suggested
> 
> 
> If the user uses the following query, there will be no double rate:
>> sum(pulsar_rate_in{namespace=""})
> 
> 
> 
> IMO is not a valid user experience for a Pulsar user. People don't do this
> stuff (attribute=""), and we don't expect them to know this small detail
> when authoring queries.
> 
> 
> In theory, this should not be the concern of Pulsar at all - making the
> aggregation since it can be done using TSDB like Prometheus recording rules
> - but I do understand that emitting so many time series is inherently a
> Pulsar issue. I accept this as a workaround for now.
> 
> I hope the work researched here
> <https://lists.apache.org/thread/comd0o5760fcgc8qn5d0s7bs7c3zs1j3> will
> yield a better long-term solution.
> 
> 
> Thanks,
> 
> 
> Asaf
> 
> On Mon, Oct 24, 2022 at 10:06 PM Michael Marshall <mm...@apache.org>
> wrote:
> 
>> I think this feature is already available out of the box, as Penghui
>> suggests. You can get metrics per broker summing them up over the
>> `instance` label or the `kubernetes_pod_name` label. These labels are
>> added not by the broker, but instead by the prometheus scrape
>> definition.
>> 
>> Also, you are correct that there are some metrics that are always 0, e.g.:
>> 
>> pulsar_topics_count{cluster="my-cluster"} 0
>> 
>> The motivation for those metrics is provided here [0], though I'm not
>> sure it is good motivation.
>> 
>> Thanks,
>> Michael
>> 
>> [0]
>> https://github.com/apache/pulsar/blob/6d6665e296e6714801048dee1292e3a07abb5cc1/pulsar-broker/src/main/java/org/apache/pulsar/broker/stats/prometheus/NamespaceStatsAggregator.java#L306-L307
>> 
>> On Sun, Oct 23, 2022 at 8:52 PM PengHui Li <pe...@apache.org> wrote:
>>> 
>>> I support the motivation.
>>> 
>>> When I read the document of Prometheus
>>> I see "A label with an empty label value is considered equivalent to a
>>> label that does not exist." [0]
>>> Can we just keep the value of the topic and namespace label empty?
>>> And here is a blog that introduced how to deal with the empty label [1]
>>> 
>>> If this can work. We don't need to add new indicator names.
>>> Otherwise, we might need "pulsar_tenant_*" ,"pulsar_namespace_*",
>>> "pulsar_cluster_*" in the future.
>>> 
>>> [0] https://prometheus.io/docs/concepts/data_model/
>>> [1]
>>> 
>> https://medium.com/pareture/query-results-where-label-is-not-present-in-prometheus-e1176320575d
>>> 
>>> Thanks,
>>> Penghui
>>> 
>>> On Sun, Oct 23, 2022 at 6:14 PM Asaf Mesika <as...@gmail.com>
>> wrote:
>>> 
>>>> One thing to note: You mentioned a PIP number but I'm not sure this is
>> the
>>>> right process you follow here for that.
>>>> 
>>>> 
>>>> On Sun, Oct 23, 2022 at 1:12 PM Asaf Mesika <as...@gmail.com>
>> wrote:
>>>> 
>>>>> The suggestion makes sense to me as well. I've also reviewed your PR.
>>>>> 
>>>>> On Sun, Oct 23, 2022 at 8:43 AM Haiting Jiang <
>> jianghaiting@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> +1
>>>>>> Makes sense to me.
>>>>>> 
>>>>>> Thanks,
>>>>>> Haiting
>>>>>> 
>>>>>> On Sat, Oct 22, 2022 at 3:59 AM yang yijun <yy...@gmail.com>
>> wrote:
>>>>>>> 
>>>>>>> Hi,I have a suggestion to improve the broker.
>>>>>>> 
>>>>>>> For detailed improvement instructions, please refer to issue:
>>>>>>> https://github.com/apache/pulsar/issues/18056
>>>>>>> 
>>>>>>> For detailed source code change, please refer to PR:
>>>>>>> https://github.com/apache/pulsar/pull/18116
>>>>>> 
>>>>> 
>>>> 
>>