You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by Enrico Olivelli <eo...@gmail.com> on 2023/01/09 12:50:17 UTC

Re: BP-60:Change PCBC limitStatsLogging default value to true

I agree with the change.
I have never used those metrics

Enrico


Il giorno lun 26 dic 2022 alle ore 10:29 Wenbing Shen
<ol...@gmail.com> ha scritto:
>
> Hi BookKeepers, I've changed the limitStatsLogging default value to true
> from false:
> BP-60 <https://github.com/apache/bookkeeper/issues/3718>
>
> Motivation
>
> We have an efficient online bookie cluster with hundreds of bookie nodes
> deployed on SSD disks.
>
> We separate the AutoRecovery cluster and the Bookie cluster for independent
> deployment.
>
> I observed that our AutoRecovery cluster GC is very frequent. After
> investigation, I found that the limitStatsLogging of the bookkeeper client
> PCBC is disabled by default, and a large number of channel monitoring
> indicators are generated. Due to the large number of bookie cluster nodes,
> this metric data occupies a large amount of heap memory.
>
> A single StringWriter object occupies 16MB of memory, of which nearly 70
> StringWriter objects are waiting for the next GC to be destroyed, occupying
> 1GB+ heap memory.
> Proposal
>
> In my use, I haven't found any usefulness of these PCBC monitoring metrics
> data, at least so far, I haven't used it effectively.
>
> If our AutoRecovery and Bookie cluster are mixed in one process, these
> large objects will affect the performance and stability of Bookie cluster.
>
> Since I can't find the meaning of these metrics by default, I suggest to
> adjust the default value of limitStatsLogging to true.
>
> Everyone can choose to turn it on or off, but by default, it is difficult
> for users to find out what effect this parameter will have, so that when
> their cluster grows to hundreds or thousands, when they realize the problem
> sometimes, it is necessary to restart hundreds to thousands of bookies in a
> rolling manner.
>
> At the same time, I observed that in pulsar, various monitoring of the
> bookkeeper client is turned off by default, because they really affect the
> performance of the pulsar service, which is enough to show that we should
> try to change it, especially some very redundant metrics created based on
> channels.
> Compatibility, Deprecation, and Migration PlanClients that rely on PCBC
> metrics monitoring need to pay attention to this upgrade, but this will not
> affect the actual functions of the client, only the metrics data, and users
> can choose to open it again.
>
>
> What do you think about it?
>
> Best.
> Wenbing

Re: BP-60:Change PCBC limitStatsLogging default value to true

Posted by Hang Chen <ch...@apache.org>.
+1

Thanks,
Hang

Andrey Yegorov <an...@datastax.com> 于2023年1月10日周二 06:15写道:
>
> +1
>
> On Mon, Jan 9, 2023 at 4:50 AM Enrico Olivelli <eo...@gmail.com> wrote:
>
> > I agree with the change.
> > I have never used those metrics
> >
> > Enrico
> >
> >
> > Il giorno lun 26 dic 2022 alle ore 10:29 Wenbing Shen
> > <ol...@gmail.com> ha scritto:
> > >
> > > Hi BookKeepers, I've changed the limitStatsLogging default value to true
> > > from false:
> > > BP-60 <https://github.com/apache/bookkeeper/issues/3718>
> > >
> > > Motivation
> > >
> > > We have an efficient online bookie cluster with hundreds of bookie nodes
> > > deployed on SSD disks.
> > >
> > > We separate the AutoRecovery cluster and the Bookie cluster for
> > independent
> > > deployment.
> > >
> > > I observed that our AutoRecovery cluster GC is very frequent. After
> > > investigation, I found that the limitStatsLogging of the bookkeeper
> > client
> > > PCBC is disabled by default, and a large number of channel monitoring
> > > indicators are generated. Due to the large number of bookie cluster
> > nodes,
> > > this metric data occupies a large amount of heap memory.
> > >
> > > A single StringWriter object occupies 16MB of memory, of which nearly 70
> > > StringWriter objects are waiting for the next GC to be destroyed,
> > occupying
> > > 1GB+ heap memory.
> > > Proposal
> > >
> > > In my use, I haven't found any usefulness of these PCBC monitoring
> > metrics
> > > data, at least so far, I haven't used it effectively.
> > >
> > > If our AutoRecovery and Bookie cluster are mixed in one process, these
> > > large objects will affect the performance and stability of Bookie
> > cluster.
> > >
> > > Since I can't find the meaning of these metrics by default, I suggest to
> > > adjust the default value of limitStatsLogging to true.
> > >
> > > Everyone can choose to turn it on or off, but by default, it is difficult
> > > for users to find out what effect this parameter will have, so that when
> > > their cluster grows to hundreds or thousands, when they realize the
> > problem
> > > sometimes, it is necessary to restart hundreds to thousands of bookies
> > in a
> > > rolling manner.
> > >
> > > At the same time, I observed that in pulsar, various monitoring of the
> > > bookkeeper client is turned off by default, because they really affect
> > the
> > > performance of the pulsar service, which is enough to show that we should
> > > try to change it, especially some very redundant metrics created based on
> > > channels.
> > > Compatibility, Deprecation, and Migration PlanClients that rely on PCBC
> > > metrics monitoring need to pay attention to this upgrade, but this will
> > not
> > > affect the actual functions of the client, only the metrics data, and
> > users
> > > can choose to open it again.
> > >
> > >
> > > What do you think about it?
> > >
> > > Best.
> > > Wenbing
> >
>
>
> --
> Andrey Yegorov

Re: BP-60:Change PCBC limitStatsLogging default value to true

Posted by Andrey Yegorov <an...@datastax.com>.
+1

On Mon, Jan 9, 2023 at 4:50 AM Enrico Olivelli <eo...@gmail.com> wrote:

> I agree with the change.
> I have never used those metrics
>
> Enrico
>
>
> Il giorno lun 26 dic 2022 alle ore 10:29 Wenbing Shen
> <ol...@gmail.com> ha scritto:
> >
> > Hi BookKeepers, I've changed the limitStatsLogging default value to true
> > from false:
> > BP-60 <https://github.com/apache/bookkeeper/issues/3718>
> >
> > Motivation
> >
> > We have an efficient online bookie cluster with hundreds of bookie nodes
> > deployed on SSD disks.
> >
> > We separate the AutoRecovery cluster and the Bookie cluster for
> independent
> > deployment.
> >
> > I observed that our AutoRecovery cluster GC is very frequent. After
> > investigation, I found that the limitStatsLogging of the bookkeeper
> client
> > PCBC is disabled by default, and a large number of channel monitoring
> > indicators are generated. Due to the large number of bookie cluster
> nodes,
> > this metric data occupies a large amount of heap memory.
> >
> > A single StringWriter object occupies 16MB of memory, of which nearly 70
> > StringWriter objects are waiting for the next GC to be destroyed,
> occupying
> > 1GB+ heap memory.
> > Proposal
> >
> > In my use, I haven't found any usefulness of these PCBC monitoring
> metrics
> > data, at least so far, I haven't used it effectively.
> >
> > If our AutoRecovery and Bookie cluster are mixed in one process, these
> > large objects will affect the performance and stability of Bookie
> cluster.
> >
> > Since I can't find the meaning of these metrics by default, I suggest to
> > adjust the default value of limitStatsLogging to true.
> >
> > Everyone can choose to turn it on or off, but by default, it is difficult
> > for users to find out what effect this parameter will have, so that when
> > their cluster grows to hundreds or thousands, when they realize the
> problem
> > sometimes, it is necessary to restart hundreds to thousands of bookies
> in a
> > rolling manner.
> >
> > At the same time, I observed that in pulsar, various monitoring of the
> > bookkeeper client is turned off by default, because they really affect
> the
> > performance of the pulsar service, which is enough to show that we should
> > try to change it, especially some very redundant metrics created based on
> > channels.
> > Compatibility, Deprecation, and Migration PlanClients that rely on PCBC
> > metrics monitoring need to pay attention to this upgrade, but this will
> not
> > affect the actual functions of the client, only the metrics data, and
> users
> > can choose to open it again.
> >
> >
> > What do you think about it?
> >
> > Best.
> > Wenbing
>


-- 
Andrey Yegorov